Non-stationary time series data for natural rubber inventory forecasting: A case study

PPLK Corp. is a company that uses natural rubber as the main raw material to produce crumb rubber. The problem identified in PPLK Corp. is the insufficient amount of natural rubber received to produce and fulfill consumer demand. There have been fluctuations in the amount of natural rubber received and high variability between periods. To minimize this variability, it is necessary to forecast natural rubber requirements. The purpose of this study is to forecast the natural rubber inventory for the next periods using the best-fitted model, which is the Autoregressive Integrated Moving Average (ARIMA) method. A total of 547 daily data points from 2021 to 2022 were used. As a result, the ARIMA (1, 1, 2) model was found to be the best model for natural rubber forecasting in the rubber factory. The ARIMA (1, 1, 2) model had the smallest AIC value compared to others. The total daily natural rubber need is forecasted to be around 67,588 kilograms with a range between 64,805 and 70,421 kilograms per day. However, it should be noted that this study was limited to short-term forecasting only.


Introduction
Forecasting is the process of predicting future outcomes based on past data [1]. Time series data is prevalent in numerous fields, including production, transportation, medicine, economics, and energy [2], [3]. Forecasting research is particularly relevant to manufacturing and operations, where challenges such as inconsistent supply and dynamic demand can impede production targets [4], [5]. One such challenge is maintaining adequate natural rubber inventory.
PPLK Corp. is a company that uses natural rubber as the main raw material to produce crumb rubber. The natural rubber is sourced from the Sumatra region. However, the company has experienced issues with insufficient supply to meet consumer demand. At other times, the supply has surged, leading to accelerated production to fulfill crumb rubber obligations and minimize delays. As shown in Figure 1, both the amount of natural rubber received and the variability between periods have fluctuated significantly. Therefore, it is necessary to forecast natural rubber requirements to minimize this variability.
Based on the previous studies, natural rubber prices were predicted using the simple moving average (MA) and ARMA-GARCH models [6], [7]. The MA (3) model was found to be more accurate in forecasting natural rubber compared to other models [8]. For dry and wet rubber, the VARX (1,1) model was found to be better in predicting, as it is a multivariate model with exogenous variables [9]. The Economic Order Quantity (EOQ) method was used as an alternative for controlling natural rubber [10].

Industrial Engineering Advance Research & Application
In Thailand, the multiplicative decomposition method was rated as the most suitable for forecasting the total natural rubber production [11]. A combination of ARIMA and SVM was found to improve forecasting accuracy and reduce forecast errors [12]. The forecast target timescale was found to influence the degree of fit between the forecast and the sub-sequence of modes [13]. The moving average (MA), autoregressive (AR), and ARIMA methods were considered effective in forecasting quantities and increasing forecasting accuracy in these studies.
The ARIMA method is widely used for forecasting and has been applied to predict crude palm oil (CPO) prices using both the ARIMA and ARIMA-GARCH integration methods. In Malaysia and Indonesia, the combination of the ARIMA and GARCH models was found to be more effective in predicting CPO prices than either model alone [14]. Time series models have also been found to be effective in predicting other variables, such as notebook production [15], poverty levels [16], the price of quality goods [17], and the demand for cups of milk coffee [18]. In the manufacturing industry, ARIMA is often used to forecast short-term demand based on historical data. The ARIMA (1,0,1) model has been found to be the most effective, as demonstrated by its successful validation against historical demand information in food manufacturing under similar conditions [19]. Additionally, seasonal ARIMA models have been used to forecast small-scale agricultural loads and manage energy in Japan [20], [21].
Using a time series approach such as MA, WMA, or ARIMA to model and forecast demand can help minimize errors in material forecasting for food manufacturing. Weighted Moving Average (WMA) is assumed to be more sensitive to changes in data [22], making it a potentially valuable tool for predicting future demand.
The ARIMA model has been found to perform better than both ARCH-GARCH [23], [24] and Holt's linear model [25]. ARIMA models have shown high accuracy and precision in predicting time series data at the nearest lag. However, the ARIMA-GARCH model may be superior to ARIMA since it updates safety stocks and calculates order quantities at each replenishment cycle [26]. LSTM models have also demonstrated superiority over ARIMA and have been found to reduce error rates and improve long-term projections [27]- [30]. Despite these findings, previous studies have reported differing results when attempting to determine the best forecasting model.
In these studies, the moving average (MA) method, the autoregressive (AR) method, and the ARIMA method were found to be effective in predicting quantities, improving forecasting accuracy, and minimizing errors [12], [23], [24]. ARIMA-related studies have mainly focused on price prediction, production forecasting, and poverty level estimation. However, there has been limited research on forecasting natural rubber inventory using time series approaches.
A novel approach is required to assess the accuracy of future predictions.
This study proposes Autoregressive Integrated Moving Average (ARIMA) methods to determine the best fitting model. The purpose of this study is to forecast the natural rubber inventory for next periods using the best fitted ARIMA model. The structure of this research consists of introduction, material and method, results and discussion, and conclusion.

Material and method
The ARIMA model involves three steps: model identification, parameter estimation, and residual diagnostics. Data on the natural rubber inventory was gathered from January 2021 to June 2022, and a total of 547 daily data points were used. The R software was used for conducting the analysis.
ARIMA is a combination of two models, Autoregressive (AR) and Moving Average (MA). In Autoregressive (AR) model, variable depends on its previous values. It can be specified as: where is the response variable at time, 0 is the constant mean of the process, −1 , … , − are the response variables at lags t-1, …, t-p.
In Moving Average (MA) model, variable depends on previous values of the errors. It can be specified as: where is the response variable at time, 0 is the constant mean of the process, −1 , … , − are the response variables at lags t-1, …, t-p [24], [30].

Model identification
The ARIMA model, also known as the Box & Jenkins model, involves three steps. In the identification step, it is necessary to check the stationarity of the time series data by plotting a graph of the data to observe its pattern. Stationarity is a key requirement for using the ARIMA model. A stationary process is characterized by constant data with respect to mean and variance. The Augmented Dickey-Fuller test (ADF) is used to check the stationarity of the data. Autocorrelation Function (ACF) and Partial ACF (PACF) plots are also considered. If the data is not stationary, it is necessary to transform the data to achieve stationarity. The first differentiation is usually considered sufficient to stabilize the mean, making the data stationary. Differentiation calculates the difference between successive observations to remove trending patterns from the data. The correlogram pattern is also used to plot ACF and PACF in identifying AR or MA in the ARIMA model. After stationarity is achieved, the next step is parameter estimation, followed by residual diagnostics.

Parameter estimation
The parameters in the ARIMA model are estimated using the least squares estimator. Auto ARIMA is a method used to identify the p and q sequences in the ARIMA model (p, d, q) with or without differentiation. The simplest to best approach is used to obtain the best parsimonious model, which is the model with the fewest parameters while still performing well. The best fitting model is determined by statistically significant coefficients, and the Akaike Information Criterion (AIC) is used to select the best model. The model with the lowest AIC value is preferred.

Residual diagnostics
Residual diagnostics are used to determine the residual white noise of the model used. The residuals must meet the assumption of non-autocorrelated random stationary process, where ~NID (normal and independently distributed). This can be achieved by plotting the residuals, performing the Ljung-Box test, and interpreting the Anderson-Darling statistics for the normality test. If the assumptions are met, the best-fitted model can be used to forecast the demand for natural rubber for the next period. If the assumptions are not met, it is necessary to identify a better model that does not overfit by adding several parameters to the AR or MA model.

Study framework
The study consists of pre-study (identify problem, literature review, set purpose, and gather data), model identification, parameter estimation, residual diagnostics, and forecast for next periods. The study framework can be seen in Fig. 2.

Model identification
Based on Table 1 Next, the data is illustrated in Fig. 3. The figure shows that the data series is non-stationary, as the mean and variance are not constant over time. This nonstationarity can be attributed to the recovery period after the COVID-19 vaccine was found.    To handle this non-stationary data series, first differencing should be performed. Fig. 4 displays the first differencing plot of the data series, which indicates stationarity. The mean and variance have remained constant over time. ACF and PACF tests can now be conducted to explore alternative ARIMA models. Fig. 5 displays the ACF and PACF plots after first differencing. Based on the correlogram, the alternative ARIMA models are MA (0), MA (1), and MA (2) for the moving average component, and AR (0), AR (1), and AR (2) for the autoregressive component. Table 2 shows all AIC value for alternative ARIMA models, from ARIMA (0,1,0) to ARIMA (2,1,2) with/without drift. ARIMA (1,1,2) without drift displays lowest AIC value around 12,558.78 and preferred as the best model. Table 3 shows coefficient of the best model ARIMA (1,1,2). Autoregressive (1) and moving average (2) results are significant. The estimate value of AR (1) and MA (2) is -0,959 and -0,916.   Table 4 shows Ljung-Box Q-statistic test and normality test results of best model's residual. Based on Ljung-box test result, p-value is 0.5875 and greater than significance level (0.05). It indicates the residual in white noise. Then, p-value of Anderson-Darling test result shows normal distribution due to p-value is less than 0,05. Next, we forecast natural rubber inventory for next month. Table A3 (see Appendix) shows daily natural rubber inventory forecast for 31 days. Mean is around 67.588 kilograms with range between 64.805 kilograms and 70.421 kilograms per day.Based on the results of forecasting, the variability value is smaller than the company's current condition. This result is similar to the results obtained by Oktiani [12], Khamis et al. [23], and Haque & Shaik [24] that show forecasting using ARIMA can improve predictions and minimize error.

Conclusions
The company had been facing the issue of fluctuations in the amount of natural rubber received, leading to production target not being met in certain periods and accelerated production in others. This research proposed an ARIMA method to solve this problem by developing the best-fitted model for forecasting natural rubber inventory. The developed model was able to transform non-stationary data into stationary data in terms of mean and variance.
The study revealed that the ARIMA (1,1,2) model provided better predictions with a significant level of 5%, as it had the smallest AIC value compared to the other models. The total daily natural rubber requirement was estimated to be around 67.588 kilograms with a range of 64.805 kilograms to 70.421 kilograms per day.
However, this study was limited to short-term forecasting, and future research could focus on developing multivariate forecasting models for predicting demand in the medium or long term. Additionally, a combination of methods with SVM could be considered in future research. Engineering, Universitas Sultan Ageng Tirtayasa who has facilitated this study.

Disclosure statement
The authors report there are no competing interests to declare.