Paper Title
Sales Prediction Model using Autoregressive Integrated Moving Average

Abstract
Due to the recent presence of companies on social media, companies’ information has increased significantly. This work aims at helping companies reveal the insights of their data and use it to predict their future sales. The proposed model has three phases: analysis, prediction, and evaluation. The first phase analyzes the company’s data to extract the information related to its sales and produce a time series that will be the input of the second phase. In the second phase, the model uses the Autoregressive Integrated Moving Average (ARIMA)technique to predict future sales. Then the third phase is to evaluate the results; both the Mean AbsoluteError(MAE) and Mean Absolute Percent Error (MAPE) metrices are used to measure the accuracy of our model. A case study using a large-scale pharmaceutical company’s sales data is conducted to prove the effectiveness of our model. The company’s data were collected over a period of four years. In the first phase of our case study, the open-source Pandas data analysis library is utilized to aggregate the company’s raw data into time series points. The R language is used in the second phase to predict the future sales. While choosing the ARIMA’s parameters, we split the sales data into 80% data to be used in this second phase and we saved 20% of the data to be used in the evaluation phase in order to compare them with our predicted results. The evaluation phase shows good prediction performance with a MAE value of 492 and MAPE value of 2.8%. The first metric indicates that the predicted data were off by 492 units which is good given that the sales values are considered very large. The second metric shows a very accepted percentage error that does not depend on how large or small the data are. Keywords - Sales Prediction, Data Aggregation, Time Series