Task 1. Regression analysis
You need to construct a meaningful regression model, select appropriate explanatory variables (using your judgment and comparing models as we did in a workshop), do basic diagnostics of it and answer the following questions using the constructed model:
1. In what month historically the highest number of deaths happen?
2. Was the seatbelts law effective in decreasing the number of drivers deaths?
3. What is the meaning of the parameter for November? Provide an interpretation.
4. What would be the number of deaths in January 1985 if the law was still in effect, the distance driven was 20,000km, petrol price was 0.11 and the number of van drivers killed was 6?
Task 2. Time series decomposition
To better understand what structure there was in the driven distance (the variable kms) and whether there were any anomalies, you need to decompose the variable “kms” using the appropriate technique and then answer the question:
5. Are there any anomalies in the data? Why do you think they happened?
Task 3. Forecasting Finally,
The analyst needs to understand what is the tendency with the number of van drivers killed (the variable VanKilled) and get a forecast for the next 12 months for it. For this task, you need to use several simple forecasting methods (at least three), select the best one based on RMSE and produce the point forecast for the next 12 months. After that you need to answer the following questions:
6. Why do you think that the simple forecasting methods are appropriate for this data? 7. How can you interpret the value of the smoothing parameter of the Simple Exponential Smoothing for your data?