Regression analysis is an essential tool in determining the connections among multiple variables and a single dependent variable. Businesses and organizations find it useful to use regression analysis in developing models that can help them explain some phenomenon affecting productivity. Through regression, it is possible to determine an effective model to control some variables in order to achieve positive outcome one variable that determines the success of the business (Atkinson & Riani, 2007). Regression is significant in developing predictive models using data available from various aspects of organizations. The major problem facing Dupree Fuels Company is how to achieve competitive advantage over other companies in the same sector. The company collected some data from customers to test a model that is practical and more reliable.

Data was collected from a sample population of 40 customers using oil-fired water heaters. This data comprised of oil usage, degree days, home index and number of people in a home. The company seeks to develop a model that determines whether degree of days, home index and number of days can predict the level of customer consumption. In this model multiple linear regressions was conducted on the variables to determine the effectiveness of using the variables to predict customerâ€™s oil consumption requirements. Therefore, oil usage is the dependent variable in this analysis that will use independent variables; degree days, home index and number of people to predict the reliability of this variables on consumption. Using excel the following section presents the results of the regression and further explains the output of each variable to give a conclusion whether the model will help the company to solve the problem it is facing in serving customers.

Multiple regression is one of the best methods of predicting outcomes of the independent variables on the dependent variable since it explains the general fitness of the model based on all variables as well as individual variables (Atkinson & Riani, 2007). Various outputs from multiple regression helps determine different characteristics of the model that are useful in giving the prediction on the outcome of the dependent variable. The most important values that we look at in multiple regression are; R square value which explains the extent to which the independent variables accounts for the variance in the dependent variable, Adjusted R square value which is the value of R square when variables are added to the model, p-values from the analysis of variance (ANOVA) the describes the significance level of the overall model in making the prediction, coefficients which explains individual variables on the dependent variables and the p-value for every independent variable to describe the significance of each independent variable in the model (Morrissey & Ruxton, 2018). The significance value in the ANOVA is compared to the standard alpha value of 0.05 where any value below this alpha implies that the model is significant.

Results

Regression Statistics | |

Multiple R | 0.347983479 |

R Square | 0.121092502 |

Adjusted R Square | 0.020645931 |

Standard Error | 11.56914241 |

Observations | 40 |

ANOVA | |||||

| df | SS | MS | F | Significance F |

Regression | 4 | 645.423 | 161.3558 | 1.205541 | 0.325865 |

Residual | 35 | 4684.577 | 133.8451 | ||

Total | 39 | 5330 |

| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% |

Intercept | 40.05571309 | 9.960512 | 4.021451 | 0.000293 | 19.8348 | 60.27663 | 19.8348 | 60.27662636 |

Oil Usage | 0.022776589 | 0.022561 | 1.009578 | 0.319627 | -0.02302 | 0.068577 | -0.02302 | 0.068576859 |

Degree Days | -0.012090105 | 0.007918 | -1.52683 | 0.13579 | -0.02817 | 0.003985 | -0.02817 | 0.003985133 |

Home Index | -3.928041836 | 2.356021 | -1.66724 | 0.104393 | -8.71102 | 0.854935 | -8.71102 | 0.854934679 |

Number People | -1.393669581 | 1.434607 | -0.97146 | 0.337983 | -4.30608 | 1.518737 | -4.30608 | 1.518737095 |

From the results, the first two tables gives the insight on how the three variables; degree days, home index and number people taken as a set will help predict the oil consumption for customers. The first regression statistics table gives the value of R square that is the first point of concern as to whether the company should consider the model reliable based on the three set of independent variables. R^{2}=0.12019 which is 12.02% implying that taken as a set, the predictors degree days, home index and number people accounts for 12.019% of the variance in the oil usage. Dupree Fuels Company should therefore, take further step in looking at the individual variables for the predictability rather than the set of three independent variables.

From the Analysis of Variance (ANOVA) table results, the overall regression model was not significant, F(4,35)=1.205541, p>0.05, R^{2}=0.12019. The significance F value was way above 0.05 which implies that the model was not significant. Therefore, to further assess how this model can help the company predict for the oil consumption among customers, we take a further look at the results of regression for individual variables to determine the possible outcome on the oil consumption for each variable.

The results in the next table therefore will help determine the significance of each variable on the oil consumption. The assessment of the p-value was conducted based on alpha=.05. The p-value for degree days was 0.13579 which is >.05 implying that degree days variable is not significant in predicting the oil consumption among customers. The p-value for home index was 0.104393 which is also >.05 meaning that again the variable was not significant to give the predictability of oil usage. Finally, the number of people had a p-value of 0.337983 which again is greater than the alpha=.05.

It is unfortunate for the company that this regression model could not help in predicting the outcome of oil consumption among customers. The results of the regression were not significant mainly due to two possible reasons. One, the variables could be appropriate but maybe the data collected was not accurate or the variables were not appropriate in predicting the oil consumption variable but the data was accurate.

Conclusion and Recommendation

According to the results, Dupree will not get any meaningful information from the model to determine the prediction on the oil consumption. There are various approaches that the company can use to improve the model and maybe get a different outcome. One way involves collecting accurate information from the customers so as to assess the predictability based on accurate data. Another option would mean changing the variables by looking for more factors that would be affecting oil consumption.

The description of the variables was based on the perceived impact on oil consumption but the regression models presented a different outcome that suggests this model as inappropriate. Perhaps the problem in the model arises from either data collected or variables but most likely the company need to conduct further analysis to determine the possible ways that they can predict consumption. Regression is effective model for prediction especially for large data sets that could demand for assessment of effectiveness in normal data modelling and prediction in various applications. Dupree will have to to conduct further research on how to predict the consumption of their customers after the failure of the first model and the second regression model.

References

Atkinson, A., & Riani, M. (2007). Building Regression Models with the Forward Search. *Journal Of Computing And Information Technology*, *15*(4), 287. doi: 10.2498/cit.1001135

Morrissey, M., & Ruxton, G. (2018). Multiple Regression Is Not Multiple Regressions: The Meaning of Multiple Regression and the Non-Problem of Collinearity. *Philosophy, Theory, And Practice In Biology*, *10*(20200624). doi: 10.3998/ptpbio.16039257.0010.003