Requirements
- Linear Regression needs at least two variables
- The amount of samples is preferred to be at minimum 20 observations per variable to have confidence in the model
1. Linear Relationship

2. Multivariate Normality
The distribution of each of the variables is normal (or Gaussian) - the averages of each random, independent observations converge (tend towards) a central mean.
R methods: Histograms, Shapiro-Wilk Test, Q-Q plot, Kolmogorov-Smirnof test


##
## Shapiro-Wilk normality test
##
## data: cars$speed
## W = 0.97765, p-value = 0.4576
##
## Shapiro-Wilk normality test
##
## data: cars$dist
## W = 0.95144, p-value = 0.0391
qqplot(cars$speed, cars$dist)

ks.test(cars$speed, cars$dist)
##
## Two-sample Kolmogorov-Smirnov test
##
## data: cars$speed and cars$dist
## D = 0.76, p-value = 5.735e-13
## alternative hypothesis: two-sided
3. No or little multicollinearity
- Multicollinearity occurs when the independent variables are not independent from each other - One variable can be a predictor of the other
A second important independence assumption is that the error of the mean has to be independent from the independent variables.
R methods: correlation, tolerance, variance inflation factor (VIF)
## speed dist
## speed 1.0000000 0.8068949
## dist 0.8068949 1.0000000
# example of correlated variables
cars2 <- cars
cars2$speed2 <- cars2$speed
cor(cars2) # if there is a 1.0 in any other position than the diagonal there is perfect correlation
## speed dist speed2
## speed 1.0000000 0.8068949 1.0000000
## dist 0.8068949 1.0000000 0.8068949
## speed2 1.0000000 0.8068949 1.0000000
4. No auto-correlation
library(car)
fit <- lm(dist ~ speed, data=cars)
summary(fit)
##
## Call:
## lm(formula = dist ~ speed, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.5791 6.7584 -2.601 0.0123 *
## speed 3.9324 0.4155 9.464 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
## lag Autocorrelation D-W Statistic p-value
## 1 0.1604322 1.676225 0.166
## Alternative hypothesis: rho != 0
5. Homoscedasticity
The error terms along the regression are the same and not heteroscedastic.
R methods: Residual plot, Goldfeld-Quandt Test
library(car)
residualPlot(fit)

# or in base
plot(fit$fitted.values, fit$residuals)

LS0tDQp0aXRsZTogIkFzc3VtcHRpb25zIG9mIExpbmVhciBSZWdyZXNzaW9uIg0KYXV0aG9yOiAiSmFzbWluZSBEdW1hcyINCmRhdGU6ICJBdWd1c3QgMzAsIDIwMTYiDQpvdXRwdXQ6DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgdG9jOiB0cnVlDQogICAgdG9jX2Zsb2F0OiB0cnVlDQogICAgY29kZV9mb2xkaW5nOiBzaG93DQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQ0KICAgIGZpZ193aWR0aDogOQ0KICAgIGZpZ19oZWlnaHQ6IDYNCiAgICB0aGVtZTogZmxhdGx5DQogICAgaGlnaGxpZ2h0OiB0YW5nbw0KLS0tDQoNCg0KYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9DQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0UpDQoNCmBgYA0KDQoNCiMjIyBSZXF1aXJlbWVudHMNCg0KKiBMaW5lYXIgUmVncmVzc2lvbiBuZWVkcyBhdCBsZWFzdCB0d28gdmFyaWFibGVzDQoqIFRoZSBhbW91bnQgb2Ygc2FtcGxlcyBpcyBwcmVmZXJyZWQgdG8gYmUgYXQgbWluaW11bSAyMCBvYnNlcnZhdGlvbnMgcGVyIHZhcmlhYmxlIHRvIGhhdmUgY29uZmlkZW5jZSBpbiB0aGUgbW9kZWwNCg0KIyMjIDEuIExpbmVhciBSZWxhdGlvbnNoaXANCg0KKiBUaGUgcmVsYXRpb25zaGlwIGJldHdlZW4gdGhlIGluZGVwZW5kZW50IHZhcmlhYmxlIChyZXNwb25zZSkgYW5kIHRoZSBkZXBlbmRlbnQgdmFyaWFibGUocykgKHByZWRpY3RvcnMpIG5lZWRzIHRvIGJlIGxpbmVhcg0KKiBMaW1pdCB0aGUgYW1vdW50IG9mIG91dGxpZXJzIGFzIExpbmVhciBSZWdyZXNzaW9uIGlzIHNlbnNpdGl2ZSB0byBvdXRsaWVycyAtIHRlc3Qgd2l0aCBzY2F0dGVyIHBsb3RzDQoNCiogUiBtZXRob2RzOiBzY2F0dGVyIHBsb3RzDQpgYGB7cn0NCnBsb3QoY2FycykNCmBgYA0KDQoNCiMjIyAyLiBNdWx0aXZhcmlhdGUgTm9ybWFsaXR5DQoNCiogVGhlIGRpc3RyaWJ1dGlvbiBvZiBlYWNoIG9mIHRoZSB2YXJpYWJsZXMgaXMgW25vcm1hbCAob3IgR2F1c3NpYW4pXShodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9Ob3JtYWxfZGlzdHJpYnV0aW9uKSAtIHRoZSBhdmVyYWdlcyBvZiBlYWNoICoqcmFuZG9tLCBpbmRlcGVuZGVudCoqIG9ic2VydmF0aW9ucyBjb252ZXJnZSAodGVuZCB0b3dhcmRzKSBhIGNlbnRyYWwgbWVhbi4NCg0KKiBSIG1ldGhvZHM6IEhpc3RvZ3JhbXMsIFtTaGFwaXJvLVdpbGsgVGVzdF0oaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvU2hhcGlybyVFMiU4MCU5M1dpbGtfdGVzdCksIFEtUSBwbG90LCBLb2xtb2dvcm92LVNtaXJub2YgdGVzdA0KYGBge3J9DQpoaXN0KGNhcnMkc3BlZWQpDQpoaXN0KGNhcnMkZGlzdCkNCg0Kc2hhcGlyby50ZXN0KGNhcnMkc3BlZWQpDQpzaGFwaXJvLnRlc3QoY2FycyRkaXN0KQ0KDQpxcXBsb3QoY2FycyRzcGVlZCwgY2FycyRkaXN0KQ0KDQprcy50ZXN0KGNhcnMkc3BlZWQsIGNhcnMkZGlzdCkNCg0KYGBgDQoNCiMjIyAzLiBObyBvciBsaXR0bGUgbXVsdGljb2xsaW5lYXJpdHkNCg0KKiBbTXVsdGljb2xsaW5lYXJpdHldKGh0dHBzOi8vZW4ud2lraXBlZGlhLm9yZy93aWtpL011bHRpY29sbGluZWFyaXR5KSBvY2N1cnMgd2hlbiB0aGUgaW5kZXBlbmRlbnQgdmFyaWFibGVzIGFyZSBub3QgaW5kZXBlbmRlbnQgZnJvbSBlYWNoIG90aGVyIC0gT25lIHZhcmlhYmxlIGNhbiBiZSBhIHByZWRpY3RvciBvZiB0aGUgb3RoZXINCiogQSBzZWNvbmQgaW1wb3J0YW50IGluZGVwZW5kZW5jZSBhc3N1bXB0aW9uIGlzIHRoYXQgdGhlIGVycm9yIG9mIHRoZSBtZWFuIGhhcyB0byBiZSBpbmRlcGVuZGVudCBmcm9tIHRoZSBpbmRlcGVuZGVudCB2YXJpYWJsZXMuDQoNCiogUiBtZXRob2RzOiBjb3JyZWxhdGlvbiwgdG9sZXJhbmNlLCB2YXJpYW5jZSBpbmZsYXRpb24gZmFjdG9yIChWSUYpDQpgYGB7cn0NCmNvcihjYXJzKQ0KDQojIGV4YW1wbGUgb2YgY29ycmVsYXRlZCB2YXJpYWJsZXMNCmNhcnMyIDwtIGNhcnMNCmNhcnMyJHNwZWVkMiA8LSBjYXJzMiRzcGVlZA0KY29yKGNhcnMyKSAjIGlmIHRoZXJlIGlzIGEgMS4wIGluIGFueSBvdGhlciBwb3NpdGlvbiB0aGFuIHRoZSBkaWFnb25hbCB0aGVyZSBpcyBwZXJmZWN0IGNvcnJlbGF0aW9uDQoNCmBgYA0KDQoNCiMjIyA0LiBObyBhdXRvLWNvcnJlbGF0aW9uDQoNCiogQXV0b2NvcnJlbGF0aW9uIG9jY3VycyB3aGVuIHRoZSByZXNpZHVhbHMgYXJlIG5vdCBpbmRlcGVuZGVudCBmcm9tIGVhY2ggb3RoZXINCiogY29ycmVsYXRpb24gb2YgYSBzaWduYWwgd2l0aCBpdHNlbGYgYXQgZGlmZmVyZW50IHBvaW50cyBpbiB0aW1lIC0gY29tbW9uIGluICp0aW1lIHNlcmllcyBhbmFseXNpcyoNCg0KKiBSIG1ldGhvZHM6IFtEdXJiaW4tV2F0c29uIFRlc3RdKGh0dHBzOi8vZW4ud2lraXBlZGlhLm9yZy93aWtpL0R1cmJpbiVFMiU4MCU5M1dhdHNvbl9zdGF0aXN0aWMpDQpgYGB7cn0NCmxpYnJhcnkoY2FyKQ0KDQpmaXQgPC0gbG0oZGlzdCB+IHNwZWVkLCBkYXRhPWNhcnMpDQpzdW1tYXJ5KGZpdCkNCg0KZHVyYmluV2F0c29uVGVzdChmaXQpDQoNCmBgYA0KDQojIyMgNS4gSG9tb3NjZWRhc3RpY2l0eQ0KDQoqICBUaGUgZXJyb3IgdGVybXMgYWxvbmcgdGhlIHJlZ3Jlc3Npb24gYXJlIHRoZSAqKnNhbWUqKiBhbmQgbm90IGhldGVyb3NjZWRhc3RpYy4NCg0KKiBSIG1ldGhvZHM6IFJlc2lkdWFsIHBsb3QsIFtHb2xkZmVsZC1RdWFuZHQgVGVzdF0oaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvR29sZGZlbGQlRTIlODAlOTNRdWFuZHRfdGVzdCkNCmBgYHtyfQ0KbGlicmFyeShjYXIpDQpyZXNpZHVhbFBsb3QoZml0KQ0KDQojIG9yIGluIGJhc2UNCnBsb3QoZml0JGZpdHRlZC52YWx1ZXMsIGZpdCRyZXNpZHVhbHMpDQpgYGANCg0KDQojIyMgUmVzb3VyY2VzDQoNCiogW1N0YXRpc3RpY3MgU29sdXRpb25zXShodHRwOi8vd3d3LnN0YXRpc3RpY3Nzb2x1dGlvbnMuY29tL2Fzc3VtcHRpb25zLW9mLWxpbmVhci1yZWdyZXNzaW9uLykNCg0KKiBbUXVpY2sgUjogRGlhZ25vc3RpY3NdKGh0dHA6Ly93d3cuc3RhdG1ldGhvZHMubmV0L3N0YXRzL3JkaWFnbm9zdGljcy5odG1sKQ0KDQoqIFtRdWljayBSOiBQcm9iYWJpbGl0eV0oaHR0cDovL3d3dy5zdGF0bWV0aG9kcy5uZXQvYWR2Z3JhcGhzL3Byb2JhYmlsaXR5Lmh0bWwpDQo=