Instructor ย Micky Midha

Updated On - Distinguish between the relative assumptions of single and multiple regression.
- Interpret regression coefficients in a multiple regression.
- Interpret goodness of fit measures for single and multiple regressions, including ๐ ^2 and adjusted ๐ ^2.
- Construct, apply, and interpret joint hypothesis tests and confidence intervals for multiple coefficients in a regression.

- Video Lecture
- |
- PDFs
- |
- List of chapters

- Introduction
- Additional Assumptions of Multiple Regression
- Interpretation of Coefficients
- Interpretation of coefficients โ Indistinct Variables
- Ols Estimators for Multiple Regression Parameters
- Measuring Model Fit
- Standard Error of Regression
- Coefficient of Determination, R
- Limitation of R
- Adjusted R
- Testing parameters in regression Models
- The F-Test-Joint hypothesis Testing
- Multivariate Confidence Intervals

- In practice, models typically use multiple variables where it is possible to isolate the unique contribution of each explanatory variable.ย A ๐-variate regression model enables the coefficients to measure the distinct contribution of each explanatory variable to the variation in the dependent variable.
- Multiple regression is regression analysis with more than one independent variable. The general form of a multiple regression can be written as

where:

Y_{i} = i^{th} observation of dependent variable Y

X_{ki} = i^{th} observation of k^{th} independent variable X

ฮฑ = intercept term

ฮฒ_{k} = slope coefficient of k^{th} independent variable

ฯต_{i }= error term of i^{th} observation

n = number of observations

k = total number of independent variables

- Extending the model to multiple regressors requires one additional assumption, along with some modifications, to the six assumptions of linear regression with single regressors.
- The additional assumption is โ
- Multiple linear regression assumes that the explanatory variables are not perfectly linearly dependent (i.e., each explanatory variable must have some variation that cannot be perfectly explained by the other variables in the model).
- The remaining assumptions require simple modifications to account for ๐-explanatory variables. These become โ
- All variables must have positive variances so that
- The error is assumed to have mean zero conditional on the explanatory variables
- The random variables
- The probability of large outliers in each explanatory variable should be small so that
- The constant variance assumption is similarly extended to hold for all explanatory variables
- The error terms should be uncorrelated across all observations, i.e.

If this assumption is violated, then the variables are perfectly collinear.

- Slope coefficient (ฮฒ
_{k}) โ Itโs the change in the dependent variable from a unit change in the corresponding independent (X_{ki}) variable keeping all other independent variables constant. - When the value of the independent variable changes by one unit, the change in the dependent variable is not equal to the slope coefficient but depends on the correlation among the independent variables as well.
- Intercept coefficient (ฮฑ) โ The intercept (or the constant) is the expected value of the dependent variableย Y when all the independent variables X
_{ks}are equal to 0.

When all explanatory variables are distinct (i.e., no variable is an exact function of the others), then the coefficients are interpreted as holding all other values fixed. For example, ฮฒ_1 is the effect of a small increase in X_{1} holding all other variables constant.

Therefore, the slope coefficient are called partial slope coefficients.

- If some explanatory variables are functions of the same random variable (e.g., if X
_{2}=X_{1}^{2}) then:

In this case,

it is not possible to change X_{1} while holding the other variables constant.

The interpretation of the coefficients in models with this structure depends on the value of X_{1} because a small change of ฮX_{1}ย in X_{1} changes ๐ by

This effect captures theย direct, linear effect of a change in X_{1} through ฮฒ_{1} and its nonlinear effect through ฮฒ_{2}.

- Estimating the multiple regression parameters can be quite demanding since it involves a lot of calculations. A basic understanding can be developed using the multiple regression model with two independent variables, which can be extended for than 2 independent variables.
- Suppose the two variable model is
- The first regresses X
_{1}on X_{2 }and retains the residuals from this regression. - The second regression does the same for ๐
- The final step regresses the residual of ๐ on the residual of X
_{1} - The first two regressions have a single purpose โ to remove the direct effect of X
_{2}from ๐ and X_{1}. They do this by decomposing each variable into two components: one that is perfectly correlated with X_{2}(i.e., the fitted value) and one that is uncorrelated with X_{2}(i.e., the residual). As a result, the two residuals are uncorrelated with X_{2}by construction. - The final regression estimates the linear relationship (i.e., ฮฒ
_{1}) between the components of ๐ and X__{1}that are uncorrelated with (and so cannot be explained by) X__{2}. - Finally, the OLS estimate of ฮฒ
_{2}can be computed in the same manner by reversing the roles of X_{1}and X_{2}(i.e., so that ฮฒ_2 measures the effect of the component in X_{2}that is uncorrelated with X_{1}). - This stepwise estimation can be used to estimate models with any number of regressors. In the ๐-variable model:

The OLS estimator for ฮฒ_{1} can be computed using three single-variable regressions.

the OLS estimate of ฮฒ_{1} is computed by first regressing each of X_{1} and Y on a constant and the remaining k-1 explanatory variables. The residuals from these two regressions are mean zero and uncorrelated with the remaining k-1 explanatory variables. The OLS estimator of ฮฒ_{1} is then estimated by regressing the residuals of ๐ on the residuals of X_{1}.

- The total variation in the dependent variable is called the total sum of squares (TSS), which is defined as the sum of the squared deviations of Y
_{i }around the sample mean Yย ฬ ย : - Each dependent variable is decomposed into two components: the fitted value (Yย ฬ_i ) and the estimated residual (ฯตย ฬ_i), so that:
- Minimizing the squared residuals decomposes the total variation of the dependent data into two distinct components:
- RSS โ one that captures the unexplained variation (due to the error in the model), and
- ESS โ another that measures explained variation (which depends on both the estimated parameters and the variation in the explanatory variables).
- The residual sum of squares (RSS/SSR) is the sum of squared deviations of the actual (or observed) values of Y
_{i}, from the predicted value of Y_{i}(i.e. Yย ฬ_{i}). - The explained sum of squares (ESS) is the sum of squared deviations of the predicted values of Y
_{i}(i.e. (Y_{i})ย ฬย ), from the sample mean the Y_{i}โs (i.e. Yย ฬ ). - It is important to note that

Hence RSS is simply the sum of the squares of the error terms, i.e.

** TSS = ESS + RSS**

- SER measures the degree of variability of the actual Y-values Y
_{i}, with respect to the estimated Y values (i.e.. (Y_{i })ย ฬ). It is a measure of the spread (or standard deviation) of the observations around the regression line.ย - The SER conveys the โfitโ of the regression line, and the fit is better if SER is smaller.

- R^2 is the proportion of the variance in the dependent variable that is explained by the (variation in) the independent variables. It is calculated as the ratio of the explained sum of squares to the total sum of squares
- Because OLS estimates parameters by finding the values that minimize the ๐
๐๐, the OLS estimator also maximizes R
^{2}. - In case of linear regression with a single regressor, R
^{2}ย is defined as the squared correlation between the dependent variable and the explanatory variable in a model with a single explanatory variable. - A model that is completely incapable of explaining the observed data has an R
^{2}of 0 (because all variation is in the residuals). A model that perfectly explains the data (so that all residuals are 0) has an R^{2}of 1. All other models must produce values that fall between these two bounds so that R^{2}is never negative and always less than 1.

When a model has multiple explanatory variables, R^{2} is a complicated function of the correlations among the explanatory variables and those between the explanatory variables and the dependent variable.

However, R^{2 }in a model with multiple regressors is the squared correlation between Y_{i} and the fitted value Yย ฬ_{i},

This interpretation of R^{2} provides another interpretation of the OLS estimator: The regression coefficients ฮฒย ฬ_{1}, โฆ ,ฮฒย ฬ_{k} are chosen to produce the linear combination of X_{1}, โฆ , X_{k} that maximizes the correlation with ๐.

- While R^2 is a useful method to assess model fit, it has three important limitations.
- Adding a new variable to the model always increases the R
^{2}, even if the new variable has an insignificant effect on the dependent variable. For example, if a regression model with one explanatory variable is modified to have two explanatory variables, the new R^{2}is greater or equal to that of the original model which contained a single explanatory variable. i.e., if the original model is - The coefficient of determination
**R**cannot be compared across models with different dependent variables. For example, when Y^{2}_{i }is always positive, it is not possible to compare the R^{2}of a model in levels (Y) and logs (lnโกY_{iย }). - There is no general value which can be considered as a โgoodโ value for R
^{2}. Whether a model provides a good description of the data depends on the nature of the data. For example โ

and the expanded model is

then the R^{2} of the expanded model must be greater than or equal to the R^{2} of the original model. This is because the expanded model always has the same TSS and nearly always has a smaller ๐
๐๐, resulting in a higher R^{2}. The only situation where adding a variable does not increase R^{2} is if ฮฒ_{2}=0. In that case, the RSS remains the same (as does the R^{2}).

It is also not possible to compare theย R^{2} for two models that are logically equivalent (in the sense that both the fit of the model as measured by **๐
๐๐ **and predictions from the models are identical). This can occur when the dependent variable is transformed by adding or subtracting one or more of the explanatory variables.

An** R ^{2 }**of

On the other hand, an **R ^{2}** less than

- As discussed, R^2ย mostly increases with the increase in the number of independent variables, even if those new independent variables may not contribute in explaining the variation in the dependent variable. Hence, a high value of R^2 might be falsely indicative of a high collective explanatory power of the independent variables, but in reality, it might just reflect the impact of a large set of independent variables.
- This limitation is addressed (in a limited way) through another measure known as Adjusted R^2 (written as Rย ฬ
^{2ย }or R_{a}^{2}) which adjusts the R^{2}for the degrees of freedom (or number of independent variables). It is defined as - Adjusted R^2 can also be expressed as:
- Including additional explanatory variables (i.e., increasing ๐) always increases ๐. The adjusted R^2 captures the tradeoff between increasing ๐ and decreasing ๐ ๐๐ as models become larger.ย If a model with additional explanatory variables produces a negligible decrease in the ๐ ๐๐ when compared to a base model, then the loss of a degree of freedom produces a smaller Rย ฬ ^2.
- The adjustment to the R^2 may produce negative values if a model produces an exceptionally poor fit. In most financial data applications, ๐ is relatively large and so the loss of a degree of freedom has little effect on Rย ฬ ^2. In large samples, the adjustment term ๐ is very small and so Rย ฬ ^2tends to increase even when an additional variable has little explanatory power.

where

๐ is the number of observations in the sample, and

๐ is the number of explanatory variables included in the model (not including the constant term ๐ผ).

where the adjustment factor

Note that ๐ must be greater than 1 because the denominator is less than the numerator.

- Testing a hypothesis about a single coefficient in a model with multiple regressors is identical to testing in a model with a single explanatory variable. Tests of the null hypothesis
- Instead, the more common choice is an alternative called the ๐น-test. This type of test compares the fit of the model (measured using the ๐ ๐๐) when the null hypothesis is trueย relative to the fit of the model without the restriction on the parameters assumed by the null.

are implemented using a ๐ก-test with sample test statistic as

where

(s.e.)ย ฬ(ฮฒย ฬ_{j }) is the estimated standard error of ฮฒย ฬ_{jย } .

However, the ๐ก-test is not directly applicable when testing complex hypotheses that involve more than one parameter, because the parameter estimators can be correlated. This correlation complicates extending the univariate ๐ก-test to tests of multiple parameters.

- Implementing an ๐น-test requires estimating two models. The first is the full model that is to be tested. This model is called the unrestricted model and has an ๐
๐๐ denoted by RSS
_{U}. The second model, called the restricted model, imposes the null hypothesis on the unrestricted model and its ๐ ๐๐ is denoted RSS_{R}. The ๐น-test compares the fit of these two models: - ๐น-tests can be equivalently expressed in terms of the R
^{2}from the restricted and unrestricted models. Using this alternative parameterization: - If the restriction imposed by the null hypothesis does not meaningfully alter the fit of the model, then the two
**๐ ๐๐**measures are similar, and the test statistic is small. - Implementing an
**๐น**-test requires imposing the null hypothesis on the model and then estimating the restricted model using**OLS.**For example, consider a test of whether**CAPM**, which only includes the market return as a factor, provides as good a fit as a multi-factor model that additionally includes the size and value factors. - In this hypothesis test, two coefficients are restricted to specific values and so q=2. The ๐น-test is then computed by estimating both regressions, storing the two ๐ ๐๐ values, and then computing:
- Finally, if the test statistic ๐น is larger than the critical value of an F
_{2,n-4 }distribution using a size of ๐ผ (e.g., 5%), then the null is rejected. If the test statistic is smaller than the critical value, then the null hypothesis is not rejected, and it is concluded that CAPM appears to be adequate in explaining the returns to the portfolio. - Imposing the null hypothesis requires replacing the parameters with their assumed value if the null is true. Imposing the null hypothesis on the unrestricted model produces the restricted model:

where

๐ is the number of restrictions imposed on the unrestricted model to produce the restricted model,

k_{U} is the number of explanatory variables in the unrestricted model, and

๐น-test has an F_{q,n-kU-1} distribution.

On the other hand, if the unrestricted model fits the data significantly better than the restricted model, then the **๐
๐๐** from the two models should differ by a large amount so that the value of the** ๐น**-test statistic is large.

A large test statistic indicates that the unrestricted model provides a superior fit and so the null hypothesis is rejected.

The unrestricted model includes all three explanatory variables, so that:

where

๐ indicates the market (i.e., so that R_{mรiย } is the return to the market factor above the risk-free-rate),

๐ indicates size, and

๐ฃ indicates value.

The null hypothesis is then:

The alternative hypothesis is that at least one of parameters is not equal to zero:

so that the null should be rejected if at least one of the coefficients is different from zero.

which is the CAPM.

- The method for constructing a confidence interval for single coefficients in the multiple regression model is also the same as in the single-regressor model.
- The confidence interval for ฮฒ_j can be constructed as

or