What Is Regression?
Regression is a statistical method used in finance, investing, and other disciplines that attempts to determine the strength and character of the relationship between a dependent variable and one or more independent variables.
Linear regression is the most common form of this technique. Also called simple regression or ordinary least squares (OLS), linear regression establishes the linear relationship between two variables.
Linear regression is graphically depicted using a straight line of best fit with the slope defining how the change in one variable impacts a change in the other. The y-intercept of a linear regression relationship represents the value of the dependent variable when the value of the independent variable is zero. Nonlinear regression models also exist, but are far more complex.
Key Takeaways
- Regression is a statistical technique that relates a dependent variable to one or more independent variables.
- A regression model is able to show whether changes observed in the dependent variable are associated with changes in one or more of the independent variables.
- It does this by essentially determining a best-fit line and seeing how the data is dispersed around this line.
- Regression helps economists and financial analysts in things ranging from asset valuation to making predictions.
- For regression results to be properly interpreted, several assumptions about the data and the model itself must hold.
In economics, regression is used to help investment managers value assets and understand the relationships between factors such as commodity prices and the stocks of businesses dealing in those commodities.
While a powerful tool for uncovering the associations between variables observed in data, it cannot easily indicate causation. Regression as a statistical technique should not be confused with the concept of regression to the mean, also known as mean reversion.
Understanding Regression
Regression captures the correlation between variables observed in a data set and quantifies whether those correlations are statistically significant or not.
The two basic types of regression are simple linear regression andmultiple linear regression, although there are nonlinear regression methods for more complicated data and analysis. Simple linear regression uses one independent variable to explain or predict the outcome of the dependent variable Y, while multiple linear regression uses two or more independent variables to predict the outcome. Analysts can use stepwise regression to examine each independent variable contained in the linear regression model.
Regression can help finance and investment professionals. For instance, a company might use it to predict sales based on weather, previous sales, gross domestic product (GDP) growth, or other types of conditions. The capital asset pricing model (CAPM) is an often-used regression model in finance for pricing assets and discovering the costs of capital.
Regression and Econometrics
Econometrics is a set of statistical techniques used to analyze data in finance and economics. An example of the application of econometrics is to study the income effect using observable data. An economist may, for example, hypothesize that as a person increases their income, their spending will also increase.
If the data show that such an association is present, a regression analysis can then be conducted to understand the strength of the relationship between income and consumption and whether or not that relationship is statistically significant.
Note that you can have several independent variables in an analysis—for example, changes to GDP and inflation in addition to unemployment in explaining stock market prices. When more than one independent variable is used, it is referred to asmultiple linear regression. This is the most commonly used tool in econometrics.
Econometrics is sometimes criticized for relying too heavily on the interpretation of regression output without linking it to economic theory or looking for causal mechanisms. It is crucial that the findings revealed in the data are able to be adequately explained by a theory.
Calculating Regression
Linear regression models often use a least-squares approach to determine the line of best fit. The least-squares technique is determined by minimizing the sum of squares created by a mathematical function. A square is, in turn, determined by squaring the distance between a data point and the regression line or mean value of the data set.
Once this process has been completed (usually done today with software), a regression model is constructed. The general form of each type of regression model is:
Simple linear regression:
Y=a+bX+u
Multiple linear regression:
Y=a+b1X1+b2X2+b3X3+...+btXt+uwhere:Y=ThedependentvariableyouaretryingtopredictorexplainX=Theexplanatory(independent)variable(s)youareusingtopredictorassociatewithYa=They-interceptb=(betacoefficient)istheslopeoftheexplanatoryvariable(s)u=Theregressionresidualorerrorterm
Example of How Regression Analysis Is Used in Finance
Regression is often used to determine how specific factors—such as the price of a commodity, interest rates, particular industries, or sectors—influence the price movement of an asset. The aforementioned CAPM is based on regression, and it's utilized to project the expected returns for stocks and to generate costs of capital. A stock’s returns are regressed against the returns of a broader index, such as the S&P 500, to generate a beta for the particular stock.
Beta is the stock’s risk in relation to the market or index and is reflected as the slope in the CAPM. The return for the stock in question would be the dependent variable Y, while the independent variable X would be the market risk premium.
Additional variables such as the market capitalization of a stock, valuation ratios, and recent returns can be added to the CAPM to get better estimates for returns. These additional factors are known as the Fama-French factors, named after the professors who developed the multiple linear regression model to better explain asset returns.
Why Is It Called Regression?
Although there is some debate about the origins of the name, the statistical technique described above most likely was termed “regression” by Sir Francis Galton in the 19th century to describe the statistical feature of biological data (such as heights of people in a population) to regress to some mean level. In other words, while there are shorter and taller people, only outliers are very tall or short, and most people cluster somewhere around (or “regress” to) the average.
What Is the Purpose of Regression?
In statistical analysis, regression is used to identify the associations between variables occurring in some data. It can show the magnitude of such an association and determine its statistical significance. Regression is a powerful tool for statistical inference and has been used to try to predict future outcomes based on past observations.
How Do You Interpret a Regression Model?
A regression model output may be in the form of Y = 1.0 + (3.2)X1 - 2.0(X2) + 0.21.
Here we have a multiple linear regression that relates some variable Y with two explanatory variables X1 and X2. We would interpret the model as the value of Y changes by 3.2× for every one-unit change in X1 (if X1 goes up by 2, Y goes up by 6.4, etc.) holding all else constant. That means controlling for X2, X1 has this observed relationship. Likewise, holding X1 constant, every one unit increase in X2 is associated with a 2× decrease in Y. We can also note the y-intercept of 1.0, meaning that Y = 1 when X1 and X2 are both zero. The error term (residual) is 0.21.
What Are the Assumptions That Must Hold for Regression Models?
To properly interpret the output of a regression model, the following main assumptions about the underlying data process of what you are analyzing must hold:
- The relationship between variables is linear;
- There must be hom*oskedasticity, or the variance of the variables and error term must remain constant;
- All explanatory variables are independent of one another;
- All variables are normally distributed.
The Bottom Line
Regression is a statistical method that tries to determine the strength and character of the relationship between one dependent variable and a series of other variables. It is used in finance, investing, and other disciplines.
Regression analysis uncovers the associations between variables observed in data, but cannot easily indicate causation.