Linear Regression - Data Science Discovery (2024)

Linear Regression

The idea of trying to fit a line as closely as possible to as many points as possible is known as linear regression. The most common technique is to try to fit a line that minimizes the squared distance to each of those points. This is called OLS or Ordinary Least Squares Regression.

We can find the equation of this line and use it to make predictions. Since our regression estimates form a straight line, we can describe them using an equation in slope-intercept form:

Regression Equation

Linear Regression - Data Science Discovery (1)

When we have one x-variable (x1) and one y-variable (y-hat), this is called simple linear regression. This means that we are using one independent variable to predict the y-variable. We can have multiple independent variables to predict the y-variable and this is called multiple regression. For now, we are going to focus on simple linear regression because it's easy to interpret the results.

The Slope and Y Intercept of the Regression Line

In our regression equation, b0 is the y-intercept and b1 is the slope. Here's how you calculate the slope and y-intercept:

Linear Regression - Data Science Discovery (2)

Here's how you interpret them:

  • SLOPE= The average increase in Y associated with a 1-unit increase in X.
  • Y-INTERCEPT= The predicted value of Y when X is equal to 0.

In order to make predictions using the equation of the regression line, first find the slope and y-intercept. Next, you can plug in values of x to get predicted values of y.

Warning About Regression

When making predictions using regression, it's important to be aware of the following:

  • Predicting y at values of x beyond the range of x in the data is called extrapolation.
  • This is risky because we have no evidence to believe that the association between x and y remains linear for unseen values of x.
  • Extrapolated predictions can be absolutely wrong.

Residuals and RMSE

Unless there is a perfect correlation, our predictions are not going to be perfect. When thinking about this graphically, this means that for most of the points in any scatter plot, the actual y-values and the predicted y-values are different. The distance between the actual value and the predicted value from the line is called the residual or prediction error.

The residual is calculated by taking the actual value of y - the predicted value of y.

The residuals are the vertical distances between the points and the line.

  • If the point is above the regression line, the residual is positive.
  • If the point is below the regression line, the residual is negative.
  • If the point is exactly on the regression line, the residual is 0.

Two Key Features of the Regression Line:

  1. For any regression line, the average (and the sum) of the errors is always zero because the positives and negatives cancel out.
  2. The SD of the errors (also called the Root Mean Square Error or RMSE), is a measure of the typical spread of the data around the regression line.

RMSE=SDerrors: The SD of the prediction errors is a measure of how accurate our predictions are. The better the predictions, the smaller the size of the errors and the smaller the RMSE.

Rather than finding all the errors and then taking their root mean square, it's much easier to use this formula below. The RMSE is in the same units as your y variable.

Linear Regression - Data Science Discovery (3)

Video 1: Simple Linear Regression

Follow along with the worksheet to work through the problem:

Video 2: Residuals and RMSE

Follow along with the worksheet to work through the problem:

Q1: Which one is better?

Q2: Linear Regression - Data Science Discovery (4)
What is the Y-INTERCEPT for the given straight line?

Q3: Suppose we have clinical data for 400 patients and the task is to predict if a patient has cancer from the given data. Should we use linear regression in this situation?

`); } else { $e.prop("disabled", true); $e.html((i, html) => "❌ " + html); $e.after(`

Try Again. ${d.comment}

Linear Regression - Data Science Discovery (2024)
Top Articles
Undo, redo, and other shortcut key functions
Ethiopian Peppery Red Lentils Nutrition Facts
Whas Golf Card
Jail Inquiry | Polk County Sheriff's Office
Poe T4 Aisling
Uca Cheerleading Nationals 2023
Fort Carson Cif Phone Number
Nfr Daysheet
Get train & bus departures - Android
Fusion
Dr Lisa Jones Dvm Married
Calamity Hallowed Ore
Irving Hac
Magic Mike's Last Dance Showtimes Near Marcus Cedar Creek Cinema
Crusader Kings 3 Workshop
Nashville Predators Wiki
Craigslist Deming
Sand Castle Parents Guide
Are They Not Beautiful Wowhead
Munich residents spend the most online for food
How to Create Your Very Own Crossword Puzzle
Craigslistjaxfl
Ruben van Bommel: diepgang en doelgerichtheid als wapens, maar (nog) te weinig rendement
Trivago Sf
Delaware Skip The Games
Conan Exiles Sorcery Guide – How To Learn, Cast & Unlock Spells
Somewhere In Queens Showtimes Near The Maple Theater
Www.patientnotebook/Atic
Talk To Me Showtimes Near Marcus Valley Grand Cinema
Dewalt vs Milwaukee: Comparing Top Power Tool Brands - EXTOL
Dark Entreaty Ffxiv
27 Modern Dining Room Ideas You'll Want to Try ASAP
NV Energy issues outage watch for South Carson City, Genoa and Glenbrook
Login.castlebranch.com
Craigslist Efficiency For Rent Hialeah
Delta Rastrear Vuelo
Xfinity Outage Map Lacey Wa
Frostbite Blaster
The best Verizon phones for 2024
Sam's Club Gas Prices Florence Sc
Panorama Charter Portal
Trivago Anaheim California
Az Unblocked Games: Complete with ease | airSlate SignNow
Kaamel Hasaun Wikipedia
855-539-4712
Race Deepwoken
10 Best Tips To Implement Successful App Store Optimization in 2024
Diccionario De Los Sueños Misabueso
Hsi Delphi Forum
Booked On The Bayou Houma 2023
Latest Posts
Article information

Author: Kerri Lueilwitz

Last Updated:

Views: 6160

Rating: 4.7 / 5 (67 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Kerri Lueilwitz

Birthday: 1992-10-31

Address: Suite 878 3699 Chantelle Roads, Colebury, NC 68599

Phone: +6111989609516

Job: Chief Farming Manager

Hobby: Mycology, Stone skipping, Dowsing, Whittling, Taxidermy, Sand art, Roller skating

Introduction: My name is Kerri Lueilwitz, I am a courageous, gentle, quaint, thankful, outstanding, brave, vast person who loves writing and wants to share my knowledge and understanding with you.