by Xiangyu Wang

with Greg Page

Leverage in Simple Linear Regression: Overview

In this article, we will discuss high leverage points in simple linear regression (SLR). Simply put, high leverage points in linear regression are those with extremely unusual independent variable values in either direction from the mean (large or small). Such points are noteworthy because they have the potential to exert considerable “pull”, or leverage, on the model’s best-fit line.

The mathematical formula used to calculate the leverage score for any particular input value in an SLR model is shown here:

Further down in this post, we will show a step-by-step breakdown of how this…

For understanding the relationship between two numeric variables, scatterplots are a valuable tool. They can inform us about the correlations (positive or negative) and the types of relationships (linear or non-linear) among variables. In addition, they can help with outlier detection and with understanding the overall distribution of the individual variables that they show.

When we wish to analyze truly massive datasets, however, scatterplots’ value can be limited because of a problem known as overplotting. Overplotting is the result of too many data points landing atop one another, thereby rendering many of the individual…

Xiangyu Wang

Master degree in Applied Business Analytics from Boston University.

