top of page

Goal 4

Correlation and Regression

Linear regression is one of the essential tools in statistical analysis. In this course, we'll walk through step-by-step how to conduct many important analyses using SPSS. 

Although you will learn the basics of what these statistics are, we'll avoid complicated mathematical discussions and go right to what you need to know to conduct these analyses.

Linear regression is basically a tool that allows you to test relationships between many variables at the same time, control for variables' effects, and create simple statistical models that allow you to make predictions.

In this course, we'll cover the following key topics:

  1. Correlations: You probably already know this, but understanding how to test the correlation between two variables gets us started in this course.

  2. Simple Linear Regression: Taking correlations one step further by creating a statistical model.

  3. Multiple Linear Regression: Being able to test multiple predictors at the same time and testing the unique effect of each.

  4. Hierarchical Linear Regression: How to test for the influence of different variables by adding them to the model one at a time.

  5. Interaction Analysis: How to test whether there's a two-way interaction between variables (also known as a "moderator" analysis)

Screen Shot 2019-06-12 at 1.20.34 PM.png
Blurred people mingling

Choose the best statistical tool

Statistical Test Selectorby laerd

Work through the steps below to select the appropriate statistical test
for your research. If we do not have a study design that matches your own

Market Analysis

Regression and Correlation

These days many employees, during work hours, spend time on the Internet doing personal things, things not related to their work. This is called “cyberloafing.” Research at ECU, by Mike Sage, graduate student in Industrial/Organizational Psychology, has related the frequency of cyberloafing to personality and age.

Revewing Graphs

Suggested statistical tests

Descriptive statistics 

Skewed data Recoding and creating new variables

Chi-squared test

Mann-Whitney U test

Kruskall-Wallis test

Logistic regression

Multiple linear regression

Research #4 - Titanic

The ship Titanic sank in 1912 with the loss of most of its passengers. Details can be obtained on 1309 passengers and crew on board the ship Titanic. The main use of this data set is Chi-squared and logistic regression with survival as the key dependent variable. Summary statistics for the categorical variables can be demonstrated and the cost of the ticket (fare) is very skewed so it can be used to demonstrate skewed data and differences between means and medians etc.

Medical Record Analysis

Research #5 - Psychological

Logistic regression is used to predict a categorical (usually dichotomous) variable from a set of predictor variables. 

As an example of the use of logistic regression in psychological research, consider the research done by Wuensch and Poteat and published in the Journal of Social Behavior and Personality, 1998, 13, 139-150.


Research #6 - Financial Distress

This data set deals with the financial distress prediction for a sample of companies. 


Which features are most indicative of financial distress?

What types of machine learning models perform best on this dataset?

Bright Workspace

Research #7 - Pizza Store Evaluation

PizzaOL is a new Pizzeria in the city that sells Pizza only online using Mobile App since more than 2 years. Customer can pay via e-payment methods such as credit card or PizzaPay. PizzaOL's owner has appointed you to recommend key indicators to assess the PizzaOL performance and to build its own strategy in the next year.


PizzaOL's owner has assigned this task to you to help her build final understanding about her customers in more objective manner. However, PizzaOL's owner wants to plan for next year and wants to allocate the budget in an optimal manner. To do so, you will need to perform different data analysis tasks (see sheet Requirements) in order to capture the impact and effect of different factors and variables on some other key indicators.

Whiteboard usage

Capstone Project Employee Attrition

This project is based on a hypothetical dataset downloaded from IBM HR Analytics Employee Attrition & Performance. It has 1,470 data points (rows) and 35 features (columns) describing each employee’s background and characteristics; and labelled (supervised learning) with whether they are still in the company or whether they have gone to work somewhere else. Machine Learning models can help to understand and determine how these factors relate to workforce attrition.


Perform exploratory data analysis to find a pattern or find and filter the criteria which are most responsible for attrition.

Arabic document

Here you can find some SPSS books and notes

Arabic Article 

Here you can find the most useful article that guide you how to choose the most appropriate statistical test

Statistical Analysis

Here you can find a map that will help you to follow the best track for statistical analysis in SPSS

VIF Issue

If you have any statistical interpretation issue(s), here you can find the solution

bottom of page