Correlated and scalar subqueries in SQL

This recipe explains correlated and scalar subqueries in SQL using SQL server

Correlated and scalar subqueries in SQL

In this tutorial, let us understand what are subqueries and what are the different types of subqueries with clear examples.

Generally, subqueries are nothing but a query within a query. There are a lot of ways in which we can write subqueries in a query. Subqueries can sometimes be called as inner queries and the main queries are called as outer queries. Subqueries can be inserted in various places like SELECT, FROM, or WHERE clause of the main query.

There are three main types of subqueries

  1. Scalar subquery : A subquery that returns just one row and one column. Or in general terms, a subquery that returns just one value
  2. Multiple row / column subqueries : A subquery that returns multiple rows or multiple columns or both.
  3. Correlated subquery : A subquery that depends on the results of the outer query.

Let us understand Scalar and Correlated subqueries in detail with an example in this tutorial.

In this tutorial we are using a simple employee table which has 7 columns and 49 rows. The sample data is shown below.

Scalar subqueries

Scalar subqueries are fairly easy to understand. Let us consider an example where we want to find all the employees from the employees table whose salary is greater than the average salary. The best way to find this out is by using the concept of scalar subquery. Take a look at the following query.

In the WHERE clause of the query, we can see another simple query which just returns the average of all the salaries. If this query is executed separately, the result obtained is 5818 which the average of all the salaries. This value can now be used as a condition for the WHERE clause in the outer query.

The result of the above query is

EMPLOYEE_ID FIRST_NAME LAST_NAME JOB_TYPE SALARY MANAGER_ID DEPARTMENT_ID
4 Michael Hartstein MK_MAN 13000 100 20
5 Pat Fay MK_REP 6000 201 20
6 Susan Mavris HR_REP 6500 101 40
7 Hermann Baer PR_REP 10000 101 70
8 Shelley Higgins AC_MGR 12008 101 110
9 William Gietz AC_ACCOUNT 8300 205 110
10 Neena Kochhar AD_VP 17000 100 90
11 Lex De Haan AD_VP 17000 100 90
12 Alexander Hunold IT_PROG 9000 102 60
13 Bruce Ernst IT_PROG 6000 103 60
17 Nancy Greenberg FI_MGR 12008 101 100
18 Daniel Faviet FI_ACCOUNT 9000 108 100
19 John Chen FI_ACCOUNT 8200 108 100
20 Ismael Sciarra FI_ACCOUNT 7700 108 100
21 Jose Manuel Urman FI_ACCOUNT 7800 108 100
22 Luis Popp FI_ACCOUNT 6900 108 100
23 Den Raphaely PU_MAN 11000 100 30
29 Matthew Weiss ST_MAN 8000 100 50
30 Adam Fripp ST_MAN 8200 100 50
31 Payam Kaufling ST_MAN 7900 100 50
32 Shanta Vollman ST_MAN 6500 100 50

Scalar subqueries can be used within various clauses such as SELECT, FROM, WHERE, JOIN, UPDATE, DELETE INSERT and so on. Let us try to reframe the same example by replacing the subquery within a different clause.

The above query also returns the same results as that of the previous query.

Correlated subqueries

Correlated subqueries can be used in scenarios where a subquery needs to be executed for each candidate row in the outer query. In correlated subqueries, the subquery returns a different set of results for each candidate row considered by the main query. This way the outer query also depends on the inner query.

The best way to understand this is by considering an example. In the previous example we saw how to fetch the employee details of those whose salary is greater than the overall average salary. In this example, let us try to find out the employees whose salary is greater than the average salary of their department.

In the above query, we can see that the DEPARTMENT_ID of the outer table is being used inside the WHERE clause of the inner query. Here, for each row corresponding to the outer query,we calculate the average salary of that department and compare it with the current salary.

The result of the above query is

EMPLOYEE_ID FIRST_NAME LAST_NAME JOB_TYPE SALARY MANAGER_ID DEPARTMENT_ID
4 Michael Hartstein MK_MAN 13000 100 20
23 Den Raphaely PU_MAN 11000 100 30
29 Matthew Weiss ST_MAN 8000 100 50
30 Adam Fripp ST_MAN 8200 100 50
31 Payam Kaufling ST_MAN 7900 100 50
32 Shanta Vollman ST_MAN 6500 100 50
33 Kevin Mourgos ST_MAN 5800 100 50
12 Alexander Hunold IT_PROG 9000 102 60
13 Bruce Ernst IT_PROG 6000 103 60
17 Nancy Greenberg FI_MGR 12008 101 100
18 Daniel Faviet FI_ACCOUNT 9000 108 100
8 Shelley Higgins AC_MGR 12008 101 110

Hope this tutorial was helpful!

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

NLP Project to Build a Resume Parser in Python using Spacy
Use the popular Spacy NLP python library for OCR and text classification to build a Resume Parser in Python.

A/B Testing Approach for Comparing Performance of ML Models
The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.

Build a Graph Based Recommendation System in Python -Part 1
Python Recommender Systems Project - Learn to build a graph based recommendation system in eCommerce to recommend products.

Multilabel Classification Project for Predicting Shipment Modes
Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

Time Series Analysis with Facebook Prophet Python and Cesium
Time Series Analysis Project - Use the Facebook Prophet and Cesium Open Source Library for Time Series Forecasting in Python

Learn How to Build a Logistic Regression Model in PyTorch
In this Machine Learning Project, you will learn how to build a simple logistic regression model in PyTorch for customer churn prediction.

Model Deployment on GCP using Streamlit for Resume Parsing
Perform model deployment on GCP for resume parsing model using Streamlit App.

Forecasting Business KPI's with Tensorflow and Python
In this machine learning project, you will use the video clip of an IPL match played between CSK and RCB to forecast key performance indicators like the number of appearances of a brand logo, the frames, and the shortest and longest area percentage in the video.

NLP Project for Multi Class Text Classification using BERT Model
In this NLP Project, you will learn how to build a multi-class text classification model using using the pre-trained BERT model.

NLP Project for Beginners on Text Processing and Classification
This Project Explains the Basic Text Preprocessing and How to Build a Classification Model in Python