SQL Joins Explained

Discover the fundamentals of the SQL join types in this comprehensive guide. Learn how to use inner, outer, left, right, and self joins to combine data from multiple tables, with clear examples and easy-to-follow explanations.
Dec 5, 2024
12 min read

Introduction

Ever felt like you're juggling multiple tables of data and wish there was an easier way to piece it all together? Welcome to the world of SQL joins! In this SQL joins tutorial, we’ll show you how to merge data from different tables without the confusion. Whether you're new to SQL or just need a refresher, this guide breaks down the different types of joins—minus the tech jargon and complicated explanations. From inner to outer joins and everything in between, we’ll walk through how they can help you get the answers you need from your data. Ready to dive in? Let’s get started!

Primary Keys and Foreign keys

In a relational database management system (RDBMS), tables are often connected through primary keys and foreign keys. A primary key is a unique identifier for each row in a table. It ensures that no two rows have the same value in this column (or set of columns), and every row must have a value (i.e., no NULLs).

A foreign key is a column (or set of columns) in one table that points to the primary key in another table. It creates a relationship between two tables, ensuring referential integrity (i.e., the foreign key value in one table must exist as a primary key value in the referenced table).

Join types

While there are many Join techniques in SQL, there are 5 primary Join types. 

  1. Inner Join
  2. Left Join or Left Outer Join
  3. Right Join or Right Outer Join
  4. Full Outer Join
  5. Cross Join

1. Inner Join

An inner join in SQL can be thought of as finding the common ground between two tables. Imagine you have two lists (or tables) of information, and you want to combine the data only where there’s a match between them.

Here’s a simple analogy:

Example Scenario

You have two lists:

  • List A: A list of people and their favorite colours.
  • List B: A list of people and their favorite fruits.

Now, you want to create a new list that only includes people who appear on both lists.

Syntax

SELECT *

FROM FavoriteColours

INNER JOIN FavoriteFruits

ON FavoriteColours.StudentID = FavoriteFruits.StudentID;

In this case, an inner join would give you a new list that only includes John and Alice (because they are in both lists). Bob and Eve are excluded because they don't have a match in the other list.

Explanation

  • SELECT * Retrieves all columns from both tables involved in the join.
  •  FROM FavoriteColours Specifies the first table (FavoriteColours).
  • INNER JOIN FavoriteFruits Joins the FavoriteColors table with the FavoriteFruits table.
  • ON FavoriteColors.StudentID = FavoriteFruits.StudentID Specifies the join condition based on the StudentID column.

Key Points

  • Inner join looks for rows in both tables that match based on a common column (in this case, the Name).
  • If there’s no match, those rows are left out.
  • You only get the rows where data is present in both tables.

In short, use inner joins to connect related information from two different sources, but only when that information overlaps.

2. Natural Joins

A natural join is like an inner join with a twist: it automatically looks for columns in two tables that have the same name and joins the tables based on those matching columns without you having to specify the condition manually.

Example Scenario

You have two lists (tables) about students. You're going to combine them based on some shared details, but you don’t need to say what that detail is—SQL figures it out on its own.

StudentInfo:

StudentGrades

 Syntax

SELECT *

FROM StudentInfo

NATURAL JOIN StudentGrades;

Explanation

- SELECT * Selects all columns from both tables.

- FROM StudentInfo Specifies the first table (`StudentInfo`).

- NATURAL JOIN StudentGrades Joins `StudentInfo` with `StudentGrades` based on columns with the same names (`StudentID` in this case).

Result

This query will automatically join the tables based on the `StudentID` column and return only the rows where there is a match in both tables.

Summary

- The natural join automatically detects and uses the column(s) with the same name in both tables to perform the join.

- It simplifies the query by removing the need to specify join conditions manually.

3. Left (outer) Join

A left join (or left outer join) is a way to combine two tables in SQL. It’s like saying, "Give me everything from the left table, and match it with data from the right table. If there's no match, still include the left table’s data and put NULL for the missing parts from the right table."

Example Scenario

Imagine you’re having a party and you make a guest list (left table). You also want to know which guests have RSVP'd (right table). The left join helps you see all your guests and also shows whether they’ve RSVP'd or not. If a guest hasn't RSVP'd, you’ll still see their name on the list, but the RSVP status will be blank. You have two lists:

You want to see all guests and whether they’ve RSVP'd. If they haven’t RSVP'd, you still want to see their name.

Syntax

SELECT GuestList.GuestID, GuestList.Name, RSVPs.RSVPStatus

FROM GuestList

LEFT JOIN RSVPs

ON GuestList.GuestID = RSVPs.GuestID;

Explanation

SELECT GuestList.GuestID, GuestList.Name, RSVPs.RSVPStatus selects the columns to show.

FROM GuestList specifies the left table (GuestList).

LEFT JOIN RSVPs joins the GuestList with the RSVPs table.

ON GuestList.GuestID = RSVPs.GuestID matches rows based on the GuestID column.

This query gives you a complete list of guests with their RSVP status, showing NULL where there’s no matching RSVP.

4. Left excluding Join

A left excluding join (sometimes called a left anti join) is used to find rows in the left table that do not have a match in the right table. It’s like saying, "Give me everything from the left table, but exclude anything that matches with the right table."

Using the same example, Left excluding join gives us a different result:

Syntax

SELECT InvitedGuests.GuestID, InvitedGuests.Name

FROM InvitedGuests

LEFT JOIN RSVPs

ON InvitedGuests.GuestID = RSVPs.GuestID

WHERE RSVPs.GuestID IS NULL;

Explanation

The result includes only those guests from the InvitedGuests table who do not have a matching GuestID in the RSVPs table. Here, Bob is included because he hasn’t RSVP'd, whereas John and Alice are excluded as they have matches in the RSVPs table.

5. Right (outer) Join

A right excluding join (or right anti join) is used to find rows in the right table that do not have a match in the left table. It’s like saying, "Give me everything from the right table, but exclude anything that matches with the left table." In other words, it will give you the same output as a left outer join if you reverse the positions of the “left” and “right tables”.

Example Scenario

Continuing with the party theme, imagine you’re hosting a party and you have a list of guests who have RSVP'd (right table). You also have a list of all the invited guests (left table). You want to see all the RSVP'd guests and any information about them from the invitation list if available. If an RSVP'd guest is not on the invitation list, you still want to see their RSVP status.

You want to see all the guests who have RSVP'd, and also see if they are on the invitation list. If a guest has RSVP'd but is not on the invitation list, you still want to see their RSVP status.

Syntax

SELECT RSVPs.GuestID, RSVPs.RSVPStatus, InvitedGuests.Name

FROM RSVPs

RIGHT JOIN InvitedGuests

ON RSVPs.GuestID = InvitedGuests.GuestID;

Explanation

  • SELECT RSVPs.GuestID, RSVPs.RSVPStatus, InvitedGuests.Name selects the columns to show from both tables.
  • FROM RSVPs specifies the right table (RSVPs).
  • RIGHT JOIN InvitedGuests joins the RSVPs table with the InvitedGuests table, focusing on including all rows from the right table.
  • ON RSVPs.GuestID = InvitedGuests.GuestID specifies the condition for the join based on GuestID.

6. Right excluding Join 

A right excluding join (or right anti join) is used to find rows in the right table that do not have a match in the left table. It’s like saying, "Give me everything from the right table, but exclude anything that matches with the left table."

Using the above example, this time you only want to find out which guests who have RSVP'd were not in the invitation list.

SELECT RSVPs.GuestID, RSVPs.RSVPStatus

FROM RSVPs

RIGHT JOIN InvitedGuests

ON RSVPs.GuestID = InvitedGuests.GuestID

WHERE InvitedGuests.GuestID IS NULL;

Explanation

  • SELECT RSVPs.GuestID, RSVPs.RSVPStatus selects the columns to show from the right table.
  • FROM RSVPs specifies the right table (RSVPs).
  • RIGHT JOIN InvitedGuests joins the RSVPs table with the InvitedGuests table.
  • ON RSVPs.GuestID = InvitedGuests.GuestID specifies the condition for the join based on GuestID.
  • WHERE InvitedGuests.GuestID IS NULL filters the result to include only those rows from the right table where there was no match in the left table, effectively excluding any RSVPs that are in the invited list.

The output will show:

7. Full (outer) Join

A full outer join combines rows from both tables being joined, including all rows from both the left table and the right table. It gives you a complete view, showing all the rows where there is a match between the tables, as well as all rows where there is no match. If a row from one table has no corresponding row in the other table, the result will still include that row, with NULL values for the missing side.

Example Scenario

Imagine you’re organising a party and have two lists:

  1. List of Invited Guests (left table)
  2. List of RSVP'd Guests (right table)

You want to see everyone who was either invited or who has RSVP'd, and see if they match. If someone was invited but did not RSVP, or if someone RSVP'd but was not on the invitation list, you still want to see their details.

You have these two lists:

List A (Invited Guests):

You want to see a complete list of both invited guests and RSVP'd guests, including those who don’t match between the two lists.

SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus

FROM InvitedGuests

FULL OUTER JOIN RSVPs

ON InvitedGuests.GuestID = RSVPs.GuestID;

Explanation

  • SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus selects the columns to show from both tables.
  • FROM InvitedGuests specifies the left table (InvitedGuests).
  • FULL OUTER JOIN RSVPs joins the InvitedGuests table with the RSVPs table, including all rows from both tables.
  • ON InvitedGuests.GuestID = RSVPs.GuestID specifies the condition for the join based on GuestID.

Here:

  • John and Carol are included because they match between the two lists.
  • Alice is included because she was invited but didn’t RSVP.
  • The guest with ID 3 is included because they RSVP'd but were not on the invitation list.

8. Outer excluding Join

An outer excluding join (also known as a full anti join) retrieves rows from both tables where there is no match between them. It’s like saying, “Give me all the rows that exist in one table but don’t have a corresponding row in the other table.” It excludes any rows where there’s a match between the two tables.

Explanation Scenario

Imagine you have two lists:

  1. List of Invited Guests (left table)
  2. List of RSVP'd Guests (right table)

You want to find out who is either on the invitation list but hasn’t RSVP'd, or who has RSVP'd but wasn’t invited. You’re not interested in guests who both appear on the invitation list and have RSVP'd — only those who are on one list and not the other.

List A (Invited Guests):

Syntax

SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus 

FROM InvitedGuests

FULL OUTER JOIN RSVPs

ON InvitedGuests.GuestID = RSVPs.GuestID

WHERE InvitedGuests.GuestID IS NULL 

   OR RSVPs.GuestID IS NULL;

Explanation

  • SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus
    This part selects the columns you want to display from both tables. You’re asking for:some text
    • InvitedGuests.GuestID and InvitedGuests.Name from the InvitedGuests table.
    • RSVPs.RSVPStatus from the RSVPs table.
  • FROM InvitedGuests
    This specifies the left table, which is the InvitedGuests table. The query starts from here.
  • FULL OUTER JOIN RSVPs
    This means you’re performing a full outer join between the two tables. A full outer join includes all rows from both the left table (InvitedGuests) and the right table (RSVPs), even if there’s no match between them.
  • ON InvitedGuests.GuestID = RSVPs.GuestID
    This is the condition that specifies how the two tables should be matched. It says to match rows where the GuestID in InvitedGuests is equal to the GuestID in RSVPs.
  • WHERE InvitedGuests.GuestID IS NULL OR RSVPs.GuestID IS NULL
    This is the filter that ensures you only get the non-matching rows. It excludes any rows where both tables have matching GuestIDs, leaving only the rows that:
  • Exist in the InvitedGuests table but not in RSVPs (where RSVPs.GuestID is NULL).
  • Exist in the RSVPs table but not in InvitedGuests (where InvitedGuests.GuestID is NULL).

Summary

  • This query first joins both tables to get all possible rows, including matches and non-matches.
  • Then, the WHERE clause filters out any matching rows, keeping only the entries that exist exclusively in one of the two tables.

9. Self Join

A self join is when a table is joined with itself. It’s used when you want to compare rows within the same table or find relationships between rows in a single table. This is especially helpful when there’s hierarchical or relational data in one table.

Example Scenario

You have a table of employees, where each employee has a ManagerID that refers to another employee in the same table.

Employees Table

Syntax

SELECT e1.EmployeeID, e1.Name AS EmployeeName, e2.Name AS ManagerName

FROM Employees e1

LEFT JOIN Employees e2

ON e1.ManagerID = e2.EmployeeID;

Explanation

  • SELECT e1.EmployeeID, e1.Name AS EmployeeName, e2.Name AS ManagerName: This selects the employee's ID and name from the first instance of the table (e1), and the manager's name from the second instance (e2).
  • FROM Employees e1: This specifies the first instance of the Employees table and gives it the alias e1.
  • LEFT JOIN Employees e2: This joins the table Employees to itself, giving the second instance the alias e2. The LEFT JOIN ensures that employees without managers (like Carol) are still included in the result.
  • ON e1.ManagerID = e2.EmployeeID: This specifies the condition for the join, where the ManagerID in the first instance of the table (e1) is matched to the EmployeeID in the second instance (e2), essentially connecting employees to their managers.

This query produces a list that shows each employee along with their manager’s name by joining the table to itself.

10. Cross Join

A cross join combines every row from one table with every row from another table, creating a “Cartesian product” of the two tables. This means you’ll get all possible combinations of rows from both tables. It doesn't require any condition or matching column like other joins do.

Example scenario

Imagine you have two sets of things:

  1. Set A: A list of different shirts.
  2. Set B: A list of different trousers.

A cross join would create all possible outfit combinations by pairing every shirt with every trouser.

You have 2 lists:

A cross join would give you every combination of shirts and trousers.

Syntax

SELECT Shirts.ShirtID, Shirts.ShirtColour, Trousers.TrouserID, Trousers.TrouserColour

FROM Shirts

CROSS JOIN Trousers;

Explanation

  • SELECT Shirts.ShirtID, Shirts.ShirtColour, Trousers.TrouserID, Trousers.TrouserColourr selects columns from both tables to show.
  • FROM Shirts specifies the first table (Shirts).
  • CROSS JOIN Trousers performs the cross join, creating a combination of every row in the Shirts table with every row in the Trousers table.

A cross join gives you all possible pairings, which can be useful in scenarios where you want to compare or mix different sets of data without needing a specific relationship between the two.

11. Combining Joins

Combining joins is when you use more than one JOIN in a SQL query to bring together data from multiple tables based on related columns. The goal is to merge data by specifying how different tables relate to each other, often using keys like IDs.

Let’s say you have 3 tables: 

We perform the query:

SELECT customers.name, products.name, orders.order_date

FROM orders

INNER JOIN customers ON orders.customer_id = customers.id

INNER JOIN products ON orders.product_id = products.id;

This would return:

Each table plays a role:

  • Customers provides the customer names.
  • Products gives the product names.
  • Orders links customers and products together with order dates.

This is how combining multiple joins works in practice with the tables.

Conclusion

Understanding SQL join types is key to working effectively with data from multiple tables. Whether you're using inner joins to find exact matches or outer joins to get a full picture, each type of join has its own unique purpose. Once you get the hang of it, SQL joins become a powerful tool in your data toolkit. So next time you're working with different tables, you’ll know exactly which join type to use to get the answers you need. If you liked this SQL tutorial, all this and more are included in our free SQL course. Happy querying!

SIMILAR BLOGS

Interested in Writing for Us?

Share your expertise, inspire others, and join a community of passionate writers. Submit your articles on topics that matter to our readers. Gain visibility, grow your portfolio, and make an impact.
Join Now