Introduction
Ever felt like you're juggling multiple tables of data and wish there was an easier way to piece it all together? Welcome to the world of SQL joins! In this SQL joins tutorial, we’ll show you how to merge data from different tables without the confusion. Whether you're new to SQL or just need a refresher, this guide breaks down the different types of joins—minus the tech jargon and complicated explanations. From inner to outer joins and everything in between, we’ll walk through how they can help you get the answers you need from your data. Ready to dive in? Let’s get started!
Primary Keys and Foreign keys
In a relational database management system (RDBMS), tables are often connected through primary keys and foreign keys. A primary key is a unique identifier for each row in a table. It ensures that no two rows have the same value in this column (or set of columns), and every row must have a value (i.e., no NULLs).
A foreign key is a column (or set of columns) in one table that points to the primary key in another table. It creates a relationship between two tables, ensuring referential integrity (i.e., the foreign key value in one table must exist as a primary key value in the referenced table).
Join types
While there are many Join techniques in SQL, there are 5 primary Join types.
- Inner Join
- Left Join or Left Outer Join
- Right Join or Right Outer Join
- Full Outer Join
- Cross Join
1. Inner Join
An inner join in SQL can be thought of as finding the common ground between two tables. Imagine you have two lists (or tables) of information, and you want to combine the data only where there’s a match between them.
Here’s a simple analogy:
Example Scenario
You have two lists:
- List A: A list of people and their favorite colours.
- List B: A list of people and their favorite fruits.
Now, you want to create a new list that only includes people who appear on both lists.
Syntax
SELECT *
FROM FavoriteColours
INNER JOIN FavoriteFruits
ON FavoriteColours.StudentID = FavoriteFruits.StudentID;
In this case, an inner join would give you a new list that only includes John and Alice (because they are in both lists). Bob and Eve are excluded because they don't have a match in the other list.
Explanation
- SELECT * Retrieves all columns from both tables involved in the join.
- FROM FavoriteColours Specifies the first table (FavoriteColours).
- INNER JOIN FavoriteFruits Joins the FavoriteColors table with the FavoriteFruits table.
- ON FavoriteColors.StudentID = FavoriteFruits.StudentID Specifies the join condition based on the StudentID column.
Key Points
- Inner join looks for rows in both tables that match based on a common column (in this case, the Name).
- If there’s no match, those rows are left out.
- You only get the rows where data is present in both tables.
In short, use inner joins to connect related information from two different sources, but only when that information overlaps.
2. Natural Joins
A natural join is like an inner join with a twist: it automatically looks for columns in two tables that have the same name and joins the tables based on those matching columns without you having to specify the condition manually.
Example Scenario
You have two lists (tables) about students. You're going to combine them based on some shared details, but you don’t need to say what that detail is—SQL figures it out on its own.
StudentInfo:
StudentGrades
Syntax
SELECT *
FROM StudentInfo
NATURAL JOIN StudentGrades;
Explanation
- SELECT * Selects all columns from both tables.
- FROM StudentInfo Specifies the first table (`StudentInfo`).
- NATURAL JOIN StudentGrades Joins `StudentInfo` with `StudentGrades` based on columns with the same names (`StudentID` in this case).
Result
This query will automatically join the tables based on the `StudentID` column and return only the rows where there is a match in both tables.
Summary
- The natural join automatically detects and uses the column(s) with the same name in both tables to perform the join.
- It simplifies the query by removing the need to specify join conditions manually.
3. Left (outer) Join
A left join (or left outer join) is a way to combine two tables in SQL. It’s like saying, "Give me everything from the left table, and match it with data from the right table. If there's no match, still include the left table’s data and put NULL for the missing parts from the right table."
Example Scenario
Imagine you’re having a party and you make a guest list (left table). You also want to know which guests have RSVP'd (right table). The left join helps you see all your guests and also shows whether they’ve RSVP'd or not. If a guest hasn't RSVP'd, you’ll still see their name on the list, but the RSVP status will be blank. You have two lists:
You want to see all guests and whether they’ve RSVP'd. If they haven’t RSVP'd, you still want to see their name.
Syntax
SELECT GuestList.GuestID, GuestList.Name, RSVPs.RSVPStatus
FROM GuestList
LEFT JOIN RSVPs
ON GuestList.GuestID = RSVPs.GuestID;
Explanation
SELECT GuestList.GuestID, GuestList.Name, RSVPs.RSVPStatus selects the columns to show.
FROM GuestList specifies the left table (GuestList).
LEFT JOIN RSVPs joins the GuestList with the RSVPs table.
ON GuestList.GuestID = RSVPs.GuestID matches rows based on the GuestID column.
This query gives you a complete list of guests with their RSVP status, showing NULL where there’s no matching RSVP.
4. Left excluding Join
A left excluding join (sometimes called a left anti join) is used to find rows in the left table that do not have a match in the right table. It’s like saying, "Give me everything from the left table, but exclude anything that matches with the right table."
Using the same example, Left excluding join gives us a different result:
Syntax
SELECT InvitedGuests.GuestID, InvitedGuests.Name
FROM InvitedGuests
LEFT JOIN RSVPs
ON InvitedGuests.GuestID = RSVPs.GuestID
WHERE RSVPs.GuestID IS NULL;
Explanation
The result includes only those guests from the InvitedGuests table who do not have a matching GuestID in the RSVPs table. Here, Bob is included because he hasn’t RSVP'd, whereas John and Alice are excluded as they have matches in the RSVPs table.
5. Right (outer) Join
A right excluding join (or right anti join) is used to find rows in the right table that do not have a match in the left table. It’s like saying, "Give me everything from the right table, but exclude anything that matches with the left table." In other words, it will give you the same output as a left outer join if you reverse the positions of the “left” and “right tables”.
Example Scenario
Continuing with the party theme, imagine you’re hosting a party and you have a list of guests who have RSVP'd (right table). You also have a list of all the invited guests (left table). You want to see all the RSVP'd guests and any information about them from the invitation list if available. If an RSVP'd guest is not on the invitation list, you still want to see their RSVP status.
You want to see all the guests who have RSVP'd, and also see if they are on the invitation list. If a guest has RSVP'd but is not on the invitation list, you still want to see their RSVP status.
Syntax
SELECT RSVPs.GuestID, RSVPs.RSVPStatus, InvitedGuests.Name
FROM RSVPs
RIGHT JOIN InvitedGuests
ON RSVPs.GuestID = InvitedGuests.GuestID;
Explanation
- SELECT RSVPs.GuestID, RSVPs.RSVPStatus, InvitedGuests.Name selects the columns to show from both tables.
- FROM RSVPs specifies the right table (RSVPs).
- RIGHT JOIN InvitedGuests joins the RSVPs table with the InvitedGuests table, focusing on including all rows from the right table.
- ON RSVPs.GuestID = InvitedGuests.GuestID specifies the condition for the join based on GuestID.
6. Right excluding Join
A right excluding join (or right anti join) is used to find rows in the right table that do not have a match in the left table. It’s like saying, "Give me everything from the right table, but exclude anything that matches with the left table."
Using the above example, this time you only want to find out which guests who have RSVP'd were not in the invitation list.
SELECT RSVPs.GuestID, RSVPs.RSVPStatus
FROM RSVPs
RIGHT JOIN InvitedGuests
ON RSVPs.GuestID = InvitedGuests.GuestID
WHERE InvitedGuests.GuestID IS NULL;
Explanation
- SELECT RSVPs.GuestID, RSVPs.RSVPStatus selects the columns to show from the right table.
- FROM RSVPs specifies the right table (RSVPs).
- RIGHT JOIN InvitedGuests joins the RSVPs table with the InvitedGuests table.
- ON RSVPs.GuestID = InvitedGuests.GuestID specifies the condition for the join based on GuestID.
- WHERE InvitedGuests.GuestID IS NULL filters the result to include only those rows from the right table where there was no match in the left table, effectively excluding any RSVPs that are in the invited list.
The output will show:
7. Full (outer) Join
A full outer join combines rows from both tables being joined, including all rows from both the left table and the right table. It gives you a complete view, showing all the rows where there is a match between the tables, as well as all rows where there is no match. If a row from one table has no corresponding row in the other table, the result will still include that row, with NULL values for the missing side.
Example Scenario
Imagine you’re organising a party and have two lists:
- List of Invited Guests (left table)
- List of RSVP'd Guests (right table)
You want to see everyone who was either invited or who has RSVP'd, and see if they match. If someone was invited but did not RSVP, or if someone RSVP'd but was not on the invitation list, you still want to see their details.
You have these two lists:
List A (Invited Guests):
You want to see a complete list of both invited guests and RSVP'd guests, including those who don’t match between the two lists.
SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus
FROM InvitedGuests
FULL OUTER JOIN RSVPs
ON InvitedGuests.GuestID = RSVPs.GuestID;
Explanation
- SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus selects the columns to show from both tables.
- FROM InvitedGuests specifies the left table (InvitedGuests).
- FULL OUTER JOIN RSVPs joins the InvitedGuests table with the RSVPs table, including all rows from both tables.
- ON InvitedGuests.GuestID = RSVPs.GuestID specifies the condition for the join based on GuestID.
Here:
- John and Carol are included because they match between the two lists.
- Alice is included because she was invited but didn’t RSVP.
- The guest with ID 3 is included because they RSVP'd but were not on the invitation list.
8. Outer excluding Join
An outer excluding join (also known as a full anti join) retrieves rows from both tables where there is no match between them. It’s like saying, “Give me all the rows that exist in one table but don’t have a corresponding row in the other table.” It excludes any rows where there’s a match between the two tables.
Explanation Scenario
Imagine you have two lists:
- List of Invited Guests (left table)
- List of RSVP'd Guests (right table)
You want to find out who is either on the invitation list but hasn’t RSVP'd, or who has RSVP'd but wasn’t invited. You’re not interested in guests who both appear on the invitation list and have RSVP'd — only those who are on one list and not the other.
List A (Invited Guests):
Syntax
SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus
FROM InvitedGuests
FULL OUTER JOIN RSVPs
ON InvitedGuests.GuestID = RSVPs.GuestID
WHERE InvitedGuests.GuestID IS NULL
OR RSVPs.GuestID IS NULL;
Explanation
- SELECT InvitedGuests.GuestID, InvitedGuests.Name, RSVPs.RSVPStatus
This part selects the columns you want to display from both tables. You’re asking for:some text- InvitedGuests.GuestID and InvitedGuests.Name from the InvitedGuests table.
- RSVPs.RSVPStatus from the RSVPs table.
- FROM InvitedGuests
This specifies the left table, which is the InvitedGuests table. The query starts from here. - FULL OUTER JOIN RSVPs
This means you’re performing a full outer join between the two tables. A full outer join includes all rows from both the left table (InvitedGuests) and the right table (RSVPs), even if there’s no match between them. - ON InvitedGuests.GuestID = RSVPs.GuestID
This is the condition that specifies how the two tables should be matched. It says to match rows where the GuestID in InvitedGuests is equal to the GuestID in RSVPs.
- WHERE InvitedGuests.GuestID IS NULL OR RSVPs.GuestID IS NULL
This is the filter that ensures you only get the non-matching rows. It excludes any rows where both tables have matching GuestIDs, leaving only the rows that:
- Exist in the InvitedGuests table but not in RSVPs (where RSVPs.GuestID is NULL).
- Exist in the RSVPs table but not in InvitedGuests (where InvitedGuests.GuestID is NULL).
Summary
- This query first joins both tables to get all possible rows, including matches and non-matches.
- Then, the WHERE clause filters out any matching rows, keeping only the entries that exist exclusively in one of the two tables.
9. Self Join
A self join is when a table is joined with itself. It’s used when you want to compare rows within the same table or find relationships between rows in a single table. This is especially helpful when there’s hierarchical or relational data in one table.
Example Scenario
You have a table of employees, where each employee has a ManagerID that refers to another employee in the same table.
Employees Table
Syntax
SELECT e1.EmployeeID, e1.Name AS EmployeeName, e2.Name AS ManagerName
FROM Employees e1
LEFT JOIN Employees e2
ON e1.ManagerID = e2.EmployeeID;
Explanation
- SELECT e1.EmployeeID, e1.Name AS EmployeeName, e2.Name AS ManagerName: This selects the employee's ID and name from the first instance of the table (e1), and the manager's name from the second instance (e2).
- FROM Employees e1: This specifies the first instance of the Employees table and gives it the alias e1.
- LEFT JOIN Employees e2: This joins the table Employees to itself, giving the second instance the alias e2. The LEFT JOIN ensures that employees without managers (like Carol) are still included in the result.
- ON e1.ManagerID = e2.EmployeeID: This specifies the condition for the join, where the ManagerID in the first instance of the table (e1) is matched to the EmployeeID in the second instance (e2), essentially connecting employees to their managers.
This query produces a list that shows each employee along with their manager’s name by joining the table to itself.
10. Cross Join
A cross join combines every row from one table with every row from another table, creating a “Cartesian product” of the two tables. This means you’ll get all possible combinations of rows from both tables. It doesn't require any condition or matching column like other joins do.
Example scenario
Imagine you have two sets of things:
- Set A: A list of different shirts.
- Set B: A list of different trousers.
A cross join would create all possible outfit combinations by pairing every shirt with every trouser.
You have 2 lists:
A cross join would give you every combination of shirts and trousers.
Syntax
SELECT Shirts.ShirtID, Shirts.ShirtColour, Trousers.TrouserID, Trousers.TrouserColour
FROM Shirts
CROSS JOIN Trousers;
Explanation
- SELECT Shirts.ShirtID, Shirts.ShirtColour, Trousers.TrouserID, Trousers.TrouserColourr selects columns from both tables to show.
- FROM Shirts specifies the first table (Shirts).
- CROSS JOIN Trousers performs the cross join, creating a combination of every row in the Shirts table with every row in the Trousers table.
A cross join gives you all possible pairings, which can be useful in scenarios where you want to compare or mix different sets of data without needing a specific relationship between the two.
11. Combining Joins
Combining joins is when you use more than one JOIN in a SQL query to bring together data from multiple tables based on related columns. The goal is to merge data by specifying how different tables relate to each other, often using keys like IDs.
Let’s say you have 3 tables:
We perform the query:
SELECT customers.name, products.name, orders.order_date
FROM orders
INNER JOIN customers ON orders.customer_id = customers.id
INNER JOIN products ON orders.product_id = products.id;
This would return:
Each table plays a role:
- Customers provides the customer names.
- Products gives the product names.
- Orders links customers and products together with order dates.
This is how combining multiple joins works in practice with the tables.
Conclusion
Understanding SQL join types is key to working effectively with data from multiple tables. Whether you're using inner joins to find exact matches or outer joins to get a full picture, each type of join has its own unique purpose. Once you get the hang of it, SQL joins become a powerful tool in your data toolkit. So next time you're working with different tables, you’ll know exactly which join type to use to get the answers you need. If you liked this SQL tutorial, all this and more are included in our free SQL course. Happy querying!