SQL JOIN

  

/////////////////////////////////////////////////////////////////////////////////////////////////

1. What is an SQL JOIN command, and when do you need it?

The SQL JOIN command is used to combine data from two tables in SQL. The JOIN clause is often used when tables have at least one column of data in common.

Typically, the JOIN condition is an equality between columns from the different tables, but other JOIN conditions are also possible. You can join more than two tables by using consecutive JOIN clauses.

There are different types of JOINs: INNER JOINLEFT JOINRIGHT JOINFULL JOIN, and others. The function of the JOIN command is illustrated by this image:

Inner JOIN

2. How would you write a query to JOIN these two tables?

During the interview process, you may be tasked with applying your knowledge to a practical scenario by writing a JOIN command. Let’s look at an example so that you can solve this problem with ease.

We have two tables:

  • employees - This table contains each employee’s ID, name, and department ID.

idemployee_namedepartment_id
1Homer Simpson4
2Ned Flanders1
3Barney Gumble5
4Clancy Wiggum3
5Moe SyzslakNULL
  • departments - This table contains each department’s ID and name.

iddepartment_name
1Sales
2Engineering
3Human Resources
4Customer Service
5Research And Development

If you have been asked to JOIN tables, try to find a column that exists in each of the tables. In this example, it is the department_id column.

SELECT *
FROM employees
JOIN departments
ON employees.department_id = departments.department_id;

Executing this code will produce the following result:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research And Development
4Clancy Wiggum33Human Resources

The ON condition indicates how the two tables (the one after FROM and the one after JOIN) should be combined. You can see in the example above that both tables contain the column department_id. Our SQL query will return rows where the employees.department_id is equal to the departments.department_id.

Sometimes relational fields are slightly less obvious. For example, you might have a table called employees with a field called id, which could be joined against employee_id in any other table.

You can also specify what exact columns you would like to return from each of the tables included in your JOIN clause. When you include a column name that exists in both tables, you must specify the exact table you want to retrieve it from.

We cannot write department_id because this would cause an ambiguity error in SQL. We must write employees.department_id or departments.department_id. For example, we could write:

SELECT
employees.department_id, employee_name, department_name
FROM employees
JOIN departments
ON employees.department_id = departments.department_id;

Notice our SELECT statement. We specified the exact table name for the department_id column because this column exists in both of the tables that make up our JOIN clause. We don’t have to do this for the columns employee_name or department_name because these are unique. Executing this SQL query yields the following result set:

idemployee_namedepartment_name
1Ned FlandersSales
3Clancy WiggumHuman Resources
4Homer SimpsonCustomer Service
5Barney GumbleResearch And Development

When writing our SQL JOIN clauses, we can also employ the use of SQL aliases. Column names can be quite technical and not very understandable. This can make the query’s output difficult to understand. Here are some rules to follow when implementing an SQL alias:

  • To give a column a descriptive name, you can use a column alias.
  • To assign an alias to a column, use the AS keyword followed by the alias.
  • If the alias contains spaces, you must quote it.

An SQL alias can be applied to both table names and column names. If we rewrite our previous query to include an alias for each column name, it may look something like this:

SELECT
employees.department_id AS ID,
employee_name AS ‘Employee Name’,
department_name AS Department
FROM employees
JOIN departments
ON employees.department_id = departments.department_id;

Notice how we had to use quotes for our ‘Employee Name’ column because this new name contains spaces.

If we rewrite our above code, this time using an alias for each table name, we get the following:

SELECT *
FROM employees AS emp
JOIN departments AS dep
ON emp.department_id = dep.department_id;

The AS keyword used here is also completely optional. You can omit it from the statement. Implementing this small change results in our code looking like so:

SELECT *
FROM employees emp
JOIN departments dep
ON emp.department_id = dep.department_id;

This should be all the information you need to JOIN two tables and answer any follow-up questions you might be asked regarding the basic JOIN syntax.

3. What types of JOINs are there?

As mentioned in the introduction to this article, there are many varieties of the SQL JOIN clause. Demonstrating that you have mastery of each command is one method of displaying your SQL JOIN knowledge. Here are some of the most common types of JOIN clauses you will encounter:

SQL INNER JOIN

The INNER JOIN clause is the default JOIN clause in SQL. If you look at our previous example (SELECT * FROM employees JOIN departments), this was actually an INNER JOIN.

The INNER JOIN is used to return rows from both tables that satisfy the given condition. The INNER JOIN matches rows from the first and second tables that satisfy the ON condition.

This image demonstrates the relationship between the two tables included in our INNER JOIN clause:

Inner JOIN

Let’s explore the INNER JOIN syntax and functionality further by looking at a practical example using the two tables, employees and departments, described above.

The following SQL code looks for matches between the employees and departments tables based on the department_id column.

SELECT from employees emp
INNER JOIN departments dep
ON emp.department_id = dep.department_id;

Executing this code will produce the following result:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research And Development
4Clancy Wiggum33Human Resources

When looking at our result, you will notice our employee, Moe Szyslak, is missing. In our employees table, this employee has no current department_id. Therefore, no match could be found when you try to JOIN the departments table on this column. Thus, the employee is excluded from the result. We’ll fix this problem with the next JOIN type, LEFT JOIN.

If you want to see more examples of INNER JOINs, looking at an article with visual, easy-to-understand examples can help you grasp this complex topic.

SQL LEFT JOIN

Similar to the INNER JOIN clause, the LEFT JOIN allows you to query data from two tables. But what is the key difference between LEFT JOIN and INNER JOIN? A LEFT JOIN returns all the rows that are in the first (left) table listed. Matching rows from the right table are also returned.

When you use the LEFT JOIN clause, the concepts of the left table and the right table are introduced.

LEFT JOIN

In the diagram above, Table 1 is the left table, and Table 2 is the right table.

The LEFT JOIN clause selects data starting from the left table. It matches each row from the left table with rows from the right table based on the condition of the JOIN clause.

The SQL LEFT JOIN clause returns all rows from the left table, even if there are no matches to be found in the right table. This means that if the ON clause matches no records in the right table, the JOIN will still return a row in the result but with NULL in each column from the right table.

An SQL LEFT JOIN returns all the values from the left table, plus matched values from the right table. If no match could be found, LEFT JOIN returns a NULL value instead.

The syntax for our SQL LEFT JOIN clause is as follows:

SELECT FROM employees emp
LEFT JOIN departments dep
ON emp.department_id = dep.department_id;

We specify we want a LEFT JOIN. This will be the same for all JOIN types. Specify which variant of JOIN you are using before the JOIN keyword.

The ON keyword works the same as it did for our INNER JOIN example. We are looking for matching values between the department_id column of our employees table and the department_id column of our departments table.

Here, our employees table will act as the left table because this is the first table we specify.

The result from executing this SQL query would be the following result set:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research And Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL

Notice how our employee, Moe Szyslak, has been included in the result set, even though there is not a matching department_id in the departments table. This is exactly the purpose of the LEFT JOIN clause, to include all the data from our left table, regardless of whether any matches were found.

SQL RIGHT JOIN

RIGHT JOIN is similar to LEFT JOIN, except that the action performed on the joined tables is reversed. Essentially, it executes the opposite action of the LEFT JOIN. This means that a RIGHT JOIN returns all the values from the right table, plus matched values from the left table or NULL in case of no matching JOIN predicate.

In the diagram below, Table 2 is our right table, and Table 1 is our left table:

RIGHT JOIN

When we apply the following code to our employees and departments tables:

SELECT FROM employees emp
RIGHT JOIN departments dep
ON emp.department_id = dep.department_id;

The syntax is similar to that of the LEFT JOIN. We specify that we want to perform a RIGHT JOIN, specifically looking for matches between the departments table and the employees table.

Here, our employees table will act as the left table, as this is the first table we specify. The departments table will be the right table. The result from executing this SQL JOIN query would be the following result set:

idemployee_namedepartment_iddepartment_iddepartment_name
2Ned Flanders11Sales
NULLNULLNULL2Engineering
4Clancy Wiggum33Human Resources
1Homer Simpson44Customer Service
3Barney Gumble55Research And Development

The RIGHT JOIN starts selecting data from the right table (departments). It matches each row from the right table with every row from the left table. If both rows cause the JOIN condition to evaluate to true, it combines the columns into a new row and includes this new row in the result set.

SQL FULL JOIN

The SQL FULL JOIN combines the results of both left and right outer joins. The joined table will contain all records from both the tables and fill in NULL values for missing matches on either side.

Be aware that a FULL JOIN can potentially return a very large dataset. A FULL JOIN returns all the rows from the joined tables, whether they are matched or not.

The SQL FULL JOIN is a type of OUTER JOIN (we’ll look at these later in the article) which is why it can also be referred to as a FULL OUTER JOIN.

Here is the concept of an SQL FULL JOIN clearly illustrated:

FULL JOIN

Notice how in our diagram, every row from both tables is returned.

Let’s look at the syntax of the SQL FULL JOIN clause by looking at some example code.

SELECT FROM employees emp
FULL JOIN departments dep
ON emp.department_id = dep.department_id;

When this SQL query is executed against our employees and departments tables, it produces the following result:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research And Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL
2Ned Flanders11Sales
NULLNULL2Engineering
4Clancy Wiggum33Human Resources
1Homer Simpson44Customer Service
3Barney Gumble55Research And Development

Compare this result set with the results of our LEFT JOIN and RIGHT JOIN. You will see how this data is a combination of the data returned from our previous examples. This specific type of JOIN clause produces a vast data set. Think carefully before using the FULL JOIN clause.

CROSS JOIN

An SQL CROSS JOIN is used when you need to find out all the possibilities of combining two tables, where the result set includes every row from each contributing table. The CROSS JOIN clause returns the Cartesian product of rows from the joined tables.

The diagram below is a good illustration of how the rows are combined:

CROSS JOIN

Using a CROSS JOIN produces a result set where its size is the number of rows in the first table multiplied by the number of rows in the second table. This type of result is called the Cartesian Product of two tables (Table 1 x Table 2).

Let us look at our two tables from earlier:

  • The employees table
idemployee_namedepartment_id
1Homer Simpson4
2Ned Flanders1
3Barney Gumble5
4Clancy Wiggum3
5Moe SyzslakNULL
  • The departments table
idemployee_name
1Sales
2Engineering
3Human Resources
4Customer Service
5Research And Development

To perform a CROSS JOIN using these tables, we would write an SQL query like so:

SELECT FROM employees
CROSS JOIN departments;

Notice how CROSS JOIN does not use ON or USING when it is being declared. This is different from the JOIN clauses we have previously looked at.

After performing a CROSS JOIN, the result set would look as follows:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson41Sales
2Ned Flanders11Sales
3Barney Gumble51Sales
4Clancy Wiggum31Sales
5Moe SzyslakNULL1Sales
1Homer Simpson42Engineering
2Ned Flanders12Engineering
3Barney Gumble52Engineering
4Clancy Wiggum32Engineering
5Moe SzyslakNULL2Engineering
1Homer Simpson43Human Resources
2Ned Flanders13Human Resources
3Barney Gumble53Human Resources
4Clancy Wiggum33Human Resources
5Moe SzyslakNULL3Human Resources
1Homer Simpson44Customer Service
2Ned Flanders14Customer Service
3Barney Gumble54Customer Service
4Clancy Wiggum34Customer Service
5Moe SzyslakNULL4Customer Service
1Homer Simpson45Research And Development
2Ned Flanders15Research And Development
3Barney Gumble55Research And Development
4Clancy Wiggum35Research And Development
5Moe SzyslakNULL5Research And Development

Our result set contains every combination possible between the two tables. Even when the tables used have little data, such as our employees and departments tables, it can produce a massive result set when they are used in conjunction with the SQL CROSS JOIN clause.

SQL NATURAL JOIN

NATURAL JOIN is a type of JOIN that combines tables based on columns with the same name and data type. When you use the NATURAL JOIN clause, it creates an implicit JOIN clause for you based on the common columns in the two tables being joined.

Common columns are columns that have the same name in both tables. There is no need to specify the column names to join. The resulting table will not contain any repeated columns.

The syntax for a NATURAL JOIN is simple:

SELECT FROM employees
NATURAL JOIN departments;

When this query is executed, it will produce the following result set:

department_ididemployee_namedepartment_name
12Ned FlandersSales
34Clancy WiggumHuman Resources
41Homer SimpsonCustomer Service
53Barney GumbleResearch And Development

The NATURAL JOIN is performed on the column that is shared between our two tables. In this case, it is the department_id column. This matched column is only displayed once in our result set.

4. What is an OUTER JOIN?

With an SQL OUTER JOIN, unmatched rows in one or both tables can be returned. There are several variations of the OUTER JOIN clause, some of which we have covered already in this article. Here are the common types of OUTER JOIN clauses:

  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN

LEFT JOIN is a synonym for LEFT OUTER JOIN. The functionality of both is identical. This may be one of the SQL JOIN interview questions you are asked! The same can be said for RIGHT JOIN and RIGHT OUTER JOIN, and FULL JOIN and FULL OUTER JOIN. Let’s look at an example for each.

SQL LEFT OUTER JOIN

Use a LEFT OUTER JOIN when you want all the results that are in the first table listed. A LEFT OUTER JOIN will return only matching rows from the second table.

The syntax for the LEFT OUTER JOIN clause is as follows:

SELECT FROM employees emp
LEFT OUTER JOIN departments dep
ON emp.department_id = dep.department_id;

The result from executing this SQL query would be the following result set:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research And Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL

Notice how our employee, Moe Syzslak, has been included in the result set even though there is not a matching department_id in the departments table. This is exactly the purpose of the LEFT OUTER JOIN clause, to include all the data from our left table, regardless of whether any matches were found.

SQL RIGHT OUTER JOIN

RIGHT OUTER JOIN is similar to LEFT OUTER JOIN, except that the action performed to the joined tables is reversed. It essentially performs the opposite action of the LEFT OUTER JOIN. This means that a RIGHT OUTER JOIN returns all the values from the right table, plus matched values from the right table or NULL in case of no matching

When we apply the RIGHT OUTER JOIN to our employees and departments tables, the code looks as follows:

SELECT FROM employees emp
RIGHT OUTER JOIN departments dep
ON emp.department_id = dep.department_id;

Here, our employees table will act as the left table because this is the first table we specify.

The result from executing this SQL query would be the following result set:

idemployee_namedepartment_iddepartment_iddepartment_name
2Ned Flanders11Sales
NULLNULLNULL2Engineering
4Clancy Wiggum33Human Resources
1Homer Simpson44Customer Service
3Barney Gumble55Research And Development

The RIGHT OUTER JOIN starts selecting data from the right table, in this case, our departments table. It matches each row from the right table with every row from the left table. If both rows cause the JOIN condition to evaluate to true, it combines the columns into a new row and includes this new row in the result set.

SQL FULL OUTER JOIN

The SQL FULL OUTER JOIN combines the results of both left and right outer joins. The joined table will contain all records from both the tables and fill in NULLs for missing matches on either side. A FULL OUTER JOIN returns all the rows from the joined tables, whether they are matched or not.

Let’s look at the syntax of the SQL FULL OUTER JOIN clause:

SELECT FROM employees emp
FULL OUTER JOIN departments dep
ON emp.department_id = dep.department_id;

When this SQL query is executed against our employees and departments tables, it produces the following result:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research And Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL
2Ned Flanders11Sales
NULLNULL2Engineering
4Clancy Wiggum33Human Resources
1Homer Simpson44Customer Service
3Barney Gumble55Research And Development

You will notice that this dataset is a combination of our previous LEFT OUTER JOIN and RIGHT OUTER JOIN queries.

Common SQL Interview Questions Answered

You are now equipped with the knowledge required to answer complex SQL JOIN interview questions. If you still feel overwhelmed or unsure about SQL JOIN clauses, there is excellent advice about the best approach to practicing SQL JOIN clauses here.

You can use this in combination with the SQL JOIN cheat sheet, which acts as a great reference tool for both new and experienced SQL programmers. Whether you are new to SQL or feel like you need to refresh your knowledge of the topic, this interactive SQL JOINs course serves as a wonderful learning resource.

5. What is the difference between an SQL INNER JOIN and an SQL LEFT JOIN?

There are some key differences to remember about these commonly used JOIN variants. INNER JOIN returns rows when there is a match in both tables. LEFT JOIN returns all of the rows from the left table and any matching rows from the right table.

Let’s look at a practical example to explore the differences between these clauses. This will help you confidently answer this common SQL JOIN interview question.

Imagine we have two tables:

  • employees - This table contains each employee’s ID, name, and department ID.
idemployee_namedepartment_id
1Homer Simpson4
2Ned Flanders1
3Barney Gumble5
4Clancy Wiggum3
5Moe SyzslakNULL
  • departments - This table contains each department’s ID and name.
iddepartment_name
1Sales
2Engineering
3Human Resources
4Customer Service
5Research and Development

The following SQL code looks for matches between the employees and departments tables based on the department_id column:

SELECT from employees emp
INNER JOIN departments dep
ON emp.department_id = dep.department_id;

Executing this code will produce the following result:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research and Development
4Clancy Wiggum33Human Resources

When looking at our result, you will notice our employee, Moe Szyslak, is missing. In our employees table, this employee has no current department_id. Therefore, no match could be found when you try to join the departments table on this column. Thus, the employee is excluded from the result.

Now, let’s use a LEFT JOIN and see what result that produces. An SQL LEFT JOIN returns all the values from the left table, plus matched values from the right table. If no match could be found, LEFT JOIN returns a NULL value.

The syntax for our SQL LEFT JOIN clause is as follows:

SELECT FROM employees emp
LEFT JOIN departments dep
ON emp.department_id = dep.department_id;

The ON keyword works the same as it did for our INNER JOIN example. We are looking for matching values between the department_id column of our employees table and the department_id column of our departments table.

Here, our employees table will act as the left table because this is the first table we specify.

The result of executing this SQL query is the following result set:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research and Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL

Notice how Moe Szyslak has been included in this result set, even though there is not a matching department_id in the departments table. This is exactly the purpose of the LEFT JOIN clause, to include all the data from our left table, regardless of whether any matches were found.

6. What is the difference between a LEFT JOIN and a FULL JOIN?

This is one of the SQL JOIN questions that you may face throughout the interview process.

As we mentioned before, an SQL LEFT JOIN returns all the values from the left table, plus matched values from the right table. If no match is found, LEFT JOIN returns a NULL value instead. An SQL FULL JOIN returns all the rows from the joined tables, whether they are matched or not. It essentially combines the functionality of LEFT JOIN and RIGHT JOIN.

Let’s compare the result set of a LEFT JOIN clause to the result set of a FULL JOIN.

Below is a query that makes use of LEFT JOIN:

SELECT FROM employees emp
LEFT JOIN departments dep
ON emp.department_id = dep.department_id;

Here, our employees table will act as the left table because this is the first table we specify.

The result of executing this SQL query is:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research and Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL

Let’s look at how this compares to using an SQL FULL JOIN. The syntax is similar, as demonstrated by this code:

SELECT FROM employees emp
FULL JOIN departments dep
ON emp.department_id = dep.department_id;

When this SQL query is executed against our employees and departments tables, it produces the following result:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research and Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL
NULLNULLNULL2Engineering

Compare this result set with the results of our LEFT JOIN and RIGHT JOIN queries. You can see that no match was found for the Engineering department, but it was still returned. It’s clear this data is a combination of the data returned from our previous examples. This specific type of JOIN clause produces a vast data set. Think carefully before using an SQL FULL JOIN.

7. Write a query that will JOIN these two tables so that all rows from Table 1 are in the result.

When interviewing for a data analyst or software developer role, you may be asked to complete a technical challenge involving SQL. A common SQL JOIN interview task is writing a query that will join two tables in a certain way. Let’s imagine that you are asked to write a query that will join two tables so that all rows from Table 1 are in the result.

First, you must understand the concept of right and left tables.

LEFT JOIN

In the diagram above, Table 1 is the left table and Table 2 is the right table. In other words, the left table comes first in the query; it gets its name from being to the left of the join condition. The right table comes after the JOIN keyword.

The LEFT JOIN clause selects data starting from the left table. It matches each row from the left table with rows from the right table, based on the condition of the JOIN clause. It returns all the values from the left table, plus matched values from the right table. If no match is found, LEFT JOIN returns a NULL value. This means that if the ON clause matches no records in the right table, the JOIN will still return that row, but with a NULL in each column from the right table.

For our practical example, we will use the employees and departments tables from our previous example:

employees - This table contains each employee’s ID, name, and department ID.

idemployee_namedepartment_id
1Homer Simpson4
2Ned Flanders1
3Barney Gumble5
4Clancy Wiggum3
5Moe SyzslakNULL

departments - This table contains each department’s ID and name.

iddepartment_name
1Sales
2Engineering
3Human Resources
4Customer Service
5Research and Development

If we want to keep all of the rows from Table 1 (in this case, employees) we must specify this as our left table.

The syntax for this LEFT JOIN clause is as follows:

SELECT FROM employees emp
LEFT JOIN departments dep
ON emp.department_id = dep.department_id;

Executing this query yields this result set:

idemployee_namedepartment_iddepartment_iddepartment_name
1Homer Simpson44Customer Service
2Ned Flanders11Sales
3Barney Gumble55Research and Development
4Clancy Wiggum33Human Resources
5Moe SzyslakNULLNULLNULL

Notice that the employee Moe Szyslak has been included in this result set – even though there is no matching department_id in the departments table. This is exactly the purpose of the LEFT JOIN clause, to include all the data from our left table, regardless of whether any matches were found in the right table.

8. How do you join more than two tables?

Joining more than two tables in a single SQL query can be quite difficult for newcomers to understand. The following example should make it clear.

You perform a JOIN on more than two tables when the data you want to include in the result exists in three or more tables. A multi-table join requires consecutive JOIN operations: first, you join the first and the second table and get a virtual result set; then you join another table to this virtual table. Let’s see an example.

For our multiple JOIN example, let’s imagine we have three tables:

departments - This table contains each department’s ID and name.

iddepartment_name
1Sales
2Engineering
3Human Resources
4Customer Service
5Research and Development

office - This table contains the address for each office.

idaddress
15 Wisteria Lane, Springfield, USA
2124 Chestmount Street, Springfield, USA
36610 Bronzeway, Springfield, USA
4532 Executive Lane, Springfield, USA
510 Meadow View, Springfield, USA

department_office - This table links the office information to the associated department. Some departments may span multiple offices.

office_iddepartment_id
11
23
32
44
55
21
51
43

In our case, we have used a link table called department_office which links or relates departments to offices.

To write an SQL query that prints the department_name and address attributes alongside each other, we need to join three tables:

  • The first JOIN clause will join departments and department_office and create a temporary table that will have an office_id column.
  • The second JOIN statement will join this temporary table with the office table on office_id to get the desired result.

Examine the SQL query below:

SELECT department_name, address
FROM departments d
JOIN department_office do ON d.department_id=do.department_id
JOIN office o ON do.office_id=o.id;

You can see we just want to retrieve two columns, the department name and the associated address. We join the department_office table, which has a link to both our departments and office tables. This enables us to then join the office table, which contains the address column in our SELECT statement.

Executing this code yields the following result set:

department_nameaddress
Sales5 Wisteria Lane, Springfield, USA
Engineering124 Chestmount Street, Springfield, USA
Human Resources6610 Bronzeway, Springfield, USA
Customer Service532 Executive Lane, Springfield, USA
Research and Development10 Meadow View, Springfield, USA
Sales124 Chestmount Street, Springfield, USA
Sales10 Meadow View, Springfield, USA
Human Resources532 Executive Lane, Springfield, USA

There it is! We have our desired output of having each department and its corresponding address. Notice how our Sales department is the largest, spanning three different offices. The second largest department is Human Resources, which spans two different offices.

You can see how JOIN clauses can be used on multiple tables to create links between tables that have columns in common. There are many different situations when joining multiple tables can be useful; for more information, check out this article on how to JOIN three or more tables in SQL.

9. How do you join a table to itself?

Many beginners don’t realize it, but you can join a table to itself. Such an operation is commonly called a self-join. It is useful for querying hierarchical data or comparing rows within the same table. When using a self-join, it is important to use a SQL alias for each table.

For our self-join example, we will use the following table:

employee - This table stores all company employees’ names, the IDs of their departments, and the IDs of their managers.

idemployee_namedepartment_idmanager_id
1Montgomery Burns4NULL
2Waylon Smithers11
3Homer Simpson21
4Carl Carlson51
5Lenny Leonard31
6Frank Grimes23

Say we want a result set that only shows employees with their managers. This can easily be done using table aliases in combination with a self-join. We will use an SQL LEFT JOIN for our first self-join. Look at the code below:

SELECT
e.employee_name AS 'Employee',
m.employee_name AS 'Manager'
FROM employee e
LEFT JOIN employee m ON m.id = e.manager_id

Watch out for the ambiguous column error, which can easily occur if you are not careful when writing a self-join query. To avoid this error, you must make good use of SQL aliases – i.e. by giving an alias to each occurrence of the table in your SQL query. This is demonstrated by the following snippet from the above query:

FROM employee e LEFT JOIN employee m

You must also prefix column names with the table alias so that the table each column is referring to is clear. We have explicitly specified e.employee_name and m.employee_name.

These rules will help you successfully execute a SQL self-join query while avoiding the ambiguous column error.

Executing the above query yields the following result set:

EmployeeManager
Montgomery BurnsNULL
Waylon SmithersMontgomery Burns
Homer SimpsonMontgomery Burns
Carl CarlsonMontgomery Burns
Lenny LeonardMontgomery Burns
Frank GrimesHomer Simpson

There’s our desired result! You can clearly see each employee and their corresponding manager. Most employees report to Mr. Burns, although the manager for Frank Grimes is Homer Simpson. Notice the NULL value under the Manager column for Montgomery Burns. This is because Montgomery Burns has no manager – he is the boss.

Let's tweak the query slightly and use an INNER JOIN this time:

SELECT
e.employee_name AS 'Employee',
m.employee_name AS 'Manager'
FROM employee e
INNER JOIN tbl_employee m ON m.id = e.manager_id
EmployeeManager
Waylon SmithersMontgomery Burns
Homer SimpsonMontgomery Burns
Carl CarlsonMontgomery Burns
Lenny LeonardMontgomery Burns
Frank GrimesHomer Simpson

The only major difference is the absence of Montgomery Burns from the Employee column. This is because the manager_id value for him was NULL; INNER JOIN only returns matching columns, with NULL values excluded.

Now you can perform self-joins, which are applicable in many different use cases. If you want to see more examples of self-joins, check out this excellent illustrated guide to self-joins.

Want to know how to use JOINs? Check out our interactive SQL JOINs course.

10. Must the JOIN condition be equality?

A non-equi join is any JOIN clause that does not use equality ( = ) as the JOIN condition. You can use common comparison operators (e.g. <, >, <=, >=, !=, and <>) in conjunction with join clauses. The BETWEEN operator can also be used.

There are many situations where non-equi joins can prove useful, including listing unique pairs, listing records within a range, and identifying duplicates. Let’s look at our last use case example: how to identify duplicates using a non-equi join.

First, look at the data which we will be querying. We’ll be using just one table, the familiar employee table, for this example:

idemployee_namedepartment_idmanager_id
1Montgomery Burns4NULL
2Waylon Smithers11
3Homer Simpson21
4Carl Carlson51
5Lenny Leonard31
6Frank Grimes23
7Lenny Leonard31

If we wanted to quickly identify any duplicate values, we’d write the following query, which makes good use of a non-equi join:

SELECT e1.id, e1.employee_name, e2.id, e2.employee_name
FROM employee e1
JOIN employee e2
ON e1.employee_name = e2.employee_name AND e1.id < e2.id

Taking a closer look at the JOIN clause, we can see it has two conditions:

  1. It matches records that have the same name.
  2. It retrieves records where the ID is less than the ID of the temporary self-joined table.

Executing this query yields the following result set:

idemployee_nameidemployee_name
5Lenny Leonard7Lenny Leonard

We can see that Lenny Leonard has a duplicate record in this table. Duplicates can cause unpredictable errors and taint the data in your reports.


/////////////////////////////////////////////////////////////////////////////////////////////////


/////////////////////////////////////////////////////////////////////////////////////////////////


/////////////////////////////////////////////////////////////////////////////////////////////////

Comments

Popular posts from this blog

Java-INterview-1