Friday, 5 August 2011

Understanding the Different Join Types

There are various types of joins and various kinds we can use in MySQL.

Understanding the Basic Join
In the preceding section, we mentioned the concept of a Cartesian product. This is sometimes called a full join or cross join, but regardless of nomenclature, it returns a complete set of combinations. When we add a conditional statement to the join (like employee.departmentID = department.departmentID), we make it into something called an equijoin, which limits the number of rows in the result set.

So far, we have used a set of tables listed in the FROM clause separated by the comma operator. This gives us a cross join, as described previously, converted to an equijoin with the WHERE clause. MySQL has various forms of syntax we can use for this type of join.

Consider our original query:

select employee.name, department.name
from employee, department
where employee.departmentID = department.departmentID;

Optionally, we could have used the keyword JOIN, instead of a comma:

select employee.name, department.name
from employee join department
where employee.departmentID = department.departmentID;

Instead of JOIN, we could just as well have written CROSS JOIN or INNER JOIN.

When we perform this type of join, MySQL looks at the tables we are joining and tries to work out the most efficient way to join them together, rather than necessarily joining the tables in the order we have listed. Sometimes the query optimization can go slightly wrong. We will discuss this in more detail in Chapter 19, "Optimizing Your Queries." If you want to override the optimizer and force MySQL to join tables in the order in which you list them, replace the word JOIN with STRAIGHT JOIN.

Understanding LEFT and RIGHT Joins
When we used equijoins in the preceding section, we were using a JOIN, CROSS JOIN, INNER JOIN, or perhaps STRAIGHT JOIN and were looking for rows that matched in two or more tables. What about cases in which we want to find rows in one table that don't have a corresponding row in the other table?

Consider, for example, a situation in which we want to find employees who have not yet worked on any outside assignments—that is, employees whose employeeIDs are not listed in the assignment table. We can do this using LEFT JOIN, as follows:

select employee.name
from employee left join assignment
on employee.employeeID = assignment.employeeID
where clientID is null;

This will give the following results:

+---------------+
| name |
+---------------+
| Ajay Patel |
| Candy Burnett |
| Ben Smith |
+---------------+
3 rows in set (0.49 sec)

You can easily confirm by visual inspection of the tables that this is the right answer, but why and how does it work?

The left join works by taking the left-hand table in the join (in this case, employee) and trying to match it to rows in the right-hand table. These matching rows are placed alongside the left table. For each row in the left table that has no matching rows in the right table, the LEFT JOIN substitutes a row of NULL values. We can find rows from the left table that have no matching row in the right table by searching for a NULL key value.

Look back at the example. In this join, for each employee who has worked on an assignment, we will get a row showing the employee and the assignment details. When an employee has no matching row in the assignment table, the left join will make up a "dummy row" consisting of all NULLs. We can find these dummy rows by looking for employees who have worked on an assignment for which the clientID is NULL. (The clientID is a key field, so this should never occur in the assignment table.)

In versions of MySQL prior to 4.1, this technique was often used to work around the absence of subqueries. Subqueries are covered later in this chapter.

In this example, we have used LEFT JOIN, but we could as easily have used RIGHT JOIN, which does the same thing but uses the right table as a base and fills any missing rows from the left table with NULLs.

No comments:

Post a Comment