Sometimes you want to work with every row in a table, in cases like this you don’t need a where clause since you don’t need to exclude any rows from consideration. Most of the time, however, you will want to narrow your focus to a subset of a table’s rows. Therefore, all the SQL data statements (except the insert statement) include an optional where clause containing one or more filter conditions used to restrict the number of rows acted on by the SQL statement.
In this post you will learn various ways to filter data in sql using WHERE clause. And all the examples in this post uses this database – sakila
Condition Evaluation –
A where clause may contain one or more conditions, separated by the operators and and or. If multiple conditions are separated only by the and operator, then all the conditions must evaluate to true for the row to be included in the result set. Consider the following where clause:
WHERE first_name = 'STEVEN' AND create_date > '2006-01-01'
Given these two conditions, only rows where the first name is Steven and the creation date was after January 1, 2006, will be included in the result set. Though this example uses only two conditions, no matter how many conditions are in your where clause, if they are separated by the and operator, they must all evaluate to true for the row to be included in the result set.
If all conditions in the where clause are separated by the or operator, however, only one of the conditions must evaluate to true for the row to be included in the result set. Consider the following two conditions:
WHERE first_name = 'STEVEN' OR create_date > '2006-01-01'
There are now various ways for a given row to be included in the result set:
The first name is Steven, and the creation date was after January 1, 2006.
The first name is Steven, and the creation date was on or before January 1, 2006.
The first name is anything other than Steven, but the creation date was after January 1, 2006.
Below table shows the possible outcomes for a where clause containing two conditions separated by the or operator.
|Intermediate result||Final result|
|WHERE true or true||true|
|WHERE true or false||true|
|WHERE false or true||true|
|WHERE false or false||false|
In the case of the preceding example, the only way for a row to be excluded from the result set is if the person’s first name was not Steven and the creation date was on or before January 1, 2006.
Using Parentheses –
If your where clause includes three or more conditions using both the and and or operators, you should use parentheses to make your intent clear, both to the database server and to anyone else reading your code. Here’s a where clause that extends the previous example by checking to make sure that the first name is Steven or the last name is Young, and the creation date is after January 1, 2006:
WHERE (first_name = 'STEVEN' OR last_name = 'YOUNG') AND create_date > '2006-01-01'
There are now three conditions; for a row to make it to the final result set, either the first or second condition (or both) must evaluate to true, and the third condition must evaluate to true.
Using the not operator –
Hopefully, the previous three-condition example is fairly easy to understand. Consider the following modification, however:
WHERE NOT (first_name = 'STEVEN' OR last_name = 'YOUNG') AND create_date > '2006-01-01'
Did you spot the change from the previous example? I added the not operator before the first set of conditions. Now, instead of looking for people with the first name of Steven or the last name of Young whose record was created after January 1, 2006, I am retrieving only rows where the first name is not Steven or the last name is not Young whose record was created after January 1, 2006.
Condition Types –
There are many different ways to filter out unwanted data. You can look for specific values, sets of values, or ranges of values to include or exclude, or you can use various pattern-searching techniques to look for partial matches when dealing with string data. The next four subsections explore each of these condition types in detail.
Equality Conditions –
A large percentage of the filter conditions that you write or come across will be of the form ‘column = expression’ as in:
title = 'RIVER OUTLAW' fed_id = '111-11-1111' amount = 375.25 film_id = (SELECT film_id FROM film WHERE title = 'RIVER OUTLAW')
Conditions such as these are called equality conditions because they equate one expression to another. The first three examples equate a column to a literal (two strings and a number), and the fourth example equates a column to the value returned from a subquery. The following query uses two equality conditions, one in the on clause (a join condition) and the other in the where clause (a filter condition):
SELECT c.email FROM customer c INNER JOIN rental r ON c.customer_id = r.customer_id WHERE date(r.rental_date) = '2005-06-14';
This query shows all email addresses of every customer who rented a film on June 14, 2005.
Inequality Conditions –
Another fairly common type of condition is the inequality condition, which asserts that two expressions are not equal. Here’s the previous query with the filter condition in the where clause changed to an inequality condition:
SELECT c.email FROM customer c INNER JOIN rental r ON c.customer_id = r.customer_id WHERE date(r.rental_date) <> '2005-06-14';
This query returns all email addresses for films rented on any other date than June 14, 2005. When building inequality conditions, you may choose to use either the != or <> operator.
Data modification using equality conditions –
Equality/inequality conditions are commonly used when modifying data. For example, let’s say that the movie rental company has a policy of removing old account rows once per year. Your task is to remove rows from the rental table where the rental date was in 2004. Here’s one way to tackle it:
DELETE FROM rental WHERE year(rental_date) = 2004;
This statement includes a single equality condition; here’s an example that uses two inequality conditions to remove any rows where the rental date was not in 2005 or 2006:
DELETE FROM rental WHERE year(rental_date) <> 2005 AND year(rental_date) <> 2006;
Range Conditions –
Along with checking that an expression is equal to (or not equal to) another expression, you can build conditions that check whether an expression falls within a certain range. This type of condition is common when working with numeric or temporal data. Consider the following query:
SELECT customer_id, rental_date FROM rental WHERE rental_date < '2005-05-25';
This query finds all film rentals prior to May 25, 2005. As well as specifying an upper limit for the rental date, you may also want to specify a lower range:
SELECT customer_id, rental_date FROM rental WHERE rental_date <= '2005-06-16' AND rental_date >= '2005-06-14';
This version of the query retrieves all films rented on June 14 or 15 of 2005.
Between Operator –
When you have both an upper and lower limit for your range, you may choose to use a single condition that utilizes the between operator rather than using two separate conditions, as in:
SELECT customer_id, rental_date FROM rental WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16';
When using the between operator, there are a couple of things to keep in mind. You should always specify the lower limit of the range first (after between) and the upper limit of the range second (after and). Here’s what happens if you mistakenly specify the upper limit first:
SELECT customer_id, rental_date FROM rental WHERE rental_date BETWEEN '2005-06-16' AND '2005-06-14';
As you can see, no data is returned. This is because the server is, in effect, generating two conditions from your single condition using the <= and >= operators, as in:
SELECT customer_id, rental_date FROM rental WHERE rental_date >= '2005-06-16' AND rental_date <= '2005-06-14';
Since it is impossible to have a date that is both greater than June 16, 2005, and less than June 14, 2005, the query returns an empty set. This brings me to the second pitfall when using between, which is to remember that your upper and lower limits are inclusive, meaning that the values you provide are included in the range limits. In this case, I want to return any films rented on June 14 or 15, so I specify 2005-06-14 as the lower end of the range and 2005-06-16 as the upper end. Since I am not specifying the time component of the date, the time defaults to midnight, so the effective range is 2005-06-14 00:00:00 to 2005-06-16 00:00:00, which will include any rentals made on June 14 or 15.
Along with dates, you can also build conditions to specify ranges of numbers. Numeric ranges are fairly easy to grasp, as demonstrated by the following:
SELECT customer_id, payment_date, amount FROM payment WHERE amount BETWEEN 10.0 AND 11.99;
Membership conditions –
In some cases, you will not be restricting an expression to a single value or range of values but rather to a finite set of values. For example, you might want to locate all films that have a rating of either ‘G’ or ‘PG’:
SELECT title, rating FROM film WHERE rating = 'G' OR rating = 'PG';
While this where clause (two conditions or’d together) wasn’t too tedious to generate, imagine if the set of expressions contained 10 or 20 members. For these situations, you can use the in operator instead:
SELECT title, rating FROM film WHERE rating IN ('G','PG');
With the in operator, you can write a single condition no matter how many expressions are in the set.
Using Subqueries –
Along with writing your own set of expressions, such as (‘G’,’PG’), you can use a subquery to generate a set for you on the fly. For example, if you can assume that any film whose title includes the string ‘PET’ would be safe for family viewing, you could execute a subquery against the film table to retrieve all ratings associated with these films and then retrieve all films having any of these ratings:
SELECT title, rating FROM film WHERE rating IN (SELECT rating FROM film WHERE title LIKE '%PET%');
Using not in –
Sometimes you want to see whether a particular expression exists within a set of expressions, and sometimes you want to see whether the expression does not exist within the set. For these situations, you can use the not in operator:
SELECT title, rating FROM film WHERE rating NOT IN ('PG-13','R','NC-17');
This query finds all accounts that are not rated ‘PG-13′ ,’R’, or ‘NC-17’, which will return the same set of 372 rows as the previous queries.