Tutorial

How To Use WHERE Clauses in SQL

Published on December 18, 2020
Default avatar

By Mark Drake

Manager, Developer Education

How To Use WHERE Clauses in SQL

Introduction

In Structured Query Language (SQL) statements, WHERE clauses limit what rows the given operation will affect. They do this by defining specific criteria, referred to as search conditions, that each row must meet in order for it to be impacted by the operation.

This guide will go over the general syntax used in WHERE clauses. It will also outline how to combine multiple search condition predicates in a single WHERE clause to filter data with more granularity, as well as how to use the NOT operator to exclude, rather than include, rows that meet a given search condition.

Although this guide will exclusively use SELECT statements in its examples, the concepts explained here can be used in a number of SQL operations. In fact, WHERE clauses are critical components of UPDATE and DELETE operations.

Prerequisites

In order to follow this guide, you will need a computer running some type of relational database management system (RDBMS) that uses SQL. The instructions and examples in this guide were validated using the following environment:

  • A server running Ubuntu 20.04, with a non-root user with administrative privileges and a firewall configured with UFW, as described in our initial server setup guide for Ubuntu 20.04.
  • MySQL installed and secured on the server, as outlined in How To Install MySQL on Ubuntu 20.04. This guide was verified with a newly-created user, as described in Step 3.
  • You’ll also need a database with some tables loaded with sample data which you can use to practice writing queries that include WHERE clauses. We encourage you to go through the following Connecting to MySQL and Setting up a Sample Database section for details on how to connect to a MySQL server and create the testing database used in examples throughout this guide.

Note: Please note that many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.

Connecting to MySQL and Setting up a Sample Database

If your SQL database system runs on a remote server, SSH into your server from your local machine:

  1. ssh sammy@your_server_ip

Then open up the MySQL server prompt, replacing sammy with the name of your MySQL user account:

  1. mysql -u sammy -p

From the prompt, create a database named where_db:

  1. CREATE DATABASE where_db;

If the database was created successfully, you’ll receive output like this:

Output
Query OK, 1 row affected (0.01 sec)

To select the where_db database, run the following USE statement:

  1. USE where_db;
Output
Database changed

After selecting where_db, create a table within it.

To follow along with the examples used in this guide, imagine that you run a golf league at a local golf course. You decide to track information about the individual performances of the league’s players at the outings they attend. To do so, you decide to store the information in an SQL database.

You decide that this table will need six columns:

  • name: each of the golfers’ names, expressed using the varchar data type with a maximum of 20 characters
  • rounds_played: the total number of complete rounds each golfer has played, expressed with the int data type
  • best: each golfer’s best, or lowest, score for an individual outing, also expressed as an int
  • worst: each golfer’s worst, or highest, score for an individual outing, again expressed as an int
  • average: an approximate average of each golfers’ scores over the rounds they’ve played. This column will hold values of the decimal type, limited to a maximum of 4 digits with one of those digits to the right of the decimal point
  • wins: the number of rounds in which each golfer had the lowest score of everyone in the playing group, expressed using the int type

Run the following CREATE TABLE statement to create a table named golfers that has these six columns:

  1. CREATE TABLE golfers (
  2. name varchar(20),
  3. rounds_played int,
  4. best int,
  5. worst int,
  6. average decimal (4,1),
  7. wins int
  8. );

Then load the golfers table with some sample data. Run the following INSERT INTO operation to add seven rows of data representing seven of the league’s golfers:

  1. INSERT INTO golfers
  2. VALUES
  3. ('George', 22, 68, 103, 84.6, 3),
  4. ('Pat', 25, 65, 74, 68.7, 9),
  5. ('Grady', 11, 78, 118, 97.6, 0),
  6. ('Diane', 23, 70, 92, 78.8, 1),
  7. ('Calvin', NULL, 63, 76, 68.5, 7),
  8. ('Rose', NULL, 69, 84, 76.7, 4),
  9. ('Raymond', 18, 67, 92, 81.3, 1);

Notice that two of these rows’ rounds_played values are NULL. For the purposes of this tutorial, assume that these golfers haven’t reported how many rounds they’ve played and thus those values are recorded as NULL.

You may also notice that each golfer’s best value is less than their worst. This is because, in common golf rules, a golfer’s score is determined by the number of strokes it takes for them to get their ball into each hole in the course, with the winner being the person with the fewest total strokes. Thus, unlike most other sports, a golfer’s best score will be lower than their worst.

With that, you’re ready to follow the rest of the guide and begin learning how to use WHERE clauses in SQL.

Filtering Data with WHERE clauses

In SQL, a statement is any operation sent to the database system that will perform some sort of task, like creating a table, inserting or deleting data, or changing the structure of a column or table. SQL statements are made up of various clauses, which consist of certain keywords and the information that they require.

As mentioned in the introduction, WHERE clauses allow you to filter out certain rows of data from being impacted by an SQL operation. In queries, the WHERE clause comes after the FROM clause, as in the following example:

  1. SELECT columns_to_query
  2. FROM table_to_query
  3. WHERE search_condition;

Following the WHERE keyword is a search condition. A search condition is a set of one or more predicates, or expressions that can evaluate one or more value expressions and return a result of either “true,” “false,” or “unknown.” Note that in cases where a search condition only contains a single predicate, the terms “search condition” and “predicate” are synonymous.

Predicates in a WHERE clause search condition can take many forms, but they typically follow this syntax:

. . .
WHERE column_name OPERATOR value_expression
. . .

In SQL, a value expression — also sometimes referred to as a scalar expression — is any expression that will return a single value. A value expression can be a literal value, like a string or numeric value, a mathematical expression. Most often, though, at least one of the value expressions in a WHERE clause’s search condition will be a column name.

When running SQL queries that contain a WHERE clause, the database management system will apply the search condition to every row in the logical table defined by the FROM clause. It will then return only the rows for which every predicate in the search condition evaluates to “true.”

To illustrate this idea, run the following query. This will return every value from the golfers table’s name column:

  1. SELECT name
  2. FROM golfers
  3. WHERE (2 + 2) = 4;

This query includes a WHERE clause, but instead of specifying a column name it uses (2 + 2) as the first value expression and tests whether it’s equal to the second value expression, 4. Because (2 + 2) is always equal to 4, this search condition evaluates to “true” for every row. Consequently, every row gets returned in the result set:

Output
+---------+ | name | +---------+ | George | | Pat | | Grady | | Diane | | Calvin | | Rose | | Raymond | +---------+ 7 rows in set (0.01 sec)

This WHERE clause isn’t very useful, since it always evaluates to “true” and always returns every row in the table. As mentioned previously, you will typically use at least one column name as a value expression in WHERE clause search conditions. When running queries, the database system will apply the search condition to each row individually in sequence. By providing a column name as a value expression in a search condition, you’re telling the DBMS to use each row’s value from that column as the value expression for that row’s iteration of the search condition.

The following query’s WHERE clause applies a more exclusive search condition to each row than the previous example. It will return the name and wins values from any row whose wins column value is equal to 1:

  1. SELECT name, wins
  2. FROM golfers
  3. WHERE wins = 1;

Only two golfers have won exactly one round, so the query only returns those two rows:

Output
+---------+------+ | name | wins | +---------+------+ | Diane | 1 | | Raymond | 1 | +---------+------+ 2 rows in set (0.01 sec)

The previous examples use the equals sign (=) to test whether two value expressions are equivalent, but the operator you use depends on what type of predicate you want to use to filter your result sets.

The SQL standard defines 18 types of predicates, although not all of them are included in every SQL implementation. Here are five of the most commonly used predicate types, as well as a brief explanation of each one and the operators they use:

Comparison

Comparison predicates use a comparison operator to compare one value (in queries, typically values in a specified column) with another. The six comparison operators are:

  • =: tests whether the two values are equal
  1. SELECT name
  2. FROM golfers
  3. WHERE name = 'George';
Output
+--------+ | name | +--------+ | George | +--------+ 1 row in set (0.00 sec)
  • <>: tests whether two values are not equal
  1. SELECT name, wins
  2. FROM golfers
  3. WHERE wins <> 1;
Output
+--------+------+ | name | wins | +--------+------+ | George | 3 | | Pat | 9 | | Grady | 0 | | Calvin | 7 | | Rose | 4 | +--------+------+ 5 rows in set (0.00 sec)
  • <: tests whether the first value is less than the second
  1. SELECT name, wins
  2. FROM golfers
  3. WHERE wins < 1;
Output
+-------+------+ | name | wins | +-------+------+ | Grady | 0 | +-------+------+ 1 row in set (0.00 sec)
  • >: tests whether the first value is greater than the second
  1. SELECT name, wins
  2. FROM golfers
  3. WHERE wins > 1;
Output
+--------+------+ | name | wins | +--------+------+ | George | 3 | | Pat | 9 | | Calvin | 7 | | Rose | 4 | +--------+------+ 4 rows in set (0.00 sec)
  • <=: tests whether the first value is less than or equal to the second
  1. SELECT name, wins
  2. FROM golfers
  3. WHERE wins <= 1;
Output
+---------+------+ | name | wins | +---------+------+ | Grady | 0 | | Diane | 1 | | Raymond | 1 | +---------+------+ 3 rows in set (0.00 sec)
  • >=: tests whether the first value is greater than or equal to the second
  1. SELECT name, wins
  2. FROM golfers
  3. WHERE wins >= 1;
Output
+---------+------+ | name | wins | +---------+------+ | George | 3 | | Pat | 9 | | Diane | 1 | | Calvin | 7 | | Rose | 4 | | Raymond | 1 | +---------+------+ 6 rows in set (0.00 sec)

Null

Predicates that use the IS NULL operator test whether values in a given column are Null. If so, the predicate evaluates to “true” and the row is included in the result set:

  1. SELECT name, rounds_played
  2. FROM golfers
  3. WHERE rounds_played IS NULL;
Output
+--------+---------------+ | name | rounds_played | +--------+---------------+ | Calvin | NULL | | Rose | NULL | +--------+---------------+ 2 rows in set (0.00 sec)

Range

Range predicates use the BETWEEN operator to test whether the specified column values fall between two value expressions:

  1. SELECT name, best
  2. FROM golfers
  3. WHERE best BETWEEN 67 AND 73;
Output
+---------+------+ | name | best | +---------+------+ | George | 68 | | Diane | 70 | | Rose | 69 | | Raymond | 67 | +---------+------+ 4 rows in set (0.00 sec)

Membership

Membership predicates use the IN operator to test whether a value is a member of a given set:

  1. SELECT name, best
  2. FROM golfers
  3. WHERE best IN (65, 67, 69, 71);
Output
+---------+------+ | name | best | +---------+------+ | Pat | 65 | | Rose | 69 | | Raymond | 67 | +---------+------+ 3 rows in set (0.00 sec)

Pattern Match

Pattern matching predicates use the LIKE operator to test whether a value matches a string pattern that contains one or more wildcard characters, also known as wildcards. SQL defines two wildcards, % and _:

  • _: an underscore represents a single unknown character
  1. SELECT name, rounds_played
  2. FROM golfers
  3. WHERE rounds_played LIKE '2_';
Output
+--------+---------------+ | name | rounds_played | +--------+---------------+ | George | 22 | | Pat | 25 | | Diane | 23 | +--------+---------------+ 3 rows in set (0.00 sec)
  • %: a percentage sign represents zero or more unknown characters
  1. SELECT name, rounds_played
  2. FROM golfers
  3. WHERE name LIKE 'G%';
Output
+--------+---------------+ | name | rounds_played | +--------+---------------+ | George | 22 | | Grady | 11 | +--------+---------------+ 2 rows in set (0.00 sec)

It’s beyond the scope of this tutorial to go into each of these predicate types in greater detail. If you’d like to learn more about them, though, we encourage you to check out the following guides:

Combining Multiple Predicates with AND and OR

There may be times when you need more granularly filtered results than a WHERE clause with a single search condition predicate can provide. On the other hand, there could also be times when rows that meet one of several search conditions would be acceptable in a result set. In cases like these, you can write WHERE clauses that include multiple predicates with the AND or OR operators, respectively.

To get started with these operators, run the following query which returns values from the golfers table’s name, best, worst, and average columns. Its WHERE clause includes two predicates, separated by AND:

  1. SELECT name, best, worst, average
  2. FROM golfers
  3. WHERE best < 70 AND worst < 96;

The first predicate tests whether each row’s best value is less than 70, while the second tests whether each row’s worst value is less than 96. If either test evaluates to “false” for a row, that row will not get returned in the result set:

Output
+---------+------+-------+---------+ | name | best | worst | average | +---------+------+-------+---------+ | Pat | 65 | 74 | 68.7 | | Calvin | 63 | 76 | 68.5 | | Rose | 69 | 84 | 76.7 | | Raymond | 67 | 92 | 81.3 | +---------+------+-------+---------+ 4 rows in set (0.00 sec)

Next, run the following query. This is identical to the previous example, except that it separates the two predicates with the OR operator instead of AND:

  1. SELECT name, best, worst, average
  2. FROM golfers
  3. WHERE best < 70 OR worst < 96;

Because only one of the predicates must evaluate to “true” for a row to be returned, this result set includes two more rows than the previous example:

Output
+---------+------+-------+---------+ | name | best | worst | average | +---------+------+-------+---------+ | George | 68 | 103 | 84.6 | | Pat | 65 | 74 | 68.7 | | Diane | 70 | 92 | 78.8 | | Calvin | 63 | 76 | 68.5 | | Rose | 69 | 84 | 76.7 | | Raymond | 67 | 92 | 81.3 | +---------+------+-------+---------+ 6 rows in set (0.00 sec)

You can include as many predicates as you’d like in a single WHERE clause as long as you combine them with the correct syntax. As your search conditions get more complex, though, it can become difficult to predict what data they will filter.

It’s important to note that database systems generally give precedence to AND operators. This means that any predicates separated by an AND operator (or operators in the case of more than two predicates) are treated as a single, isolated search condition that gets tested before any other predicates in a WHERE clause.

To illustrate, run the following query, which returns values from the name, average, worst, and rounds_played columns for any row that meets the search conditions defined in the WHERE clause:

  1. SELECT name, average, worst, rounds_played
  2. FROM golfers
  3. WHERE average < 85 OR worst < 95 AND rounds_played BETWEEN 19 AND 23;

This query first tests whether the predicates separated by the AND operator — worst < 95 and rounds_played BETWEEN 19 AND 23 — both evaluate to “true” for the row in the current iteration. If so, then that row will appear in the result set. But if either evaluates to “false”, the query will then check whether the current row’s average value is less than 85. If so, the row will be returned:

Output
+---------+---------+-------+---------------+ | name | average | worst | rounds_played | +---------+---------+-------+---------------+ | George | 84.6 | 103 | 22 | | Pat | 68.7 | 74 | 25 | | Diane | 78.8 | 92 | 23 | | Calvin | 68.5 | 76 | NULL | | Rose | 76.7 | 84 | NULL | | Raymond | 81.3 | 92 | 18 | +---------+---------+-------+---------------+ 6 rows in set (0.00 sec)

You can prioritize a set two or more predicates by wrapping them in parentheses. The following example is identical to the previous one, but it wraps the average < 85 and worst < 95 predicates, separated by an OR operator, in parentheses:

  1. SELECT name, average, worst, rounds_played
  2. FROM golfers
  3. WHERE (average < 85 OR worst < 95) AND rounds_played BETWEEN 19 AND 23;

Because the first two predicates are surrounded by parentheses, the subsequent AND operator treats them as a discrete search condition that must evaluate to “true”. If both of these predicates — average < 85 and worst < 95 — evaluate to “false”, then the entire search condition evaluates to “false” and the query immediately drops the row from the result set before moving on to evaluate the next one.

However, if either of these first two predicates evaluate to “true”, the query then tests whether the given golfer’s rounds_played value is between 19 and 23. If so, that row gets returned in the result set:

Output
+--------+---------+-------+---------------+ | name | average | worst | rounds_played | +--------+---------+-------+---------------+ | George | 84.6 | 103 | 22 | | Diane | 78.8 | 92 | 23 | +--------+---------+-------+---------------+ 2 rows in set (0.00 sec)

As this output indicates, by prioritizing sets of predicates and wrapping them in parentheses, otherwise identical queries can return significantly different result sets.

Although it isn’t always necessary to do so, it’s recommended that you always include parentheses when combining more than two predicates in a single search condition. Doing so can help make queries more readable and easier to understand.

Excluding Results with NOT

All of this guide’s examples so far have focused on how to write queries with WHERE clauses that only include rows that meet the specified search conditions in their result set. However, you can write queries that exclude specific rows by including the NOT operator in your WHERE clauses.

Range, membership, and pattern matching predicates clauses that include the NOT operator generally follow this syntax:

  1. . . .
  2. WHERE column_name NOT OPERATOR value_expression
  3. . . .

To illustrate, run the following query. This will return values from the golfers table’s name column, but the NOT operator in its WHERE clause will cause the DBMS to exclude any rows that match the wildcard pattern:

  1. SELECT name
  2. FROM golfers
  3. WHERE name NOT LIKE 'R%';
Output
+--------+ | name | +--------+ | George | | Pat | | Grady | | Diane | | Calvin | +--------+ 5 rows in set (0.00 sec)

Things get a little different when adding the NOT operator to IS NULL predicates. In such cases you place the NOT between IS and NULL, as in the following example. This query returns the name and rounds_played values of every golfer whose rounds_played value isn’t Null:

  1. SELECT name, rounds_played
  2. FROM golfers
  3. WHERE rounds_played IS NOT NULL;
Output
+---------+---------------+ | name | rounds_played | +---------+---------------+ | George | 22 | | Pat | 25 | | Grady | 11 | | Diane | 23 | | Raymond | 18 | +---------+---------------+ 5 rows in set (0.00 sec)

You can also place the NOT operator immediately after the WHERE keyword. This is useful if you’re excluding rows based on whether they meet multiple search conditions, as in this example query which returns the golfers’ name, average, best, and wins values:

  1. SELECT name, average, best, wins
  2. FROM golfers
  3. WHERE NOT (average < 80 AND best < 70) OR wins = 9;
Output
+---------+---------+------+------+ | name | average | best | wins | +---------+---------+------+------+ | George | 84.6 | 68 | 3 | | Pat | 68.7 | 65 | 9 | | Grady | 97.6 | 78 | 0 | | Diane | 78.8 | 70 | 1 | | Raymond | 81.3 | 67 | 1 | +---------+---------+------+------+ 5 rows in set (0.00 sec)

Take note of this result set’s second row. Pat’s average score is less than 80 and her best score is less than 70. However, her row is still included in the result set, as the NOT operator only negates the search condition wrapped in parentheses.

Recall that when you wrap multiple predicates separated by AND or OR in parentheses, SQL will prioritize those predicates and treat them as a single isolated search condition. Because of this, the NOT operator only excludes rows based on the first two predicates, average < 80 and best < 70. But it includes rows based on the third predicate, wins = 9.

You can rewrite the query to exclude rows based on the third predicate along with the first two by wrapping all three in parentheses, like this:

  1. SELECT name, average, best, wins
  2. FROM golfers
  3. WHERE NOT ((average < 80 AND best < 70) OR wins = 9);
Output
+---------+---------+------+------+ | name | average | best | wins | +---------+---------+------+------+ | George | 84.6 | 68 | 3 | | Grady | 97.6 | 78 | 0 | | Diane | 78.8 | 70 | 1 | | Raymond | 81.3 | 67 | 1 | +---------+---------+------+------+ 4 rows in set (0.00 sec)

Depending on its SQL implementation, your database system may consider a query’s syntax invalid if you include NOT before a comparison operator. As an example, try running this query:

  1. SELECT name
  2. FROM golfers
  3. WHERE name NOT = 'Grady';

On MySQL and its derivatives, this will cause an error:

Output
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '= 'Grady'' at line 1

The reason for this error is that the NOT operator generally isn’t used with comparison operators (=, <>, <, <=, >, and >=), since you can achieve the opposite effect of one comparison operator by replacing it with another that would return the rows that the first would exclude. For example, you can replace the equivalence operator (=) with the inequivalence operator (<>).

Conclusion

By reading this guide, you learned how to write WHERE clauses so a query only returns rows that meet a specified condition. You also learned how to combine multiple predicates and search conditions in a single query, as well as how to use the NOT keyword to exclude information from result sets.

While the commands shown here should work on most relational databases, be aware that every SQL database uses its own unique implementation of the language. You should consult your DBMS’s official documentation for a more complete description of each command and their full sets of options.

If you’d like to learn more about working with SQL, we encourage you to check out the other tutorials in this series on How To Use SQL.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about us


Tutorial Series: How To Use SQL

Series Description

Structured Query Language — commonly known as SQL — is a language used to define, control, manipulate, and query data held in a relational database. SQL has been widely adopted since it was first developed in the 1970s, and today it’s the predominant language used to manage relational database management systems.

Ideal for managing structured data (data that can fit neatly into an existing data model), SQL is an essential tool for developers and system administrators in a wide variety of contexts. Also, because of its maturity and prevalence, candidates with SQL experience are highly sought after for jobs across a number of industries.

This series is intended to help you get started with using SQL. It includes a mix of conceptual articles and tutorials which provide introductions to various SQL concepts and practices. You can also use the entries in this series for reference while you continue to hone your skills with SQL.

Note: Please be aware that the tutorials in this series use MySQL in examples, but many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.

About the authors
Default avatar

Manager, Developer Education

Technical Writer @ DigitalOcean

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
Leave a comment


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more
DigitalOcean Cloud Control Panel