Report this

What is the reason for this report?

How To Use GROUP BY and ORDER BY in SQL: A Complete Guide

Updated on July 29, 2025
How To Use GROUP BY and ORDER BY in SQL: A Complete Guide

Introduction

Structured Query Language (SQL) databases can store and manage a lot of data across numerous tables. With large data sets, it’s important to understand how to sort data, especially for analyzing result sets or organizing data for reports or external communications.

Two common statements in SQL that help with sorting your data are GROUP BY and ORDER BY. A GROUP BY statement sorts data by grouping it based on column(s) you specify in the query and is used with aggregate functions. An ORDER BY clause allows you to organize result sets alphabetically or numerically and in ascending or descending order.

This article provides a hands-on guide to mastering data sorting and aggregation in SQL, focusing on the GROUP BY and ORDER BY clauses. We’ll walk you through creating a sample MySQL database and populating it with data. From there, we’ll delve into detailed, real-world examples on how to use GROUP BY with various aggregate functions like SUM, AVG, and COUNT to summarize data effectively. You will also learn how to combine these with the WHERE clause for pre-filtering and the HAVING clause for post-aggregation filtering. We will also explain the ORDER BY clause for sorting results and demonstrate how to combine all these statements to create powerful, organized reports. To ensure a complete understanding, the article will conclude with a crucial section on common errors and debugging, along with a detailed FAQ section, making it a complete resource for both beginners and those looking to solidify their SQL skills.

Key Takeaways:

  • GROUP BY is used for aggregation, not sorting: The GROUP BY clause groups rows that share a common value in specified columns, and is typically used with aggregate functions like SUM(), AVG(), COUNT(), MAX(), and MIN() to produce summarized results.
  • ORDER BY is used to sort query results: The ORDER BY clause arranges the output of a query in ascending (ASC) or descending (DESC) order based on one or more columns. It does not group data, only sorts it.
  • You can use GROUP BY and ORDER BY together: Combining these clauses allows you to first aggregate the data (using GROUP BY) and then sort the aggregated results (using ORDER BY). This is useful for building analytical reports or dashboards.
  • Use WHERE before grouping and HAVING after: The WHERE clause filters rows before grouping occurs. The HAVING clause filters groups after aggregation, making it essential when working with aggregate values.
  • Avoid selecting non-aggregated columns not included in GROUP BY: Including a column in the SELECT clause that is neither part of GROUP BY nor wrapped in an aggregate function will cause an error in strict SQL modes (e.g., MySQL’s ONLY_FULL_GROUP_BY).
  • DISTINCT can sometimes replace GROUP BY (without aggregation): If you’re only interested in retrieving unique values from a column and not performing aggregation, SELECT DISTINCT can produce the same result as GROUP BY, but is often clearer and more efficient.
  • Sorting must follow correct clause order: In SQL syntax, the correct clause order is: SELECTFROMWHEREGROUP BYHAVINGORDER BY. Placing ORDER BY before GROUP BY results in a syntax error.
  • Common mistakes include misusing HAVING, ORDER BY, or forgetting aliases: Using HAVING without aggregation, referencing undefined column names in ORDER BY, or failing to use aliases consistently can lead to errors or unexpected output. Always double-check your logic and clause order.

Prerequisites

To follow this guide, you will need a computer running some type of relational database management system (RDBMS) that uses SQL. The instructions and examples in this tutorial were validated using the following environment:

  • A server running Ubuntu 20.04, with a non-root user with sudo administrative privileges and firewall enabled. Follow our Initial Server Setup with Ubuntu 20.04 to get started.
  • MySQL installed and secured on the server. Follow our How To Install MySQL on Ubuntu 20.04 guide to set this up. This guide assumes you’ve also set up a non-root MySQL user, as outlined in Step 3 of this guide.

Note: Please note that many relational database management systems use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.

To practice sorting data results in this tutorial, you’ll need a database and table loaded with sample data. If you do not have one ready to use, you can read the following Connecting to MySQL and Setting up a Sample Database section to learn how to create a database and table. This tutorial will refer to this sample database and table throughout.

Connecting to MySQL and Setting up a Sample Database

If your SQL database runs on a remote server, SSH into your server from your local machine:

ssh sammy@your_server_ip

Next, open the MySQL prompt, replacing sammy with your MySQL user account information:

mysql -u sammy -p

Create a database named movieDB:

  1. CREATE DATABASE movieDB;

If the database was created successfully, you’ll receive the following output:

Output
Query OK, 1 row affected (0.01 sec)

To select the movieDB database run the following USE statement:

  1. USE movieDB;
Output
Database changed

After selecting the database, create a table within it. For this tutorial’s example, we’ll create a table that stores information about a local movie theater’s showings. This table will hold the following seven columns:

  • theater_id: stores values of the int data type for each theater’s showing rooms, and will serve as the table’s primary key, meaning each value in this column will function as a unique identifier for its respective row.
  • date: uses the DATE data type to store the specific date by the year, month, and day when a movie was shown. This data type adheres to the following parameters: four digits for the year, and a maximum of two digits for the month and day (YYYY-MM-DD).
  • time: represents the movie’s scheduled showing with the TIME data type by hours, minutes, and seconds (HH:MM:SS).
  • movie_name: stores the movie’s name using the varchar data type with a maximum of 40 characters.
  • movie_genre: uses the varchar data type with a maximum of 30 characters, to hold information on each movie’s respective genre.
  • guest_total: shows the total number of guests that attended a movie showing with the int data type.
  • ticket_cost: uses the decimal data type, with a precision of four and a scale of two, meaning values in this column can have four digits, and two digits to the right of the decimal point. This column represents the ticket cost for the specific movie showing.

Create a table named movie_theater that contains each of these columns by running the following CREATE TABLE command:

  1. CREATE TABLE movie_theater (
  2. theater_id int,
  3. date DATE,
  4. time TIME,
  5. movie_name varchar(40),
  6. movie_genre varchar(30),
  7. guest_total int,
  8. ticket_cost decimal(4,2),
  9. PRIMARY KEY (theater_id)
  10. );

Next, insert some sample data into the empty table:

  1. INSERT INTO movie_theater
  2. (theater_id, date, time, movie_name, movie_genre, guest_total, ticket_cost)
  3. VALUES
  4. (1, '2022-05-27', '10:00:00', 'Top Gun Maverick', 'Action', 131, 18.00),
  5. (2, '2022-05-27', '10:00:00', 'Downton Abbey A New Era', 'Drama', 90, 18.00),
  6. (3, '2022-05-27', '10:00:00', 'Men', 'Horror', 100, 18.00),
  7. (4, '2022-05-27', '10:00:00', 'The Bad Guys', 'Animation', 83, 18.00),
  8. (5, '2022-05-28', '09:00:00', 'Top Gun Maverick', 'Action', 112, 8.00),
  9. (6, '2022-05-28', '09:00:00', 'Downton Abbey A New Era', 'Drama', 137, 8.00),
  10. (7, '2022-05-28', '09:00:00', 'Men', 'Horror', 25, 8.00),
  11. (8, '2022-05-28', '09:00:00', 'The Bad Guys', 'Animation', 142, 8.00),
  12. (9, '2022-05-28', '05:00:00', 'Top Gun Maverick', 'Action', 150, 13.00),
  13. (10, '2022-05-28', '05:00:00', 'Downton Abbey A New Era', 'Drama', 118, 13.00),
  14. (11, '2022-05-28', '05:00:00', 'Men', 'Horror', 88, 13.00),
  15. (12, '2022-05-28', '05:00:00', 'The Bad Guys', 'Animation', 130, 13.00);
Output
Query OK, 12 rows affected (0.00 sec) Records: 12 Duplicates: 0 Warnings: 0

Once you’ve inserted the data, you’re ready to start sorting query results in SQL.

Using GROUP BY

The function of a GROUP BY statement is to group records with shared values. A GROUP BY statement is always used with an aggregate function in a query. As you may recall, an aggregate function summarizes information and returns a single result. For instance, you can query for the total count or sum of a column and this will produce a single value in your result. With a GROUP BY clause, you can implement the aggregate function to get one result value for each group you desire.

GROUP BY is useful for returning multiple summarized results sorted by your specified group(s), rather than solely one column. Additionally, GROUP BY must always come after the FROM statement and the WHERE clause, if you choose to use one. Here’s an example of how a query with a GROUP BY and aggregate function is structured:

GROUP BY syntax
SELECT column_1, AGGREGATE_FUNCTION(column_2) FROM table GROUP BY column_1;

To illustrate how you can use GROUP BY statements, say you’re leading the campaign for several movie releases, and you want to evaluate the success of your marketing efforts. You ask a local theater to share the data they collected from guests on Friday and Saturday. Start by reviewing the data by running SELECT and the * symbol to select “every column” from the movie_theater table:

  1. SELECT * FROM movie_theater;
Output
+------------+------------+----------+-------------------------+-------------+-------------+-------------+ | theater_id | date | time | movie_name | movie_genre | guest_total | ticket_cost | +------------+------------+----------+-------------------------+-------------+-------------+-------------+ | 1 | 2022-05-27 | 10:00:00 | Top Gun Maverick | Action | 131 | 18.00 | | 2 | 2022-05-27 | 10:00:00 | Downton Abbey A New Era | Drama | 90 | 18.00 | | 3 | 2022-05-27 | 10:00:00 | Men | Horror | 100 | 18.00 | | 4 | 2022-05-27 | 10:00:00 | The Bad Guys | Animation | 83 | 18.00 | | 5 | 2022-05-28 | 09:00:00 | Top Gun Maverick | Action | 112 | 8.00 | | 6 | 2022-05-28 | 09:00:00 | Downton Abbey A New Era | Drama | 137 | 8.00 | | 7 | 2022-05-28 | 09:00:00 | Men | Horror | 25 | 8.00 | | 8 | 2022-05-28 | 09:00:00 | The Bad Guys | Animation | 142 | 8.00 | | 9 | 2022-05-28 | 05:00:00 | Top Gun Maverick | Action | 150 | 13.00 | | 10 | 2022-05-28 | 05:00:00 | Downton Abbey A New Era | Drama | 118 | 13.00 | | 11 | 2022-05-28 | 05:00:00 | Men | Horror | 88 | 13.00 | | 12 | 2022-05-28 | 05:00:00 | The Bad Guys | Animation | 130 | 13.00 | +------------+------------+----------+-------------------------+-------------+-------------+-------------+ 12 rows in set (0.00 sec)

While this data is helpful, you want to perform a deeper assessment and sort the results for some specific columns.

SQL GROUP BY with AVG Function

Since you worked on movies across a few different genres, you’re interested in knowing how well-received they were by movie-goers. Specifically, you want to know the average amount of people that watched each movie genre. Use SELECT to retrieve the various types of movies from the movie_genre column. Then apply the aggregate function AVG on the guest_total column, use AS to create an alias for a column called average, and include the GROUP BY statement to group results by movie_genre. Grouping them this way will provide you with the average results for each movie genre:

  1. SELECT movie_genre, AVG(guest_total) AS average
  2. FROM movie_theater
  3. GROUP BY movie_genre;
Output
+-------------+----------+ | movie_genre | average | +-------------+----------+ | Action | 131.0000 | | Drama | 115.0000 | | Horror | 71.0000 | | Animation | 118.3333 | +-------------+----------+ 4 rows in set (0.00 sec)

This output provides the four averages for each genre within the movie_genre group. Based on this information, Action movies attracted the highest average number of guests per showing.

SQL GROUP BY with COUNT Function

Now you may want to know how many times each movie was shown over the weekend. This is useful if you’re planning the number of future screenings based on past frequency.

To do this, use the COUNT() aggregate function with the GROUP BY clause. COUNT() returns the number of rows that match a specific condition. In this case, the number of showings for each movie.

The following query selects the movie_name column, counts how many entries exist for each movie, and groups the results by movie_name:

SELECT movie_name, COUNT(*) AS showings
FROM movie_theater
GROUP BY movie_name;
Output
+-------------------------+----------+ | movie_name | showings | +-------------------------+----------+ | Top Gun Maverick | 3 | | Downton Abbey A New Era | 3 | | Men | 3 | | The Bad Guys | 3 | +-------------------------+----------+ 4 rows in set (0.00 sec)

The output shows that each movie was shown three times during the data collection period. This kind of insight can be valuable for assessing scheduling decisions or planning future showtimes. If some movies had significantly more or fewer showings, it could reflect popularity or availability constraints.

By grouping your results and counting them, you can easily summarize large volumes of data into a readable and actionable format.

SQL GROUP BY with SUM Function

Next, let’s say you want to measure the theater’s revenues over two separate days. The following query returns values from the date column, as well as values returned by the SUM aggregate function. Specifically, the aggregate function SUM will enclose a mathematical equation in parentheses to multiply (using the * operator) the number of total guests by the cost of a ticket, represented as: SUM(guest_total * ticket_cost). This query includes the AS clause to provide the alias total_revenue for the column returned by the aggregate function. Then complete the query with the GROUP BY statement to group the query results by the date column:

  1. SELECT date, SUM(guest_total * ticket_cost)
  2. AS total_revenue
  3. FROM movie_theater
  4. GROUP BY date;
Output
+------------+---------------+ | date | total_revenue | +------------+---------------+ | 2022-05-27 | 7272.00 | | 2022-05-28 | 9646.00 | +------------+---------------+ 2 rows in set (0.00 sec)

Since you used GROUP BY to group the date column, your output provides the results for the total revenue in ticket sales for each day, in this case, $7,272 for Friday, May 27, and $9,646 for Saturday, May 28.

SQL GROUP BY with WHERE Clause and MAX Function

Now imagine you want to focus on and analyze one movie: The Bad Guys. In this scenario, you want to figure out how timing and price points impact a family’s choice to watch an animated film. For this query use the aggregate function MAX to retrieve the maximum ticket_cost, making sure to include AS to create the alias for the price_data column. After, use the WHERE clause to narrow down the results by only “The Bad Guys”, and use AND to also determine the most popular movie times based on guest_total numbers that were more than 100 with the comparison operator >. Then complete the query with the GROUP BY statement and group it by time:

  1. SELECT time, MAX(ticket_cost) AS price_data
  2. FROM movie_theater
  3. WHERE movie_name = "The Bad Guys"
  4. AND guest_total > 100
  5. GROUP BY time;
Output
+----------+------------+ | time | price_data | +----------+------------+ | 09:00:00 | 8.00 | | 05:00:00 | 13.00 | +----------+------------+ 2 rows in set (0.00 sec)

According to this output, more guests attended The Bad Guys movie at the early matinee time of 9:00 am, which had the more affordable price point of $8.00 per ticket. However, these results also show movie guests paid the higher ticket price of $13.00 at 5:00 pm, suggesting that families prefer showings that aren’t too late in the day and will pay a bit more for a ticket. This seems to be a fair assessment when compared to the 10:00 PM time when The Bad Guys movie only had 83 guests and the price per ticket was $18.00. This can be helpful information to provide the movie theater manager with evidence that opening more matinee and early evening time slots can increase the attendance for families that are making a choice based on a preferred time and price point.

GROUP BY vs. DISTINCT

Even though GROUP BY is almost always used with an aggregate function, there can be exceptions, although unlikely. However, if you did want to group your results without an aggregate function, you can use the DISTINCT statement to achieve the same result. A DISTINCT clause removes any duplicates in a result set by returning the unique values in the column, and it can only be used with a SELECT statement. For example, if you wanted to group all the movies together by name, you could do so with the following query:

  1. SELECT DISTINCT movie_name FROM movie_theater;
Output
+-------------------------+ | movie_name | +-------------------------+ | Top Gun Maverick | | Downton Abbey A New Era | | Men | | The Bad Guys | +-------------------------+ 4 rows in set (0.00 sec)

As you recall from viewing all the data in the table, there were duplicates of the movie names since there were multiple showings. Therefore, DISTINCT removed those duplicates and effectively grouped the unique values under the single column movie_name. This is effectively identical to the following query, which includes a GROUP BY statement:

  1. SELECT movie_name FROM movie_theater GROUP BY movie_name;

Now that you’ve practiced using GROUP BY with aggregate functions, next you’ll learn how to sort your query results with the ORDER BY statement.

USING ORDER BY

The function of the ORDER BY statement is to sort results in ascending or descending order based on the column(s) you specify in the query. Depending on the data type stored by the column you specify after it, ORDER BY will organize them in alphabetical or numerical order. By default, ORDER BY will sort results in ascending order; if you prefer descending order, however, you have to include the keyword DESC in your query. You can also use the ORDER BY statement with GROUP BY, but it must come after in order to function properly. Similar to GROUP BY, ORDER BY must also come after the FROM statement and WHERE clause. The general syntax for using ORDER BY is as follows:

ORDER BY syntax
SELECT column_1, column_2 FROM table ORDER BY column_1;

Let’s continue with the sample data for the movie theater and practice sorting results with ORDER BY. Begin with the following query which retrieves values from the guest_total column and sorts those numberical values using the ORDER BY clause:

  1. SELECT guest_total FROM movie_theater
  2. ORDER BY guest_total;
Output
+-------------+ | guest_total | +-------------+ | 25 | | 83 | | 88 | | 90 | | 100 | | 112 | | 118 | | 130 | | 131 | | 137 | | 142 | | 150 | +-------------+ 12 rows in set (0.00 sec)

Since your query specified a column with numerical values, the ORDER BY statement organized the results by numerical and ascending order, starting with 25 under the guest_total column.

If you preferred to order the column in descending order, you would add the DESC keyword at the end of the query. Additionally, if you wanted to order the data by the character values under movie_name, you would specify that in your query. Let’s perform that type of query using ORDER BY to order the movie_name column with character values in descending order. Sort the results even further by including a WHERE clause to retrieve the data on movies showing at 10:00 pm from the time column:

  1. SELECT movie_name FROM movie_theater
  2. WHERE time = '10:00:00'
  3. ORDER BY movie_name DESC;
Output
+-------------------------+ | movie_name | +-------------------------+ | Top Gun Maverick | | The Bad Guys | | Men | | Downton Abbey A New Era | +-------------------------+ 4 rows in set (0.01 sec)

This result set lists the four different movie showings at 10:00 pm in descending alphabetical order, starting from Top Gun Maverick to Downton Abbey A New Era.

Combining GROUP BY with ORDER BY

For this next query, combine the ORDER BY and GROUP BY statements with the aggregate function SUM to generate results on the total revenue received for each movie. However, let’s say the movie theater miscounted the total guests and forgot to include special parties that had pre-purchased and reserved tickets for a group of 12 people at each showing.

In this query use SUM and include the additional 12 guests at each movie showing by implementing the operator for addition + and then adding 12 to the guest_total. Make sure to enclose this in parenthesis. Then, multiply this total by the ticket_cost with the operator *, and complete the mathematical equation by closing the parenthesis at the end. Add the AS clause to create the alias for the new column titled total_revenue. Then, use GROUP BY to group total_revenue results for each movie based on the data retrieved from the movie_name column. Lastly, use ORDER BY to organize the results under the new column total_revenue in ascending order:

  1. SELECT movie_name, SUM((guest_total + 12) * ticket_cost)
  2. AS total_revenue
  3. FROM movie_theater
  4. GROUP BY movie_name
  5. ORDER BY total_revenue;
Output
+-------------------------+---------------+ | movie_name | total_revenue | +-------------------------+---------------+ | Men | 3612.00 | | Downton Abbey A New Era | 4718.00 | | The Bad Guys | 4788.00 | | Top Gun Maverick | 5672.00 | +-------------------------+---------------+ 4 rows in set (0.00 sec)

This result set tells us the total revenue for each movie with the additional 12 guest ticket sales and organizes the total ticket sales in ascending order from lowest to highest. From this, we learn that Top Gun Maverick received the most ticket sales, while Men received the least. Meanwhile, The Bad Guys and Downton Abbey A New Era movies were very close in total ticket sales.

In this section, you practiced various ways to implement the ORDER BY statement and how to specify the order you prefer, such as ascending and descending orders for both character and numerical data values. You also learned how to include the WHERE clause to narrow down your results, and performed a query using both the GROUP BY and ORDER BY statements with an aggregate function and mathematical equation.

When to Use ORDER BY vs. GROUP BY in SQL

While both GROUP BY and ORDER BY help organize your query results, they serve different purposes and are used in different scenarios. Knowing when to use each can help you write more efficient and effective SQL queries, especially when analyzing large datasets.

Use GROUP BY when you want to summarize or aggregate your data. This is helpful when you want to count, sum, average, or otherwise perform a calculation on a group of rows that share a common value. For example, if you’re evaluating movie attendance and want to know the average number of guests per movie genre, GROUP BY is the right tool.

SELECT movie_genre, AVG(guest_total) AS average_guests
FROM movie_theater
GROUP BY movie_genre;

In this case, the query groups all the rows by movie_genre and then calculates the average guest count for each group.

On the other hand, use ORDER BY when you want to sort the final results of your query. This is useful when you’re displaying data and want to organize it in a specific sequence: alphabetically, numerically, by date, or by custom priority. If you want to see which movie had the most guests, you’d use ORDER BY to sort your results by guest_total in descending order.

SELECT movie_name, guest_total
FROM movie_theater
ORDER BY guest_total DESC;

Here, ORDER BY doesn’t group the data, it simply sorts the rows based on the values in the guest_total column.

In many cases, you’ll use both together. For example, you might group data to calculate totals or averages, and then sort those grouped results to identify trends:

SELECT movie_name, SUM(guest_total * ticket_cost) AS total_revenue
FROM movie_theater
GROUP BY movie_name
ORDER BY total_revenue DESC;

This combined query gives you total revenue per movie and organizes it from highest to lowest, offering both analysis and clear presentation.

In summary:

  • Use GROUP BY to calculate and analyze grouped data.
  • Use ORDER BY to organize query results for readability or reporting.
  • Use both when you need to aggregate and then sort the results.

Understanding when to use each statement will help you extract insights more efficiently and present your data more clearly.

Combining GROUP BY with HAVING

While the WHERE clause filters rows before grouping occurs, the HAVING clause filters grouped records after the GROUP BY operation. This distinction is important when working with aggregate functions. You’ll use HAVING to apply conditions to summarized data, such as totals or averages, after grouping your results.

For example, say you want to find out which movie genres attracted an average of more than 100 guests per showing. Since you’re working with grouped results using an aggregate function (AVG()), you need to use the HAVING clause rather than WHERE.

Here’s how you can write this query:

SELECT movie_genre, AVG(guest_total) AS avg_guests
FROM movie_theater
GROUP BY movie_genre
HAVING AVG(guest_total) > 100;
+-------------+-------------+
| movie_genre | avg_guests |
+-------------+-------------+
| Action      | 131.0000    |
| Drama       | 115.0000    |
| Animation   | 118.3333    |
+-------------+-------------+
3 rows in set (0.00 sec)

In this example, the query groups the data by movie_genre, calculates the average number of guests using the AVG() function, and then filters the grouped results to only include genres where the average exceeds 100. This allows you to focus on genres that performed well in terms of audience size.

Now you might be wondering when to use HAVING? Use HAVING when:

  • You’re filtering results based on aggregate values like SUM(), AVG(), COUNT(), MAX(), or MIN().
  • You need to apply conditions after the rows have been grouped.

Let’s look at another example. Say you want to know which movies were shown more than twice. You’d use the COUNT() function to tally the number of showings, group the results by movie name, and then filter the groups using HAVING:

SELECT movie_name, COUNT(*) AS total_showings
FROM movie_theater
GROUP BY movie_name
HAVING COUNT(*) > 2;
+-------------------------+----------------+
| movie_name              | total_showings |
+-------------------------+----------------+
| Top Gun Maverick        |              3 |
| Downton Abbey A New Era |              3 |
| Men                     |              3 |
| The Bad Guys            |              3 |
+-------------------------+----------------+
4 rows in set (0.00 sec)

Although all movies in the sample dataset were shown three times, this query gives you the flexibility to filter based on show count, especially useful in larger datasets where some movies may only have one or two showings.

Using HAVING in combination with GROUP BY gives you more control and insight over aggregated data, helping you refine your reports, dashboards, and data-driven decisions.

Common Errors and Debugging

While working with GROUP BY and ORDER BY in SQL, you may occasionally run into errors or unexpected results. Understanding how and why these issues occur can save you time and help you write more accurate and efficient queries. Let’s review some of the most common issues.

Misuse of GROUP BY with Non-Aggregated Columns

One of the most frequent mistakes developers encounter when using GROUP BY is attempting to include columns in the SELECT clause that are neither part of the GROUP BY clause nor aggregated using an aggregate function. Most relational database management systems (RDBMS), such as MySQL (in strict mode), PostgreSQL, and SQL Server, will return an error when this occurs.

Example:

Let’s say you attempt to write the following query to get the average number of guests per movie genre, but also include the movie_name column:

SELECT movie_genre, movie_name, AVG(guest_total)
FROM movie_theater
GROUP BY movie_genre;

Error (in many SQL engines):

ERROR 1055 (42000): 'movie_theater.movie_name' isn't in GROUP BY

This error occurs because movie_name is not part of the GROUP BY clause and isn’t being aggregated. SQL doesn’t know how to handle multiple movie names within the same genre group unless you tell it exactly what to do with them, such as applying an aggregate function like MIN(), MAX(), or GROUP_CONCAT().

Correct Usage:

To fix the error, either:

  • Add movie_name to the GROUP BY clause (which would create a group for each unique movie_genre and movie_name combination), or
  • Remove movie_name from the SELECT clause, since it doesn’t make sense without aggregation in this case.

Corrected query:

SELECT movie_genre, AVG(guest_total) AS avg_guests
FROM movie_theater
GROUP BY movie_genre;

This query successfully groups data by genre and returns the average number of guests per genre, as expected.

+-------------+------------+
| movie_genre | avg_guests |
+-------------+------------+
| Action      |   131.0000 |
| Drama       |   115.0000 |
| Horror      |    71.0000 |
| Animation   |   118.3333 |
+-------------+------------+
4 rows in set (0.00 sec)

Incorrect Sorting Order Using ORDER BY

Another common issue arises when sorting results using ORDER BY, especially if the intent of the sort isn’t clearly defined. Some mistakes include:

  • Sorting by columns that are ambiguous or do not exist in the result set.
  • Forgetting to specify ASC or DESC when needed for clarity.
  • Attempting to use ORDER BY before GROUP BY or in the wrong order of SQL operations.

Example:

Consider the following query:

SELECT movie_name, SUM(guest_total) AS total_guests
FROM movie_theater
GROUP BY movie_name
ORDER BY guest_total;

This query may return an error or unexpected results because guest_total is not in the SELECT list; it has been aggregated as SUM(guest_total) and aliased as total_guests.

Correct Usage:

To resolve this, make sure you’re ordering by either:

  • The correct aggregate expression (SUM(guest_total)), or
  • The alias you provided (total_guests).

Corrected query:

SELECT movie_name, SUM(guest_total) AS total_guests
FROM movie_theater
GROUP BY movie_name
ORDER BY total_guests DESC;

This version of the query sorts the movies by their total number of guests in descending order, which is likely what you intended.

Avoiding these common pitfalls will make your queries more robust, maintainable, and error-free—especially when working with large and complex datasets.

Using Aggregate Functions Without GROUP BY

You cannot mix aggregate functions and non-aggregated columns in a SELECT clause unless you specify how to group the data. If SQL doesn’t know how to aggregate the non-aggregated column, it raises an error.

Incorrect Example:

SELECT movie_genre, SUM(guest_total)
FROM movie_theater;

Error (MySQL):

ERROR 1140 (42000): In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'movieDB.movie_theater.movie_genre'; this is incompatible with sql_mode=only_full_group_by

Here, SQL doesn’t know how to summarize multiple movie_genre values without grouping them.

Corrected Example:

SELECT movie_genre, SUM(guest_total) AS total_guests
FROM movie_theater
GROUP BY movie_genre;

Output:

+-------------+--------------+
| movie_genre | total_guests |
+-------------+--------------+
| Action      |          393 |
| Drama       |          345 |
| Horror      |          213 |
| Animation   |          355 |
+-------------+--------------+
4 rows in set (0.00 sec)

This groups guest totals by genre, so every movie_genre appears only once with its total guest count.

Misusing HAVING Without GROUP BY or Aggregates

HAVING is designed for filtering aggregated results, not raw rows. Using it without GROUP BY or an aggregate function confuses the SQL engine and results in an error.

Incorrect Example:

SELECT *
FROM movie_theater
HAVING movie_genre = 'Drama';

Error (pgsql):

ERROR:  column "movie_theater.theater_id" must appear in the GROUP BY clause or be used in an aggregate function

To filter ungrouped rows, use WHERE:

SELECT *
FROM movie_theater
WHERE movie_genre = 'Drama';

And if you want to filter grouped results:

SELECT movie_genre, AVG(guest_total) AS avg_guests
FROM movie_theater
GROUP BY movie_genre
HAVING avg_guests > 100;

This filters out genres whose average guest count is below 100.

Placing ORDER BY Before GROUP BY

The sequence of SQL clauses matters. ORDER BY must always come after GROUP BY. If reversed, the SQL engine will throw a syntax error or ignore the intended behavior.

Incorrect Clause Order:

SELECT movie_genre, AVG(guest_total)
FROM movie_theater
ORDER BY movie_genre
GROUP BY movie_genre;

Error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'movie_theater order by movie_genre group by movie_genre' at line 1

Correct Clause Order:

SELECT movie_genre, AVG(guest_total) AS avg_guests
FROM movie_theater
GROUP BY movie_genre
ORDER BY avg_guests DESC;

This groups first, then sorts the aggregated results.

Frequently Asked Questions (FAQs)

1. What is the difference between GROUP BY and ORDER BY in SQL?

GROUP BY and ORDER BY serve different purposes in SQL:

  • GROUP BY groups rows that have the same values in specified columns. It’s usually used with aggregate functions like SUM(), AVG(), COUNT(), etc., to perform calculations on grouped data.
  • ORDER BY is used to sort the result set in ascending (ASC) or descending (DESC) order based on one or more columns.

Example:

-- GROUP BY groups data by genre and calculates average attendance
SELECT movie_genre, AVG(guest_total) AS average_attendance
FROM movie_theater
GROUP BY movie_genre;

-- ORDER BY sorts the data by average attendance in descending order
SELECT movie_genre, AVG(guest_total) AS average_attendance
FROM movie_theater
GROUP BY movie_genre
ORDER BY average_attendance DESC;

2. Can you use GROUP BY and ORDER BY together in SQL?

Yes, you can use both GROUP BY and ORDER BY in the same query. The GROUP BY clause groups the data, and the ORDER BY clause sorts the grouped result set.

Example:

SELECT movie_name, SUM(guest_total * ticket_cost) AS total_revenue
FROM movie_theater
GROUP BY movie_name
ORDER BY total_revenue DESC;

This query groups data by movie_name, calculates total_revenue, and then sorts the results from highest to lowest revenue.

3. Does GROUP BY require an aggregate function in SQL?

Almost always. The primary purpose of GROUP BY is to perform calculations on groups of data using aggregate functions. As the article notes, if you only want to get a list of unique values without any aggregation, it is often clearer and more efficient to use SELECT DISTINCT.

4. What is the default sorting order of ORDER BY in SQL?

The default sorting order for ORDER BY is ascending (ASC). You can specify descending order explicitly using DESC.

Examples:

-- Ascending order (default)
SELECT guest_total
FROM movie_theater
ORDER BY guest_total;

-- Descending order
SELECT guest_total
FROM movie_theater
ORDER BY guest_total DESC;

These queries sort guest counts in ascending and descending order, respectively.

5. How do you group by multiple columns in SQL?

To group by more than one column, list each column in the GROUP BY clause separated by commas. This allows you to create subgroup aggregations.

Example:

SELECT movie_genre, date, COUNT(*) AS showings
FROM movie_theater
GROUP BY movie_genre, date
ORDER BY date, movie_genre;

This query counts how many showings of each genre occurred on each date.

6. What is the difference between GROUP BY and DISTINCT in SQL?

  • GROUP BY groups rows and is often used with aggregate functions to compute metrics per group.
  • DISTINCT eliminates duplicate rows in the result set and cannot perform aggregations.

Example using DISTINCT:

SELECT DISTINCT movie_name
FROM movie_theater;

Equivalent using GROUP BY:

SELECT movie_name
FROM movie_theater
GROUP BY movie_name;

For aggregations like total revenue or average guest count, use GROUP BY. Use DISTINCT only when you want unique records without summarization.

Conclusion

In this article, you’ve learned the essential SQL skills of sorting and aggregation. You learned to use ORDER BY to arrange results and GROUP BY with aggregate functions such as SUM(), AVG(), and COUNT() to summarize data into meaningful groups. We covered the critical distinction between filtering rows with WHERE before grouping and filtering groups with HAVING after aggregation. By combining these clauses, you can now build powerful, sophisticated queries to transform raw data into clear reports.

To continue developing your skills, consider exploring the following tutorials:

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

Tutorial Series: How To Use SQL

Series Description

Structured Query Language — commonly known as SQL — is a language used to define, control, manipulate, and query data held in a relational database. SQL has been widely adopted since it was first developed in the 1970s, and today it’s the predominant language used to manage relational database management systems.

Ideal for managing structured data (data that can fit neatly into an existing data model), SQL is an essential tool for developers and system administrators in a wide variety of contexts. Also, because of its maturity and prevalence, candidates with SQL experience are highly sought after for jobs across a number of industries.

This series is intended to help you get started with using SQL. It includes a mix of conceptual articles and tutorials which provide introductions to various SQL concepts and practices. You can also use the entries in this series for reference while you continue to hone your skills with SQL.

Note: Please be aware that the tutorials in this series use MySQL in examples, but many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.

About the author(s)

Jeanelle Horcasitas
Jeanelle Horcasitas
Author
Technical Writer
See author profile

Educator and writer committed to empowering our community by providing access to the knowledge and tools for making creative ideas into a reality

Manikandan Kurup
Manikandan Kurup
Editor
Senior Technical Content Engineer I
See author profile

With over 6 years of experience in tech publishing, Mani has edited and published more than 75 books covering a wide range of data science topics. Known for his strong attention to detail and technical knowledge, Mani specializes in creating clear, concise, and easy-to-understand content tailored for developers.

Category:
Tags:

Still looking for an answer?

Was this helpful?


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Creative CommonsThis work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.
Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.