Single Table Queries

Photo by Andrew Neel on Unsplash

Single Table Queries

Logical Query Processing

Quite recently, I started reading a book about T-SQL. It was written by Itzik Ben-Gan. I'm learning so much that I thought it'd be great to document my journey so here goes...


Elements of the SELECT statement

THE SELECT statement is used to query tables, perform some kind of logical manipulation and return a result. In this section, I'd like to highlight how standard SQL processes SQL statements to achieve a final result. Although some logical phases may be inefficient, the Microsoft SQL Server engine tries to optimise the whole process.

The sample query I'll be analysing in this article is the one Listing 1 below.

SELECT 
  order_id, 
  COUNT(product_id) AS numOfProducts, 
  SUM(list_price) AS sumOfListPrice
FROM sales.order_items
WHERE order_id = 3
GROUP BY order_id
HAVING COUNT(product_id) > 1
ORDER BY order_id;

Listing 1


This query filters order items that were placed in order 3; groups the order items by order (just order 3 in this case); and filters only groups of orders that have more than one order item. The query is far from perfect because the aim of this article is to highlight logical query processing -- the order in which the query clauses are logically processed.

It's interesting that although the SELECT statement appears at the top of the query, it's logically processed almost last. The clauses are logically processed in the following order:

  1. FROM
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. ORDER BY

Thus, the query in Listing 1 is processed as follows

FROM sales.order_items
WHERE order_id = 3
GROUP BY order_id
HAVING COUNT(product_id) > 1
SELECT order_id, COUNT(product_id) AS numOfProducts, SUM(list_price) AS sumOfListPrice
ORDER BY order_id

What the query is actually doing:

  1. Queries from the sales.order_items table.
  2. Filters only order items where order_id is 3.
  3. Groups the order items by order_id.
  4. Filters only groups that have more than 1 product.
  5. Returns for each group, the order_id, numOfProducts, and sumOfListPrice.
  6. Lastly, for presentation purposes, it sorts the result by order_id.

Sadly, you can't write the query in the correct logical order but here's the idea behind the design. The SQL designers wanted to develop a declarative language where users can query data using English-like syntax. For example, let's scrutinise an instruction from one person to another.

Bring me the iPad from my room

Notice that the iPad was mentioned before it's location. If you were to give this same instruction to a robot, you'd have to mention the location of the iPad first before specifying an action for the robot. Your instruction would have been something like, Go to the my room; search for the iPad on my bed; grab the iPad; bring it to me.

Concretely, the order of the query clauses is similar to English however, the logical query processing order is similar to how you'd instruct a robot to perform some task.


Conclusion

In this article, I discussed the logical query processing of standard SQL. Additionally, the logical processing order of the clauses were mentioned, and finally, the reason behind the interesting design of the SELECT statement was briefly discussed.