Quite recently, I started reading a book about T-SQL. It was written by Itzik Ben-Gan. I'm learning so much that I thought it'd be great to document my journey so here goes...
Elements of the SELECT
statement
THE SELECT
statement is used to query tables, perform some kind of logical manipulation and return a result. In this section, I'd like to highlight how standard SQL processes SQL statements to achieve a final result.
Although some logical phases may be inefficient, the Microsoft SQL Server engine tries to optimise the whole process.
The sample query I'll be analysing in this article is the one Listing 1 below.
SELECT
order_id,
COUNT(product_id) AS numOfProducts,
SUM(list_price) AS sumOfListPrice
FROM sales.order_items
WHERE order_id = 3
GROUP BY order_id
HAVING COUNT(product_id) > 1
ORDER BY order_id;
Listing 1
This query filters order items that were placed in order 3; groups the order items by order (just order 3 in this case); and filters only groups of orders that have more than one order item. The query is far from perfect because the aim of this article is to highlight logical query processing -- the order in which the query clauses are logically processed.
It's interesting that although the SELECT
statement appears at the top of the query, it's logically processed almost last. The clauses are logically processed in the following order:
- FROM
- WHERE
- GROUP BY
- HAVING
- SELECT
- ORDER BY
Thus, the query in Listing 1 is processed as follows
FROM sales.order_items
WHERE order_id = 3
GROUP BY order_id
HAVING COUNT(product_id) > 1
SELECT order_id, COUNT(product_id) AS numOfProducts, SUM(list_price) AS sumOfListPrice
ORDER BY order_id
What the query is actually doing:
- Queries from the
sales.order_items table
. - Filters only order items where
order_id
is 3. - Groups the order items by
order_id
. - Filters only groups that have more than 1 product.
- Returns for each group, the
order_id
,numOfProducts
, andsumOfListPrice
. - Lastly, for presentation purposes, it sorts the result by
order_id
.
Sadly, you can't write the query in the correct logical order but here's the idea behind the design. The SQL designers wanted to develop a declarative language where users can query data using English-like syntax. For example, let's scrutinise an instruction from one person to another.
Bring me the iPad from my room
Notice that the iPad was mentioned before it's location. If you were to give this same instruction to a robot, you'd have to mention the location of the iPad first before specifying an action for the robot. Your instruction would have been something like, Go to the my room; search for the iPad on my bed; grab the iPad; bring it to me.
Concretely, the order of the query clauses is similar to English however, the logical query processing order is similar to how you'd instruct a robot to perform some task.
Conclusion
In this article, I discussed the logical query processing of standard SQL. Additionally, the logical processing order of the clauses were mentioned, and finally, the reason behind the interesting design of the SELECT
statement was briefly discussed.