Hash is used in 2 most commonly seen physical operators, Hash Join and Hash Aggregate. Those 2 happen when there are no any other alternatives (merge, nested loop, or stream aggregate) which can be used for more efficient operation. For instance, when SQL Server joins 2 tables together but none of them has an index. SQL Server has no idea whether the joining keys are sorted or not. In most of the case for such scenarios, hash join will take place. As its name, hash join uses hash algorithm to encode the joining keys from both side, compares the hashed values, and produce the result. This sounds very complex — yes, it is a very heavy operator.
Few steps are taken when while hash join of 2 sets taking place
- Build a hash list(or table) from joining keys from a smaller set of two. SQL Server will use statistics on the joining keys of the set to determine number of rows being returned. If there isn’t any stats associated, SQL Server will build it automatically before generating the query plans.This is the first step of the hash join, which called Hash Keys Build.
- After the hash table is built, the rows in larger set will be retrieved one by one. Hash of joining keys from each row will be checked against the hash list (table) from the smaller set. This is called Hash Keys Probe
- Comparing the hashes, and generate output
No matter what, the hash of the joining keys from both side will always be calculated. This mean that it requires CPU cycles. Hash Keys Probe has to run after Hash Key Build completed entirely. Hash Keys Build will be saved in memory. While generating query plan, SQL Server will try to get sufficient memory to hold hash list based on statistics. When the stats is accurate and memory is enough for the query, this process will be running in the memory. The perfromance will not be too bad.
However, when the hash table is too big to fit into memory. The hash list will be partitioned by the hash keys. Partitioned lists will be saved in both memory and tempdb locations. When this happen, an Hash Warning will be showing in SQL Server Profiler. While hash lists are building, the incoming row will go to memroy hash list if the hash is in that bucket, otherwise, it will go to tempdb. Now both lists in memory and/or tempdb keep growing. It’s possible that the memory for the hash table in memory gets exhaused again. When this happens, SQL Server will have to repartition this part.
Regardless number of partitions generated, they are all in memory or disk. After build is completed, probing starts. If the hash is in the memory partitions, output will be generated and sent, the probing row will be discarded. But when the probing row does not match the rows in memory hash table, the hash of this row will be saved in tempdb without producing any result . aha, tempdb now saves 2 hash tables in this case, one is from build phase and the other one is from probing phase.
After all records in the probing table processed against the hash table in memory, the hash table in memory will be unloaded. The hash table in tempdb will be loaded to memory partition by partitioin. Then the probing process will use the same approach with the probing hash in tempdb to generate the result.
Hue, it’s a complex expensive process. What’s the cost formula on this? I don’t think I can use my previous way to test it out easily. But I have some document I downloaded before regarding this, Accurate Modelling of The Hybrid Hash Join Algorithm.
SQL Server sometimes uses Sort operator to make un-indexed joining keys ordered then perform merge to avoid hash joins. Avoiding using hash join in your query is the key for performance in most of the cases but not always. If you see this operator in your query plan, try to create/adjust indexes to remove this operator. If hash join is inevitable, try to reduce number of rows in the hash list.
2 thoughts on “Query Plan (10) – Hash”
Could you explain what is the difference between Hash bailout and Hash recursion?
Partitioning hash table is based on hash function. Whenever hash table is running out of memory, some partitions will be written to tempdb. This is grace hash join. After this happening, the part of partitioned hashed table resides in memory will still be possible running out of memory. When this happen, the memory part of hash table will be partially save in tempdb again based on hash function like grace hash join. If sql server keeps spilling hash table to tempdb, in ternally, it really performs a recusion of “grace join” — this is hash recursion.
If recursion repeats until it reaches its maximum recursion level, then bailout – stop current plan and start another one to process the hash table.