In oracle 11gR2, oracle uses one of broadcast, hash, round robin, range or key distribution methods, to redistribute data among parallel query servers. If the cardinality estimation at a given step happened to be incorrect (eg: the estimate was about a 100 rows and in reality there was 100,000 rows), oracle could end up broadcasting a lot of rows to all the receiving parallel query servers. Another issue frequently encountered was that there could be skew in the popular values, resulting in hash distribution sending most of the values to one or a small subset of the receiving pq servers (Causing them to do most of the work).
To address such situations in 12cR1, as part of the new Adaptive Query Optimization techniques, oracle introduced the ability to switch the distribution method from hash to broadcast, if the actual number of rows was less than the threshold . In 12.1 the threshold was 2X degree of parallelism . This new functionality is called Adaptive Parallel Distribution Methods.
If you check the execution plan you will see that a “Statistics Collector” has been inserted before the send operation , and the send operation is called “PX Send Hybrid Hash”.
Let us look at an example of a simple hybrid hash distribution.
SQL_ID d1qmm37g288qz, child number 0
-------------------------------------
select /*+ full(emp) full(dept) parallel(4) */
dept.deptno,dept.dname,emp.empno,emp.ename from dept,emp where
dept.deptno = emp.deptno
Plan hash value: 717794318
------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | TQ |IN-OUT| PQ Distrib | OMem | 1Mem | O/1/M |
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4 (100)| | | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | 73728 | 73728 | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | 14 | 770 | 4 (0)| 00:00:01 | Q1,02 | P->S | QC (RAND) | | | |
|* 3 | HASH JOIN BUFFERED | | 14 | 770 | 4 (0)| 00:00:01 | Q1,02 | PCWP | | 3105K| 2823K| 4/0/0|
| 4 | PX RECEIVE | | 4 | 88 | 2 (0)| 00:00:01 | Q1,02 | PCWP | | | | |
| 5 | PX SEND HYBRID HASH | :TQ10000 | 4 | 88 | 2 (0)| 00:00:01 | Q1,00 | P->P | HYBRID HASH| | | |
| 6 | STATISTICS COLLECTOR | | | | | | Q1,00 | PCWC | | | | |
| 7 | PX BLOCK ITERATOR | | 4 | 88 | 2 (0)| 00:00:01 | Q1,00 | PCWC | | | | |
|* 8 | TABLE ACCESS FULL | DEPT | 4 | 88 | 2 (0)| 00:00:01 | Q1,00 | PCWP | | | | |
| 9 | PX RECEIVE | | 14 | 462 | 2 (0)| 00:00:01 | Q1,02 | PCWP | | | | |
| 10 | PX SEND HYBRID HASH | :TQ10001 | 14 | 462 | 2 (0)| 00:00:01 | Q1,01 | P->P | HYBRID HASH| | | |
| 11 | PX BLOCK ITERATOR | | 14 | 462 | 2 (0)| 00:00:01 | Q1,01 | PCWC | | | | |
|* 12 | TABLE ACCESS FULL | EMP | 14 | 462 | 2 (0)| 00:00:01 | Q1,01 | PCWP | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------
In the execution plan you can see that in steps 5 and 10 the operation is “PX SEND HYBRID HASH”. During execution, the statistics collector in step 6 buffers the rows being retrieved from DEPT. If the number of rows are below a threshold, then all rows from DEPT are broadcast to each of the PQ servers that perform the join. If the number of rows exceed a threshold then the rows from DEPT are send using a HASH distribution to the PQ servers that perform the join.
Immediately after the query above is executed, from the same session, you can execute the script pqstat to see how the rows were distributed among the producers and consumers.
SQL> @pqstat
DFO_NUMBER TQ_ID SERVER_TYPE PROCESS NUM_ROWS BYTES WAITS TIMEOUTS AVG_LATENCY
---------- ---------- ---------------------------------------- ------------------------ ---------- ---------- ---------- ---------- -----------
1 0 Producer P004 0 96 0 0 0
P005 16 324 0 0 0
P006 0 96 0 0 0
P007 0 96 0 0 0
Consumer P000 4 153 125 119 0
P001 4 153 127 119 0
P002 4 153 127 119 0
P003 4 153 125 119 0
2 Producer P000 0 24 0 0 0
P001 14 375 1 0 0
P002 0 24 0 0 0
P003 0 24 0 0 0
Consumer QC 14 447 2 0 0
Now let us take a look at a variation of the hybrid hash distribution operation , which is denoted as “PX SEND HYBRID HASH (SKEW)”
SQL_ID 20fky17tasvn9, child number 0
-------------------------------------
SELECT /*+ parallel(4) full(e) full(d) gather_plan_statistics */
department_name, sum(salary) FROM employees e, departments d
WHERE d.department_id=e.department_id GROUP BY department_name
Plan hash value: 2940813933
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | TQ |IN-OUT| PQ Distrib | A-Rows | A-Time | Buffers | OMem | 1Mem | O/1/M |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 5 (100)| | | | | 11 |00:00:00.04 | 24 | | | |
| 1 | PX COORDINATOR | | 1 | | | | | | | | 11 |00:00:00.04 | 24 | 73728 | 73728 | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | 0 | 27 | 621 | 5 (20)| 00:00:01 | Q1,03 | P->S | QC (RAND) | 0 |00:00:00.01 | 0 | | | |
| 3 | HASH GROUP BY | | 4 | 27 | 621 | 5 (20)| 00:00:01 | Q1,03 | PCWP | | 11 |00:00:00.01 | 0 | 1071K| 1071K| 4/0/0|
| 4 | PX RECEIVE | | 4 | 27 | 621 | 5 (20)| 00:00:01 | Q1,03 | PCWP | | 12 |00:00:00.01 | 0 | | | |
| 5 | PX SEND HASH | :TQ10002 | 0 | 27 | 621 | 5 (20)| 00:00:01 | Q1,02 | P->P | HASH | 0 |00:00:00.01 | 0 | | | |
| 6 | HASH GROUP BY | | 4 | 27 | 621 | 5 (20)| 00:00:01 | Q1,02 | PCWP | | 12 |00:00:00.03 | 18 | 1063K| 1063K| 4/0/0|
|* 7 | HASH JOIN | | 4 | 106 | 2438 | 4 (0)| 00:00:01 | Q1,02 | PCWP | | 106 |00:00:00.03 | 18 | 1572K| 1572K| 4/0/0|
| 8 | PX RECEIVE | | 4 | 27 | 432 | 2 (0)| 00:00:01 | Q1,02 | PCWP | | 108 |00:00:00.02 | 0 | | | |
| 9 | PX SEND HYBRID HASH | :TQ10000 | 0 | 27 | 432 | 2 (0)| 00:00:01 | Q1,00 | P->P | HYBRID HASH| 0 |00:00:00.01 | 0 | | | |
| 10 | STATISTICS COLLECTOR | | 4 | | | | | Q1,00 | PCWC | | 27 |00:00:00.01 | 9 | | | |
| 11 | PX BLOCK ITERATOR | | 4 | 27 | 432 | 2 (0)| 00:00:01 | Q1,00 | PCWC | | 27 |00:00:00.01 | 9 | | | |
|* 12 | TABLE ACCESS FULL | DEPARTMENTS | 1 | 27 | 432 | 2 (0)| 00:00:01 | Q1,00 | PCWP | | 27 |00:00:00.01 | 9 | | | |
| 13 | PX RECEIVE | | 4 | 107 | 749 | 2 (0)| 00:00:01 | Q1,02 | PCWP | | 107 |00:00:00.01 | 18 | | | |
| 14 | PX SEND HYBRID HASH (SKEW)| :TQ10001 | 4 | 107 | 749 | 2 (0)| 00:00:01 | Q1,01 | P->P | HYBRID HASH| 107 |00:00:00.01 | 18 | | | |
| 15 | PX BLOCK ITERATOR | | 4 | 107 | 749 | 2 (0)| 00:00:01 | Q1,01 | PCWC | | 107 |00:00:00.01 | 18 | | | |
|* 16 | TABLE ACCESS FULL | EMPLOYEES | 2 | 107 | 749 | 2 (0)| 00:00:01 | Q1,01 | PCWP | | 107 |00:00:00.01 | 18 | | | |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
In the execution plans you can see the steps “PX SEND HYBRID HASH” (Step 9) and “PX SEND HYBRID HASH (SKEW)” (Step 14).
“PX SEND HYBRID HASH (SKEW)” is a technique used by 12c parallel query to avoid hash distribution of skewed popular key values. During optimization of the query, if histograms are present on the join expression, oracle finds and keeps the popular values in the cursor. At execution time , when the build table is being read, all values with matches in the popular values, are BROADCAST to all receiving pq serves and non matching values are distributed by HASH. When the probe table is being read all values with matches in the popular values are send ROUND ROBIN/RANDOM to the receiving pq serves and the non matching values are distributed by HASH.
Please read Randolf Geist’s excellent Articles Linked below for indepth info on this topic.
Parallel Execution Skew – 12c Hybrid Hash Distribution with Skew Detection
12c Hybrid Hash Distribution with Skew Detection / Handling.