Oracle

April 27, 2011

Bloom Filters

Filed under: Uncategorized — srivenu @ 9:52 am
----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 | 86185 |        |      |            |
|   1 |  PX COORDINATOR               |                 |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)         | :TQ10004        | 86185 |  Q1,04 | P->S | QC (RAND)  |
|   3 |    HASH GROUP BY              |                 | 86185 |  Q1,04 | PCWP |            |
|   4 |     PX RECEIVE                |                 | 86185 |  Q1,04 | PCWP |            |
|   5 |      PX SEND HASH             | :TQ10003        | 86185 |  Q1,03 | P->P | HASH       |
|   6 |       HASH GROUP BY           |                 | 86185 |  Q1,03 | PCWP |            |
|*  7 |        HASH JOIN              |                 |  1145K|  Q1,03 | PCWP |            |
|   8 |         PX RECEIVE            |                 |    22 |  Q1,03 | PCWP |            |
|   9 |          PX SEND BROADCAST    | :TQ10000        |    22 |  Q1,00 | P->P | BROADCAST  |
|  10 |           PX BLOCK ITERATOR   |                 |    22 |  Q1,00 | PCWC |            |
|  11 |            TABLE ACCESS FULL  | A               |    22 |  Q1,00 | PCWP |            |
|* 12 |         HASH JOIN SEMI        |                 |  1353K|  Q1,03 | PCWP |            |
|  13 |          PX JOIN FILTER CREATE| :BF0000         |  1353K|  Q1,03 | PCWP |            |
|  14 |           PX RECEIVE          |                 |  1353K|  Q1,03 | PCWP |            |
|  15 |            PX SEND HASH       | :TQ10001        |  1353K|  Q1,01 | P->P | HASH       |
|  16 |             PX BLOCK ITERATOR |                 |  1353K|  Q1,01 | PCWC |            |
|* 17 |              TABLE ACCESS FULL| B               |  1353K|  Q1,01 | PCWP |            |
|  18 |          PX RECEIVE           |                 |   154M|  Q1,03 | PCWP |            |
|  19 |           PX SEND HASH        | :TQ10002        |   154M|  Q1,02 | P->P | HASH       |
|  20 |            PX JOIN FILTER USE | :BF0000         |   154M|  Q1,02 | PCWP |            |
|  21 |             PX BLOCK ITERATOR |                 |   154M|  Q1,02 | PCWC |            |
|* 22 |              TABLE ACCESS FULL| C               |   154M|  Q1,02 | PCWP |            |
----------------------------------------------------------------------------------------------

Looking at steps 13 & 20 in the above plan reminded me of a post i had written long time back – Hash Join

At that time, i could not get any information on Oracle’s implementation of the “bit vector” filtering mechanism. I assumed that it’s a sort of “Bloom filter”.

Later on I had come across this good note by Christian Antognini.

I think “Bloom Filter” and “Bit Vector” are the same except as to where they are applied.

Bit Vector is created by a process (or process in case of a parallel join) based on the input from the build table. This bit vector is then applied by the same process(processes) on the probe table rows before performing the actual join.

Bloom filter is also created the same way, by a process (processes) based on the input from the build table. But the key difference is that this Bloom Filter is then passed on to the process (or processes) scanning the probe table, which uses this filter to reject rows during the scan itself.

In normal Bit Vector filtering the rows are filtered after they are transferred between the slave processes and just before the join.
In the case of Bloom Filters the rows are discarded at an even earlier stage – during the scan of probe table data itself. This has the advantage of reducing the data transfer between the slave processes.

I need to design some test cases to verify my supposition that Bloom Filters & Bit Vectors are the same.

If case they are the same, then for a sql plan showing a Bloom filter operation, i should not see any rows filtered at the Bit Vector step (because the bloom filter should have filtered all the non-matching rows at an even earlier step)

Advertisements

Blog at WordPress.com.

%d bloggers like this: