To illustrate this in a 'real world' situation say you're in a restaurant and the service appears to be slow; you think 'if I had three waitresses instead of one I'd get my food faster', but let's look at that from a different point of view. Three waitresses for one table means that all three waitresses need to communicate with each other to avoid repeating work done by the others (this, of course, takes time). Each waitress needs to be assigned specific duties but also must be available to take over for another if something happens (one broke her toe, for instance); such an occurrence requires reassignment of duties and adjusting the schedule to accomodate the change. Eventually you get your order but it will take MORE time than if a single waitress performed all of the necessary tasks.
Parallel processing does more than simply 'divide and conquer' as it requires several steps most developers may not know about or may ignore entirely. DSS and DW systems, running with large numbers of CPUs, can benefit from parallel processing as the load can be distributed among the CPUs reducing the load on a single processor. OLTP systems, on the other hand, usually involve operations which are quick to begin with and the overhead of implementing parallel processing is quite large compared to the overall execution time; additionally it may take longer to complete the parallel execution than it would to properly tune the query for single-threaded processing and eliminate the parallelism altogether.
So what does parallel processing bring to the table? Obviously the
SQL> select id, txtval, status
2 from para_tst pt
3 where id between 9001 and 34001;
ID TXTVAL STATUS
---------- ------------------------------ -------
9389 ALL_SCHEDULER_RUNNING_JOBS VALID
9390 USER_SCHEDULER_RUNNING_JOBS VALID
9391 USER_SCHEDULER_RUNNING_JOBS VALID
[...]
20772 /130d52e2_JDK2Sorter VALID
20773 /130d52e2_JDK2Sorter VALID
20774 /4c28cb16_ToolLogOptions VALID
20775 /4c28cb16_ToolLogOptions VALID
20776 /cbd9a55f_AbortException VALID
20777 /cbd9a55f_AbortException VALID
591960 rows selected.
Elapsed: 00:02:03.68
Execution Plan
----------------------------------------------------------
Plan hash value: 350193380
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 593K| 20M| 1698 (2)| 00:00:21 | | |
| 1 | PARTITION RANGE ITERATOR| | 593K| 20M| 1698 (2)| 00:00:21 | 19 | 21 |
|* 2 | TABLE ACCESS FULL | PARA_TST | 593K| 20M| 1698 (2)| 00:00:21 | 19 | 21 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ID"<=34001)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
45396 consistent gets
156 physical reads
0 redo size
21517854 bytes sent via SQL*Net to client
434616 bytes received via SQL*Net from client
39465 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
591960 rows processed
SQL>
SQL> select /*+ parallel(pt 2) pq_distribute(pt partition) */
2 id, txtval, status
3 from para_tst pt
4 where id between 9001 and 34001;
ID TXTVAL STATUS
---------- ------------------------------- -------
18404 /79bc64f9_JvmMemoryMeta VALID
18405 /b80019c2_EnumJvmThreadContent VALID
18406 /b80019c2_EnumJvmThreadContent VALID
18407 /bcfa29b5_EnumJvmThreadCpuTime VALID
18408 /bcfa29b5_EnumJvmThreadCpuTime VALID
17997 /b3bd73eb_CommonClassObject VALID
17998 /b3bd73eb_CommonClassObject VALID
17999 /c5a69e17_ServerSchemaObject1 VALID
[...]
20724 /4bd3ef8d_KnownOptions4 VALID
20725 /4bd3ef8d_KnownOptions4 VALID
20726 /75d2b0ba_KnownOptions5 VALID
20727 /75d2b0ba_KnownOptions5 VALID
20728 /75e9b2d4_KnownOptions VALID
20729 /75e9b2d4_KnownOptions VALID
591960 rows selected.
Elapsed: 00:02:23.59
Execution Plan
----------------------------------------------------------
Plan hash value: 1393746857
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distr |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 593K| 20M| 942 (1)| 00:00:12 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM)| :TQ10000 | 593K| 20M| 942 (1)| 00:00:12 | | | Q1,00 | P->S | QC (RAND)|
| 3 | PX BLOCK ITERATOR | | 593K| 20M| 942 (1)| 00:00:12 | 19 | 21 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS FULL| PARA_TST | 593K| 20M| 942 (1)| 00:00:12 | 19 | 21 | Q1,00 | PCWP | |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("ID"<=34001)
Statistics
----------------------------------------------------------
11 recursive calls
1 db block gets
6396 consistent gets
6154 physical reads
96 redo size
21534533 bytes sent via SQL*Net to client
434616 bytes received via SQL*Net from client
39465 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
591960 rows processed
SQL>
On a system running 11.2.0.2 with
When is it good practice to use parallel processing? One situation which comes to mind is in creating/populating tables using complex queries that may return faster in parallel than with standard serial processing. An application on which I worked was populating a table with a rather benign join but it was taking far too long to complete the load -- the elapsed time exceeded the batch processing window. Using parallelism (along with regularly updated statistics) dramatically reduced the response time from over an hour to less than two minutes, well within the batch window allowing the rest of the processing to continue. [Note that this was a batch process, run outside of the normal business day, which freed resources normally allocated to user sessions.] There are others, outside of data warehousing applications, but they're exceptions and not the rule.
So, we've learned that even though parallelism would appear to make things go faster in reality that's not often the case due to the extra overhead in managing additional processes and consolodating the individual results into the final result set. We've also learned that simply slapping a /*+ parallel */ hint into a query doesn't constitute tuning and doing so can make performance worse instead of better. The correct choice is to properly tune the query or queries in question using resources such as AWR and ASH reports, execution plans and wait statistics to pinpoint the problem area or areas to address. Parallelism isn't a silver bullet and wasn't intended to be and should be used sparingly, if at all.
Now, where's my sandwich?

1 comments:
Here your sandwich David, good explanation and waitress example is amazing.
In the restaurant works just as well ....
Ciao
Alberto
Post a Comment