Oracle

December 24, 2010

Parallel Query – events in waitclass Other – PX Deq Credit: send blkd

Filed under: Uncategorized — srivenu @ 2:16 am

This entry relates to a create table statement. The table in question is created every day in the report database by joining several mviews.
The table creation which normally completed in a hour hasn’t completed even after several hours yesterday.
The sql is something like this

CREATE 	TABLE 	User.abcd nologging parallel 8 AS
	with 	tbl as (
		select	service_type, circle_id 
		from 	User.F
		group 	by service_type, circle_id
		)
SELECT	.....
	.....
FROM	A, 
	B, 
	C, 
	D, 
	E, 
	tbl	
WHERE	b.CUST_ACCT_NBR = A.CUST_ACCT_NBR 
AND	A.CUST_ACCT_CTGY_ID = c.CUST_ACCT_CTGRY_ID(+) 
AND	A.MARKET_CODE = d.ACC_SUB_CAT(+) 
AND	b.SRVC_TYPE_ID = st.SERVICE_TYPE_ID 
AND	b.SRVC_TYPE_ID = tbl.SERVICE_TYPE 
AND	b.CIRCLE_ID = tbl.CIRCLE_ID
/

One of the Mview tables was missing statistics. The initial plan was showing an index access on that table. I added the full & use_hash hints and without properly observing the resulting plan ran the hinted sql. After 20 mins of execution, when i monitored one of the parallel slave sessions (V$SESSION_EVENT), i observed that almost 18 mins of elapsed time was spent on “events in waitclass Other”. I immediately suspected CPU exhaustion (violating the cardinal rule of tuning – “Never assume anything”). But sar showed 30% cpu free. I ruled out CPU exhaustion, and a short duration continuous monitoring of session waits showed a steady stream of “PX Deq Credit: send blkd” waits. (For more complex situations, you could use the excellent waitprof script from Tanel Podar). Again i violated the cardinal principle and checked PARALLEL_EXECUTION_MESSAGE_SIZE. It was set to a decent size of 8k. I then looked at the plan and found that the table with missing stats was being Broadcast to the HASH Join. It was a 16GB table and Broadcasting it had caused all those waits. I added the PQ_DISTRIBUTE HASH HASH hint on that table to rectify the issue.

But what puzzled me even more was what I had observed during the next run. The plan for the create table session was showing full scans and HASH joins for all tables involved. But after 1 hour of execution, when i monitored, all the parallel slaves in one set were showing a steady stream of continuous “db file sequential read” waits. When i checked the blocks, I found them to be of another table H which was not at all referred to in the sql. All the Mviews A, B, C, D, E, F referred in the sql were against remote tables across a DBlink and get refreshed periodically. I could not find the reason for the access to H (which is again a Mview against another remote table). There is no sort of dependency among any of these tables. There is also no fine-grained access control setup.
After stopping the periodic Mview refreshes and restarting the create table, i did not find the access to table H in the next run.
I want to resolve this mystery and blog about it.

Advertisements

1 Comment »

  1. Hi

    Did you find the reason? even i am facing the same issue. Mview refresh suddenly started taking time. We checked all the Parallel session found that it is waiting for PX Deq Credit: send blkd and PX qref latch
    event. In our case, PARALLEL_EXECUTION_MESSAGE_SIZE is 16 K. Please let me know if you have found any solution for this.

    Thanks

    Comment by Daniesh — December 13, 2013 @ 4:49 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: