I was looking into a system having serious performance issues. The environment was a 4-node RAC of 64-bit Oracle 10.2.0.4 on AIX 6.1.
Every few minutes or so, randomly on one or more nodes, the cpu spikes from around the regular 40% usage to 100% and the system grinds to a halt for a minute or two till normalcy returns. A quick look at the Session Waits during the issue showed around 100 sessions on each instance waiting for “latch: row cache objects”. The sql they were executing was similar, but not same due to literal usage.
The event “latch: row cache objects” was around 33% of the total DB time.
The row cache gets and misses were mainly for dc_histogram_defs & dc_object_ids
col cache# head "Cache|no" form 999 col parameter head "Parameter" form a25 col type head "Type" form a12 col subordinate# head "Sub|ordi|nate" form 9999 col rcgets head "Cache|Gets" form 999999999999 col rcmisses head "Cache|Misses" form 999999999999 col rcmodifications head "Cache|Modifica|tions" form 999999999999 col rcflushes head "Cache|Flushes" form 999999999999 col kqrstcln head "Child#" form 999 col lagets head "Latch|Gets" form 999999999999 col lamisses head "Latch|Misses" form 999999999999 col laimge head "Latch|Immediate|gets" form 999999999999 select dc.kqrstcid CACHE#, dc.kqrsttxt PARAMETER, decode(dc.kqrsttyp, 1,'PARENT','SUBORDINATE') type, decode(dc.kqrsttyp, 2, kqrstsno, null) subordinate#, dc.kqrstgrq rcgets, dc.kqrstgmi rcmisses, dc.kqrstmrq rcmodifications, dc.kqrstmfl rcflushes, dc.kqrstcln, la.gets lagets, la.misses lamisses, la.immediate_gets laimge from x$kqrst dc, v$latch_children la where dc.inst_id = userenv('instance') and la.child# = dc.kqrstcln and la.name = 'row cache objects' order by rcgets desc / Sub Cache ordi Cache Cache Latch Latch no Parameter Type nate Gets Misses Gets Misses ----- ------------------------- ------------ ----- ------------- ------------- ------------- ------------- 16 dc_histogram_defs PARENT 1163261239 182775 3492413701 862439459 11 dc_object_ids PARENT 1091768860 37575 3275630976 221260636 0 dc_tablespaces PARENT 440255133 1310 1320780266 8747309 7 dc_users PARENT 279457505 747 843421881 9823132 2 dc_segments PARENT 178559327 1192786 545703514 8000261 3 dc_rollback_segments PARENT 146295144 9143 437217211 6452501 8 dc_objects PARENT 43116889 31984 129632484 943641 15 dc_database_links PARENT 21401418 79 64204915 15514 7 dc_users SUBORDINATE 1 10638770 323 843421881 9823132 10 dc_usernames PARENT 8048301 856 24151948 24725 7 dc_users SUBORDINATE 0 2809074 101 843421881 9823132 19 dc_table_scns PARENT 2389350 2103 7208283 12624 14 dc_profiles PARENT 1413630 7 4240943 2677 16 dc_histogram_data SUBORDINATE 0 716143 41435 3492413701 862439459 16 dc_histogram_data SUBORDINATE 1 315044 4578 3492413701 862439459 13 dc_sequences PARENT 181111 56571 2997694 930 6 dc_files PARENT 90210 61110 758952 191 17 dc_global_oids PARENT 58365 691 180794 3 24 outstanding_alerts PARENT 40882 38340 582110 1617 5 dc_tablespace_quotas PARENT 29368 1679 336175 316 8 dc_object_grants SUBORDINATE 0 27432 941 129632484 943641 12 dc_constraints PARENT 2911 1802 46892 48 22 dc_awr_control PARENT 2378 46 8001 0 19 dc_partition_scns SUBORDINATE 0 0 0 7208283 12624 33 kqlsubheap_object PARENT 0 0 0 0 7 dc_users SUBORDINATE 2 0 0 843421881 9823132 4 dc_used_extents PARENT 0 0 0 0 1 dc_free_extents PARENT 0 0 0 0 34 realm cache PARENT 0 0 0 0 34 realm auth SUBORDINATE 0 0 0 0 0 35 Command rule cache PARENT 0 0 0 0 36 Realm Object cache PARENT 0 0 0 0 36 Realm Subordinate Cache SUBORDINATE 0 0 0 0 0 40 Rule Set Cache PARENT 0 0 0 0 37 event map PARENT 0 0 0 0 38 format PARENT 0 0 0 0 39 audit collector PARENT 0 0 0 0 26 global database name PARENT 0 0 0 0 20 rule_info PARENT 0 0 0 0 21 rule_or_piece PARENT 0 0 0 0 21 rule_fast_operators SUBORDINATE 0 0 0 0 0 9 dc_qmc_cache_entries PARENT 0 0 0 0 23 dc_qmc_ldap_cache_entries PARENT 0 0 0 0 27 qmtmrcin_cache_entries PARENT 0 0 0 0 28 qmtmrctn_cache_entries PARENT 0 0 0 0 29 qmtmrcip_cache_entries PARENT 0 0 0 0 30 qmtmrctp_cache_entries PARENT 0 0 0 0 31 qmtmrciq_cache_entries PARENT 0 0 0 0 32 qmtmrctq_cache_entries PARENT 0 0 0 0 25 dc_hintsets PARENT 0 0 0 0 18 dc_outlines PARENT 0 0 0 0 51 rows selected. The misses were mainly at these locations NoWait Waiter Latch Name Where Misses Sleeps Sleeps ------------------------ -------------------------- ------- ---------- -------- row cache objects kqrpre: find obj 0 869,898 937,846 row cache objects kqreqd: reget 0 578,411 414,607 row cache objects kqreqd 0 407,455 503,634
Following is one of the sql executed by the sessions contending for the row cache latch. (They were using literal values as shown below)
select ......, SUB_STATUS, DOC_STATUS, CRE_STATUS, PHY_STATUS, ... from OWNER.VIEW1 where ( 10.0 = 10 AND 7.07 = 7.07 AND 'NA' = 'G_60_TO' ) AND ISD = 987654321 AND ACCOUNT_ID = 1234567 AND circle_id = 55555 AND ACCOUNT_ID NOT IN ( SELECT ACCOUNT_ID FROM OWNER.TAB1 WHERE ID1 = 12345 GROUP BY ACCOUNT_ID HAVING COUNT(*) >= 1000 )
There were around a 100 sessions executing the same piece of code in a tight loop. The offending sql was executed roughly around 2 times/second by each session. There were others sql’s in the loop and some of them were having literals. But the latch contention shows up only during the execution of the above sql. This piqued my interest to find the root cause.
A trace showed the parse call of the offending sql to be preceded by 3 recursive calls.
===================== select text from view$ where rowid=:1 Bind value=0000D6E8.0000.0001 ===================== select condition from cdef$ where rowid=:1 Bind value=00011BD3.004F.0001 ===================== select condition from cdef$ where rowid=:1 Bind value=00011BD3.0050.0001 ===================== select condition from cdef$ where rowid=:1 Bind value=00011BD3.0051.0001 =====================
I queried view$ to find out what this recursive call related to.
select rowid, dbms_rowid.ROWID_TO_RESTRICTED(rowid,1) from view$ where dbms_rowid.ROWID_TO_RESTRICTED(rowid,1) = '0000D6E8.0000.0001' / ROWID DBMS_ROWID.ROWID_T ------------------ ------------------ AAAAA/AABAAANboAAA 0000D6E8.0000.0001 select obj# from view$ where rowid = 'AAAAA/AABAAANboAAA' / OBJ# ---------- 58193 select owner, object_type, object_name from dba_objects where object_id = 58193 / Owner Object type Object Name ---------------- ------------------ ------------------------------ OWNER VIEW VIEW1
Similar queries on cdef$ revealed that the data related to 3 constraints on a table (lets call it SUB)
DOC_STATUS IN ('P', 'S', 'F','N') CRE_STATUS IN ('P', 'S', 'F','N') PHY_STATUS IN ('P', 'S', 'F','N')
I then checked the view definition, which was something like this
select .........., SUB_STATUS, DOC_STATUS, CRE_STATUS, PHY_STATUS, ........... from SUB, X, Y where SUB.id = x.id and x.id = y.id and SUB.SUB_STATUS != 'C'
The table SUB also has a check constraint on the SUB_STATUS column – SUB_STATUS IN (‘I’, ‘A’, ‘G1’, ‘G2’, ‘S’, ‘C’). But there was no recursive call to get this constraint definition even though a predicate on that column was used in the view definition. Looks like it only gets the constraint definitions for columns used in the query for doing a possible rewrite.
This led to the next question, why is it executing these recursive calls again and again.
Wasn’t that the main purpose for the existence of the row cache? To cache dictionary data?
As usual my master comes to the rescue with this note – Constraints, Inserts and bind
TEXT column in VIEW$ table and CONDITION column in CDEF$ table are LONG datatype and hence their values are not going to be cached in the row cache.
We had resolved the issue by changing CURSOR_SHARING to FORCE from EXACT.
There is also another workaround of setting event 10195 to disable transitive predicate generation using check constraints.
So using literals in sql’s involving views is going to have an additional penalty.
Leave a comment