Oracle

March 18, 2011

ORA-00018: maximum number of sessions exceeded

Filed under: Uncategorized — srivenu @ 11:48 am

Recently in one of our RAC databases, we started getting ORA-00018. A quick count of V$SESSION showed around 9000. Since i’m a regular reader of Tanel Podar’s articles, i could quickly figure out what the issue was. Here is the link to article – Recursive Sessions and ORA-00018: maximum number of sessions exceeded

Advertisements

Digging in the ASH

Filed under: Uncategorized — srivenu @ 11:11 am

Recently logins to one of our databases had frozen for around 10 minutes from 1:04 PM till around 1:16PM.
New users were not able to login to the database.
By the time i had logged in, i saw several sessions waiting on “gc buffer busy”.
The object involved was found out to be AUD$ table. The issue got resolved immediately after auditing was turned off.
I decided to dig in further for a portmortem using ASH.

Get the latest snap_time and snap_id from dba_hist_snapshot.

select 	* 
from 	dba_hist_snapshot 
where 	snap_id in (
	select 	max(snap_id) 
	from 	dba_hist_snapshot 
	where 	instance_number=1
	)
/

   SNAP_ID       DBID INSTANCE_NUMBER STARTUP_TIME
---------- ---------- --------------- ---------------------------------------------------------------------------
BEGIN_INTERVAL_TIME
---------------------------------------------------------------------------
END_INTERVAL_TIME
---------------------------------------------------------------------------
FLUSH_ELAPSED                                                               SNAP_LEVEL ERROR_COUNT
--------------------------------------------------------------------------- ---------- -----------
     45044 2872548522               1 11-MAR-11 04.52.14.000 AM
18-MAR-11 12.30.35.275 PM
18-MAR-11 01.30.19.435 PM
+00000 00:00:41.6                                                                    1           0

     45044 2872548522               2 11-MAR-11 05.01.23.000 AM
18-MAR-11 12.30.16.417 PM
18-MAR-11 01.30.01.443 PM
+00000 00:00:36.6                                                                    

Get the count of samples from DBA_HIST_ACTIVE_SESS_HISTORY for the previous and current snaps.

— Previous Snap

col sample_id head "Sample ID" form 999999999
col sample_time head "Sample Time" form a28

select 	SAMPLE_ID, SAMPLE_TIME, count(*)
from  	DBA_HIST_ACTIVE_SESS_HISTORY
where 	snap_id = 45043
and   	dbid = 2872548522
and   	instance_number = 1
group 	by SAMPLE_ID, SAMPLE_TIME
order 	by sample_id, SAMPLE_TIME
/


 Sample ID Sample Time                    COUNT(*)
---------- ---------------------------- ----------
 158418490 18-MAR-11 12.27.45.691 PM            74
 158418500 18-MAR-11 12.27.56.181 PM            64
 158418510 18-MAR-11 12.28.06.441 PM            64
 158418520 18-MAR-11 12.28.16.771 PM            83
 158418530 18-MAR-11 12.28.27.121 PM            64
 158418540 18-MAR-11 12.28.37.372 PM            57
 158418550 18-MAR-11 12.28.47.721 PM            74
 158418560 18-MAR-11 12.28.58.211 PM            60
 158418570 18-MAR-11 12.29.08.521 PM            70
 158418580 18-MAR-11 12.29.18.871 PM            71
 158418590 18-MAR-11 12.29.29.231 PM            75
 158418600 18-MAR-11 12.29.39.491 PM            69
 158418610 18-MAR-11 12.29.49.883 PM            75
 158418620 18-MAR-11 12.30.00.412 PM            75

— Current Snap

col sample_id head "Sample ID" form 999999999
col sample_time head "Sample Time" form a28

select	SAMPLE_ID, SAMPLE_TIME, count(*)
from  	DBA_HIST_ACTIVE_SESS_HISTORY
where 	snap_id = 45044
and   	dbid = 2872548522
and   	instance_number = 1
group 	by SAMPLE_ID, SAMPLE_TIME
order 	by sample_id, SAMPLE_TIME
/



 Sample ID Sample Time                    COUNT(*)
---------- ---------------------------- ----------
 158420640 18-MAR-11 01.04.58.204 PM          4493
 158420650 18-MAR-11 01.05.08.781 PM          4565
 158420660 18-MAR-11 01.05.19.481 PM          4629
 158420670 18-MAR-11 01.05.30.311 PM          2480
 158420680 18-MAR-11 01.05.40.881 PM          2400
 158420690 18-MAR-11 01.05.51.321 PM          2148
 158420700 18-MAR-11 01.06.03.851 PM          1731
 158420710 18-MAR-11 01.06.14.401 PM          1608
 158420720 18-MAR-11 01.06.24.960 PM          1511
 158420730 18-MAR-11 01.06.35.391 PM          1419
 158420740 18-MAR-11 01.06.45.890 PM          1412
 158420750 18-MAR-11 01.06.58.430 PM          1437
 158420760 18-MAR-11 01.07.08.870 PM          1373
 158420770 18-MAR-11 01.07.19.394 PM          1323
 158420780 18-MAR-11 01.07.29.960 PM          1294
 158420790 18-MAR-11 01.07.40.470 PM          1287
 158420800 18-MAR-11 01.07.50.855 PM          1288
 158420810 18-MAR-11 01.08.01.570 PM          1281
 158420820 18-MAR-11 01.08.12.071 PM          1281
 158420830 18-MAR-11 01.08.22.423 PM          1287
 158420840 18-MAR-11 01.08.32.921 PM          1283
 158420850 18-MAR-11 01.08.43.431 PM          1275
 158420860 18-MAR-11 01.08.53.811 PM          1298
 158420870 18-MAR-11 01.09.04.541 PM          1286
 158420880 18-MAR-11 01.09.15.084 PM          1282
 158420890 18-MAR-11 01.09.25.581 PM          1271
 158420900 18-MAR-11 01.09.35.951 PM          1292
 158420910 18-MAR-11 01.09.46.481 PM          1284
 158420920 18-MAR-11 01.09.57.221 PM          1279
 158420930 18-MAR-11 01.10.07.601 PM          1274
 158420940 18-MAR-11 01.10.18.111 PM          1300
 158420950 18-MAR-11 01.10.28.641 PM          1279
 158420960 18-MAR-11 01.10.39.031 PM          1267
 158420970 18-MAR-11 01.10.49.541 PM          1287
 158420980 18-MAR-11 01.11.00.341 PM          1271
 158420990 18-MAR-11 01.11.10.846 PM          1284
 158421000 18-MAR-11 01.11.21.251 PM          1281
 158421010 18-MAR-11 01.11.31.791 PM          1275
 158421020 18-MAR-11 01.11.42.331 PM          1276
 158421030 18-MAR-11 01.11.52.701 PM          1270
 158421040 18-MAR-11 01.12.03.441 PM          1278
 158421050 18-MAR-11 01.12.13.972 PM          1264
 158421060 18-MAR-11 01.12.24.361 PM          1263
 158421070 18-MAR-11 01.12.34.881 PM          1260
 158421080 18-MAR-11 01.12.45.410 PM          1279
 158421090 18-MAR-11 01.12.56.191 PM          1275
 158421100 18-MAR-11 01.13.06.571 PM          1266
 158421110 18-MAR-11 01.13.17.111 PM          1286
 158421120 18-MAR-11 01.13.27.641 PM          1256
 158421130 18-MAR-11 01.13.38.021 PM          1259
 158421140 18-MAR-11 01.13.48.531 PM          1263
 158421150 18-MAR-11 01.13.59.403 PM          1264
 158421160 18-MAR-11 01.14.09.981 PM          1276
 158421170 18-MAR-11 01.14.20.381 PM          1271
 158421180 18-MAR-11 01.14.30.951 PM          1268
 158421190 18-MAR-11 01.14.41.524 PM          1286
 158421200 18-MAR-11 01.14.51.941 PM          1261
 158421210 18-MAR-11 01.15.02.841 PM          1257
 158421220 18-MAR-11 01.15.13.396 PM          1264
 158421230 18-MAR-11 01.15.23.821 PM          1276
 158421240 18-MAR-11 01.15.34.431 PM          1267
 158421250 18-MAR-11 01.15.45.033 PM          1265
 158421260 18-MAR-11 01.15.55.871 PM          1258
 158421270 18-MAR-11 01.16.06.291 PM          1253
 158421280 18-MAR-11 01.16.16.901 PM            60
 158421290 18-MAR-11 01.16.27.381 PM            71
 158421300 18-MAR-11 01.16.37.721 PM            74
 158421310 18-MAR-11 01.16.48.204 PM            56
 158421320 18-MAR-11 01.16.58.881 PM            66
 158421330 18-MAR-11 01.17.09.211 PM            63
 158421340 18-MAR-11 01.17.19.681 PM            73
 158421350 18-MAR-11 01.17.30.181 PM            89
 158421360 18-MAR-11 01.17.40.721 PM           159
 158421370 18-MAR-11 01.17.51.104 PM           283
 158421380 18-MAR-11 01.18.02.671 PM           145
 158421390 18-MAR-11 01.18.13.151 PM           128
 158421400 18-MAR-11 01.18.23.501 PM           110
 158421410 18-MAR-11 01.18.33.992 PM            79
 158421420 18-MAR-11 01.18.44.461 PM            91
 158421430 18-MAR-11 01.18.55.161 PM            91
 158421440 18-MAR-11 01.19.05.501 PM            70
 158421450 18-MAR-11 01.19.15.981 PM           384
 158421460 18-MAR-11 01.19.26.463 PM            74
 158421470 18-MAR-11 01.19.36.801 PM            79
 158421480 18-MAR-11 01.19.47.291 PM            55
 158421490 18-MAR-11 01.19.57.961 PM            69
 158421500 18-MAR-11 01.20.08.321 PM            74
 158421510 18-MAR-11 01.20.18.801 PM            62
 158421520 18-MAR-11 01.20.29.281 PM            80
 158421530 18-MAR-11 01.20.39.631 PM            93
 158421540 18-MAR-11 01.20.50.121 PM            73
 158421550 18-MAR-11 01.21.00.791 PM            83
 158421560 18-MAR-11 01.21.11.261 PM            66
 158421570 18-MAR-11 01.21.21.601 PM            72
 158421580 18-MAR-11 01.21.32.111 PM            66
 158421590 18-MAR-11 01.21.42.581 PM            57
 158421600 18-MAR-11 01.21.52.922 PM            65

There is a gap between 12:30 PM and 1:04 PM. (I will try to find out about this later)
We only have data from 1:04 PM.

These are the main wait events at that time.

col name head "Event" form a40

select 	en.NAME, count(*)
from  	DBA_HIST_ACTIVE_SESS_HISTORY  dhas,
      	v$event_name                  en
where 	snap_id = 45044
and   	dbid = 2872548522
and   	instance_number = 1
and   	sample_id = 158420640
and   	dhas.event_id = en.event_id
group 	by en.NAME
order 	by count(*) desc
/


NAME                                                               COUNT(*)
---------------------------------------------------------------- ----------
gc buffer busy                                                         1694
buffer busy waits                                                      1451
log file sync                                                          1193
gc cr request                                                            85
gc current request                                                       26

Now lets find out the object involved in “gc buffer busy”

col name head "Event" form a20

select 	en.NAME, p1, p2, count(*)
from  	DBA_HIST_ACTIVE_SESS_HISTORY  dhas,
      	v$event_name                  en
where 	snap_id = 45044
and   	dbid = 2872548522
and   	instance_number = 1
and   	sample_id = 158420640
and   	en.NAME = 'gc buffer busy'
and   	dhas.event_id = en.event_id
group 	by en.NAME, p1, p2
order 	by count(*) desc
/


Event                        P1         P2   COUNT(*)
-------------------- ---------- ---------- ----------
gc buffer busy              319       6413       1387
gc buffer busy              769     113488         49
gc buffer busy              582     284415         37

They are mostly for the file#/block# 319/6413

Dumping the block shows that its a level 1 BMB for AUD$ table

Start dump data blocks tsn: 20 file#: 319 minblk 6413 maxblk 6413
buffer tsn: 20 rdba: 0x4fc0190d (319/6413)
scn: 0x0a35.c98b07d1 seq: 0x01 flg: 0x00 tail: 0x07d12001
frmt: 0x02 chkval: 0x0000 type: 0x20=FIRST LEVEL BITMAP BLOCK
Hex dump of block: st=0, typ_found=1

When the query is repeated for SAMPLE_ID=158420690, this is the data

Event                        P1         P2   COUNT(*)
-------------------- ---------- ---------- ----------
gc buffer busy              319       2590        421
gc buffer busy              741     630505         18

This is the SEGMENT HEADER for the same TABLE AUD$

Start dump data blocks tsn: 20 file#: 319 minblk 2590 maxblk 2590
buffer tsn: 20 rdba: 0x4fc00a1e (319/2590)
scn: 0x0a35.c3264fe1 seq: 0x01 flg: 0x00 tail: 0x4fe12301
frmt: 0x02 chkval: 0x0000 type: 0x23=PAGETABLE SEGMENT HEADER
Hex dump of block: st=0, typ_found=1

Why so much contention on AUD$. Lets look at the per minute login rate

select      count(*), to_char(cast((from_tz(ntimestamp#,'00:00') at local) as date), 'YY-MM-DD HH24:MI')
from  aud$
group by to_char( cast((from_tz(ntimestamp#,'00:00') at local) as date), 'YY-MM-DD HH24:MI')
order by to_char( cast((from_tz(ntimestamp#,'00:00') at local) as date), 'YY-MM-DD HH24:MI')
/

  COUNT(*) TO_CHAR(CAST((FROM_TZ(NTIMESTA
---------- ---------------------------------------------------------------------------
       377 11-03-18 12:55
       359 11-03-18 12:56
       385 11-03-18 12:57
       338 11-03-18 12:58
       473 11-03-18 12:59
       752 11-03-18 13:00
      1307 11-03-18 13:01
       814 11-03-18 13:02
       596 11-03-18 13:03
       598 11-03-18 13:04
      1095 11-03-18 13:05
      1122 11-03-18 13:06
       935 11-03-18 13:07
       780 11-03-18 13:08
       730 11-03-18 13:09
       437 11-03-18 13:10
       217 11-03-18 13:11
       266 11-03-18 13:12
       161 11-03-18 13:13
       426 11-03-18 13:14
       591 11-03-18 13:15
      1905 11-03-18 13:16
      1640 11-03-18 13:17
      1021 11-03-18 13:18
       603 11-03-18 13:19
       529 11-03-18 13:20
       618 11-03-18 13:21
       642 11-03-18 13:22
       451 11-03-18 13:23
       603 11-03-18 13:24
       576 11-03-18 13:25
       516 11-03-18 13:26
       414 11-03-18 13:27
       438 11-03-18 13:28
       428 11-03-18 13:29
       492 11-03-18 13:30

There are too many Logins at 1 PM (13:01)

The issue at around 1PM was that AUD$ was not able to handle the concurrent inserts at that time.

But what caused the ASH history to disappear between 12:30PM and 1:04PM?
Did something happen around 12:30PM that caused the system freeze and existing user connections to freeze?
Did this cause users to relogin causing a flood of new connections and hence contention at AUD$?
Was there a shift change at around 1PM to cause this flood?
Will need to dig in further into analyze what happened between 12:30PM and 1:04PM using other means as ASH is not available

March 8, 2011

Hugepages in Linux

Filed under: Uncategorized — srivenu @ 9:02 am

I was recently looking into a performance issue on one of the databases. It was Oracle 10.2.0.4 on linux (RHEL 5). Having had the prior experience of working with Oracle on Linux, i could quickly make out that the issue was related to not using Huge pages for the SGA. I thought i would blog about the issue and while looking for similar articles on the web, found one Riyaz Shamshuddin. Since it was quite identical to our issue, i decided to link it here – Performance tuning: HugePages and Linux

Some notes from our site
1) We had a RAM of around 16GB and our SGA was around 4GB. Our pagetables without Hugepages were around 6.5GB!

/proc/meminfo before hugepages

MemTotal:     16305888 kB
MemFree:        823340 kB
.............
PageTables:    6631560 kB

/proc/meminfo after confuguring hugepages

MemTotal:     16305888 kB
MemFree:         84268 kB
..........
PageTables:     276628 kB
HugePages_Total:  2050
HugePages_Free:    356
HugePages_Rsvd:    355
Hugepagesize:     2048 kB

2) Modifying vm.nr_hugepages in /etc/sysctl.conf and reloading it using sysctl -p did not work in our case – probably because the system could not get contiguous physical memory. We had to reboot the system.

3) Due to a calculation mistake i had given one “0” less for the ulimit parameter “memlock” in /etc/securities/limits.conf. So the first time we started the instance, the SGA did not sit in the Huge pages. Restarting the instance after modifying /etc/securities/limits.conf with the proper value resolved the issue.

For those who have access there is also a very good article on Metalink about Hugepages on Linux- Article ID 361323.1

March 4, 2011

Calculate iops for a RAC database

Filed under: Uncategorized — srivenu @ 11:13 am

I had blogged previously about calculating iops for a non-RAC database instance. It could be a little tricky to get the IOPS for a RAC database.

In the previous blog entry, i had shown sql which allows you to get the iops during peak hours. The following sql can be used to get the iops for all times.

define start_time='03-FEB-2011 10:00'  -- please specify this in 'DD-MON-YYYY HH24:MI' format
define end_time  ='03-MAR-2011 18:00'  -- please specify this in 'DD-MON-YYYY HH24:MI' format
define rangesize=500
define name='physical read total IO requests'
define inst_no=1

undef v_noofmins

col mns new_value v_noofmins

select trunc((to_date('&end_time', 'DD-MON-YYYY HH24:MI') - to_date('&start_time', 'DD-MON-YYYY HH24:MI'))*1440) mns
from dual
/


col range head "   IOPS Range" form a20
col noofslots head "No Of|Minutes" form 999,999

with 	v2 as 
	(
	select	/*+materialize*/
		to_date('&start_time', 'DD-MON-YYYY HH24:MI')+(rownum/1440) timewithmin
	from	dual
	connect	by rownum < &v_noofmins
	)
select	/*+use_merge(v1 v2)*/
	lpad(trunc(vpersec/&rangesize) * &rangesize, 6) ||' - '||
	lpad(((trunc(vpersec/&rangesize) + 1) * &rangesize) - 1, 6) range,
	count(*) noofslots
from	(
	select 	/*+ full(st) full(ss) use_hash(st)*/
		lag(ROUND(CAST(ss.END_INTERVAL_TIME AS DATE)+(1/1440), 'MI')) over(order by ss.snap_id) dsst,
		ROUND(CAST(ss.END_INTERVAL_TIME AS DATE), 'MI') dest,
		(st.value-lag(st.value,1,0) over(order by ss.snap_id)) st,
		round(
		(st.value-lag(st.value,1,0) over(order by ss.snap_id) ) /
		round((CAST(ss.END_INTERVAL_TIME AS DATE) - 
			lag(ROUND(CAST(ss.END_INTERVAL_TIME AS DATE), 'MI')) over(order by ss.snap_id)) * 1440 * 60))
		vpersec
	from 	sys.WRH$_SYSSTAT 	st,
		sys.WRM$_SNAPSHOT 	ss
	where 	st.snap_id=ss.snap_id
	and	ss.INSTANCE_NUMBER = &inst_no
	and	st.INSTANCE_NUMBER = &inst_no
	and	ss.dbid=st.dbid
	and	st.stat_id in (
		select	stat_id
		from	v$statname
		where	upper(name)=upper(trim('&name'))
		)
	and 	ss.END_INTERVAL_TIME > to_date('&start_time', 'DD-MON-YYYY HH24:MI')
	and	ss.BEGIN_INTERVAL_TIME < to_date('&end_time', 'DD-MON-YYYY HH24:MI')
	order	by 1
	) v1,
	v2
where	v2.timewithmin between v1.dsst and v1.dest
and	v2.timewithmin between v1.dsst and v1.dest
group	by lpad(trunc(vpersec/&rangesize) * &rangesize, 6) ||' - '||
	lpad(((trunc(vpersec/&rangesize) + 1) * &rangesize) - 1, 6)
order	by lpad(trunc(vpersec/&rangesize) * &rangesize, 6) ||' - '||
	lpad(((trunc(vpersec/&rangesize) + 1) * &rangesize) - 1, 6)
/

The output is something like this

                          No Of
   IOPS Range           Minutes
-------------------- ----------
     0 -    499              59
   500 -    999             479
  1000 -   1499            1804
  1500 -   1999            3595
  2000 -   2499            8092
  2500 -   2999           11107
  3000 -   3499            7074
  3500 -   3999            3578
  4000 -   4499            2342
  4500 -   4999            1260
  5000 -   5499             539
  5500 -   5999             420
  7000 -   7499              59

13 rows selected.

For Instance 2 this is the data.
define inst_no=2

                          No Of
   IOPS Range           Minutes
-------------------- ----------
     0 -    499             179
   500 -    999             543
  1000 -   1499            1680
  1500 -   1999            3419
  2000 -   2499            4251
  2500 -   2999            4503
  3000 -   3499            4617
  3500 -   3999            3852
  4000 -   4499            1976
  4500 -   4999            2126
  5000 -   5499            1260
  5500 -   5999             840
  6000 -   6499             540
  6500 -   6999             300
  7000 -   7499             661
  7500 -   7999             240
  8000 -   8499             120

17 rows selected.

The question now is, what is the peak iops for both instances combined? Can we just sum up the peaks of the 2 nodes and say that the peak is 15500 – 16000? This might result in overestimation as both nodes might not have peaked at the same time.

I use the following sql to get the IOPS for all (could be more than 2) nodes.

define start_time='03-FEB-2011 10:00'  -- please specify this in 'DD-MON-YYYY HH24:MI' format
define end_time  ='03-MAR-2011 18:00'  -- please specify this in 'DD-MON-YYYY HH24:MI' format
define rangesize=500
define name='physical read total IO requests'

undef v_noofmins

col mns new_value v_noofmins

select trunc((to_date('&end_time', 'DD-MON-YYYY HH24:MI') - to_date('&start_time', 'DD-MON-YYYY HH24:MI'))*1440) mns
from dual
/

col range head "   IOPS Range" form a20
col noofslots head "No Of|Minutes" form 999,999

with 	v2 as 
	(
	select	/*+ materialize */
		to_date('&start_time', 'DD-MON-YYYY HH24:MI')+(rownum/1440) timewithmin
	from	dual
	connect	by rownum < &v_noofmins
	)
select	lpad(trunc(v11.vpersec/&rangesize) * &rangesize, 6) ||' - '||
	lpad(((trunc(v11.vpersec/&rangesize) + 1) * &rangesize) - 1, 6) range,
	count(*) noofslots
from	(
	select	/*+use_merge(v1 v2)*/
		v2.timewithmin, sum(vpersec) vpersec
	from	(
		select 	/*+ full(st) full(ss) use_hash(st)*/
			lag(ROUND(CAST(ss.END_INTERVAL_TIME AS DATE)+(1/1440), 'MI')) 
				over(partition by ss.INSTANCE_NUMBER order by ss.snap_id) dsst,
			ROUND(CAST(ss.END_INTERVAL_TIME AS DATE), 'MI') dest,
			(st.value-lag(st.value,1,0) over(partition by ss.INSTANCE_NUMBER order by ss.snap_id)) st,
			round(
			(st.value-lag(st.value,1,0) over(partition by ss.INSTANCE_NUMBER order by ss.snap_id) ) /
			round((ROUND(CAST(ss.END_INTERVAL_TIME AS DATE), 'MI') - 
				lag(ROUND(CAST(ss.END_INTERVAL_TIME AS DATE)+(1/1440), 'MI')) 
				over(partition by ss.INSTANCE_NUMBER order by ss.snap_id)) * 1440 * 60))
			vpersec
		from 	sys.WRH$_SYSSTAT 	st,
			sys.WRM$_SNAPSHOT 	ss
		where 	ss.snap_id=st.snap_id
		and	ss.dbid=st.dbid
		and	ss.INSTANCE_NUMBER = st.INSTANCE_NUMBER 
		and	st.stat_id in (
			select	stat_id
			from	v$statname
			where	upper(name)=upper(trim('&name'))
			)
		and 	ss.END_INTERVAL_TIME > to_date('&start_time', 'DD-MON-YYYY HH24:MI')
		and	ss.BEGIN_INTERVAL_TIME < to_date('&end_time', 'DD-MON-YYYY HH24:MI')
		order	by 1
		) v1,
		v2
	where	v2.timewithmin between v1.dsst and v1.dest
	and	v1.st > 0
	and	v1.dsst is not null
	group	by v2.timewithmin
	) v11
group	by lpad(trunc(v11.vpersec/&rangesize) * &rangesize, 6) ||' - '||
	lpad(((trunc(v11.vpersec/&rangesize) + 1) * &rangesize) - 1, 6)
order	by lpad(trunc(v11.vpersec/&rangesize) * &rangesize, 6) ||' - '||
	lpad(((trunc(v11.vpersec/&rangesize) + 1) * &rangesize) - 1, 6)
/

This is an output from our 2-node RAC database.

                          No Of
   IOPS Range           Minutes
-------------------- ----------
     0 -    499              59
  1000 -   1499             601
  1500 -   1999             359
  2000 -   2499            2094
  2500 -   2999            5407
  3000 -   3499            3303
  3500 -   3999            2459
  4000 -   4499            1496
  4500 -   4999            2224
  5000 -   5499            3238
  5500 -   5999            3294
  6000 -   6499            3482
  6500 -   6999            2400
  7000 -   7499            2400
  7500 -   7999            1558
  8000 -   8499            1111
  8500 -   8999            1203
  9000 -   9499            1019
  9500 -   9999             481
 10000 -  10499             482
 10500 -  10999             300
 11000 -  11499             721
 11500 -  11999             238
 12000 -  12499             241
 12500 -  12999             177
 13000 -  13499             179
 13500 -  13999              61

27 rows selected.

So we can see that the peak was only around 14000.

Just for a little bit of explanation about the above sql.

Get the Number of Minutes between the start_time and end_time with this sql
select trunc((to_date(‘&end_time’, ‘DD-MON-YYYY HH24:MI’) – to_date(‘&start_time’, ‘DD-MON-YYYY HH24:MI’))*1440) mns
from dual

Using sub query factoring, generate a row for each minute between start_time and end_time
with v2 as
(
select /*+ materialize */
to_date(‘&start_time’, ‘DD-MON-YYYY HH24:MI’)+(rownum/1440) timewithmin
from dual
connect by rownum < &v_noofmins
)

In AWR, we have snaps running at regular intervals on all nodes. The start and end timings might not exactly match all the time.

Lets say we take hourly snaps. So from AWR, we get the value of the stat (lets say “physical read total IO requests”) for 1 hour.
(st.value-lag(st.value,1,0) over(partition by ss.INSTANCE_NUMBER order by ss.snap_id))

Then get the per second average
round(
(st.value-lag(st.value,1,0) over(partition by ss.INSTANCE_NUMBER order by ss.snap_id) ) /
round((ROUND(CAST(ss.END_INTERVAL_TIME AS DATE), ‘MI’) –
lag(ROUND(CAST(ss.END_INTERVAL_TIME AS DATE)+(1/1440), ‘MI’))
over(partition by ss.INSTANCE_NUMBER order by ss.snap_id)) * 1440 * 60))

Then join this row with the view V2 so that we get the per second average of data for each instance for each minute.
v2.timewithmin between v1.dsst and v1.dest

Then sum up the data from all instances for each minute.
group by v2.timewithmin

Then generate the IOPS ranges for good presentation of the data
group by lpad(trunc(v11.vpersec/&rangesize) * &rangesize, 6) ||’ – ‘||
lpad(((trunc(v11.vpersec/&rangesize) + 1) * &rangesize) – 1, 6)

I hope i did not mess up this explanation part!

Blog at WordPress.com.