Oracle

July 21, 2011

Don’t close that connection

Filed under: Uncategorized — srivenu @ 9:29 am

Consider this scenario.

The cafeteria i eat in, like many others i had seen in India, gives out stainless steel spoons to eat your food. (Ignoring the hygienic factor, I wonder how eco friendly this is, considering the water and soap needed to wash them after each use). At the spoon counter you have a bin into which a cleaning lady periodically dumps a bunch of spoons after each cleaning session. You wait in line at the food counters, get your food, and then come to the spoon counter. Since there is a perennial shortage of spoons, most of the time you wait in line to get a spoon. If your eating time is 20 mins, food counter queue time is 2 mins and spoon counter waiting time in 1 min, you perhaps wouldn’t mind and might take it in your stride. Now what if a person changes his spoon after each mouthful? (You might wonder why someone would think of doing that? That’s stupid ain’t it!) Well we might not do this in real life, but in the programming world, i bet many of us might have seen some strange and similar things indeed!

Let’s say you need to write a java application which periodically wakes up (say every 10 secs or so), checks for any new mail and in case there is any, inserts it into a database. What sort of database connection management do you use for this code? Well, it seems like simple common sense, If it is not multi threaded, you would just open one connection and keep it till the end. But i have seen programs where people have opened and closed the connection for each iteration of the above loop.

Advertisements

DriverManager calls not intended for production

Filed under: Uncategorized — srivenu @ 9:28 am

I was looking into the performance issue in an OC4J application, which occasionally froze for 2-3 minutes. The issue got escalated as the problem frequency increased.

Step 1 : Get a thread dump.
First of all, I needed a series of thread dumps during the freeze for analysis. But the freeze duration was too short. The issue passed off by the time users complained, and the system was checked. So i decided to used dmstool to identify the symptoms during the freeze. If i could identify the symptoms, i schedule a job to thread dump the Jserv processes during the freeze. Using dmstool, i gathered all metrics data for the application at 30 second duration for 1 day. Next day, after correlating with the freeze times reported by the users, i could see a steady spike in the “activeThreads.value” metric which peaked during the freeze. Next, i set up a job which monitored the “activeThreads.value” and perfomed a kill -3 on the java process if it spiked to a certain value.

Step 2 : Analyze the thread dump.
Shown below is a part of that thread dump. I removed several waiting threads as they all waited at the same location. I also renamed our class names to something like abc.def etc. I showed the waiting threads at the top with the blocker at the end.


"AJPRequestHandler-RMICallHandler-623" prio=1 tid=0x089fed00 nid=0x473b waiting for monitor entry [0x9db2e000..0x9db2efa0]
	at java.sql.DriverManager.registerDriver(DriverManager.java:259)
	- waiting to lock  (a java.lang.Class)
	at com.abc.def.pqr.stu(DBConnection.java:33)
	at com.abc.def.ghi.jkl(mno.java:113)
	at com.evermind.server.http.ServletRequestDispatcher.invoke(ServletRequestDispatcher.java:644)
	at com.evermind.server.http.ServletRequestDispatcher.forwardInternal(ServletRequestDispatcher.java:391)
	at com.evermind.server.http.HttpRequestHandler.doProcessRequest(HttpRequestHandler.java:908)
	at com.evermind.server.http.HttpRequestHandler.processRequest(HttpRequestHandler.java:458)
	at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:313)
	at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:199)
	at oracle.oc4j.network.ServerSocketReadHandler$SafeRunnable.run(ServerSocketReadHandler.java:260)
	at oracle.oc4j.network.ServerSocketAcceptHandler.procClientSocket(ServerSocketAcceptHandler.java:234)
	at oracle.oc4j.network.ServerSocketAcceptHandler.access$700(ServerSocketAcceptHandler.java:29)
	at oracle.oc4j.network.ServerSocketAcceptHandler$AcceptHandlerHorse.run(ServerSocketAcceptHandler.java:879)
	at com.evermind.util.ReleasableResourcePooledExecutor$MyWorker.run(ReleasableResourcePooledExecutor.java:303)
	at java.lang.Thread.run(Thread.java:595)

"AJPRequestHandler-HTTPThreadGroup-622" prio=1 tid=0xa18ac588 nid=0x46e8 waiting for monitor entry [0x9effe000..0x9efff120]
	at java.sql.DriverManager.registerDriver(DriverManager.java:259)
	- waiting to lock  (a java.lang.Class)
	at com.abc.def.pqr.stu(DBConnection.java:33)
	at com.abc.def.ghi.jkl(mno.java:113)
	at com.evermind.server.http.ServletRequestDispatcher.invoke(ServletRequestDispatcher.java:644)
	at com.evermind.server.http.ServletRequestDispatcher.forwardInternal(ServletRequestDispatcher.java:391)
	at com.evermind.server.http.HttpRequestHandler.doProcessRequest(HttpRequestHandler.java:908)
	at com.evermind.server.http.HttpRequestHandler.processRequest(HttpRequestHandler.java:458)
	at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:313)
	at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:199)
	at oracle.oc4j.network.ServerSocketReadHandler$SafeRunnable.run(ServerSocketReadHandler.java:260)
	at oracle.oc4j.network.ServerSocketAcceptHandler.procClientSocket(ServerSocketAcceptHandler.java:234)
	at oracle.oc4j.network.ServerSocketAcceptHandler.access$700(ServerSocketAcceptHandler.java:29)
	at oracle.oc4j.network.ServerSocketAcceptHandler$AcceptHandlerHorse.run(ServerSocketAcceptHandler.java:879)
	at com.evermind.util.ReleasableResourcePooledExecutor$MyWorker.run(ReleasableResourcePooledExecutor.java:298)
	at java.lang.Thread.run(Thread.java:595)

"AJPRequestHandler-HTTPThreadGroup-620" prio=1 tid=0xa306f4a0 nid=0x46de waiting for monitor entry [0x9ea77000..0x9ea77e20]
	.......................................


"AJPRequestHandler-RMICallHandler-619" prio=1 tid=0x08535340 nid=0x46dd waiting for monitor entry [0xa2a6e000..0xa2a6eda0]
	.......................................

"AJPRequestHandler-RMICallHandler-612" prio=1 tid=0xa22d7018 nid=0x3c70 runnable [0x9e9f6000..0x9e9f6e20]
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.read(SocketInputStream.java:129)
	at oracle.net.ns.Packet.receive(Unknown Source)
	at oracle.net.ns.DataPacket.receive(Unknown Source)
	at oracle.net.ns.NetInputStream.getNextPacket(Unknown Source)
	at oracle.net.ns.NetInputStream.read(Unknown Source)
	at oracle.net.ns.NetInputStream.read(Unknown Source)
	at oracle.net.ns.NetInputStream.read(Unknown Source)
	at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:978)
	at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:950)
	at oracle.jdbc.driver.T4CTTIoauthenticate.receiveOauth(T4CTTIoauthenticate.java:618)
	at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:301)
	at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:346)
	at oracle.jdbc.driver.T4CConnection.(T4CConnection.java:149)
	at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:31)
	at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:543)
	at java.sql.DriverManager.getConnection(DriverManager.java:525)
	- locked  (a java.lang.Class)
	at java.sql.DriverManager.getConnection(DriverManager.java:171)
	- locked  (a java.lang.Class)
	at com.abc.def.dbutilities.abcDBConnection.openOracleConnection(abcDBConnection.java:21)
	at com.abc.def.dao.InsertFeedbackDao.insertabcData(InsertFeedbackDao.java:447)
	at com.abc.def.bos.CloseMailBo.closeMail(CloseMailBo.java:366)
	at com.abc.def.actions.CloseMailAction.execute(CloseMailAction.java:60)
	at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:484)
	at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:274)
	at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1482)
	at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:525)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
	at com.evermind.server.http.ResourceFilterChain.doFilter(ResourceFilterChain.java:64)
	at com.abc.def.ghi.jkl(mno.java:113)
	at com.evermind.server.http.ServletRequestDispatcher.invoke(ServletRequestDispatcher.java:644)
	at com.evermind.server.http.ServletRequestDispatcher.forwardInternal(ServletRequestDispatcher.java:391)
	at com.evermind.server.http.HttpRequestHandler.doProcessRequest(HttpRequestHandler.java:908)
	at com.evermind.server.http.HttpRequestHandler.processRequest(HttpRequestHandler.java:458)
	at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:313)
	at com.evermind.server.http.AJPRequestHandler.run(AJPRequestHandler.java:199)
	at oracle.oc4j.network.ServerSocketReadHandler$SafeRunnable.run(ServerSocketReadHandler.java:260)
	at oracle.oc4j.network.ServerSocketAcceptHandler.procClientSocket(ServerSocketAcceptHandler.java:234)
	at oracle.oc4j.network.ServerSocketAcceptHandler.access$700(ServerSocketAcceptHandler.java:29)
	at oracle.oc4j.network.ServerSocketAcceptHandler$AcceptHandlerHorse.run(ServerSocketAcceptHandler.java:879)
	at com.evermind.util.ReleasableResourcePooledExecutor$MyWorker.run(ReleasableResourcePooledExecutor.java:303)
	at java.lang.Thread.run(Thread.java:595)

The thread RMICallHandler-612 is the culprit and has locked out all the other threads. A listener issue and no timeout settings have caused intermittent connections hangs to the abcDBConnection. But the important issue is that this slow connection to database abc has blocked out all the DriverManager.registerDriver calls in other threads connecting to other databases. There in lies the severe limitation of the java.sql.DriverManager calls. Most of the DriverManager calls are all synchronized and are not for use in production. Joe Weinstein has written several notes on these limitations – here is one

All the hanging threads are calling a servlet which queries a small table in the database to apply a security filter. The developer did not know how to include the servlet in the struts framework and was using this connection method just for this servlet. Actually the lookup table used for security filtering was quite small and static and they could have used ServletContext to avoid going to the database for this.

Dude, Where’s my Memory

Filed under: Uncategorized — srivenu @ 9:17 am

Being a movie buff, i couldn’t help myself from using this title (Unfortunately, a quick search on the web revealed that i ain’t the first or the only one to use this title!)

In brief, we have a 2-Node oracle 10.2.0.4 RAC database on Sun server’s running Solaris 10. Each of the servers had around 48GB RAM. OEM was showing memory utilization to be more than 90%. There was no paging on the system even during peak and vmstat continuously showed a page scan rate of zero. But the management was uncomfortable with the high memory utilization, and it was decided to increase the RAM to 64GB. But this did not improve matters as the additional memory was quickly consumed and total utilization once again spiked up to 90%.

I used the following method to debug the memory utilization.
Using mdb is documented here

# mdb -k
Loading modules: [ unix genunix specfs dtrace ufs ssd fcp fctl qlc pcisch sd md mpt ip hook neti 
sctp arp usba nca lofs zfs cpc random crypto wrsmd fcip logindmux ptm sppp nfs ipc ]
>
> ::memstat

Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                    5219013             40773   68%
Anon                      2136575             16691   28%
Exec and libs               42034               328    1%
Page cache                  20094               156    0%
Free (cachelist)            16281               127    0%
Free (freelist)            285106              2227    4%

Total                     7719103             60305
Physical                  7715086             60274

40GB of kernel?
Let’s dig deeper into kernel memory details.

> ::kmastat
cache                        buf    buf    buf    memory     alloc alloc
name                        size in use  total    in use   succeed  fail
------------------------- ------ ------ ------ --------- --------- -----
kmem_magazine_1               16  57994 142240   2293760   6014139     0
kmem_magazine_3               32  21610  32258   1040384    162150     0
...................................................
...................................................
------------------------- ------ ------ ------ --------- --------- -----
Total [static]                                   7536640  55505290     0
Total [hat_memload]                            305258496 655495609     0
Total [kmem_msb]                                74088448  19240511     0
Total [kmem_va]                                1052770304    256555     0
Total [kmem_default]                           1836761088 3094356297     0
Total [bp_map]                                   4194304    302511     0
Total [kmem_tsb_default]                         1040384  16858149     0
Total [hat_memload1]                            25165824 116449345     0
Total [segkmem_ppa]                                98304      6588     0
Total [umem_np]                                  3932160     69620     0
Total [segkp]                                  119144448   7942386     0
Total [zio_buf]                                149708800  16558948     0
Total [pcisch1_dvma]                              557056    912597     0
Total [pcisch0_dvma]                              327680       503     0
Total [pcisch3_dvma]                            14155776 742137383     0
Total [ip_minor_arena_sa]                            384   1097630     0
Total [ip_minor_arena_la]                           5504   3540980     0
Total [pcisch2_dvma]                             9895936 563983687     0
Total [dld_ctl]                                       64        36     0
Total [spdsock]                                       64         3     0
Total [namefs_inodes]                                 64        27     0
------------------------- ------ ------ ------ --------- --------- -----

vmem                         memory     memory    memory     alloc alloc
name                         in use      total    import   succeed  fail
------------------------- --------- ---------- --------- --------- -----
heap                      1100848578560 4398046511104         0    331671     0
    vmem_metadata          95453184   95682560  95682560     10995     0
        vmem_seg           89260032   89260032  89260032     10892     0
        vmem_hash           5831808    5840896   5840896       118     0
        vmem_vmem            334080     387056    352256       102     0
    static                  7675904    7675904   7675904       971     0
        static_alloc         123136     131072    131072        19     0
    hat_memload           305258496  305258496 305258496     37428     0
    kstat                    947760     966656    901120      3547     0
    kmem_metadata          76922880   77332480  77332480     11850     0
        kmem_msb           74088448   74088448  74088448     11696     0
        kmem_cache           565360     606208    606208       436     0
        kmem_hash           2191872    2228224   2228224       350     0
    kmem_log                1051232    1056768   1056768         6     0
    kmem_firewall_va              0          0         0         0     0
        kmem_firewall             0          0         0         0     0
    mod_sysfile                2448       8192      8192        58     0
    kmem_oversize         1810785470 1895497728 1895497728   1408517     0
    kmem_va               1068105728 1068105728 1068105728     13783     0
        kmem_default      1836761088 1836761088 1836761088    333110     0
    little_endian           1129024    1335296   1335296      1979     0
    big_endian             26265304   28803072  28803072      1356     0
    bp_map                  4194304    4194304   4194304       550     0
    ksyms                   2144568    2244608   2244608       332     0
    ctf                      253453     278528    278528       319     0
    kmem_tsb              314572800  314572800 314572800      1028     0
        kmem_tsb_default  234659840  314572800 314572800   2938958     0
    hat_memload1           25165824   25165824  25165824      3072     0
    kcage_arena                 520       8192      8192        13     0
    segkmem_ppa               98304      98304     98304         6     0
    umem_np                 4063232    4063232   4063232     39490     0
heap32                      6594560  134217728         0    334959     0
    promplat                      0          0         0    377546     0
    id32                          0          0         0         0     0
    module_data             2884190    7061504   6537216       454     0
    trapstat                      0          0         0         0     0
heaptext                   42352640  134217728         0       169     0
    module_text            10199000   10403840   8790016       330     0
logminor_space                   32     262137         0     40702     0
taskq_id_arena                   52 2147483647         0        73     0
heap_lp                   1979711488 1098169450496         0      6588     0
    kmem_lp               1979711488 1979711488 1979711488      5370  1218
segkp                     119144448 2147483648         0      4720     0
zio                       38809133056 63234891776         0   5469732     0
    zio_buf               38809133056 38809133056 38809133056   5469732     0
rctl_ids                         36      32767         0        36     0
zoneid_space                      0       9998         0         0     0
taskid_space                     48     999999         0     29967     0
pool_ids                          0     999998         0         0     0
contracts                        50 2147483646         0     46404     0
regspec                    22519808 5368709120         0       271     0
pcisch1_dvma                4980736 1040187392         0     74898     0
pcisch0_dvma                 327680 1040187392         0     87013     0
pcisch3_dvma               31989760 1040187392         0  70370904     0
ip_minor_arena_sa               384     262140         0         6     0
ip_minor_arena_la              5504 4294705152         0        87     0
pcisch2_dvma              158040064 1040187392         0     61207     0
dls_minor_arena                   2 4294967295         0         2     0
dld_ctl                          64 4294967295         0         1     0
tl_minor_space                  194     262138         0    258549     0
keysock                           0 4294967295         0         0     0
spdsock                          64 4294967295         0         1     0
namefs_inodes                    64      65536         0         1     0
Hex0x60054e9f648_minor            0 4294967294         0         0     0
Hex0x600570a4008_minor            0 4294967294         0         0     0
ptms_minor                        5         16         0        81     0
devfsadm_event_channel            1        101         0         1     0
devfsadm_event_channel            1          2         0         1     0
syseventconfd_event_channel         0        101         0         0     0
syseventconfd_event_channel         1          2         0         1     0
syseventd_channel                 6        101         0         6     0
syseventd_channel                 1          2         0         1     0
Hex0x600570a1c88_minor            0 4294967294         0         0     0
dtrace                        62582 4294967295         0   2057812     0
dtrace_minor                      0 4294967293         0         1     0
heaptext_holesrc_15          335872    2097152         0        21     0
    heaptext_hole_15         313264     335872    335872        54     0
heaptext_holesrc_16          483328    2097152         0        18     0
    heaptext_hole_16         473272     483328    483328        55     0
heaptext_holesrc_14          540672    2097152         0        17     0
    heaptext_hole_14         522480     540672    540672        38     0
heaptext_holesrc_13          360448    2097152         0        25     0
    heaptext_hole_13         332924     360448    360448        63     0
heaptext_holesrc_12          491520    2097152         0        13     0
    heaptext_hole_12         453996     491520    491520        31     0
lmsysid_space                     0      16383         0         0     0
shmids                            2        128         0      4762     0
semids                           10        128         0        16     0
crypto                            0         16         0      7098     0
msqids                            0        128         0         0     0
logdmux_minor                     0        256         0         0     0
module_text_holesrc         1236992    4194304         0        19     0
    heaptext_hole_0         1216292    1236992   1236992        42     0
sppptun_minor                     0         16         0         0     0
ibcm_local_sid                    0 4294967295         0         0     0
ibcm_ip_sid                       0      65535         0         0     0
------------------------- --------- ---------- --------- --------- -----
> exit

zio shows very high memory usage. (We are using ZFS).
Let’s dig deeper into zfs memory usage and zfs kernel parameters

# kstat -m zfs
module: zfs                             instance: 0
name:   arcstats                        class:    misc
        c                               38539677282
        c_max                           62161149952
        c_min                           7770143744
        crtime                          282.9648021
        deleted                         5979967
        demand_data_hits                46535834
        demand_data_misses              89071
        demand_metadata_hits            5274233
        demand_metadata_misses          27598
        evict_skip                      0
        hash_chain_max                  7
        hash_chains                     136691
        hash_collisions                 3827378
        hash_elements                   661092
        hash_elements_max               720203
        hdr_size                        111542928
        hits                            61247851
        l2_abort_lowmem                 0
        l2_cksum_bad                    0
        l2_evict_lock_retry             0
        l2_evict_reading                0
        l2_feeds                        0
        l2_free_on_write                0
        l2_hdr_size                     0
        l2_hits                         0
        l2_io_error                     0
        l2_misses                       0
        l2_rw_clash                     0
        l2_size                         0
        l2_writes_done                  0
        l2_writes_error                 0
        l2_writes_hdr_miss              0
        l2_writes_sent                  0
        memory_throttle_count           0
        mfu_ghost_hits                  1537137
        mfu_hits                        47087889
        misses                          4019246
        mru_ghost_hits                  25177
        mru_hits                        8989078
        mutex_miss                      22768
        p                               17711720156
        prefetch_data_hits              9033089
        prefetch_data_misses            3895947
        prefetch_metadata_hits          404695
        prefetch_metadata_misses        6630
        recycle_miss                    8277
        size                            38444263776
        snaptime                        653587.0459765

module: zfs                             instance: 0
name:   vdev_cache_stats                class:    misc
        crtime                          282.9648542
        delegations                     7158
        hits                            11575
        misses                          18702
        snaptime                        653587.0477537

ZFS memory usage parameters were not configured on this server, hence it was using system defined defaults.
The max was set to 57GB (c_max 62161149952) and min was 7GB (c_min 7770143744). A google search fetched me this note.

Following is the output of arcstat.pl on our server (Thanks to Neelakanth Nadgir)

# arcstat.pl
    Time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
17:48:29   65M    4M      6  116K    0    3M   29   34K    0    37G   37G
17:48:30    28     0      0     0    0     0    0     0    0    37G   37G
17:48:31     6     0      0     0    0     0    0     0    0    37G   37G
17:48:32    19     0      0     0    0     0    0     0    0    37G   37G
17:48:33   361     0      0     0    0     0    0     0    0    37G   37G

I found the following note on ZFS tuning – ZFS Evil Tuning Guide

Based on the situations listed out in the above note, in the section “Limiting the ARC cache”, we have decided to limit the ARC cache on our servers to a value after considering the memory usage of oracle and others.
I would like to blog about my experiences on ZFS ARC memory tuning in an other note.

Blog at WordPress.com.