Coskan’s Approach to Oracle

August 18, 2010

Session Snapper on duty

Filed under: Performance — coskan @ 11:52 am

I have joined another company 2 weeks ago and I immediately started  to enjoy of the active Oracle Environment.
It looks like  I really missed the action too much in last 3 years (Last company was a bit quite).
Because I am going to upgrade major systems to 11GR2 from 10GR2 I will be very busy but I hope I can write somethin.
Trying to keep up with blogroll report but no chance :(
Anyways today’s story does not need long paragraphs. Everything is self explanatory. (Because I did not write the tools I used here,  all credit can go to Tanel Poder and his scripts)
Long story short I got a call that system is slow and waiting on library cache lock. Here is the action I took (the script output is also available as text at the bottom)
######################################################
#####SESSION STATE WHEN I CHECK COUPLE OF TIMES#######
######################################################

DBA@CLONE1> @swact

    SID STATE   EVENT                                          SEQ# SEC_IN_STATE         P1         P2         P3 P1TRANSL                                   SQL_ID          CHILD_NO
------- ------- ---------------------------------------- ---------- ------------ ---------- ---------- ---------- ------------------------------------------ ------------- ----------
    698 WAITING DIAG idle wait                                  696          662          7          8          0                                            4tq4zndbtty5g          0
    897 WAITING Streams AQ: waiting for messages in the       64312            0       8560 1.2335E+10          5
   1047 WAITING Streams AQ: waiting for messages in the       15818            0     539956 1.2251E+10         10                                            ady0t9s6h5x5n          0
    732 WAITING cursor: pin S wait on X                       37240            0 2272359238 2.5469E+12 2.1475E+10                                            fwxq99y3r2wu6
    680 WAITING cursor: pin S wait on X                       30734            0 2272359238 2.5469E+12 2.1475E+10                                            fwxq99y3r2wu6
    926 WAITING cursor: pin S wait on X                       48363            0 2272359238 2.5469E+12 2.1475E+10                                            fwxq99y3r2wu6
    682 WAITING cursor: pin S wait on X                       38796            0 3849361425 3.4145E+12 2.1476E+10                                            7v63nkmkr160j
    906 WAITING cursor: pin S wait on X                       31566            0 3849361425 3.4145E+12 2.1476E+10                                            7v63nkmkr160j
   1037 WAITING cursor: pin S wait on X                         854            0 3849361425 3.4145E+12 2.1476E+10                                            7v63nkmkr160j
    612 WAITING jobq slave wait                                   1          271          0          0          0                                                                   0
   1053 WAITING library cache lock                             1474         4650 1.2269E+10 1.1683E+10        501                                            7xvut41x0dxj2          0
    774 WAITING library cache lock                               41         2866 1.2269E+10 1.1663E+10        301                                            9r5pb2sbs10v4          0
    593 WAITING library cache lock                            15936          834 1.2269E+10 1.1775E+10        301                                            fwxq99y3r2wu6
    979 WAITING library cache lock                              553         2646 1.2288E+10 1.1516E+10        301                                            a610tnjgas2xg          2
    613 WAITING library cache lock                              750         1131 1.2288E+10 1.1511E+10        301                                            gyg59d0f8hk1t          0
    961 WAITING library cache lock                            34638         2752 1.2305E+10 1.1518E+10        501                                            0000000000000          0
    914 WAITING library cache pin                               543          111 1.2300E+10 1.1665E+10        202                                            a610tnjgas2xg          2
   1061 WAITING wait for unread message on broadcast cha      30681            4 1.2239E+10 1.2221E+10          0                                            2b064ybzkwf1y          0
    760 WORKING On CPU / runqueue                               369            0 1413697536          1          0                                            3y2zzajvss71p          0
    795 WORKING On CPU / runqueue                             42057           75 1.1969E+10        215          0                                            b221muwskhmq6          0

20 rows selected.

#########################################################################
#### ONLY ONE SESSION IS RUNNING OTHER THAN MINE LETS SEE WHAT IS THE SQL PLAN for 795  ####
#########################################################################

DBA@CLONE1> @e3
Enter value for sql_id:  b221muwskhmq6
Enter value for child_number: 0

PLAN_TABLE_OUTPUT
-----------------------------------------------------------
SQL_ID:  b221muwskhmq6, child number: 0 cannot be found

######################################
######NO PLAN BUT WHAT IS THE SQL#####
######################################

DBA@CLONE1> @sqltext  b221muwskhmq6

no rows selected

######################################
###NO SQL IN CACHE MAYBE IN HISTORY###
######################################

DBA@CLONE1>  @h_sqltext b221muwskhmq6

no rows selected

DBA@CLONE1> l
  1* select sql_text from dba_hist_sqltext where sql_id='&1'

######################################
#######NO CHANCE- TIME TO SNAP########
######################################

DBA@CLONE1> @snapper ash,stats 50 1 795
Sampling...

-- Session Snapper v3.10 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

----------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                               ,         DELTA, HDELTA/SEC,    %TIME, GRAPH
----------------------------------------------------------------------------------------------------------------------
    795, VALDATACAP, STAT, opened cursors cumulative               ,           210,       4.12,
    795, VALDATACAP, STAT, recursive calls                         ,          2856,         56,
    795, VALDATACAP, STAT, session logical reads                   ,          1344,      26.35,
    795, VALDATACAP, STAT, CPU used by this session                ,          4648,      91.14,
    795, VALDATACAP, STAT, concurrency wait time                   ,           337,       6.61,
    795, VALDATACAP, STAT, session pga memory                      ,       5111808,    100.23k,
    795, VALDATACAP, STAT, session pga memory max                  ,       5111808,    100.23k,
    795, VALDATACAP, STAT, global enqueue gets sync                ,            42,        .82,
    795, VALDATACAP, STAT, global enqueue releases                 ,            42,        .82,
    795, VALDATACAP, STAT, consistent gets                         ,          1344,      26.35,
    795, VALDATACAP, STAT, consistent gets from cache              ,          1344,      26.35,
    795, VALDATACAP, STAT, consistent gets - examination           ,           672,      13.18,
    795, VALDATACAP, STAT, calls to get snapshot scn: kcmgss       ,           378,       7.41,
    795, VALDATACAP, STAT, no work - consistent read gets          ,           672,      13.18,
    795, VALDATACAP, STAT, table fetch by rowid                    ,           546,      10.71,
    795, VALDATACAP, STAT, cluster key scans                       ,            84,       1.65,
    795, VALDATACAP, STAT, cluster key scan block gets             ,           210,       4.12,
    795, VALDATACAP, STAT, rows fetched via callback               ,           126,       2.47,
    795, VALDATACAP, STAT, index fetch by key                      ,           252,       4.94,
    795, VALDATACAP, STAT, index scans kdiixs1                     ,            42,        .82,
    795, VALDATACAP, STAT, sql area purged                         ,            42,        .82,
    795, VALDATACAP, STAT, session cursor cache hits               ,           378,       7.41,
    795, VALDATACAP, STAT, buffer is not pinned count              ,          1306,      25.61,
    795, VALDATACAP, STAT, parse time cpu                          ,             1,        .02,
    795, VALDATACAP, STAT, parse count (total)                     ,           379,       7.43,
    795, VALDATACAP, STAT, parse count (hard)                      ,            42,        .82,
    795, VALDATACAP, STAT, execute count                           ,           379,       7.43,
    795, VALDATACAP, STAT, sorts (memory)                          ,            42,        .82,
    795, VALDATACAP, STAT, sorts (rows)                            ,           126,       2.47,
    795, VALDATACAP, TIME, hard parse elapsed time                 ,      47918246,   939.57ms,    94.0%, |@@@@@@@@@@|
    795, VALDATACAP, TIME, parse time elapsed                      ,      47918246,   939.57ms,    94.0%, |@@@@@@@@@@|
    795, VALDATACAP, TIME, DB CPU                                  ,      44676472,   876.01ms,    87.6%, |@@@@@@@@@ |
    795, VALDATACAP, TIME, sql execute elapsed time                ,      47918246,   939.57ms,    94.0%, |@@@@@@@@@@|
    795, VALDATACAP, TIME, DB time                                 ,      47918246,   939.57ms,    94.0%, |@@@@@@@@@@|
    795, VALDATACAP, WAIT, latch: library cache                    ,       3370707,    66.09ms,     6.6%, |@         |
--  End of Stats snap 1, end=2010-08-18 11:28:06, seconds=51

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
    94% | b221muwskhmq6   | ON CPU                    | ON CPU
     6% | b221muwskhmq6   | latch: library cache      | Concurrency

--  End of ASH snap 1, end=2010-08-18 11:28:06, seconds=50, samples_taken=100

##############################################################
#######HARD PARSE SOFT PARSE AND ON CPU ALL THE TIME ?########
##############################################################

######################################
#####SESSIONS ARE STILL WAITING#######
######################################

DBA@CLONE1> @swact

    SID STATE   EVENT                                          SEQ# SEC_IN_STATE         P1         P2         P3 P1TRANSL                                   SQLID             CHILD_NO
------- ------- ---------------------------------------- ---------- ------------ ---------- ---------- ---------- ------------------------------------------ --------------- ----------
    698 WAITING DIAG idle wait                                  731          151          7          8          0                                            4tq4zndbtty5g            0
    897 WAITING Streams AQ: waiting for messages in the       64728            2       8560 1.2335E+10          5
   1047 WAITING Streams AQ: waiting for messages in the       16512            3     539956 1.2251E+10         10                                            ady0t9s6h5x5n            0
    732 WAITING cursor: pin S wait on X                         692            0 2272359238 3.9771E+12 2.1475E+10                                            fwxq99y3r2wu6
    682 WAITING cursor: pin S wait on X                       34091            0 3849361425 3.4145E+12 2.1476E+10                                            7v63nkmkr160j
    906 WAITING cursor: pin S wait on X                       26951            0 3849361425 3.4145E+12 2.1476E+10                                            7v63nkmkr160j
   1037 WAITING cursor: pin S wait on X                       61795            0 3849361425 3.4145E+12 2.1476E+10                                            7v63nkmkr160j
    612 WAITING jobq slave wait                                   1          946          0          0          0                                                                     0
   1053 WAITING library cache lock                             1474         5325 1.2269E+10 1.1683E+10        501                                            7xvut41x0dxj2            0
    774 WAITING library cache lock                               41         3541 1.2269E+10 1.1663E+10        301                                            9r5pb2sbs10v4            0
    926 WAITING library cache lock                            42987            9 1.2269E+10 1.1689E+10        301                                            fwxq99y3r2wu6
    979 WAITING library cache lock                              553         3321 1.2288E+10 1.1516E+10        301                                            a610tnjgas2xg            2
    961 WAITING library cache lock                            34638         3427 1.2305E+10 1.1518E+10        501                                            0000000000000            0
   1061 WAITING wait for unread message on broadcast cha      30791            9 1.2239E+10 1.2221E+10          0                                            2b064ybzkwf1y            0
    760 WORKING On CPU / runqueue                               750            0 1413697536          1          0                                            3y2zzajvss71p            0
    795 WORKING On CPU / runqueue                             42061           67 1.1969E+10        215          0                                            b221muwskhmq6            0

######################################################################
#######GOT PERMISSION TO KILL THE SESSION AND I KILLED FROM OS########
######################################################################

-----KILL THE SESSION AND CHECK AGAIN

######################################
#####NO MORE LIBRARY CACHE LOCK#######
######################################

COSKANDBA@CLONE1> @swact

    SID STATE   EVENT                                          SEQ# SEC_IN_STATE         P1         P2         P3 P1TRANSL                                   SQLID             CHILD_NO
------- ------- ---------------------------------------- ---------- ------------ ---------- ---------- ---------- ------------------------------------------ --------------- ----------
    698 WAITING DIAG idle wait                                  733          122          7          8          0                                            4tq4zndbtty5g            0
   1053 WAITING PX Deq: Execute Reply                          1541            0        200          2          0                                            62gz6xzzrb8r3            0
    732 WAITING PX Deq: Execution Msg                             8            1  268566527          4          0                                                                     0
    774 WAITING PX Deq: Execution Msg                             8            1  268566527          4          0                                                                     0
    897 WAITING Streams AQ: waiting for messages in the       64880            1       8560 1.2335E+10          5
   1047 WAITING Streams AQ: waiting for messages in the       16720            5     539956 1.2251E+10         10                                            ady0t9s6h5x5n            0
    977 WAITING db file scattered read                          738            0        217    1214913         32                                                                     0
    926 WAITING db file scattered read                         1076            0        243    4114803         32                                                                     0
    612 WAITING jobq slave wait                                   1         1149          0          0          0                                                                     0
   1061 WAITING wait for unread message on broadcast cha      30826            1 1.2239E+10 1.2221E+10          0                                            2b064ybzkwf1y            0
    961 WORKING On CPU / runqueue                             45580            0 1230372869 2507953651  559265612                                            0000000000000            0
    760 WORKING On CPU / runqueue                               799            0 1413697536          1          0                                            3y2zzajvss71p            0

Cause of the problem :

It is caused because of a session whose application server was shut down but somehow it stuck on DB at parsing and continuously spinning on CPU and blocking others.

I wish I could see which sql it was running but no chance. App owner was not sure what it could be doing :(

Moral of the story :

Snapper is just so cool :)

Thanks again Tanel, you are a superstar.

For the ones who is loving OEM, The Performance  Graph is just like this but no chance to find what the session is doing from OEM (or I don’t know how to look)

Because of the formatting I think it is better to put a text output


The scripts used here are

Tanel Poders sw.sql for session waits   (swact.sql  is just to call it like @sw ‘select sid from v$session where status=”ACTIVE” and type<>”BACKGROUND”’)

Tanel Poders snapper.sql for snapping

e3.sql  is for gathering plan (SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR((‘&sql_id’),&child_number,’ALLSTATS LAST’));)

sqltext.sql  is for getting the sqltext (select sql_text from dba_hist_sqltext where sql_id=’&1′)

instead of e3 and sqltext.sql you can use Tanel Poders usql.sql :)

About these ads

11 Comments »

  1. Hi Coskan

    Great job, but it’s a bit sad that you didn’t/couldn’t use os_explain. But I guess you decided it’s to risky to take a stack trace of the oracle process on production.

    For those who don’t know os_explain: It’s another very useful tool written by Tanel Poder. This tool takes a stack trace as input and then translates the call stack into an execution plan. This is done by mapping kernel functions to specific operations i.e. qerhj does a HASH JOIN.

    Wish you a good start at your new job.

    Cheers Dani

    Comment by danirey — August 18, 2010 @ 3:25 pm

    • Thanks for the feedback and good wishes Dani. For the os_explain I wish I could but I am not armed with that much privileges yet and business were already ok to kill the session. Normally it is the last thing I do after running out of diagnosing options.

      Strace is also a good option to see what is really going on behind the scene

      Comment by coskan — August 18, 2010 @ 4:16 pm

  2. I have circulated this link to my team as it is a very good example of a thought process in action and how it leads to a successful finding.

    Nice write-up

    John

    Comment by John — August 19, 2010 @ 10:06 am

    • Thanks for the feedback John. Nice to see you around here :)

      Comment by coskan — August 19, 2010 @ 1:26 pm

  3. Coşkan selam, yeni işinde başarılar, umarım gönlünce olur ve blog’larından daha fazla faydalanma şansımız olur. Görüşürüz.

    Comment by H.Tonguç Yılmaz — August 19, 2010 @ 9:42 pm

  4. Well done Coskan and good luck on your new job !
    Very good infos , my coffe was better today :-)

    Luc

    PS. Please , keep up with with blogroll report . I’m a FAN !

    Comment by Lucian — August 20, 2010 @ 8:10 am

  5. [...] Gundgobar (Log Buffer #49) responds to a slow system waiting on library cache lock and credits Tanel Poder’s scripts as the [...]

    Pingback by Log Buffer #200, A Carnival of the Vanities for DBAs | The Pythian Blog — August 21, 2010 @ 4:28 am

  6. Nice blog post.

    Future mysteries involving missing SQL might be answered by Kerry Osborne’s article here:

    http://kerryosborne.oracle-guy.com/2009/04/hidden-sql-why-cant-i-find-my-sql-text

    Is essence: check out v$open_cursor.

    Comment by Richard Armstrong-Finnerty — January 10, 2011 @ 5:00 pm

    • The problem here wasn’t about the situation Kerry was explaining. This is a bug Oracle loses track of sqlid when it does too many revursive sqls. When I see this I usually check the last sqlid in ash data (99% of the time it is the sqlid which is running ) or I dump the error stack which reveals the sqlid in the dump

      Comment by coskan — January 12, 2011 @ 11:40 pm

  7. hi Coskan, great article. But had a quick question. When you run @swact. I see that you have 2 extra column output ( SQLID,CHILD_NO)…but when i took that script and tested it out….it seemed like i do not get the output…i am on 10.2.0.4. Did you modify the script a little bit more to get that output…if so…can you share that with us. Thanks. Altough we can get its from other dynamic views, its nice to have it when running this…

    Comment by max — January 28, 2011 @ 4:18 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Silver is the New Black Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 203 other followers

%d bloggers like this: