Coskan’s Approach to Oracle

August 2, 2013

oratop from MOS

Filed under: Basics, Linux, Monitoring, Performance, Tips — coskan @ 10:12 am

This is just a pointer post to a tool which I discovered today. It is ORATOP, to monitor databases real time like top of unix flavours. It is pretty much like MOATS   (tool from Tanel Poder and Adrian Billington and RAC version by  Jagjeet Singh),   with a bit better output (at least for my taste)

 

It doesn’t need anything to be installed which is the best part of the tool.

oratop – utility for near real-time monitoring of databases, RAC and Single Instance (Doc ID 1500864.1)

 

Sample output is like below. It can be a good addition to your performance tuning toolkits.

 

oratop

February 6, 2012

Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backups (on Exadata!!)

Filed under: Basics — coskan @ 7:47 am

Just a quick note to point a MOS note which I found out yesterday . I already tweeted about it and it has been retweeted many time by my follower, but for wider audience I decided to write a blog post as well.

As you may already know, to migrate databases between different endian formats, using RMAN Transportable Tablespaces is an option but for this option to work, you need to have a downtime since tablespaces needs to be read only, and this downtime can be very long duration depending on the data size.

 

To sort this problem for migrations to Exadata platform (method only supported on Exadata) Oracle now brought incremental backup availability for TTS, till Oracle puts this as a public document (or somebody post about the steps) you need a MOS access to  see the method.

For more information.

Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backups [ID 1389592.1]

 

In my humble opinion, Oracle better decide to support this on other platforms since method should not have anything to do with Exadata spesific other than being supported (don’t know any rman method which needs Exadata spesific features)

January 13, 2012

Session Snapper on Duty 2 – (how to support your answers, with proof)

Filed under: Basics, Performance, Tips — coskan @ 10:51 pm

Warning : This long blog is for educational and quality purposes and especially targeting Junior DBAs and DBAs who does not know how to use the Session Snapper
(greatest Oracle Tuning Tool yet) by Tanel Poder . If you feel comfortable with snapper and you think your relationship with your
clients just rocks then there is no need to read the post but you may still give it a go :)

Story started with a long email thread forward by a  developer who is asking question below for a system I am responsible but not feeling perfectly comfortable about how things really work.

Hi Josh (note to reader: Coskan is pronounced as Joshkahn so I use Josh at office)
Any ideas on how we can work with dbas to find out why our database is only committing at a range of 
2-30 sec for 100 commits instead of the ideal 100 commits a sec for a third party process?

I answered

From my point of view 

I will be requesting 

1-	Time interval when this process run
2-	If you have SID information when this process run 
3-	If it gives same results on UAT as well then can we run the operation while tracing is on (which need Change Request-hassle on production)

Then we can contione to understand what can be the problem

Developer replied

It runs 7/24 and UAT testing is not possible since we can't point the feeds to UAT but username is USER_XXX 
if it helps.

Long story short all I understood was in simplest words  they want to be able to process more data in same amount of time,  so I replied with a long mail below

BEGINNING OF THE EMAIL

I did some sampling on your sessions to understand what is happening and I think I can come up with rough conclusions about improvement suggestions.

Lets walk through on what is/maybe happening

This is 500 second sampling of one of your sessions

---Some important Statistics
-------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
     71, USER_XXX  , STAT, user commits                                              ,        326,        .65,
     71, USER_XXX  , STAT, session logical reads                                     ,      1.36M,      2.71k,

—Wait profile

     71, USER_XXX  , WAIT, buffer busy waits                                         ,     1.18ms,     2.36us,      .0%, |          |
     71, USER_XXX  , WAIT, log file sync                                             ,     36.12s,    72.24ms,     7.2%, |@         |
     71, USER_XXX  , WAIT, db file sequential read                                   ,    276.17s,   552.34ms,    55.2%, |@@@@@@    |
     71, USER_XXX  , WAIT, flashback buf free by RVWR                                ,   403.28ms,   806.55us,      .1%, |          |
     71, USER_XXX  , WAIT, undo segment extension                                    ,    72.93ms,   145.86us,      .0%, |          |
     71, USER_XXX  , WAIT, SQL*Net message to client                                 ,     2.23ms,     4.47us,      .0%, |          |
     71, USER_XXX  , WAIT, SQL*Net message from client                               ,     178.2s,    356.4ms,    35.6%, |@@@@      |
     71, USER_XXX  , WAIT, SQL*Net more data from client                             ,   710.26ms,     1.42ms,      .1%, |          |

—Wait profile breakdown per sql

---------------------------------------------------------------------------------
Active% | SQL_ID          | EVENT                               | WAIT_CLASS
---------------------------------------------------------------------------------
    20% | ggjkmtv8cxxh3   | db file sequential read             | User I/O
    19% | 64djbhqv5v93u   | db file sequential read             | User I/O
    13% | 3b8gt3dqtn345   | db file sequential read             | User I/O
     8% |                 | log file sync                       | Commit
     2% | 8zt375nzk26rh   | db file sequential read             | User I/O
     1% | 8ztz7p814q9u2   | db file sequential read             | User I/O
     1% | ggjkmtv8cxxh3   | ON CPU                              | ON CPU
     1% | 64djbhqv5v93u   | ON CPU                              | ON CPU
     0% | 64djbhqv5v93u   | flashback buf free by RVWR          | Configuration
     0% | 64djbhqv5v93u   | SQL*Net more data from client       | Network

--  End of ASH snap 1, end=2012-01-11 21:51:46, seconds=500, samples_taken=499

When we check the stats, you have done 326 commits in 500 seconds which makes 0.65 commit per second but if we take 180 second of idle events (SQL*Net message to client + SQL*Net message from client) out, it is 326 commits in 320 (500-180) seconds active time which is roughly 1 commit per second.

When we look at the wait profile we can see 276 seconds of “db file sequential read” (physical read) over 320 second ofnon-iddle period and 36  second  “log file sync” which we will come later

If we focus on the “db file sequential read” we can see it is %86 (276/320) of the active time and there are 3 main contributer sqls for this case.

SQLs are below

SQL> @sqlid2 ggjkmtv8cxxh3

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ---------------------------------------------------------------------------------------------------------------
ggjkmtv8cxxh3 3503224323 DELETE FROM TABLE_T WHERE ESMP = :1

1 row selected.

  PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
4033870845        4          0         21      20219           .0        392.7       .26172575          1198.3            28.1


Plan hash value: 4033870845

-----------------------------------------------------------------------------------------------
| Id  | Operation         | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT  |                           |       |       |     4 (100)|          |
|   1 |  DELETE           | TABLE_T                   |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| PK_TABLE_T                |     2 |    50 |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
SQL> @sqlid2 64djbhqv5v93u

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
64djbhqv5v93u 3059590266 some complex update statement which does not need to be shared

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
1856144614        3         26          2     473152           .0         85.4      .270458035           454.4            25.4

Plan hash value: 1856144614

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |                      |       |       |     3 (100)|          |
|   1 |  UPDATE            | TABLE_XXXXXXX        |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| PK_TABLE_XXXXXXX     |     1 |   192 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

SQL> @sqlid2 3b8gt3dqtn345

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
3b8gt3dqtn345 1838812293 some complex update statement which does not need to be shared

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
--------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
861798698        3          0          7      22308           .0         97.5      .113096364           382.6            10.9

Plan hash value: 861798698

------------------------------------------------------------------------------------------------
| Id  | Operation          | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |                           |       |       |     3 (100)|          |
|   1 |  UPDATE            | TABLE_ZZZZ                |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| PK_TABLE_ZZZZ             |     1 |    69 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

There is nothing much we can do about index unique scans other than changing the table structure but I think there is room for improve on index range scan for sql_id ggjkmtv8cxxh3 for delete statement which is waitig on db file sequential read for  %31 (500*%20 / 320 ) of the active time.

The reason the delete is doing index range scan is because we are trying to access 3 column composite index PK_TABLE_T by only using 1 column which is COL1. If business logic lets you also give other 2 columns as filter as well we will be doing much throughput with less LIO and PIO

TABLE_OWNER          TABLE_NAME                     INDEX_NAME                     POS# COLUMN_NAME
-------------------- ------------------------------ ------------------------------ ---- ------------------------------
APP_OWNER            TABLE_T                        PK_TABLE_T                        1 COL1
                                                                                      2 COL2
                                                                                      3 COL3

While I was doing sampling on above session at the same time I also did sampling on another session which is also having similar behaviour with different sqls like below.

—Some important Statistics

-------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
    192, USER_XXX  , STAT, user commits                                              ,        343,        .69,
    192, USER_XXX  , STAT, session logical reads                                     ,      2.78M,      5.56k,

—Wait profile

  192, USER_XXX  , WAIT, buffer busy waits                                         ,      702us,      1.4us,      .0%, |          |
  192, USER_XXX  , WAIT, log file switch completion                                ,    94.34ms,   188.68us,      .0%, |          |
  192, USER_XXX  , WAIT, log file sync                                             ,     28.26s,    56.53ms,     5.7%, |@         |
  192, USER_XXX  , WAIT, db file sequential read                                   ,     364.3s,    728.6ms,    72.9%, |@@@@@@@@  |
  192, USER_XXX  , WAIT, flashback buf free by RVWR                                ,   348.79ms,   697.57us,      .1%, |          |
  192, USER_XXX  , WAIT, undo segment extension                                    ,    492.4ms,    984.8us,      .1%, |          |
  192, USER_XXX  , WAIT, SQL*Net message to client                                 ,     1.03ms,     2.07us,      .0%, |          |
  192, USER_XXX  , WAIT, SQL*Net message from client                               ,     94.02s,   188.05ms,    18.8%, |@@        |
  192, USER_XXX  , WAIT, SQL*Net more data from client                             ,   368.33ms,   736.67us,      .1%, |          |

—Wait profile breakdown per sql

---------------------------------------------------------------------------------
Active% | SQL_ID          | EVENT                               | WAIT_CLASS
---------------------------------------------------------------------------------
    37% | dkh6b1nhnhfcu   | db file sequential read             | User I/O
    32% | 29zz9n1b9sppr   | db file sequential read             | User I/O
     7% |                 | log file sync                       | Commit
     1% | dkh6b1nhnhfcu   | ON CPU                              | ON CPU
     1% | 29zz9n1b9sppr   | ON CPU                              | ON CPU
     0% | 29zz9n1b9sppr   | undo segment extension              | Configuration
     0% | 29zz9n1b9sppr   | flashback buf free by RVWR          | Configuration

--  End of ASH snap 1, end=2012-01-11 21:52:17, seconds=500, samples_taken=499

This time 94 seconds idle with 406 (500-94) second processing time which gives us 0.84 (343/406) commit per second and 392 seconds of wait
(364 “db file sequential read”+28 “log file sync”) wait.

When we look at the sqls below this time we can see 160 (500*%32) of 364 seconds (%43 of active time) is being spent on again index range scan (sqlid 29zz9n1b9sppr) on delete statement again due to the lack of composite index column filtering.

I again believe we can improve the access if we are able to give all columns as filter and maintain index unique scan.


TABLE_OWNER          TABLE_NAME                     INDEX_NAME                     POS# COLUMN_NAME                    DSC
-------------------- ------------------------------ ------------------------------ ---- ------------------------------ ----
APP_OWNER            TABLE_Z                        PK_TABLE_Z                      1    COL1
                                                                                    2    COL2
                                                                                    3    COL3

SQL> @sqlid2 dkh6b1nhnhfcu

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
dkh6b1nhnhfcu  558381466 INSERT INTO APP_OWNER.TABLE_Z where bla bla bla bla

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
--------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
        0        1        778         15     102566           .0       1065.3      .177031281          5673.4            30.4

-------------------------------------------------
| Id  | Operation                | Name | Cost  |
-------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |
|   1 |  LOAD TABLE CONVENTIONAL |      |       |
-------------------------------------------------

SQL> @sqlid2 29zz9n1b9sppr

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- -------------------------------------------------------------------------------------------------------
29zz9n1b9sppr 1453086391 DELETE FROM APP_OWNER.TABLE_Z WHERE ESMP = :1

1 row selected.

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
3229062859        4          0         30      21308           .0       1136.6      .642183193          2081.5            68.1

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT  |                  |       |       |     4 (100)|          |
|   1 |  DELETE           | TABLE_Z          |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| PK_TABLE_Z       |     5 |   135 |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

I think I managed to make it clear about the index range scan (for more information see the bonus material below), Now Lets see what the log file sync is all about and why we are seeing it .
<p

Log File sync event has got definition below in Oracle Documentation.

log file sync

When a user session commits, the session’s redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write the log buffer to the redo log file. When the LGWR has finished writing, it will post the user session.

Wait Time: The wait time includes the writing of the log buffer and the post.

Parameter Description
buffer# The number of the physical buffer in the redo log buffer that needs to be synchronized

Potential causes

Wait Event General Area Possible Causes Look for / Examine
log file sync I/O, over- committing Slow disks that store the online logs

Un-batched commits

Check the disks that house the online redo logs for resource contention. Check the number of transactions (commits + rollbacks) each second, from V$SYSSTAT.

First rule of tuning log file sync is doing less commit which means instead of commiting every dml or every single operation like

begin
for i in 300001..600000 loop
delete from t1 where id1=to_char(i);
commit;
end loop;
end;
/

Doing commits in batch like

begin
for i in 300001..600000 loop
delete from t1 where id1=to_char(i);
end loop;
commit;
end;
/

To prove how you are doing commits, I just get before sql statistics then get sampling of the session then again get after sql statistics and to see if number of user commits statistics during sampling interval are matching with number of increase in the sql executions

—-note to blog reader: this is, how I did,  no rocket science  (ones you arm yourself with right tools) just run the block as whole

select sql_id,executions from v$sql where sql_id in ('29zz9n1b9sppr','ggjkmtv8cxxh3');
@snapper ash=sql_id+event,stats,gather=stw,sinclude=user%commits 20 1 71,192
select sql_id,executions from v$sql where sql_id in ('29zz9n1b9sppr','ggjkmtv8cxxh3');

Results are

SQL> @test1

SQL_ID        EXECUTIONS
------------- ----------
29zz9n1b9sppr      28897
ggjkmtv8cxxh3      28211

Sampling SID 71,192 with interval 100 seconds, taking 1 snapshots...

-- Session Snapper v3.53 by Tanel Poder ( http://blog.tanelpoder.com )

-------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
     71, USER_XXX  , STAT, user commits                                              ,         12,        .12,
     71, USER_XXX  , TIME, repeated bind elapsed time                                ,       59us,      .59us,      .0%, |          |
     71, USER_XXX  , TIME, parse time elapsed                                        ,      127us,     1.27us,      .0%, |          |
     71, USER_XXX  , TIME, DB CPU                                                    ,   278.96ms,     2.79ms,      .3%, |          |
     71, USER_XXX  , TIME, sql execute elapsed time                                  ,     36.32s,   363.21ms,    36.3%, |@@@@      |
     71, USER_XXX  , TIME, DB time                                                   ,     40.75s,   407.51ms,    40.8%, |@@@@@     |
     71, USER_XXX  , WAIT, log file sync                                             ,      4.42s,    44.22ms,     4.4%, |@         |
     71, USER_XXX  , WAIT, db file sequential read                                   ,     35.04s,   350.43ms,    35.0%, |@@@@      |
     71, USER_XXX  , WAIT, SQL*Net message to client                                 ,      242us,     2.42us,      .0%, |          |
     71, USER_XXX  , WAIT, SQL*Net message from client                               ,     60.11s,   601.05ms,    60.1%, |@@@@@@    |
     71, USER_XXX  , WAIT, SQL*Net more data from client                             ,     3.94ms,    39.44us,      .0%, |          |
    192, USER_XXX  , STAT, user commits                                              ,         12,        .12,
    192, USER_XXX  , TIME, DB CPU                                                    ,   207.97ms,     2.08ms,      .2%, |          |
    192, USER_XXX  , TIME, sql execute elapsed time                                  ,     30.74s,   307.37ms,    30.7%, |@@@@      |
    192, USER_XXX  , TIME, DB time                                                   ,     35.14s,    351.4ms,    35.1%, |@@@@      |
    192, USER_XXX  , WAIT, log file sync                                             ,       4.4s,    43.98ms,     4.4%, |@         |
    192, USER_XXX  , WAIT, db file sequential read                                   ,     29.95s,   299.51ms,    30.0%, |@@@       |
    192, USER_XXX  , WAIT, SQL*Net message to client                                 ,      113us,     1.13us,      .0%, |          |
    192, USER_XXX  , WAIT, SQL*Net message from client                               ,     64.55s,   645.48ms,    64.5%, |@@@@@@@   |
    192, USER_XXX  , WAIT, SQL*Net more data from client                             ,    19.36ms,   193.62us,      .0%, |          |
--  End of Stats snap 1, end=2012-01-12 22:27:59, seconds=100

---------------------------------------------------------------
Active% | SQL_ID          | EVENT
---------------------------------------------------------------
    28% | 29zz9n1b9sppr   | db file sequential read             |
    12% | ggjkmtv8cxxh3   | db file sequential read             |
    11% | 3b8gt3dqtn345   | db file sequential read             |
     9% |                 | log file sync                       |
     9% | 64djbhqv5v93u   | db file sequential read             |
     4% | dkh6b1nhnhfcu   | db file sequential read             |
     1% | 8zt375nzk26rh   | db file sequential read             |
     1% | b513mwk4mf9rt   | db file sequential read             |
     1% | 0h01fy07y1c1p   | db file sequential read             |

--  End of ASH snap 1, end=2012-01-12 22:27:59, seconds=100, samples_taken=99

PL/SQL procedure successfully completed.

SQL_ID        EXECUTIONS
------------- ----------
29zz9n1b9sppr      28909
ggjkmtv8cxxh3      28223

As you can see, both sql increase by 12  (after-before execution times) and both sql show user commits statistics delta as 12 which means  there is a commit per execution that leads log file sync event.

After proving that the app is not doing batch commits,  I wonder if this is something avoidable or not,  in terms of business logic. Doing batch commits  will definitely help to increase your overall throughput plus whole system health.

So in my humble opinion we have two options without me seeing any code logic.

1- Providing unique index access for the delete operations by giving all columns of the index
2- Decreasing the number of commits and try to do batch commit.

For you to test all my actions, I have added a a simple synthetic test case attached to understand how these implementations can help us.

Cases tested.

1-Nonbatch commit range scan
2-Nonbatch commit unique scan

3-Batch commit range scan
4-Batch commit unique scan

Timings for the test are as follows for deleting 100K row sets as you can see

Elapsed Improvement
nonbatch-commit-range 84 0%
nonbatch-commit-unique 47 44%
batch-commit-range 55 34%
batch-commit-unique 35 58%

END OF THE EMAIL


Note to blog reader : (In case they are still reading), I wish you found something interesting about how I used the right tool right time
and how I answered the development questions without talking bollocks but numbers and proofs. One thing I need to admit there were only 2 sessions running this operation which made my life easy.

If you wonder the duration of whole diagnose: It took 15 mins in total to find out what is going on and 1 hour to write the mail but that 1 hour save at least 3-4 unnecessary email ping pong just because nobody can prove their fact with numbers.

Quick Tip: If you are sending any output to anybody use Courier New character set and make everybodies eyes happy.

Session Snapper and its usage can be found at Tanels website
Test case can be found  here

November 18, 2011

On demand AWR Report Generator, for Many Node Cluster

Filed under: Basics, Performance, RAC — coskan @ 11:34 am

update ( 10/10/2012 ): I have changed the script to make work better with hourly requests at it was causing issues. Thanks to Dominic Brooks for the trick
update ( 11/10/2012 ): Added ADDM and Global report generation

Since I got turbo boost (over 1000 hits a day-wooow) from Tanel Poder the Great, better I write something to deserve the fame. This one is a real quick one which made my life easier.

On my second week at current company, I had to cover Exadata for 2 weeks which were completely nightmare due to some problems. I had to work with Oracle Support too much and most of the time they ask AWR reports from 8 nodes for x hour duration. First day I was not armed with anything automatic, so tried to do it manually but it was proper nightmare for 8 nodes. Then I said this is complete rubbish way of wasting my time and I need to automate the process otherwise my life will be miserable. Finally I came up with something after couple of attempts

The script below is the result of this automation attempts.

update ( 11/10/2012 ): –not equal sign may have been corrupted during paste for (next_snap_id”0 AND previous_snap_id”0)

-- Name:    gen_snap.sql
--
-- Purpose: Automating ADDM and AWR and Global RAC report generation 

--Usage:  copy the script your script directory. Edit the parameters section then  run it ,  
---it will generate gen_awr.sql  script . Call that script for generating awr reports
---
--- ALWAYS CHECK gen_awr.sql script before you run !!!!!!!!! 
---
-- @gen_snap.sql  
-- @gen_awr 

-- Versioning:
-- Version    Modified by	Date		What Modified
-- -------    ---------------	----------	--------------
-- 1.0	      Coskan Gundogar	11/10/2012	Initial 


------------PARAMETERS - BEGIN -------------------------------

define _interval=2				   --1 for half hour 2 for 1 hour (30 mins snapshots) / 1 for  hour 2 for 2 hour (60 mins snapshots)
define _database_name='DATABASE_NAME'		 ---- database name 
define _begin_interval_date='111012 07:00:00'	 ----begin date DDMMYY HH24:MI:SS
define _end_interval_date='111012 10:00:00'      ----end date   DDMMYY HH24:MI:SS
define _folder='C:\awr\DATABASE_NAME\111012'  ---report output location
define _option=2 ---1 without ADDM 2 with ADDM 
define _global=1 ---0 if you don't want Global Reports for RAC - It will not generate report for Single Instance with any setting 

------------PARAMETERS - END-------------------------------


SET termout OFF 
host mkdir &_folder
SET termout ON
--host dir &_folder
prompt 
prompt 
prompt GEN_AWR.SQL script generation started
prompt 
SET termout OFF 
SET heading off
SET feedback OFF 
SET echo OFF 
SET linesize 155
SET verify OFF 

spool gen_awr.SQL
SELECT 'set veri off;'||CHR(10)||
'set feedback off;' ||CHR(10)||
'set linesize 1500;' ||CHR(10)||
'prompt '||CHR(10)||
'prompt '||CHR(10)||
'prompt REPORT GENERATION STARTED'||CHR(10)||
'prompt '||CHR(10)||
'SET TIME OFF'||CHR(10)||
'set termout off;'||CHR(10)||
'set long 1000000 pagesize 0 longchunksize 1000;'||CHR(10)||
'COLUMN get_clob format a80'||CHR(10)
FROM dual;
with driver as (
select *
from (
select rownum rn,dbid,snap_id,previous_snap_id,next_snap_id,begin_interval_time  from (
SELECT * FROM (
select instance_number,dbid,snap_id,
LEAD(snap_id, &_interval, 0) OVER (PARTITION BY instance_number ORDER BY snap_id DESC NULLS LAST) previous_snap_id,
LAG(snap_id, &_interval, 0) OVER (PARTITION BY instance_number ORDER BY snap_id DESC NULLS LAST) next_snap_id,
begin_interval_time 
from dba_hist_snapshot 
where dbid=(select dbid from v$database where name='&_database_name')
AND instance_number= (select min(inst_id) from gV$database)
AND begin_interval_time>=TO_DATE('&_begin_interval_date','DDMMYY HH24:MI:SS')-1/24
AND begin_interval_time<TO_DATE('&_end_interval_date','DDMMYY HH24:MI:SS')+1/24
) d WHERE next_snap_id0 AND previous_snap_id0 ---not equal sign may have been corrupted during paste
ORDER BY 3
)  s ) where (
CASE WHEN &_INTERVAL = 2 THEN mod(rn,2)  
WHEN &_INTERVAL=1 THEN 1
END)=1
)
SELECT  
CASE &_option
WHEN 1 THEN 
'spool &_folder\awrrptto_&_database_name'||CHR(95)||TO_CHAR(instance_number)||'_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'.html'||CHR(10)|| CHR(10)|| 
'SELECT * FROM TABLE(dbms_workload_repository.awr_report_html('||TO_CHAR(dbid)||','||TO_CHAR(instance_number)||','||TO_CHAR(snap_id)||','||TO_CHAR(next_snap_id)||',8));'||CHR(10)
||CHR(10)||'spool off'
WHEN 2 THEN 
'spool &_folder\awrrptto_&_database_name'||CHR(95)||TO_CHAR(instance_number)||'_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'.html'||CHR(10)|| CHR(10)|| 
'SELECT * FROM TABLE(dbms_workload_repository.awr_report_html('||TO_CHAR(dbid)||','||TO_CHAR(instance_number)||','||TO_CHAR(snap_id)||','||TO_CHAR(next_snap_id)||',8));'||CHR(10)
||CHR(10)||'spool off'||CHR(10)||
'BEGIN'||CHR(10)||
'DBMS_ADVISOR.create_task (advisor_name      => ''ADDM'',task_name=> ''ADDM_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT'');'||CHR(10)||
'DBMS_ADVISOR.set_task_parameter (task_name => ''ADDM_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT'',parameter => ''START_SNAPSHOT'',value=> '||TO_CHAR(snap_id)||');'||CHR(10)||
'DBMS_ADVISOR.set_task_parameter (task_name => ''ADDM_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT'',parameter => ''END_SNAPSHOT'', value=> '||TO_CHAR(next_snap_id)||');'||CHR(10)||
'DBMS_ADVISOR.set_task_parameter (task_name => ''ADDM_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT'',parameter => ''INSTANCE'', value=> '||TO_CHAR(instance_number)||');'||CHR(10)||
'DBMS_ADVISOR.set_task_parameter (task_name => ''ADDM_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT'',parameter => ''DB_ID'', value=> '||TO_CHAR(dbid)||');'||CHR(10)||
'DBMS_ADVISOR.execute_task(task_name => ''ADDM_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT'');'||CHR(10)||
'END;'||CHR(10)||
'/'||CHR(10)||
'spool &_folder\addm_&_database_name'||CHR(95)||TO_CHAR(instance_number)||'_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT.txt'||CHR(10)||
'SELECT DBMS_ADVISOR.get_task_report(''ADDM_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'_AWR_SNAPSHOT'') AS report'||CHR(10)||
'FROM   dual;'||CHR(10)||
'spool off'||CHR(10)
END
FROM (
select ai.inst_id instance_number,d.dbid,d.snap_id,d.previous_snap_id,d.next_snap_id,d.begin_interval_time from driver d,
gv$database ai
order by 1,3);




SELECT 
CASE &_global 
WHEN 1 then
'set timing off space 1 flush on pause off numwidth 10;' ||CHR(10)||
'set echo off feedback off   newpage 1 recsep off;' ||CHR(10)||
'set trimspool on trimout on define "&" concat "." serveroutput on;'||CHR(10)||
'set underline on;'||CHR(10)||
'set heading off;'||CHR(10)||
'set pagesize 50000;'||CHR(10)||
'set echo off;'||CHR(10)||
'set feedback off;'||CHR(10)||
'set linesize 8000;'||CHR(10)||
'set veri off;'||CHR(10)
WHEN 0 THEN '' 
END 
FROM dual;
with driver as (
select *
from (
select rownum rn,dbid,snap_id,previous_snap_id,next_snap_id,begin_interval_time  from (
SELECT * FROM (
select instance_number,dbid,snap_id,
LEAD(snap_id, &_interval, 0) OVER (PARTITION BY instance_number ORDER BY snap_id DESC NULLS LAST) previous_snap_id,
LAG(snap_id, &_interval, 0) OVER (PARTITION BY instance_number ORDER BY snap_id DESC NULLS LAST) next_snap_id,
begin_interval_time 
from dba_hist_snapshot 
where dbid=(select dbid from v$database where name='&_database_name')
AND instance_number= (select min(inst_number) from gV$active_instances)
AND begin_interval_time>=TO_DATE('&_begin_interval_date','DDMMYY HH24:MI:SS')-&_interval/24
AND begin_interval_time<TO_DATE('&_end_interval_date','DDMMYY HH24:MI:SS')+&_interval/24
) d WHERE next_snap_id0 AND previous_snap_id0   --not equal sign may have been corrupted during paste
 ORDER BY 3
)  s ) where (
CASE WHEN &_INTERVAL = 2 THEN mod(rn,2)  
WHEN &_INTERVAL=1 THEN 1
END)=1
)
SELECT  
CASE &_global 
WHEN 1 then
'spool &_folder\global_awrrptto_&_database_name'||CHR(95)||'_'||TO_CHAR(snap_id)||'_'||TO_CHAR(next_snap_id)||'.html'||CHR(10)|| CHR(10)|| 
'SELECT * FROM TABLE(dbms_workload_repository.awr_global_report_html('||TO_CHAR(dbid)||','''','||TO_CHAR(snap_id)||','||TO_CHAR(next_snap_id)||',8));'||CHR(10)
||CHR(10)||'spool off' 
WHEN 0 THEN '' 
END 
FROM driver;



SELECT
'set termout on' ||CHR(10)||
'SET TIME ON' ||CHR(10)||
'prompt '||CHR(10)||
'prompt '||CHR(10)||
'prompt REPORT GENERATION FINISHED'||CHR(10)||
'prompt '||CHR(10)||
'SET TIME OFF'||CHR(10)
FROM dual;
SET termout ON 

spool off

prompt 
prompt 
prompt GEN_AWR.SQL script generation finished
prompt 
prompt 



!!!To save it click on the view source link on the sourcode upright corner

Nothing rocket science, just using dbms_workload_repository.awr_report_html with some junior level analytic.

All you have to do is give the date, interval (only tested for half hour or one hour intervals)and folder to be saved then rest will be done by the script.

It will create a gen_awr.sql script which you will run and go have a coffee then when you come back your awr reports are ready to be sent.

so the sequence is

@gen_snap  ---assume you save it with this name
@gen_awr 
--coffee break
--zip+upload

I would like to automate it further by doing the same for rac comparison or group of instance but I think I am too lazy or maybe busy with some other non-automatable rubbish :( (I also had some issues with rac comparison report formatting with the DBMS_WORKLOAD_REPOSITORY package so I gave up on that one)

Hope this tiny script save your time as much as it saves mine.

January 31, 2011

Plan stability through Upgrade to 11G- Building a test case

Filed under: Basics, Plan Stability — coskan @ 6:16 pm

This will be the first post of plan stability through 11G upgrade series.
I initially planned to write about bugfixes as second post but change my mind and add some common steps to find the sqls I need to investigate and build a test case for.
The scripts I use here are mostly modified versions of Kerry Osborne and Tanel Poder, so just to avoid taking time of some of you who already knows this stuff, the only interesting part can be a bit further down in the post. I also did not add sqls problematic sqls yet, I kept them for related posts :)

Before I start, I have some assumptions for this and upcoming posts
I am assuming that you took the actions below on the test system as a DBA who knows what he is doing.

1- you cloned/snapped/duplicated the database from production so data is same
2- if you have a cutdown version of prod then you copied all the stats from production (both system and object stats)
3- you did not overwrite the statistics by gathering new stats after step 1 or 2
4- you have the same parameter settings apart from the version related ones.
5- You have optimizer_features_enable parameter set to the version you upgraded
(this is very important and I got my own invalid DBA exception and set this parameter at system level during my own test and caused tester test on the wrong optimizer environment)

You took the actions above, started the database and testing began then you start to get calls from testing/development groups that some of the tasks are running slower than SLAs for that spesific task
Since you are on testing phase you have time to sort the problems and understand what is the cause of the problem.

First things first, we will start by checking what is going on in terms of plans. Here is how “I” do it. For ease of read I did not put content of the scripts. They are all at the bottom of the post

1-Find/get the sqlid or sqltext for the task from developers.

1a-If developer provide sqlid then I call sqlid.sql- modified version of sqlid from Tanel Poder. This script gives the statistics of cursors in memory.

SQL> @sqlid cp7cg4db39a8z

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
cp7cg4db39a8z 1446291743 select from USER.TEST where ID =:1

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 0000000A16BA1338 000000072A73FE20 1026121647        3     117859          3     117858          0         117858    4223100 6820106.35  .57867148  279018595     408403          0               1

1b- If you got the sqltext but not sqlid but the process is reported as still running check with swact.sql script which is calling modified version of Tanel Poders sw.sql for active foreground processes.

Run the sqlid.sql script for the active sql_ids and double check the sqltext

SQL> @swact

    SID USERN STATE   EVENT              SEQ# SEC_IN_STATE         P1         P2         P3 P1TRANSL   SQL_ID         CHN BLCKNG_SID BLCKNG_SID_S BLCKNG_INST
------- ----- ------- ----------------------- ------------ ---------- ---------- ---------- ---------- ------------- ---- ---------- ------------ -----------
    401       WAITING jobq slave wait      32            0          0          0          0                             0            NO HOLDER
    553       WAITING jobq slave wait      32            0          0          0          0                             0            NO HOLDER
    402 COSKA WORKING On CPU / runqueue   145            0 1413697536          1          0            f5cqythck1gfz    0            NOT IN WAIT
    481 COSKA WORKING On CPU / runqueue   271            2          1      51202          1            fuws9wt9d16fq    0            NOT IN WAIT

SQL> @sqlid fuws9wt9d16fq

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
fuws9wt9d16fq 1389402582 select count(*) from dba_source,dba_objects

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002A9B90758 00000002A9C55DF8 2265727165 99999999          1          1          1          0              0  18539.182  20075.237  20.075237      32172       3619          1               1

1c- If session is not active and you could not get the sqlid from developer and start searching for the sqlid in v$sqlarea or dba_hist_sqltext

For this I usually use find_sql.sql from Kerry Osborne.

I needed to edit it to just find the sqlid of the given search key to avoid massive screen output for sqls with multiple child cursors


COSKANDBA@WHQAINT> @find_sql
Enter value for sql_text: select count(*) from v$parameter
Enter value for sql_id: %

SQL_ID         CHILD  PLAN_HASH      EXECS         ETIME     AVG_ETIME SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------- -----------------------------------------
89f82uh6x612p      0  322934900          2           .01           .00 select count(*) from v$parameter where na
                                                                       me = 'cpu_count' and   (isdefault = 'FALS
                                                                       E' or ismodified != 'FALSE')
COSKANDBA@WHQAINT> @find_sqlid
Enter value for sql_text: select count(*) from v$parameter

SQL_ID        SQL_TEXT
------------- ----------------------------------------------------------------------------------------------------
89f82uh6x612p select count(*) from v$parameter where name = 'cpu_count' and   (isdefault = 'FALSE' or ismodified !
              = 'FALSE')

If the sql is not active then I use find_sqlid_h which check dba_hist_sqltext
for historic check which can be dramatically slow if your AWR retention is high or number of unique sqls

2-Check the plan changes for this spesific sql

2a-If you have diagnostics pack license then you can use awr to check the historic plans

I use two scripts for getting the plan changes from awr.

awr_plan_changes.sql from Kerry Osborne which uses dba_hist_sqlstats for given sql_id. The good part of this script is that you can see the average elapsed time and
avg logical io. Take average into consideration but know that they are also dangerous and misleading

SQL> @awr_plan_change
Enter value for sql_id: 2h5nub7zqkj5q

   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
     24906      1 15-JAN-11 14.30.49.673         2h5nub7zqkj5q      3435162931            3       60.352       75,163.0
     24936      1 16-JAN-11 05.30.36.765         2h5nub7zqkj5q                            3      135.858       25,870.0
     24954      1 16-JAN-11 14.30.26.315         2h5nub7zqkj5q      3125217954            3       20.494       74,416.7
     25246      1 22-JAN-11 16.30.05.775         2h5nub7zqkj5q      1903048315            4        4.289       73,862.0
     25273      1 23-JAN-11 06.00.55.152         2h5nub7zqkj5q                            3       77.159      160,780.7
     25280      1 23-JAN-11 09.30.03.439         2h5nub7zqkj5q                            3       23.366       73,862.0
     25283      1 23-JAN-11 11.00.06.489         2h5nub7zqkj5q                            3        5.612       73,862.0
     25292      1 23-JAN-11 15.30.15.880         2h5nub7zqkj5q                            1       11.525       73,862.0
     25321      1 24-JAN-11 06.00.02.214         2h5nub7zqkj5q                            7       11.784       73,862.0
     25328      1 24-JAN-11 09.30.10.651         2h5nub7zqkj5q                            3       24.208       73,869.0
     25417      1 26-JAN-11 06.00.31.983         2h5nub7zqkj5q                            7       17.478       74,686.0
     25475      1 27-JAN-11 11.00.55.155         2h5nub7zqkj5q                            3       25.786       74,845.3
     25523      1 28-JAN-11 11.00.35.584         2h5nub7zqkj5q                            3       24.190       74,947.0
     25561      1 29-JAN-11 06.00.55.227         2h5nub7zqkj5q                            7       99.244       74,902.0
     25568      1 29-JAN-11 09.30.08.835         2h5nub7zqkj5q                            3       18.939       74,914.3
     25571      1 29-JAN-11 11.00.12.085         2h5nub7zqkj5q                            3       26.258       74,922.7

16 rows selected.

SQL>

sqlhist.sql which check uses dba_hist_active_sess_history for given sql_id. This does not have sql statistics for the plan_hash_value but it gives information for the sqls
which are not captured for dba_hist_sqlstats.


SQL> @sqlhist 2h5nub7zqkj5q

   SNAP_ID BEGIN_INTERVAL_TIME            SQL_ID          CHILD_NO PLAN_HASH_VALUE
---------- ------------------------------ ------------- ---------- ---------------
     24891 15-JAN-11 06.00.14.552         2h5nub7zqkj5q          0      3125217954   ****
     24906 15-JAN-11 14.30.49.673         2h5nub7zqkj5q          0      3435162931
     24937 16-JAN-11 06.00.38.585         2h5nub7zqkj5q          0
     24954 16-JAN-11 14.30.26.315         2h5nub7zqkj5q          0      3125217954
     25003 17-JAN-11 15.00.01.148         2h5nub7zqkj5q          0      4093460156
     25051 18-JAN-11 15.00.18.035         2h5nub7zqkj5q          1
     25099 19-JAN-11 15.00.44.108         2h5nub7zqkj5q          0
     25147 20-JAN-11 15.00.53.100         2h5nub7zqkj5q          0
     25225 22-JAN-11 06.00.09.902         2h5nub7zqkj5q          0
     25227 22-JAN-11 07.00.14.473         2h5nub7zqkj5q          0
     25230 22-JAN-11 08.30.19.472         2h5nub7zqkj5q          0
     25231 22-JAN-11 09.00.20.496         2h5nub7zqkj5q          0      1903048315
     25244 22-JAN-11 15.30.03.874         2h5nub7zqkj5q          0
     25246 22-JAN-11 16.30.05.775         2h5nub7zqkj5q          0
     25273 23-JAN-11 06.00.55.152         2h5nub7zqkj5q          0
     25280 23-JAN-11 09.30.03.439         2h5nub7zqkj5q          0
     25283 23-JAN-11 11.00.06.489         2h5nub7zqkj5q          0
     25292 23-JAN-11 15.30.15.880         2h5nub7zqkj5q          0
     25321 24-JAN-11 06.00.02.214         2h5nub7zqkj5q          0
     25328 24-JAN-11 09.30.10.651         2h5nub7zqkj5q          0
     25331 24-JAN-11 11.00.14.273         2h5nub7zqkj5q          0
     25369 25-JAN-11 06.00.56.648         2h5nub7zqkj5q          0
     25376 25-JAN-11 09.30.12.066         2h5nub7zqkj5q          0
     25379 25-JAN-11 11.00.15.482         2h5nub7zqkj5q          0
     25388 25-JAN-11 15.30.32.502         2h5nub7zqkj5q          0
     25417 26-JAN-11 06.00.31.983         2h5nub7zqkj5q          0
     25427 26-JAN-11 11.00.47.332         2h5nub7zqkj5q          0
     25436 26-JAN-11 15.30.07.648         2h5nub7zqkj5q          0
     25465 27-JAN-11 06.00.35.469         2h5nub7zqkj5q          0
     25472 27-JAN-11 09.30.51.194         2h5nub7zqkj5q          0
     25475 27-JAN-11 11.00.55.155         2h5nub7zqkj5q          0
     25484 27-JAN-11 15.30.09.941         2h5nub7zqkj5q          0
     25520 28-JAN-11 09.30.31.090         2h5nub7zqkj5q          0
     25523 28-JAN-11 11.00.35.584         2h5nub7zqkj5q          0
     25532 28-JAN-11 15.30.49.332         2h5nub7zqkj5q          0
     25561 29-JAN-11 06.00.55.227         2h5nub7zqkj5q          0
     25568 29-JAN-11 09.30.08.835         2h5nub7zqkj5q          0
     25571 29-JAN-11 11.00.12.085         2h5nub7zqkj5q          0

38 rows selected.

SQL>

When you look carefully you will see why I needed to check dba_hist_active_sess_stats with another script. The first line is not in awr_plan_change which can be misleading!!!!.

2b-If you have statspack configured to be taken at “Level 6″

then you can write scripts on STATS$SQL_PLAN_USAGE table in the similar manner. The difference is that since statspack does not store sql_id you need to use hash_value for instead of sql_id and in addition to this after 10G
You need to use V$SQL.OLD_HASH_VALUE for the given SQL_ID. I may write a version and put here but not this time because I don’t have it configured:(

3- You found out that your plan is changed after upgrade or you could not find that information due to lack of awr data or statspack but you suspect that it is changed. Now is the time to build the test case.

3a-Build the test case :

To create the test case I choose one of the options below
1- Manual: I use this option when I don’t have sqlid but sqltext to generate a case, which is painful if it has binds. Better ask the case from developer

2- build_bind_vars script from Kerry Osborne: Once you have/find the sqlid then you have a perfect option from Kerry Osborne which “nearly” does everything for you
he has two scripts called build_bind_vars and build_bind_vars2 one is using v$sql_bind_capture for the binded value the other one is using other_xml column for peeked binds from v$sql_plan table . Downside of build_bind_vars is that it uses display_raw function of Greg Rahn which means creating a function in the DB because this function is not there by default

SQL> @sqlid 6asfnp4v15rg3

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
6asfnp4v15rg3  907206115  SELECT DISTINCT TABLE_1.ID FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2
                         TABLE_2 WHERE TABLE_1.ID = TABLE_2.ID AND TABLE_2.SOURCEID = :1 AND TABLE_1.MARKETID = :2 AND TABLE_1.TYPEID
                         = :3

Elapsed: 00:00:00.01

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   1 000000082DF6FB88 00000009498BCE50 2222075559        6          2         12          1        130           1295     16.997     69.267    .069267       1562         13          0               0
   4 000000082DF6FB88 000000084D6B6808 2222075559       15          1         10          1        995           9946     55.993    1168.99    1.16899       6497         63          0               0
   6 000000082DF6FB88 00000009F2DB4B98 2222075559        6          2          7          1        504           5039     22.996     188.12     .18812       3877         34          0               0
   7 000000082DF6FB88 00000006D3B821A0 2222075559        7          1         10          1          9             86      6.999     15.825    .015825        151          2          0               0
   8 000000082DF6FB88 00000006EC670138 2222075559       39        144         11        105      33458         334050   1797.724    7692.61 .073262952     262247        999          0               0
  18 000000082DF6FB88 00000008AED02378 1994644346      176        768          3        804     233085        2326441  12999.013 119810.985 .149018638    2624167      12429          0               0

6 rows selected.

Elapsed: 00:00:00.01
SQL> @e5
Enter value for sql_id: 6asfnp4v15rg3
Enter value for child_number: 18

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
 SELECT
DISTINCT TABLE_1.ID FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2
TABLE_2 WHERE TABLE_1.ID = TABLE_2.ID AND
TABLE_2.SOURCEID = :1 AND TABLE_1.MARKETID = :2 AND
TABLE_1.TYPEID = :3

Plan hash value: 1994644346

---------------------------------------------------
| Id  | Operation           | Name                |
---------------------------------------------------
|   0 | SELECT STATEMENT    |                     |
|   1 |  SORT UNIQUE NOSORT |                     |
|   2 |   NESTED LOOPS      |                     |
|   3 |    INDEX RANGE SCAN | IDX_TABLE1_09       |
|   4 |    INDEX UNIQUE SCAN| PK_TABLE_2 |
---------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :1 (NUMBER): 66
   2 - :2 (NUMBER): 1281
   3 - :3 (NUMBER): 125

27 rows selected.

Elapsed: 00:00:00.06

SQL> l
  1* SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR(('&sql_id'),&child_number,'BASIC +PEEKED_BINDS'))

SQL> @build_bind_vars
Enter SQL ID ==> 6asfnp4v15rg3
Enter Child Number ==> 18

variable N1 NUMBER
variable N2 NUMBER
variable N3 NUMBER

begin

:N1 := 66;
:N2 := 1281;
:N3 := 125;

end;

/

select /* test_6asfnp4v15rg3 */ /*+ gather_plan_statistics */ DISTINCT TABLE_1.ID
FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2 TABLE_2
WHERE TABLE_1.ID = TABLE_2.ID
AND TABLE_2.SOURCEID = :N1
AND TABLE_1.MARKETID = :N2
AND TABLE_1.TYPEID = :N3;

As you see build_bind_vars uses actual peeked binds however build_bind_vars2 uses v$sql_bind_capture which maybe misleading due to the bind peeking.

SQL> @build_bind_vars2
Enter SQL ID ==> 6asfnp4v15rg3
Enter Child Number ==> 18

variable N1 NUMBER
variable N2 NUMBER
variable N3 NUMBER

begin

:N1 := 66;
:N2 := 1661;
:N3 := 1;

end;

/

select /* test_6asfnp4v15rg3 */ /*+ gather_plan_statistics */ DISTINCT TABLE_1.ID
FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2 TABLE_2
WHERE TABLE_1.ID = TABLE_2.ID
AND TABLE_2.SOURCEID = :N1
AND TABLE_1.MARKETID = :N2
AND TABLE_1.TYPEID = :N3;

Challenge is that these two script use active cursors in the shared pool. what if you need something which is not in the cache but in history.
To sort this problem I also add one more script called build_bind_vars_h to build test case using dba_hist_* tables and again other_xml column of those tables for peeked binds.
For the example below I will use build_bind_vars_h script because our query is not in memory.


SQL> @sqlid 1m7wy721qytx3

no rows selected

no rows selected

SQL> @sqlhist 1m7wy721qytx3

   SNAP_ID BEGIN_INTERVAL_TIME            SQL_ID          CHILD_NO PLAN_HASH_VALUE
---------- ------------------------------ ------------- ---------- ---------------
     24981 17-JAN-11 04.00.28.416         1m7wy721qytx3          0      2266344041
     24982 17-JAN-11 04.30.32.884         1m7wy721qytx3          0      2266344041
     24987 17-JAN-11 07.00.03.043         1m7wy721qytx3          0      2266344041
     24988 17-JAN-11 07.30.10.114         1m7wy721qytx3          0      2266344041
     25029 18-JAN-11 04.00.07.227         1m7wy721qytx3          0      2266344041
     25030 18-JAN-11 04.30.10.894         1m7wy721qytx3          0      2266344041
     25077 19-JAN-11 04.00.56.483         1m7wy721qytx3          0      3561051786 =>after the upgrade
     25078 19-JAN-11 04.31.00.422         1m7wy721qytx3          0      3561051786

8 rows selected.

SQL> @build_bind_vars_h
Enter SQL ID ==> 1m7wy721qytx3
Enter Plan Hash Value ==> 2266344041

variable B1 NUMBER

begin

:B1 := 14544;

end;

/

select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.TICKERID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USER_1.TABLE_1, USER_1.TABLE_2, USER_1.TABLE_3 WHERE TABLE_1.TICKERID = TABLE_2.TICKERID
AND TABLE_2.MARKETID = TABLE_3.MARKETID AND TABLE_2.MARKETID IN :B1 AND TABLE_1.SOURCEID = TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.TICKERID, TABLE_1.SOURCEID;

Once I have the test content available , I copy the test case content two sql files to ease and generalize my tests.

test.sql sql file with /*+gather_plan_statistics*//*identifier*/ hints+comments added so I can gather the runtime statistics and easily find my sql
Sample test.sql

variable N1 NUMBER

begin
:N1 :=  100;
end;
/

select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/*
from T1,T2 where t1.id=t2.id and t1.id&lt;=:N1;

test_e.sql same as test but this time I replace binds with literals and run explain plan for the sql. I put literals because runtime plans can be
different than explain plans for sqls with bind variables.

Sample test_e.sql

explain plan for select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/*
from T1,T2 where t1.id=t2.id and t1.id&lt;=100;

4-Testing the test case with different optimizer_features_enable (OFE) settings

There can be couple of major minor releases between the release you are upgrading from and the release you are upgrading to.
For example If you want to upgrade from 10.2.0.4 to 11.2.0.2 and your plans are changed, it may not be because something is changed on 11.2.0.2 because there are 4 base releases between these two which are 10.2.0.5, 11.1.0.6, 11.2.0.7 and 11.2.0.1. Your plan characteristics may actually changed on one of these releases.

To understand which release my problems actually start I need to test the sqls for all the releases, to automate this process
I wrote two simple scripst like below

build_optimizer_features_test.sql : this script accepts base release as parameter and builds a test environment for all releases by generating another sql called
optimizer_features_test.sql which is just a spool file which sets OFE and runs test.sql (the one you created on step 3b) for each OFE from base and till last value
in v$system_fix_control.optimizer_feature_enable column.

Source of the build_optimizer_features_test.sql

undefine base_optimizer_version
set heading off
set feedback off
define optimizer_feature="&amp;base_optimizer_version"
set term off
spool optimizer_features_test.sql
select 'set timing on' from dual;
select 'set echo off' from dual;
--select 'set autotrace traceonly  statistics' from dual;
select 'spool optimizer_features_test_results.log' from dual;
select
'timing start time_for_ofe_'||optimizer_feature_enable||chr(10)||'set echo on'||chr(10)||'alter session set optimizer_features_enable='''||optimizer_feature_enable||''';'
||chr(10)||'set echo off'||chr(10)||'set term off'||chr(10)
||'@test.sql'||chr(10)||'/'||chr(10)||'set term on'
||chr(10)||'timing stop  time_for_ofe_'||optimizer_feature_enable||chr(10)
||chr(10)||'select * from table(dbms_xplan.display_cursor(null,null,''ALLSTATS LAST''));'||chr(10)
from
(select distinct optimizer_feature_enable from v$system_fix_control
where
to_number(replace(optimizer_feature_enable,'.',''))&gt;=to_number(replace('&amp;optimizer_feature','.',''))
order by to_number(replace(optimizer_feature_enable,'.','')) desc );
select 'set echo off' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'set timing off' from dual;
select 'spool off' from dual;
--select 'set autotrace off' from dual;
spool off
set term on
set heading on
set feedback on

and output from this sql is optimizer_features_test.sql which is like below for base version is 11.1.0.7 and database is running with 11.2.0.1 binary

set timing on

set echo off

spool optimizer_features_test_results.log

timing start time_for_ofe_11.2.0.1
set echo on
alter session set optimizer_features_enable='11.2.0.1';
set echo off
set term off
@test.sql
/
set term on
timing stop  time_for_ofe_11.2.0.1

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

timing start time_for_ofe_11.1.0.7
set echo on
alter session set optimizer_features_enable='11.1.0.7';
set echo off
set term off
@test.sql
/
set term on
timing stop  time_for_ofe_11.1.0.7

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

set timing off

spool off

For this example I want to see if my plan is changing behaviour when I upgrade from 10.2.0.4 to 11.2.0.1


SQL> ------------------------------
SQL> ----- BUILD THE TEST CASE ----
SQL> ------------------------------
SQL> @build_optimizer_features_test
Enter value for base_optimizer_version: 10.2.0.4
SQL> ------------------------------
SQL> ----- TEST THE TEST CASE -----
SQL> ------------------------------
SQL> @optimizer_features_test
SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

Elapsed: 00:00:00.02
SQL> set echo off
timing for: time_for_ofe_11.2.0.1
Elapsed: 00:00:00.03

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 0
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1268K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.02 SQL> alter session set optimizer_features_enable='11.1.0.7';

Session altered.

Elapsed: 00:00:00.01
SQL> set echo off
timing for: time_for_ofe_11.1.0.7
Elapsed: 00:00:00.02

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 1
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1254K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.01 SQL> alter session set optimizer_features_enable='11.1.0.6';

Session altered.

Elapsed: 00:00:00.01
SQL> set echo off
timing for: time_for_ofe_11.1.0.6
Elapsed: 00:00:00.02

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 2
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1258K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.02 SQL> alter session set optimizer_features_enable='10.2.0.5';

Session altered.

Elapsed: 00:00:00.02
SQL> set echo off
timing for: time_for_ofe_10.2.0.5
Elapsed: 00:00:00.03

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 3
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1250K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.01 SQL> alter session set optimizer_features_enable='10.2.0.4';

Session altered.

Elapsed: 00:00:00.02
SQL> set echo off
timing for: time_for_ofe_10.2.0.4
Elapsed: 00:00:00.03

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 4
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1252K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.02 SQL> @sqlid 1hs7jn4tshzqa

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
1hs7jn4tshzqa  864583370 select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2 where t1.id=t2.id and t1.id<=:N1 1 row selected.  CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING ---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------    0 00000000926F6F90 00000000926F6B90 1143140472        9          2          5          2          4            200         10      3.435   .0017175         20          0          0               0    1 00000000926F6F90 00000000926F3E90 1143140472        9          2          5          2          4            200          0      3.278    .001639         20          0          0               0    2 00000000926F6F90 0000000094936D28 1143140472        9          2          5          2          4            200         10      3.931   .0019655         20          0          0               0    3 00000000926F6F90 00000000926F1EF0 1143140472        9          2          5          2          4            200          0      3.644    .001822         20          0          0               0    4 00000000926F6F90 00000000926C3148 1143140472        9          2          5          2          4            200          0       2.61    .001305         20          0          0               0 5 rows selected. SQL>

With this test you call test.sql for 2 times for each available OFE value since the given base OFE. There are 5 different cursors for 5 different CBO environment variable and
they all used the same plan. By calling 2 times you see the caching effect and also on 11.2.0.1 and onward you can see the cardinality feedback effect if it kicks in (we will see about that later)

build_optimizer_features_test_e.sql : The only difference is that this scripts calls test_e.sql (the one you create on step 3b) and runs dbms_xplan.display
for the explained plan. I use this one when the sql runs very long time, even I still test with the runtime plan generation
explaining the plan sometimes gives me the rough idea on which OFE my plan may change

sourcecode of the build_optimizer_features_test_e is below

undefine base_optimizer_version
set heading off
set feedback off
define optimizer_feature="&amp;base_optimizer_version"
set term off
spool optimizer_features_test_e.sql
select 'set timing on' from dual;
select 'set echo off' from dual;
select 'spool optimizer_features_test_2_results.log' from dual;
select   'set echo on'||chr(10)||'alter session set optimizer_features_enable='''||optimizer_feature_enable||''';'
||chr(10)||'set echo off'||chr(10)||'@test_e.sql'||chr(10)||
'select * from table(dbms_xplan.display);'||chr(10)
from
(select distinct optimizer_feature_enable from v$system_fix_control
where
to_number(replace(optimizer_feature_enable,'.',''))&gt;=to_number(replace('&amp;optimizer_feature','.',''))
order by to_number(replace(optimizer_feature_enable,'.','')) desc );
select 'set echo off' from dual;
select 'spool off' from dual;
spool off
set term on
set heading on
set feedback on

Same example with the previous one this time assuming I am moving from 11.1.0.6 to 11.2.0.1


SQL> @build_optimizer_features_test_e
Enter value for base_optimizer_version: 11.1.0.6
SQL> @optimizer_features_test_e
SQL> alter session set optimizer_features_enable='11.2.0.1';
Elapsed: 00:00:00.02
SQL> set echo off
Elapsed: 00:00:00.10

Plan hash value: 1143140472

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    99 | 24156 |     9  (12)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |    99 | 24156 |     9  (12)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |   100 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |   100 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=100)
   5 - access("T2"."ID"<=100) Elapsed: 00:00:00.08 SQL> alter session set optimizer_features_enable='11.1.0.7';
Elapsed: 00:00:00.01
SQL> set echo off
Elapsed: 00:00:00.01

Plan hash value: 1143140472

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    99 | 24156 |     9  (12)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |    99 | 24156 |     9  (12)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |   100 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |   100 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=100)
   5 - access("T2"."ID"<=100) Elapsed: 00:00:00.01 SQL> alter session set optimizer_features_enable='11.1.0.6';
Elapsed: 00:00:00.02
SQL> set echo off
Elapsed: 00:00:00.01

Plan hash value: 1143140472

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    99 | 24156 |     9  (12)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |    99 | 24156 |     9  (12)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |   100 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |   100 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=100)
   5 - access("T2"."ID"<=100) Elapsed: 00:00:00.02 SQL>

In summary without building a test case and testing it yourself it is hard to find out what is going on and this post was all about having everything ready before you start the
actual diagnose. Next posts I will only put an output of the these steps and won’t explain what they are.

References Used:
http://kerryosborne.oracle-guy.com/2009/07/creating-test-scripts-with-bind-variables/
http://kerryosborne.oracle-guy.com/2008/10/unstable-plans/

Scripts used in this post

All in one as a zip file- build_test_case.zip

Scipts shared as txt for availability on google docs. (I think I need to move my own domain)
build_bind_vars2

build_bind_vars2 Kerry Osborne version
build_bind_vars
build_bind_vars2 Kerry Osborne version
build_bind_vars_h
build_optimizer_features_test
build_optimizer_features_test_e
find_sqlid
find_sqlid_h
sqlhist
sqlid
sw
sw Tanel Poder version
awr_plan_change

<-Previous Post in Series————————————–  Next Post in Series–>

January 2, 2011

2010 in review from wordpress

Filed under: Basics — coskan @ 10:22 am

–Below is the mail I got from wordpress explaining how this blog performed in  2010 . Thank you all for visiting

 

 

The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

Healthy blog!

The Blog-Health-o-Meter™ reads Wow.

Crunchy numbers

Featured image

The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 140,000 times in 2010. If it were an exhibit at The Louvre Museum, it would take 6 days for that many people to see it.

 

In 2010, there were 27 new posts, growing the total archive of this blog to 145 posts. There were 13 pictures uploaded, taking up a total of 449kb. That’s about a picture per month.

The busiest day of the year was February 3rd with 731 views. The most popular post that day was Working with statspack-part-1a-Diagnosis.

Where did they come from?

The top referring sites in 2010 were blog.tanelpoder.com, forums.oracle.com, google.co.in, google.com, and pythian.com.

Some visitors came searching, mostly for io exception: the network adapter could not establish the connection, the network adapter could not establish the connection, and the network adapter could not establish the connection oracle.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

1

Working with statspack-part-1a-Diagnosis January 2010
14 comments

2

Io Exception: the network adapter could not establish the connection (Enterprise Manager) April 2007
19 comments

3

ORA-01031: insufficient privileges April 2007
16 comments

4

Setting , Resetting/Unsetting Database Parameters March 2007
3 comments

5

Changing the Current Schema March 2007
1 comment

May 27, 2010

Flashback log generation with select statements

Filed under: Basics, Tips — coskan @ 3:50 pm

There is a question on OTN Forums about excessive flashback log generates with select statement . Basically OP says there are flashback logs generated when they do select only.

there are different answers in the thread which can be true or false,  but when I first read the question I immediately think if delayed block cleanout have the similar effect on flashback log generation as well, so I tested.

I initially thought v$sesstats counters might reveal something but no luck and then I checked v$flashback_database_stat for the generated flashback_data_size and I think I hit the jackpot.

HR@ORACOS> select * from v$flashback_database_stat;

BEGIN_TIME        END_TIME          FLASHBACK_DATA    DB_DATA  REDO_DATA ESTIMATED_FLASHBACK_SIZE
----------------- ----------------- -------------- ---------- ---------- ------------------------
20100527 15:32:53 20100527 15:50:16      875266048 1207132160 2038729728                        0
20100527 14:32:50 20100527 15:32:53      248160256  127295488  450139648               1.3215E+10
20100527 13:32:48 20100527 14:32:50       10452992   15646720    4400640               1.5549E+10
20100527 12:32:43 20100527 13:32:48      745693184  948461568 1311620608               2.2789E+10
20100527 11:25:56 20100527 12:32:43     1262026752 1984741376 2358546432               2.7212E+10

HR@ORACOS> set autotrace traceonly statistics
HR@ORACOS>  update base_table_np set y='INVALID';
 commit;

4021808 rows updated.

Statistics
----------------------------------------------------------
       2512  recursive calls
    8341430  db block gets
    4069140  consistent gets
     120569  physical reads
 1908471980  redo size
        848  bytes sent via SQL*Net to client
        793  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
    4021808  rows processed

HR@ORACOS> set autotrace off;
HR@ORACOS> select * from v$flashback_database_stat;

HR@ORACOS>
BEGIN_TIME        END_TIME          FLASHBACK_DATA    DB_DATA  REDO_DATA ESTIMATED_FLASHBACK_SIZE
----------------- ----------------- -------------- ---------- ---------- ------------------------
20100527 15:32:53 20100527 16:00:36     1236664320 2021974016 4019910656                        0
20100527 14:32:50 20100527 15:32:53      248160256  127295488  450139648               1.3215E+10
20100527 13:32:48 20100527 14:32:50       10452992   15646720    4400640               1.5549E+10
20100527 12:32:43 20100527 13:32:48      745693184  948461568 1311620608               2.2789E+10
20100527 11:25:56 20100527 12:32:43     1262026752 1984741376 2358546432               2.7212E+10

HR@ORACOS> set autotrace traceonly statistics
HR@ORACOS> select * from base_table_np;
4021808 rows selected.

Statistics
----------------------------------------------------------
        139  recursive calls
          0  db block gets
      53908  consistent gets
       4404  physical reads
    1652384  redo size							------->DELAYED BLOCKS CLEANOUTS
  175008833  bytes sent via SQL*Net to client
      88996  bytes received via SQL*Net from client
       8045  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
    4021808  rows processed

HR@ORACOS> set autotrace off
HR@ORACOS> select * from v$flashback_database_stat;    ------>CHECK THE increase in FLASHBACK_DATA
HR@ORACOS>
BEGIN_TIME        END_TIME          FLASHBACK_DATA    DB_DATA  REDO_DATA ESTIMATED_FLASHBACK_SIZE
----------------- ----------------- -------------- ---------- ---------- ------------------------
20100527 15:32:53 20100527 16:01:11     1305264128 2054594560 4021728256                        0
20100527 14:32:50 20100527 15:32:53      248160256  127295488  450139648               1.3215E+10
20100527 13:32:48 20100527 14:32:50       10452992   15646720    4400640               1.5549E+10
20100527 12:32:43 20100527 13:32:48      745693184  948461568 1311620608               2.2789E+10
20100527 11:25:56 20100527 12:32:43     1262026752 1984741376 2358546432               2.7212E+10

I am still not %100 convinced because I have no idea what is going on behind the scenes and I don’t have Julian Dyke (ppt file ) wisdom/time to go more deep into the issue and  I stopped here.

If you have something to say about the issue  comments are welcomed.

Tested on 11.2.0.1 with single active instance

May 23, 2010

Blogroll Report 26/03 /2010 – 02/04/2010

Filed under: Basics, Blogroll Report — coskan @ 10:21 pm

<—- Blogroll Report 19/03/2010 – 26/03/2010

1-How, adding another index can cause performance problem ? (AllEqRange can cause wrong index choice)
Vivek Sharma-Cost Based Optimization ! Query Performance Issue after creating a new Index…

2-How to extract sql statements from 10046 tracefile via tkprof?
Hemant K Chitale-Extracting Application / User SQLs from a TraceFile

3-Which one to choose ? Memory Target vs Huge Pages
Ronny Egner-MEMORY_TARGET (SGA_TARGET) or HugePages – which to choose?

4-How does heap block compress statistics increase and what happens when there is space but not at the top of the free space heap?
Jonathan Lewis– Heap Block Compress

5-Where are the database logs located in 10GR2 RAC?
Murali Vallath-Logs in 10GR2 RAC

6-Why to use DBMS_STATS instead of analyze ?
Jonathan Lewis-Analyze this
Jonathan Lewis-Analyze this 2

7-Ways to transfer Oracle files between database when using ASM
Emre Baransel-Transferring Oracle Files Between Databases In ASM

8-How to reclaim unused space in datafiles?
Tim Hall-Reclaiming Unused Space in Datafiles

9-Available Opatch methods in RAC environment
Mohammad Arju-OPatch supported patch methods in RAC environment

10-Performance impact of ATOMIC_REFRESH parameter value for materialized view refreshes
Kubilay Kara-Materialized View Refresh and the ATOMIC_REFRESH parameter

11-How to check whether or not Java privilege attacks have occurred in your database ?
Paul M Wright-Java Forensics in Oracle

12-How to rotate alert logs via ADRCI
James Koopmann-Rotating Oracle Database’s Alert Log with ADRCI

Blogroll Report 02/04 /2010 – 09/04/2010 –>

May 3, 2010

Blogroll Report 19/03 /2010 – 26/03/2010

Filed under: Basics, Blogroll Report — coskan @ 2:39 am

<—- Blogroll Report 12/03/2010 – 19/03/2010

1-Tanel Poder’s Session Snapper v3.10
Tanel Poder-Session Snapper

2-Function Based Index behaviour with cursor_sharing similar or force?
Yu Sun-Demonstrate a function-based index is unable to be used due to cursor_sharing setting

3-How to upgrade database using a Rolling Upgrade With an Existing Physical Standby Database feature of 11G (transient logical standby)
Gavin Soorma-11g Release 2 Rolling Upgrade using Transient Logical Standby database

4-Extra filter in CBO to do short circuit
Alex Fatkulin-Where did the filter came from?

5-What happens when you delete a row where there are more than 2 index and you reach via 1?
Gary Myers-Turning things on their head

6-How to clone a database on the same server using rman duplicate from active database ?
Alejandro Vargas-Cloning A Database On The Same Server Using Rman Duplicate From Active Database

7-Java Vulnaribility that lets hacker to use orapwd to change sys password
Paul M Wright-CREATE SESSION to SYSDBA via Java and orapwd

8-What is the difference between database sessions and processs?
Arup Nanda-Difference between Session and Process

9-What does Parse Calls statistic mean?
Jonathan Lewis-Nutshell-2

10-Difference between deletion from index and table
Jonathan Lewis-Index too big

11-How to create fake height balanced histograms?
Jonathan Lewis-Fake Histograms

12-How does oracle store unique bitmap indexes
Richard Foote-Unique Bitmap Indexes Part I (Unnatural Selection)

13-Using Mapping tables instead of decode or case
Robert Vollman-DECODE/CASE vs. Mapping Tables

14-How to build ASM Clustered file system
Jim Czuprynski-Building an ASM Clustered File System

15-How does dynamic remastering work in Oracle RAC
Riyaj Shamsudeen-RAC object remastering ( Dynamic remastering )

Blogroll Report 26/03 /2010 – 02/04/2010–>

January 27, 2010

Working with statspack-part-1a-Diagnosis

Filed under: Basics, Performance — coskan @ 6:24 pm

Part 1b—->

Note: Post is a bit long and might be boring, if you are already good with statspack, but there is a question at the end if you would like to answer

Yesterday, I decided to look for something to troubleshoot without waiting for any user complaints so I can write something useful here :)
Primary reason for this practice was to get used to interpret statspack reports to find possible bottlenecks
secondary reason was because I was bored about being underutilized.
I always feel comfortable troubleshooting session wide problems, non-historic performance problems or historic problems with Diagnostics Pack,
but I am not that experienced on statspack.
In my current environment I don’t have Diagnostic Pack option so I have to make good friendship with statspack and
the best way to be a good friend is to spend more time with your new friend as I am going to explain how.

Characteristics of the DB I will work on:
daily load is between 08:00 and 19:00
export at 21:00 running for roughly 1 hour
DB backup start 23:00

There are couple of ways to spot a possible bottlenecks without getting individual reports.

—First one is from Tanel Poder and his famous Perfsheet. This tool is already configured and ready to be used with statspack.
The query he uses to generate the report is the same idea behind the statspack delta views of  Tim Gorman. Lets start

first screen is login screen (you need to activate the macros).

Inputs for statspack report is
Data Source- TNS Entry of the DB
Username/Password- username with statspack schema objects read privilege
Which Instance-instance name (lower case or modify the related sql in Queries sheet with upper function )
Which Query-Statspack CPU+Wait+Stats Hourly
From Date-start date
To Date-end date
then press show me. You will get warnings during the graph generation saying PivotTable report will not fit on the sheet … Just Press OK till the warnings end
Your graph might be empty on Office 2007. Just go to the Report Filter and select WAIT from EVENT_TYPES to see any abnormality on wait events.

First Graph is WAIT graph like below. As you see, I was able to spot a possible problem on 20/01/2010 at first try.

Lets make it more visible by adding top 10 filter to Legend Fields Section.

Another option is focus on the date where the abnormality is happened. Lets select only 20th. FYI top 10 filter is still on

Without the top 10 filter. Graph is slightly different but still giving what we need.

Lets Focus on only time interval between 18 and 23

As you see there is an abnormal increase on row cache lock logfile switch and buffer busy

Lets do double-check with system stats. By selecting all hours and available dates and events your graph will be a bit misleading because of very high metrics like below

What can we do ? DB time is something we want to decrease when troubleshooting, so lets focus on DB Time by filtering Legend Fields on DB Time

We have 2 peaks, which is a bit confusing for the initial assumption of problem timing. Lets focus on 20th again

There are two peaks  between 13-15 and 21-23. We have an interesting situation here in terms of DB time. From wait event perspective abnormality is between 20-22 but DB time perspective we have 1 more time slot between 13-15 .

At this point  I quit Perfsheet. I spotted what I needed for initial problem chase now I can go to other options before, gathering the report. (these options are way more easier than statspack report generation thats why I did not directly go to the statspack)

Second option is from Tim Gorman himself. sp_systime which I find very useful in terms of overall response time analysis. ( i needed to change the script to focus between the dates instead of days before sysdate option)

Here is the ouput

First Part is overall response time for that day.


SQL> @sp_systime2
Please enter the ORACLE_SID value: PROD1
Please enter the report begin date: sysdate-8
Please enter the report end date:sysdate-5

                                                                        Total             Cum
Service Seconds % of % of
Day or Wait Name Spent Total Total
------------ -------- ----------------------------------- ------------------- ------- -------
19-JAN       Service  SQL execution                                115,326.24   78.78   78.78
             Wait     db file sequential read                        7,697.33    5.26   84.04
             Wait     db file scattered read                         6,213.01    4.24   88.29
             Service  Recursive SQL execution                        4,638.75    3.17   91.46
Wait Backup: sbtwrite2 2,671.85 1.83 93.28
Wait Backup: sbtbackup 2,081.53 1.42 94.70
             Wait     log file parallel write                        1,423.58    0.97   95.67
             Service  Parsing SQL                                    1,109.49    0.76   96.43
             Wait     db file parallel write                           913.09    0.62   97.06
             Wait     log file sync                                    761.78    0.52   97.58
Wait Datapump dump file I/O 631.13 0.43 98.01
             Wait     Log archive I/O                                  549.08    0.38   98.38

20-JAN       Service  Recursive SQL execution                      255,597.67   65.42   65.42
             Wait     row cache lock                                48,098.17   12.31   77.74
Wait log file switch (checkpoint incompl 41,613.30 10.65 88.39
             Wait     buffer busy waits                             13,474.25    3.45   91.84
             Wait     db file sequential read                        9,788.51    2.51   94.34
             Wait     db file scattered read                         6,831.08    1.75   96.09
Wait Backup: sbtwrite2 3,512.51 0.90 96.99
             Wait     log file parallel write                        1,897.38    0.49   97.48
Wait Datapump dump file I/O 1,674.20 0.43 97.90
             Wait     db file parallel write                         1,389.77    0.36   98.26
             Wait     log file sync                                  1,073.14    0.27   98.53
             Service  Parsing SQL                                      964.58    0.25   98.78

21-JAN       Service  SQL execution                                 33,045.26   44.91   44.91
             Service  Recursive SQL execution                       12,785.72   17.38   62.28
             Wait     db file sequential read                        8,296.99   11.28   73.56
             Wait     db file scattered read                         6,731.96    9.15   82.71
Wait Backup: sbtwrite2 3,340.23 4.54 87.25
             Wait     log file parallel write                        1,856.99    2.52   89.77
             Wait     db file parallel write                         1,359.19    1.85   91.62
             Wait     log file sync                                  1,131.31    1.54   93.16
             Service  Parsing SQL                                    1,011.46    1.37   94.53
Wait enq: TX - row lock contention 861.24 1.17 95.70
             Wait     Log archive I/O                                  686.01    0.93   96.63
Wait Backup: sbtbackup 402.52 0.55 97.18

If you check the top 5 you will easily spot the row cache lock, log file switch and buffer busy waits are different from the other days

Second part of the report is hourly based response time for the days. I truncated the output for 8 hours but the full is here ***

 12:00 Service  SQL execution                                  6,268.92   76.66   76.66
       Wait     db file sequential read                          634.26    7.76   84.42
       Wait     db file scattered read                           344.02    4.21   88.63
       Service  Recursive SQL execution                          230.23    2.82   91.44
       Wait     log file parallel write                          147.58    1.80   93.25
       Wait     log file sync                                    129.94    1.59   94.84
 13:00 Service  SQL execution                                  6,942.33   83.97   83.97
       Wait     db file sequential read                          375.15    4.54   88.51
       Wait     db file scattered read                           207.74    2.51   91.02
       Service  Recursive SQL execution                          200.86    2.43   93.45
Wait enq: TX - row lock contention 162.83 1.97 95.42
       Wait     log file parallel write                           94.08    1.14   96.56
 14:00 Service  Recursive SQL execution                      241,820.91   99.35   99.35
       Wait     db file scattered read                           574.84    0.24   99.59
       Wait     db file sequential read                          515.74    0.21   99.80
       Wait     log file parallel write                           97.09    0.04   99.84
Wait enq: TX - row lock contention 87.19 0.04 99.88
       Service  Parsing SQL                                       81.25    0.03   99.91
 15:00 Service  SQL execution                                  3,620.32   62.09   62.09
       Service  Recursive SQL execution                        1,210.42   20.76   82.86
       Wait     db file sequential read                          328.60    5.64   88.49
       Wait     db file scattered read                           294.73    5.06   93.55
       Service  Parsing SQL                                       73.42    1.26   94.81
Wait enq: TX - row lock contention 72.03 1.24 96.04
 16:00 Service  SQL execution                                  3,667.78   70.86   70.86
       Wait     db file sequential read                          429.87    8.30   79.16
       Service  Recursive SQL execution                          371.24    7.17   86.33
       Wait     db file scattered read                           215.28    4.16   90.49
Wait enq: TX - row lock contention 144.67 2.79 93.29
       Wait     log file parallel write                          100.81    1.95   95.24
 17:00 Service  SQL execution                                  2,904.54   70.63   70.63
       Wait     db file sequential read                          397.42    9.66   80.29
       Service  Recursive SQL execution                          202.13    4.92   85.21
       Wait     db file scattered read                           138.29    3.36   88.57
       Wait     log file sync                                     94.87    2.31   90.88
       Wait     log file parallel write                           90.94    2.21   93.09
 18:00 Service  SQL execution                                  1,791.86   65.17   65.17
       Wait     db file sequential read                          302.02   10.98   76.15
       Service  Recursive SQL execution                          176.04    6.40   82.56
       Wait     db file scattered read                           151.83    5.52   88.08
       Wait     log file parallel write                           80.44    2.93   91.00
       Wait     log file sync                                     63.46    2.31   93.31
 19:00 Service  SQL execution                                  2,261.88   44.86   44.86
       Wait     db file sequential read                          758.41   15.04   59.91
       Wait     db file scattered read                           540.35   10.72   70.62
       Wait     db file parallel write                           272.36    5.40   76.03
       Wait     log file parallel write                          269.47    5.34   81.37
       Wait     log file sequential read                         154.58    3.07   84.44
 20:00 Service  SQL execution                                  3,517.30   73.29   73.29
       Wait     db file sequential read                          467.34    9.74   83.03
       Wait     db file scattered read                           251.47    5.24   88.27
       Wait     log file parallel write                          161.71    3.37   91.64
       Wait     db file parallel write                            92.73    1.93   93.57
       Wait     Log archive I/O                                   90.24    1.88   95.45
 21:00 Service  SQL execution                                  2,875.04   45.72   45.72
Wait Datapump dump file I/O 1,504.28 23.92 69.64
       Wait     db file sequential read                          873.68   13.89   83.53
       Wait     db file scattered read                           417.53    6.64   90.17
       Wait     log file parallel write                          130.15    2.07   92.24
       Wait     db file parallel write                           106.51    1.69   93.93
 22:00 Wait     row cache lock                                48,097.61   41.83   41.83
Wait log file switch (checkpoint incompl 41,594.75 36.17 78.00
       Wait     buffer busy waits                             13,454.89   11.70   89.70
       Service  Recursive SQL execution                        8,958.04    7.79   97.49
       Wait     db file sequential read                        1,485.98    1.29   98.79
       Wait     db file scattered read                           856.65    0.75   99.53
23:00 Wait Backup: sbtwrite2 2,801.73 76.03 76.03
Wait Backup: sbtbackup 352.22 9.56 85.59
       Wait     db file scattered read                           140.16    3.80   89.39
Wait Backup: sbtclose2 93.25 2.53 91.92
       Wait     db file sequential read                           78.52    2.13   94.05
       Service  SQL execution                                     65.77    1.78   95.84

What do we see here ? The graph told us there might be a possible problem between 13-15 and 21-23.

First; focus on 13-15

13:00 Service  SQL execution                                  6,942.33   83.97   83.97
       Wait     db file sequential read                          375.15    4.54   88.51
       Wait     db file scattered read                           207.74    2.51   91.02
       Service  Recursive SQL execution                          200.86    2.43   93.45
Wait enq: TX - row lock contention 162.83 1.97 95.42
       Wait     log file parallel write                           94.08    1.14   96.56
 14:00 Service  Recursive SQL execution                      241,820.91   99.35   99.35
       Wait     db file scattered read                           574.84    0.24   99.59
       Wait     db file sequential read                          515.74    0.21   99.80
       Wait     log file parallel write                           97.09    0.04   99.84
Wait enq: TX - row lock contention 87.19 0.04 99.88
       Service  Parsing SQL                                       81.25    0.03   99.91

Second ; focus on 21-23

21:00 Service  SQL execution                                  2,875.04   45.72   45.72
Wait Datapump dump file I/O 1,504.28 23.92 69.64
       Wait     db file sequential read                          873.68   13.89   83.53
       Wait     db file scattered read                           417.53    6.64   90.17
       Wait     log file parallel write                          130.15    2.07   92.24
       Wait     db file parallel write                           106.51    1.69   93.93
 22:00 Wait     row cache lock                                48,097.61   41.83   41.83
Wait log file switch (checkpoint incompl 41,594.75 36.17 78.00
       Wait     buffer busy waits                             13,454.89   11.70   89.70
       Service  Recursive SQL execution                        8,958.04    7.79   97.49
       Wait     db file sequential read                        1,485.98    1.29   98.79
       Wait     db file scattered read                           856.65    0.75   99.53

Nothing common but tells lots of things. First one spent %99 of response time on Recursive SQL Execution which means no wait but on CPU doing something which we don’t know yet.
Second one spent %41 of its time on row cache lock and %36 on log file switch. Now we are ready for the statspack to see what can we find but
before that I want to show 2 more scripts whose output is can be obtained by Perfsheet as well (like DB time)

First script is Tim Gormans sp_evtrends scripts which shows trends of events. There are two problems; sometimes unnecessarily high number of * character  shown on the screen but comparing it with the seconds waited it can easily be ignored. Another thing is that, if you focus on 1 day sometimes * character graph might be misleading as it is with excel graphs so it is better to focus on 5-10 days

Scripts asks for db_name, begin date, end date and statistics to analyze

SQL> @sp_evtrends
Please enter the ORACLE_SID value: PROD1
Please enter the report begin date: sysdate-8
Please enter the report end date:sysdate-6

Some useful database statistics to search upon:

Name
------------------------------------------------------------
ARCH random i/o
ARCH sequential i/o
ARCH wait for archivelog lock
	....
	....
	....

Daily trends for "row cache lock"...

What statistic do you want to analyze? row cache lock

Day    Statistic Name                         Secs Waited Percentage of total over all days
------ ------------------------------ ------------------- ------------------------------------------------------------
19-JAN row cache lock                                0.00

20-JAN row cache lock                          480,981.73 ************************************************************

Daily/hourly trends for "row cache lock"...

Day    Hour   Statistic Name                         Secs Waited Percentage of total over all hours for each day
------ ------ ------------------------------ ------------------- ------------------------------------------------------------
19-JAN 00:00  row cache lock                                0.00
       01:00  row cache lock                                0.00
       02:00  row cache lock                                0.00
       03:00  row cache lock                                0.00
       04:00  row cache lock                                0.00
       05:00  row cache lock                                0.00
       06:00  row cache lock                                0.00
       07:00  row cache lock                                0.00
       08:00  row cache lock                                0.00
       09:00  row cache lock                                0.00
       10:00  row cache lock                                0.00
       11:00  row cache lock                                0.00
       12:00  row cache lock                                0.00
       13:00  row cache lock                                0.00
       14:00  row cache lock                                0.00
       15:00  row cache lock                                0.00
       16:00  row cache lock                                0.00
       17:00  row cache lock                                0.00
       18:00  row cache lock                                0.00
       19:00  row cache lock                                0.00
       20:00  row cache lock                                0.00
       21:00  row cache lock                                0.00
       22:00  row cache lock                                0.00 ************************************************************
       23:00  row cache lock                                0.00

20-JAN 00:00  row cache lock                                0.00
       01:00  row cache lock                                0.00
       02:00  row cache lock                                0.00
       03:00  row cache lock                                0.00
       04:00  row cache lock                                0.00
       05:00  row cache lock                                0.00
       06:00  row cache lock                                0.00
       07:00  row cache lock                                0.00
       08:00  row cache lock                                0.00
       09:00  row cache lock                                0.00
       10:00  row cache lock                                0.00
       11:00  row cache lock                                0.00
       12:00  row cache lock                                0.00
       13:00  row cache lock                                0.00
       14:00  row cache lock                                0.00
       15:00  row cache lock                                0.00
       16:00  row cache lock                                5.65
       17:00  row cache lock                                0.00
       18:00  row cache lock                                0.00
       19:00  row cache lock                                0.00
       20:00  row cache lock                                0.00
       21:00  row cache lock                                0.00
       22:00  row cache lock                          480,976.09 ************************************************************
       23:00  row cache lock                                0.00

As you see it shows the same number of * for 19th which is wrong when you check with time spent so abnormality can be spotted by both time spent and number of *.

Second script is modified version for sys_time_model statistics called sp_sys_time_trends.

SQL> @sp_sys_time_trends
</a>Please enter the ORACLE_SID value: PROD1
Please enter the report begin date: sysdate-8
Please enter the report end date:sysdate-6

Some useful database statistics to search upon:

Name
------------------------------------------------------------
DB CPU
DB time
Java execution elapsed time
PL/SQL compilation elapsed time
PL/SQL execution elapsed time
RMAN cpu time (backup/restore)
background cpu time
background elapsed time
connection management call elapsed time
failed parse (out of shared memory) elapsed time
failed parse elapsed time
hard parse (bind mismatch) elapsed time
hard parse (sharing criteria) elapsed time
hard parse elapsed time
inbound PL/SQL rpc elapsed time
parse time elapsed
repeated bind elapsed time
sequence load elapsed time
sql execute elapsed time
What statistic do you want to analyze? DB time

Daily trends for "DB time"...

Day    Statistic Name                      VALUE Percentage of total over all days
------ ------------------------------ ---------- ------------------------------------------------------------
19-JAN DB time                        1408469.51 **************

20-JAN DB time                        4797565.14 **********************************************

Daily/hourly trends for "DB time"...

Day    Hour   Statistic Name                      VALUE Percentage of total over all hours for each day
------ ------ ------------------------------ ---------- ------------------------------------------------------------
19-JAN 00:00  DB time                        18267.2492 *
01:00  DB time                        57907.3551 **
02:00  DB time                        64889.8266 ***
03:00  DB time                        37988.8748 **
04:00  DB time                        39093.3842 **
05:00  DB time                        39267.5194 **
06:00  DB time                        56103.5928 **
07:00  DB time                        43312.1147 **
08:00  DB time                        50542.4821 **
09:00  DB time                        62883.2236 ***
10:00  DB time                        89721.9315 ****
11:00  DB time                        86621.3671 ****
12:00  DB time                        78163.5806 ***
13:00  DB time                        64324.9537 ***
14:00  DB time                        78970.2241 ***
15:00  DB time                        78037.7742 ***
16:00  DB time                        78215.1336 ***
17:00  DB time                          69875.99 ***
18:00  DB time                        61075.9801 ***
19:00  DB time                        47658.5181 **
20:00  DB time                         41372.216 **
21:00  DB time                        57569.6918 **
22:00  DB time                        65593.3167 ***
23:00  DB time                        41013.2088 **

20-JAN 00:00  DB time                        39789.4329
01:00  DB time                        76274.4764 *
02:00  DB time                        39564.2615
03:00  DB time                        37327.7534
04:00  DB time                        38174.4725
05:00  DB time                         38335.113
06:00  DB time                        55688.7588 *
07:00  DB time                        39039.2476
08:00  DB time                        45155.1028 *
09:00  DB time                        62292.3892 *
10:00  DB time                        73688.6751 *
11:00  DB time                        86222.6032 *
12:00  DB time                        79400.9817 *
13:00  DB time                        81021.4711 *
14:00  DB time                        2498694.98 *******************************
15:00  DB time                        57824.5047 *
16:00  DB time                        50252.1229 *
17:00  DB time                        39474.4176
18:00  DB time                         25728.465
19:00  DB time                         40950.006 *
20:00  DB time                        45258.7127 *
21:00  DB time                        66186.8379 *
22:00  DB time                        1177471.46 ***************
23:00  DB time                        3748.88236

As you can spot like the excell graph we have DB time increased for 14-15 and 22-23 period

Now It is time to get the statspack report and dig further.
First I want to focus on the row cache problem if I have space I will talk about the recursive cursor problem maybe as second part.
We normally collect 30 minutes but available snaps are like below.

                          Snap
Snap Id   Snap Started    Level
8135 20 Jan 2010 21:05     7
8136 20 Jan 2010 21:35     7
8137 20 Jan 2010 22:28     7
8138 20 Jan 2010 22:58     7

If you look carefully you will see that snap at 22:00 is missing. There must be something wrong on DB which prevented statspack
from gathering but lets assume that you could not spot it for now as I could’t which was a deadly mistake caused increase in trouble shooting time
I generated statspack report between 21:35 and 22:28

Here is the statspack summary (full statspack is at here. SQLs are a bit edited for keeping privacy)

STATSPACK report for

Database    DB Id    Instance     Inst Num Startup Time    Release     RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
1814528001 PROD1 1 30-Dec-09 01:59 10.2.0.4.0 NO

Host Name: MACHINE1 Num CPUs: 8 Phys Memory (MB): 32,763
~~~~

Snapshot       Snap Id     Snap Time      Sessions Curs/Sess Comment
~~~~~~~~    ---------- ------------------ -------- --------- -------------------
Begin Snap:       8136 20-Jan-10 21:35:59      128       7.2
  End Snap:       8137 20-Jan-10 22:28:50      189       5.4
   Elapsed:               52.85 (mins)

Cache Sizes                       Begin        End
~~~~~~~~~~~                  ---------- ----------
               Buffer Cache:     4,128M     4,032M  Std Block Size:         8K
           Shared Pool Size:     6,032M     6,128M      Log Buffer:    13,913K

Load Profile                            Per Second       Per Transaction
~~~~~~~~~~~~                       ---------------       ---------------
                  Redo size:            126,609.46            124,954.43
              Logical reads:             10,300.31             10,165.66
              Block changes:                904.39                892.57
             Physical reads:                375.25                370.34
            Physical writes:                 17.54                 17.31
                 User calls:                 15.88                 15.67
Parses: 9.22 9.09
Hard parses: 1.08 1.06
                      Sorts:                 60.58                 59.79
                     Logons:                  0.13                  0.13
                   Executes:                327.42                323.14
               Transactions:                  1.01

  % Blocks changed per Read:    8.78    Recursive Call %:    98.49
 Rollback per transaction %:    1.28       Rows per Sort:    28.80

Instance Efficiency Percentages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.96 Redo NoWait %: 95.11
            Buffer  Hit   %:   99.73    In-memory Sort %:  100.00
            Library Hit   %:   99.10        Soft Parse %:   88.32
         Execute to Parse %:   97.19         Latch Hit %:   99.91
Parse CPU to Parse Elapsd %: 64.77 % Non-Parse CPU: 99.51

 Shared Pool Statistics        Begin   End
                               ------  ------
             Memory Usage %:   74.42   72.87
    % SQL with executions>1:   33.97   33.40
  % Memory for SQL w/exec>1:   56.11   55.38

Top 5 Timed Events                                                    Avg %Total
~~~~~~~~~~~~~~~~~~                                                   wait   Call
Event                                            Waits    Time (s)   (ms)   Time
----------------------------------------- ------------ ----------- ------ ------
row cache lock                                  16,524      48,098   2911   45.6
log file switch (checkpoint incomplete)         68,137      41,595    610   39.5
buffer busy waits                               13,462      13,455    999   12.8
CPU time                                                     1,472           1.4
db file sequential read                        723,098         446      1     .4
          -------------------------------------------------------------
Host CPU CPUs-8
~~~~~~~~              Load Average
Begin End User System Idle WIO WCPU
                    ------- -------   ------- ------- ------- ------- --------
                                         3.69    2.12   94.19

Instance CPU
~~~~~~~~~~~~
              % of total CPU for Instance:    3.30
              % of busy  CPU for Instance:   56.81
  %DB time waiting for CPU - Resource Mgr:

Memory Statistics                       Begin          End
~~~~~~~~~~~~~~~~~                ------------ ------------
                  Host Mem (MB):     32,762.6     32,762.6
SGA use (MB): 10,240.0 10,240.0
                   PGA use (MB):        776.7        787.4
% Host Mem used for SGA+PGA: 33.6 33.7
          -------------------------------------------------------------

Time Model System Stats DB/Inst: PROD1/prod1 Snaps: 8136-8137
-> Ordered by % of DB time desc, Statistic name

Statistic                                       Time (s) % of DB time
----------------------------------- -------------------- ------------
sql execute elapsed time 101,089.5 100.0
DB CPU                                             821.2           .8
PL/SQL execution elapsed time                      620.9           .6
parse time elapsed                                  22.4           .0
hard parse elapsed time                             21.2           .0
PL/SQL compilation elapsed time                      1.7           .0
connection management call elapsed                   0.2           .0
hard parse (sharing criteria) elaps 0.1 .0
repeated bind elapsed time                           0.0           .0
hard parse (bind mismatch) elapsed                   0.0           .0
failed parse elapsed time                            0.0           .0
sequence load elapsed time                           0.0           .0
DB time                                        101,090.9
background elapsed time                          6,382.7
background cpu time                                 16.6

Wait Events DB/Inst: PROD1/prod1 Snaps: 8380-8382
-> s - second, cs - centisecond, ms - millisecond, us - microsecond
-> %Timeouts: value of 0 indicates value was < .5%. Value of null is truly 0
-> Only events with Total Wait Time (s) >= .001 are shown
-> ordered by Total Wait Time desc, Waits desc (idle events last)

                                                                    Avg
                                                %Time Total Wait   wait    Waits
Event Waits -outs Time (s) (ms) /txn
--------------------------------- ------------ ------ ---------- ------ --------
row cache lock                          16,524     97     48,098   2911      5.1
log file switch (checkpoint incom 68,137 61 41,595 610 21.2
buffer busy waits                       13,462    100     13,455    999      4.2
..
..
..

Wait Event Histogram DB/Inst: PROD1/prod1 Snaps: 8136-8137
-> Total Waits - units: K is 1000, M is 1000000, G is 1000000000
-> % of Waits - column heading: <=1s is truly <1024ms, >1s is truly >=1024ms
-> % of Waits - value: .0 indicates value was <.05%, null is truly 0
-> Ordered by Event (idle events last)

                           Total ----------------- % of Waits ------------------
Event Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s >1s
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
row cache lock               16K   2.9                                  .0  97.1
log file switch (checkpoin 66K 21.0 2.7 12.8 2.1 .3 61.1
buffer busy waits            13K    .0                            .0 100.0

As you see row cache locks avg wait time is nearly 3 second with 97% is over 1 second which looks like it is the problem

We already spotted row cache lock but where to look next ?

Documentation says ;

row cache lock

The session is trying to get a data dictionary lock. 

Wait Time: Wait up to 60 seconds.

Parameter	Description
cache id	The CACHE# column value in the V$ROWCACHE view
mode	See "mode"
request	The pipe timer set by the user

In Statspack report we have a section called “Dictionary Cache Stats”. Lets check what we have there

Dictionary Cache Stats DB/Inst: PROD1/prod1 Snaps: 8136-8137
->"Pct Misses" should be very low (< 2% in most cases)
->"Final Usage" is the number of cache entries being used in End Snapshot

Get Pct Scan Pct Mod Final
Cache Requests Miss Reqs Miss Reqs Usage
------------------------- ------------ ------ ------- ----- -------- ----------
dc_awr_control                      18    0.0       0              0          1
dc_database_links                  107    0.0       0              0          1
dc_files                            13    0.0       0              0         13
dc_global_oids                  21,571    0.1       0              2        255
dc_histogram_data               73,396    2.2       0              0      3,904
dc_histogram_defs              279,541    1.4       0              0      7,652
dc_object_grants                   102   11.8       0              0        124
dc_object_ids                1,050,311    0.1       0              5      2,879
dc_objects                      14,052    1.3       0             31      2,605
dc_profiles                         74    0.0       0              0          1
dc_rollback_segments             4,147    0.0       0             64        300
dc_segments                     50,349    0.2       0            121      2,415
dc_sequences                        10   10.0       0             10         13
dc_table_scns                        1  100.0       0              0          0
dc_tablespace_quotas               289    0.0       0              0          9
dc_tablespaces                  66,671    0.0       0              0         11
dc_usernames                    18,888    0.1       0              0         53
dc_users                        99,804    0.0       0              0         71
outstanding_alerts                  26   15.4       0              4         24
          -------------------------------------------------------------

From my old shared pool post I remember that Pct Miss is important about dictionary cache tuning.
I did the comparison with the normal times and the only thing I spot was increase in Pct miss for
dc_object_grants and dc_sequences with less get request but without specific session activity all
I can do is this much for this event. so I moved to logfile switch event to see if I can find something

Definition **** log file switch (checkpoint incomplete)
The log file switch wait indicates that the process is waiting for the log file switch to complete,
but the log file switch is not possible because the checkpoint process for that log file
has not completed. You may see this event when the redo log files are sized too small.
Wait Parameters Wait parameters are not used for log file switch (checkpoint incomplete).
Wait Time One second

This is an event we see occasionally but not with the redo size we generated for that time interval. Lets see how many log switches has been done for that day

SQL> @gen_redo

DAY            COUNT#       MIN#       MAX# DAILY_AVG_MB
---------- ---------- ---------- ---------- ------------
2010-01-26        790      60361      61150        50560
2010-01-25        722      59639      60360        46208
2010-01-24        469      59170      59638        30016
2010-01-23        291      58879      59169        18624
2010-01-22        651      58228      58878        41664
2010-01-21        394      57834      58227        25216
2010-01-20        473      57361      57833        30272

10 rows selected.

Session altered.

DAY        00    01    02    03    04    05    06    07    08    09    10    11    12    13    14    15    16    17    18    19    20    21    22    23
---------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
20/01/2010    0    50     0     0     0     1    56     0     1     9    17    43    37    27     8     8    14    12     9   101    57    15     7     1
21/01/2010    0    52     0     1     0     0    56     0     1     6    13    12    12    12    17    10     4    11     6   100    58    18     4     1
22/01/2010    0    51     0     0     1     0    58     0     1     3    11    27    14    18    82    25    17    60   104   100    57    18     4     0
23/01/2010    1    51     2     1     0     0    54     1     0     0     0     0     1     0     0     0     1     0     0   100    57    18     3     1
24/01/2010    0    52     1     3     3     3    67     8     7     1     0     1     0     0   136    53    16     1     0    76    20    18     1     2
25/01/2010    0    53     0     0     0     1    59     1     1   120    45    89    40     4    80    17    16    78     6    90    11     8     3     0
26/01/2010    0    53    12    39    13     0    57     0     1    40    28    63    38   142     4    99    59     5    17    82    16    19     2     1

8 rows selected.

looks like nothing specific happened number of log switches looks innocent.

What is next buffer busy looks like the third one with little effect but still can give something ?

Lets check the trends

SQL> @sp_evtrends
Please enter the ORACLE_SID value: PROD1
Please enter the report begin date: sysdate-8
Please enter the report end date:sysdate

Daily trends for "buffer busy waits"...

Day    Statistic Name                         Secs Waited Percentage of total over all days
------ ------------------------------ ------------------- ------------------------------------------------------------
19-JAN buffer busy waits                           127.95

20-JAN buffer busy waits                       134,742.54 **********************************************************

21-JAN buffer busy waits                           205.92

22-JAN buffer busy waits                           553.43

23-JAN buffer busy waits                             0.31

24-JAN buffer busy waits                            32.49

25-JAN buffer busy waits                         1,449.90 *

26-JAN buffer busy waits                         1,131.49

Daily/hourly trends for "DB time"... (truncated)

Day    Hour   Statistic Name                      VALUE Percentage of total over all hours for each day
------ ------ ------------------------------ ---------- ------------------------------------------------------------

20-JAN 00:00  buffer busy waits                             0.00
       01:00  buffer busy waits                             0.00
       02:00  buffer busy waits                             0.00
       03:00  buffer busy waits                             0.00
       04:00  buffer busy waits                             0.00
       05:00  buffer busy waits                             0.00
       06:00  buffer busy waits                             0.00
       07:00  buffer busy waits                             0.00
       08:00  buffer busy waits                             0.00
       09:00  buffer busy waits                             4.86
       10:00  buffer busy waits                             6.03
       11:00  buffer busy waits                            43.35
       12:00  buffer busy waits                            24.02
       13:00  buffer busy waits                            20.27
       14:00  buffer busy waits                            16.45
       15:00  buffer busy waits                            24.45
       16:00  buffer busy waits                            27.31
       17:00  buffer busy waits                            23.86
       18:00  buffer busy waits                             2.78
       19:00  buffer busy waits                             0.24
       20:00  buffer busy waits                             0.00
       21:00  buffer busy waits                             0.00
       22:00  buffer busy waits                       134,548.91 ************************************************************
       23:00  buffer busy waits                             0.00

Looks like normally our DB doesn’t spend time on this event which is good.

We have a nice section in statspack report for buffer busy wait events


Segments by Buffer Busy Waits DB/Inst: PROD1/prod1 Snaps: 8136-8137
-> End Segment Buffer Busy Waits Threshold:       100

                                                                    Buffer
Subobject Obj. Busy Pct
Owner Tablespace Object Name Name Type Waits Total
---------- ---------- -------------------- ------------ ----- ------------ -----
APP_USER USERS ULG TABLE 9,560 100.0
APP_USER USERS PK_ULG INDEX 1 .0
          -------------------------------------------------------------

All of the waits happened on the same object. Before checking I checked trend with another Tim Gorman script sp_buffer_busy (again edited for time interval and output only segments with buffer_busy waits >0)

I won’t put the output but output says this table showed up only 3 more times with wait total less then 30, in 10 days period. If you check the actual statspack for sqls
ULG table is not shown on any of them which is a bit strange again.

Maybe I can find something else, if I get half an hour period report for

		 	Snap
Snap Id   Snap Started    Level
8137 20 Jan 2010 22:28     7
8138 20 Jan 2010 22:58     7

From top 5 wait events

Top 5 Timed Events                                                    Avg %Total
~~~~~~~~~~~~~~~~~~                                                   wait   Call
Event                                            Waits    Time (s)   (ms)   Time
----------------------------------------- ------------ ----------- ------ ------
db file sequential read                        613,887       1,040      2   42.0
db file scattered read                         266,197         853      3   34.5
CPU time                                                       418          16.9
db file parallel write                          45,244          86      2    3.5
log file parallel write                          4,449          26      6    1.1
          -------------------------------------------------------------

Our problem disappeared from top 5 list and DB time=16,656.2 which is very small comparing to previous value 101,090.9.
problem is gone but what has actually happened ?

I am pretty sure that there wasn’t any change on that day. backup ran successfully after 23:00 and export finished before 9:30 so what can this be ?

I know the answer but whoever comes up with a good idea please be my guest to comment. If you are good enough, I promise to buy beer.

Answer will be here as an another post on Friday. If I have time I will also show what happened for the other DB time peak on same day.

Note:
With this post I did not re-invent the wheel but try to show you the available free tools you can use and modify for your own fast troubleshooting. As Tanel says right tool right time. These guys share these tools to help you , my advice is use it if you need the.
My script library consist of many scripts from Tanel Poder, Riyaj Shamsudeen ,Tim Gorman, Kerry Osborne, Dion Cho, Craig Shallahamer and from many others plus my own.
Thanks you very much to all guys who share their tools with us.

How much time I spent for diagnosis? Everything on this page took 10 minutes total to run. Rest is your experience and skills to search on metalink and internet.
for me coming to last point took only 1.5 hour.

List of scripts I used;

Older Posts »

The Silver is the New Black Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 199 other followers