Coskan’s Approach to Oracle

January 13, 2012

Session Snapper on Duty 2 – (how to support your answers, with proof)

Filed under: Basics, Performance, Tips — coskan @ 10:51 pm

Warning : This long blog is for educational and quality purposes and especially targeting Junior DBAs and DBAs who does not know how to use the Session Snapper
(greatest Oracle Tuning Tool yet) by Tanel Poder . If you feel comfortable with snapper and you think your relationship with your
clients just rocks then there is no need to read the post but you may still give it a go :)

Story started with a long email thread forward by a  developer who is asking question below for a system I am responsible but not feeling perfectly comfortable about how things really work.

Hi Josh (note to reader: Coskan is pronounced as Joshkahn so I use Josh at office)
Any ideas on how we can work with dbas to find out why our database is only committing at a range of
2-30 sec for 100 commits instead of the ideal 100 commits a sec for a third party process?

I answered

From my point of view 

I will be requesting 

1-	Time interval when this process run
2-	If you have SID information when this process run
3-	If it gives same results on UAT as well then can we run the operation while tracing is on (which need Change Request-hassle on production)

Then we can contione to understand what can be the problem

Developer replied

It runs 7/24 and UAT testing is not possible since we can't point the feeds to UAT but username is USER_XXX
if it helps.

Long story short all I understood was in simplest words  they want to be able to process more data in same amount of time,  so I replied with a long mail below

BEGINNING OF THE EMAIL

I did some sampling on your sessions to understand what is happening and I think I can come up with rough conclusions about improvement suggestions.

Lets walk through on what is/maybe happening

This is 500 second sampling of one of your sessions

---Some important Statistics
-------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
     71, USER_XXX  , STAT, user commits                                              ,        326,        .65,
     71, USER_XXX  , STAT, session logical reads                                     ,      1.36M,      2.71k,

—Wait profile

     71, USER_XXX  , WAIT, buffer busy waits                                         ,     1.18ms,     2.36us,      .0%, |          |
     71, USER_XXX  , WAIT, log file sync                                             ,     36.12s,    72.24ms,     7.2%, |@         |
     71, USER_XXX  , WAIT, db file sequential read                                   ,    276.17s,   552.34ms,    55.2%, |@@@@@@    |
     71, USER_XXX  , WAIT, flashback buf free by RVWR                                ,   403.28ms,   806.55us,      .1%, |          |
     71, USER_XXX  , WAIT, undo segment extension                                    ,    72.93ms,   145.86us,      .0%, |          |
     71, USER_XXX  , WAIT, SQL*Net message to client                                 ,     2.23ms,     4.47us,      .0%, |          |
     71, USER_XXX  , WAIT, SQL*Net message from client                               ,     178.2s,    356.4ms,    35.6%, |@@@@      |
     71, USER_XXX  , WAIT, SQL*Net more data from client                             ,   710.26ms,     1.42ms,      .1%, |          |

—Wait profile breakdown per sql

---------------------------------------------------------------------------------
Active% | SQL_ID          | EVENT                               | WAIT_CLASS
---------------------------------------------------------------------------------
    20% | ggjkmtv8cxxh3   | db file sequential read             | User I/O
    19% | 64djbhqv5v93u   | db file sequential read             | User I/O
    13% | 3b8gt3dqtn345   | db file sequential read             | User I/O
     8% |                 | log file sync                       | Commit
     2% | 8zt375nzk26rh   | db file sequential read             | User I/O
     1% | 8ztz7p814q9u2   | db file sequential read             | User I/O
     1% | ggjkmtv8cxxh3   | ON CPU                              | ON CPU
     1% | 64djbhqv5v93u   | ON CPU                              | ON CPU
     0% | 64djbhqv5v93u   | flashback buf free by RVWR          | Configuration
     0% | 64djbhqv5v93u   | SQL*Net more data from client       | Network

--  End of ASH snap 1, end=2012-01-11 21:51:46, seconds=500, samples_taken=499

When we check the stats, you have done 326 commits in 500 seconds which makes 0.65 commit per second but if we take 180 second of idle events (SQL*Net message to client + SQL*Net message from client) out, it is 326 commits in 320 (500-180) seconds active time which is roughly 1 commit per second.

When we look at the wait profile we can see 276 seconds of “db file sequential read” (physical read) over 320 second ofnon-iddle period and 36  second  “log file sync” which we will come later

If we focus on the “db file sequential read” we can see it is %86 (276/320) of the active time and there are 3 main contributer sqls for this case.

SQLs are below

SQL> @sqlid2 ggjkmtv8cxxh3

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ---------------------------------------------------------------------------------------------------------------
ggjkmtv8cxxh3 3503224323 DELETE FROM TABLE_T WHERE ESMP = :1

1 row selected.

  PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
4033870845        4          0         21      20219           .0        392.7       .26172575          1198.3            28.1



Plan hash value: 4033870845

-----------------------------------------------------------------------------------------------
| Id  | Operation         | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT  |                           |       |       |     4 (100)|          |
|   1 |  DELETE           | TABLE_T                   |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| PK_TABLE_T                |     2 |    50 |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

SQL> @sqlid2 64djbhqv5v93u

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
64djbhqv5v93u 3059590266 some complex update statement which does not need to be shared

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
1856144614        3         26          2     473152           .0         85.4      .270458035           454.4            25.4

Plan hash value: 1856144614

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |                      |       |       |     3 (100)|          |
|   1 |  UPDATE            | TABLE_XXXXXXX        |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| PK_TABLE_XXXXXXX     |     1 |   192 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------


SQL> @sqlid2 3b8gt3dqtn345

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
3b8gt3dqtn345 1838812293 some complex update statement which does not need to be shared

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
--------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
861798698        3          0          7      22308           .0         97.5      .113096364           382.6            10.9

Plan hash value: 861798698

------------------------------------------------------------------------------------------------
| Id  | Operation          | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |                           |       |       |     3 (100)|          |
|   1 |  UPDATE            | TABLE_ZZZZ                |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| PK_TABLE_ZZZZ             |     1 |    69 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

There is nothing much we can do about index unique scans other than changing the table structure but I think there is room for improve on index range scan for sql_id ggjkmtv8cxxh3 for delete statement which is waitig on db file sequential read for  %31 (500*%20 / 320 ) of the active time.

The reason the delete is doing index range scan is because we are trying to access 3 column composite index PK_TABLE_T by only using 1 column which is COL1. If business logic lets you also give other 2 columns as filter as well we will be doing much throughput with less LIO and PIO

TABLE_OWNER          TABLE_NAME                     INDEX_NAME                     POS# COLUMN_NAME
-------------------- ------------------------------ ------------------------------ ---- ------------------------------
APP_OWNER            TABLE_T                        PK_TABLE_T                        1 COL1
                                                                                      2 COL2
                                                                                      3 COL3

While I was doing sampling on above session at the same time I also did sampling on another session which is also having similar behaviour with different sqls like below.

—Some important Statistics

-------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
    192, USER_XXX  , STAT, user commits                                              ,        343,        .69,
    192, USER_XXX  , STAT, session logical reads                                     ,      2.78M,      5.56k,

—Wait profile

  192, USER_XXX  , WAIT, buffer busy waits                                         ,      702us,      1.4us,      .0%, |          |
  192, USER_XXX  , WAIT, log file switch completion                                ,    94.34ms,   188.68us,      .0%, |          |
  192, USER_XXX  , WAIT, log file sync                                             ,     28.26s,    56.53ms,     5.7%, |@         |
  192, USER_XXX  , WAIT, db file sequential read                                   ,     364.3s,    728.6ms,    72.9%, |@@@@@@@@  |
  192, USER_XXX  , WAIT, flashback buf free by RVWR                                ,   348.79ms,   697.57us,      .1%, |          |
  192, USER_XXX  , WAIT, undo segment extension                                    ,    492.4ms,    984.8us,      .1%, |          |
  192, USER_XXX  , WAIT, SQL*Net message to client                                 ,     1.03ms,     2.07us,      .0%, |          |
  192, USER_XXX  , WAIT, SQL*Net message from client                               ,     94.02s,   188.05ms,    18.8%, |@@        |
  192, USER_XXX  , WAIT, SQL*Net more data from client                             ,   368.33ms,   736.67us,      .1%, |          |

—Wait profile breakdown per sql

---------------------------------------------------------------------------------
Active% | SQL_ID          | EVENT                               | WAIT_CLASS
---------------------------------------------------------------------------------
    37% | dkh6b1nhnhfcu   | db file sequential read             | User I/O
    32% | 29zz9n1b9sppr   | db file sequential read             | User I/O
     7% |                 | log file sync                       | Commit
     1% | dkh6b1nhnhfcu   | ON CPU                              | ON CPU
     1% | 29zz9n1b9sppr   | ON CPU                              | ON CPU
     0% | 29zz9n1b9sppr   | undo segment extension              | Configuration
     0% | 29zz9n1b9sppr   | flashback buf free by RVWR          | Configuration

--  End of ASH snap 1, end=2012-01-11 21:52:17, seconds=500, samples_taken=499

This time 94 seconds idle with 406 (500-94) second processing time which gives us 0.84 (343/406) commit per second and 392 seconds of wait
(364 “db file sequential read”+28 “log file sync”) wait.

When we look at the sqls below this time we can see 160 (500*%32) of 364 seconds (%43 of active time) is being spent on again index range scan (sqlid 29zz9n1b9sppr) on delete statement again due to the lack of composite index column filtering.

I again believe we can improve the access if we are able to give all columns as filter and maintain index unique scan.


TABLE_OWNER          TABLE_NAME                     INDEX_NAME                     POS# COLUMN_NAME                    DSC
-------------------- ------------------------------ ------------------------------ ---- ------------------------------ ----
APP_OWNER            TABLE_Z                        PK_TABLE_Z                      1    COL1
                                                                                    2    COL2
                                                                                    3    COL3


SQL> @sqlid2 dkh6b1nhnhfcu

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
dkh6b1nhnhfcu  558381466 INSERT INTO APP_OWNER.TABLE_Z where bla bla bla bla

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
--------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
        0        1        778         15     102566           .0       1065.3      .177031281          5673.4            30.4

-------------------------------------------------
| Id  | Operation                | Name | Cost  |
-------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |
|   1 |  LOAD TABLE CONVENTIONAL |      |       |
-------------------------------------------------


SQL> @sqlid2 29zz9n1b9sppr

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- -------------------------------------------------------------------------------------------------------
29zz9n1b9sppr 1453086391 DELETE FROM APP_OWNER.TABLE_Z WHERE ESMP = :1

1 row selected.

PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS AVG_FETCH_PE  AVG_ROWS_PE AVG_ELA_SECONDS      AVG_LIO_PE      AVG_PIO_PE
---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- ---------------
3229062859        4          0         30      21308           .0       1136.6      .642183193          2081.5            68.1

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT  |                  |       |       |     4 (100)|          |
|   1 |  DELETE           | TABLE_Z          |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| PK_TABLE_Z       |     5 |   135 |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

I think I managed to make it clear about the index range scan (for more information see the bonus material below), Now Lets see what the log file sync is all about and why we are seeing it .
<p

Log File sync event has got definition below in Oracle Documentation.

log file sync

When a user session commits, the session’s redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write the log buffer to the redo log file. When the LGWR has finished writing, it will post the user session.

Wait Time: The wait time includes the writing of the log buffer and the post.

Parameter Description
buffer# The number of the physical buffer in the redo log buffer that needs to be synchronized

Potential causes

Wait Event General Area Possible Causes Look for / Examine
log file sync I/O, over- committing Slow disks that store the online logs

Un-batched commits

Check the disks that house the online redo logs for resource contention. Check the number of transactions (commits + rollbacks) each second, from V$SYSSTAT.

First rule of tuning log file sync is doing less commit which means instead of commiting every dml or every single operation like

begin
for i in 300001..600000 loop
delete from t1 where id1=to_char(i);
commit;
end loop;
end;
/

Doing commits in batch like

begin
for i in 300001..600000 loop
delete from t1 where id1=to_char(i);
end loop;
commit;
end;
/

To prove how you are doing commits, I just get before sql statistics then get sampling of the session then again get after sql statistics and to see if number of user commits statistics during sampling interval are matching with number of increase in the sql executions

—-note to blog reader: this is, how I did,  no rocket science  (ones you arm yourself with right tools) just run the block as whole

select sql_id,executions from v$sql where sql_id in ('29zz9n1b9sppr','ggjkmtv8cxxh3');
@snapper ash=sql_id+event,stats,gather=stw,sinclude=user%commits 20 1 71,192
select sql_id,executions from v$sql where sql_id in ('29zz9n1b9sppr','ggjkmtv8cxxh3');

Results are

SQL> @test1

SQL_ID        EXECUTIONS
------------- ----------
29zz9n1b9sppr      28897
ggjkmtv8cxxh3      28211

Sampling SID 71,192 with interval 100 seconds, taking 1 snapshots...

-- Session Snapper v3.53 by Tanel Poder ( http://blog.tanelpoder.com )

-------------------------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
     71, USER_XXX  , STAT, user commits                                              ,         12,        .12,
     71, USER_XXX  , TIME, repeated bind elapsed time                                ,       59us,      .59us,      .0%, |          |
     71, USER_XXX  , TIME, parse time elapsed                                        ,      127us,     1.27us,      .0%, |          |
     71, USER_XXX  , TIME, DB CPU                                                    ,   278.96ms,     2.79ms,      .3%, |          |
     71, USER_XXX  , TIME, sql execute elapsed time                                  ,     36.32s,   363.21ms,    36.3%, |@@@@      |
     71, USER_XXX  , TIME, DB time                                                   ,     40.75s,   407.51ms,    40.8%, |@@@@@     |
     71, USER_XXX  , WAIT, log file sync                                             ,      4.42s,    44.22ms,     4.4%, |@         |
     71, USER_XXX  , WAIT, db file sequential read                                   ,     35.04s,   350.43ms,    35.0%, |@@@@      |
     71, USER_XXX  , WAIT, SQL*Net message to client                                 ,      242us,     2.42us,      .0%, |          |
     71, USER_XXX  , WAIT, SQL*Net message from client                               ,     60.11s,   601.05ms,    60.1%, |@@@@@@    |
     71, USER_XXX  , WAIT, SQL*Net more data from client                             ,     3.94ms,    39.44us,      .0%, |          |
    192, USER_XXX  , STAT, user commits                                              ,         12,        .12,
    192, USER_XXX  , TIME, DB CPU                                                    ,   207.97ms,     2.08ms,      .2%, |          |
    192, USER_XXX  , TIME, sql execute elapsed time                                  ,     30.74s,   307.37ms,    30.7%, |@@@@      |
    192, USER_XXX  , TIME, DB time                                                   ,     35.14s,    351.4ms,    35.1%, |@@@@      |
    192, USER_XXX  , WAIT, log file sync                                             ,       4.4s,    43.98ms,     4.4%, |@         |
    192, USER_XXX  , WAIT, db file sequential read                                   ,     29.95s,   299.51ms,    30.0%, |@@@       |
    192, USER_XXX  , WAIT, SQL*Net message to client                                 ,      113us,     1.13us,      .0%, |          |
    192, USER_XXX  , WAIT, SQL*Net message from client                               ,     64.55s,   645.48ms,    64.5%, |@@@@@@@   |
    192, USER_XXX  , WAIT, SQL*Net more data from client                             ,    19.36ms,   193.62us,      .0%, |          |
--  End of Stats snap 1, end=2012-01-12 22:27:59, seconds=100

---------------------------------------------------------------
Active% | SQL_ID          | EVENT
---------------------------------------------------------------
    28% | 29zz9n1b9sppr   | db file sequential read             |
    12% | ggjkmtv8cxxh3   | db file sequential read             |
    11% | 3b8gt3dqtn345   | db file sequential read             |
     9% |                 | log file sync                       |
     9% | 64djbhqv5v93u   | db file sequential read             |
     4% | dkh6b1nhnhfcu   | db file sequential read             |
     1% | 8zt375nzk26rh   | db file sequential read             |
     1% | b513mwk4mf9rt   | db file sequential read             |
     1% | 0h01fy07y1c1p   | db file sequential read             |

--  End of ASH snap 1, end=2012-01-12 22:27:59, seconds=100, samples_taken=99

PL/SQL procedure successfully completed.

SQL_ID        EXECUTIONS
------------- ----------
29zz9n1b9sppr      28909
ggjkmtv8cxxh3      28223

As you can see, both sql increase by 12  (after-before execution times) and both sql show user commits statistics delta as 12 which means  there is a commit per execution that leads log file sync event.

After proving that the app is not doing batch commits,  I wonder if this is something avoidable or not,  in terms of business logic. Doing batch commits  will definitely help to increase your overall throughput plus whole system health.

So in my humble opinion we have two options without me seeing any code logic.

1- Providing unique index access for the delete operations by giving all columns of the index
2- Decreasing the number of commits and try to do batch commit.

For you to test all my actions, I have added a a simple synthetic test case attached to understand how these implementations can help us.

Cases tested.

1-Nonbatch commit range scan
2-Nonbatch commit unique scan

3-Batch commit range scan
4-Batch commit unique scan

Timings for the test are as follows for deleting 100K row sets as you can see

Elapsed Improvement
nonbatch-commit-range 84 0%
nonbatch-commit-unique 47 44%
batch-commit-range 55 34%
batch-commit-unique 35 58%

END OF THE EMAIL


Note to blog reader : (In case they are still reading), I wish you found something interesting about how I used the right tool right time
and how I answered the development questions without talking bollocks but numbers and proofs. One thing I need to admit there were only 2 sessions running this operation which made my life easy.

If you wonder the duration of whole diagnose: It took 15 mins in total to find out what is going on and 1 hour to write the mail but that 1 hour save at least 3-4 unnecessary email ping pong just because nobody can prove their fact with numbers.

Quick Tip: If you are sending any output to anybody use Courier New character set and make everybodies eyes happy.

Session Snapper and its usage can be found at Tanels website
Test case can be found  here

November 18, 2011

On demand AWR Report Generator, for Many Node Cluster

Filed under: Basics, Performance, RAC — coskan @ 11:34 am

Since I got turbo boost (over 1000 hits a day-wooow) from Tanel Poder the Great, better I write something to deserve the fame. This one is a real quick one which made my life easier.

On my second week at current company, I had to cover Exadata for 2 weeks which were completely nightmare due to some problems. I had to work with Oracle Support too much and most of the time they ask AWR reports from 8 nodes for x hour duration. First day I was not armed with anything automatic, so tried to do it manually but it was proper nightmare for 8 nodes. Then I said this is complete rubbish way of wasting my time and I need to automate the process otherwise my life will be miserable. Finally I came up with something after couple of attempts

The script below is the result of this automation attempts.

define _interval=1   --1 for half hour 2 for 1 hour 
define _database_name='DBNAME'
define _begin_interval_date='141111 23:00:00'
define _end_interval_date='151111 05:00:00'
define _folder='C:\PERSONAL\awr\instance\EXADATA\141111'


SET termout OFF 
host mkdir &amp;_folder
SET termout ON
host dir &amp;_folder

SET heading off

spool gen_awr.sql
SELECT 'set veri off;' FROM dual;
SELECT 'set feedback off;' FROM dual;
SELECT 'set linesize 1500;' FROM dual;
SELECT 'set termout on;'||CHR(10) FROM dual;
SELECT  'spool &amp;_folder\awrrptto_&amp;_database_name'||CHR(95)||TO_CHAR(instance_number)||'_'||TO_CHAR(previous_snap_id)||'_'||TO_CHAR(snap_id)||'.html'||CHR(10)|| CHR(10)|| 
'SELECT * FROM TABLE(dbms_workload_repository.awr_report_html('||TO_CHAR(dbid)||','||TO_CHAR(instance_number)||','||TO_CHAR(previous_snap_id)||','||TO_CHAR(snap_id)||',8));'||CHR(10)
||CHR(10)||'spool off' 
FROM (
select instance_number,dbid,snap_id,
LEAD(snap_id, &amp;_interval, 0) OVER (PARTITION BY instance_number ORDER BY snap_id DESC NULLS LAST) previous_snap_id,
LAG(snap_id, &amp;_interval, 0) OVER (PARTITION BY instance_number ORDER BY snap_id DESC NULLS LAST) next_snap_id,
begin_interval_time 
from dba_hist_snapshot 
where dbid=(select dbid from v$database where name='&amp;_database_name')
AND instance_number in (select DISTINCT inst_id from gv$database  )
AND begin_interval_time&gt;=TO_DATE('&amp;_begin_interval_date','DDMMYY HH24:MI:SS')-1/24
AND begin_interval_time&lt;TO_DATE('&amp;_end_interval_date','DDMMYY HH24:MI:SS')+1/24
) WHERE next_snap_id&lt;&gt;0 AND previous_snap_id&lt;&gt;0
order by instance_number,begin_interval_time asc;

spool off;


!!!To save it click on the view source link on the sourcode upright corner

Nothing rocket science, just using dbms_workload_repository.awr_report_html with some junior level analytic.

All you have to do is give the date, interval (only tested for half hour or one hour intervals)and folder to be saved then rest will be done by the script.

It will create a gen_awr.sql script which you will run and go have a coffee then when you come back your awr reports are ready to be sent.

so the sequence is

@gen_snap  ---assume you save it with this name
@gen_awr
--coffee break
--zip+upload

I would like to automate it further by doing the same for rac comparison or group of instance but I think I am too lazy or maybe busy with some other non-automatable rubbish :( (I also had some issues with rac comparison report formatting with the DBMS_WORKLOAD_REPOSITORY package so I gave up on that one)

Hope this tiny script save your time as much as it saves mine.

November 11, 2011

Where is the sql_id of active session?

Filed under: Bugs, Performance, Plan Stability, Tips — coskan @ 3:39 pm

In this post I will give information on quick ways to find the sql_id’s of sessions for which you can’t see any entry when you look at v$session.

I usually see this behavior when sql is burning CPU in a nested loop and after a long time of running sql_id column suddenly not updated anymore. When I tweeted about it , ,Tanel Poder answered (If I a recall it right damn twitter does not have a proper search and history) something like ” this happens when oracle does too many recursive calls and loses track of the sql_id” (Tanel, if you are reading, please correct me if I am wrong about what you said)

On Metalink there are two notes I could find

Reason for Apparently Incorrect or Null Sql_id in v$sql [ID 406452.1]

This is caused by an architectural change in the 10g release: from 10.2 onwards, the v$sql columns (sql_id, sql_address and so on) will only show the top level user cursor,  if any (ie it will not display the recursive cursors),.
When using a remote procedure call, such as when a rpc call from a client (eg forms client) requests that a plsql object is executed on the server, there is no top level sql and, consequently, there is no sql_id in v$session.

BUG:5146994 - CANNOT FIND RECURSIVE SQL IN V$SQLAREA FOR A RUNNING SNAPSHOT REFRESH JOB
BUG 5724661 Abstract: WRONG VALUES FOR SQL_ID IN V$SESSION

On Null SQL_ID For Active Sessions In V$Session [ID 1343503.1]
From 10.2 onwards sql_id will only show the top level user cursor ( if any). 

Sometimes,Session does a plsql rpc call ie a client ( eg forms client) can request a plsql object to be executed at the server. In this case there is no top level sql and as such we don't see the sql_id in v$session.

Currently, oracle don't store the current sql thru any v$ views. The only way to find that is thru event 10046.
This is covered in Unpublished bug 5528670: "NULL VALUES FOR SQL_ID, SQL_HASH_VALUE, SQL_ADDRESS AND OTHERS IN V$SESSION", Which was closed as "Not a Bug".

When I have this issues I sometimes get it even on a sql which I run from sql_plus without a pl/sql call so I think limitation on these notes is a bit misleading or I am interpreting the information wrong. Anyway finding the reasons needs too much tracing research bla bla bla which you will never have time to go through when you hit the issue on a production system so best I cut the chase and start giving information how to find the sql_id

Notes says there is no v$ view which is a a bit misleading and notes also say 10046 is the only way to find it which is also another documentation error ignoring method 2 below.

On 10G there are 2 ways that I know will usualls work one of which needs tuning and diagnostics pack license other does not

On 11G there is one more way which again needs tuning and diagnostics pack license.

Method 1- using ASH/AWR data + v$sql combination (10g-11g license needed)

1a- Find the latest not null sql_id column for the given session and serial in v$active_session_history

1b- If you cant find it at first step, due to the reason sql_id information already lost in v$active_session_history, then you need to check similar way using dba_hist_active_sess_history

2- Once you find the sql_id,plan_hash_value double check it on v$sql if number on users_executing column is matching in current v$session then you have what you looking for

—Method-1 on Action

DATABASE1> @swact

    SID USERN STATE   EVENT                        SEQ# SEC_IN_STATE         P1         P2         P3  SQL_ID         CHN
------- ----- ------- ---------------------- ---------- ------------ ---------- ---------- ----------  ------------- ----
   1352 XXWEB WAITING db file scattered read      40906            0         67    2638509         32                       --->sql_id empty
    294 XXWEB WAITING db file scattered read      16727            0         82     234468         32                       --->sql_id empy
   1263 COSKA WORKING On CPU / runqueue              44            0 1413697536          1          0  f5cqythck1gfz    1

DATABASE1> @last_sql 1352 %

SAMPLE_TIME                           SESSION_ID SESSION_SERIAL# SQL_ID		PLAN_HASH_VALUE
------------------------------------- ---------- --------------- -------------  ---------------
31-JAN-11 11.00.35.524                      1352              72 3p1b93fq81x0f  2118159443	   ----> very likely our sql
31-JAN-11 11.00.34.514                      1352              72 3p1b93fq81x0f  2118159443
31-JAN-11 11.00.33.514                      1352              72 3p1b93fq81x0f  2118159443
31-JAN-11 11.00.32.514                      1352              72 3p1b93fq81x0f  2118159443
31-JAN-11 11.00.31.514                      1352              72 3p1b93fq81x0f  2118159443
31-JAN-11 11.00.30.514                      1352              72 3p1b93fq81x0f  2118159443
31-JAN-11 11.00.29.504                      1352              72 3p1b93fq81x0f  2118159443
31-JAN-11 11.00.28.494                      1352              72 3p1b93fq81x0f  2118159443
31-JAN-11 11.00.27.494                      1352              72 3p1b93fq81x0f  2118159443

9 rows selected.

DATABASE1> @last_sql 294

SAMPLE_TIME                            SESSION_ID SESSION_SERIAL# SQL_ID	PLAN_HASH_VALUE
-------------------------------------- ---------- --------------- ------------- ---------------
31-JAN-11 10.50.40.071                        294              39 3p1b93fq81x0f 2118159443	  ----> very likely our sql
31-JAN-11 10.50.39.061                        294              39 3p1b93fq81x0f 2118159443
31-JAN-11 10.50.38.061                        294              39 3p1b93fq81x0f 2118159443
31-JAN-11 10.50.37.061                        294              39 3p1b93fq81x0f 2118159443
31-JAN-11 10.50.36.061                        294              39 3p1b93fq81x0f 2118159443
31-JAN-11 10.50.35.061                        294              39 3p1b93fq81x0f 2118159443
31-JAN-11 10.50.34.051                        294              39 3p1b93fq81x0f 2118159443
31-JAN-11 10.50.33.051                        294              39 3p1b93fq81x0f 2118159443
31-JAN-11 10.50.32.051                        294              39 3p1b93fq81x0f 2118159443

9 rows selected.

DATABASE1>-----LETS CHECK THE SQL

DATABASE1> @sqlid 3p1b93fq81x0f

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- --------------------------------------------------------------------------------------------------------
3p1b93fq81x0f 2894132238 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

 CH#   PLAN_HASH OPT_COST EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS USERS_EXECUTING
----  ---------- -------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- --------------
   0  2118159443   956322          3          1              0 861533.028    5876120 1958.70667   21851873   21739212          2

Elapsed times + users_executing columns saying that these sqls has been running for a very long time and they are %99.999 the ones we are looking for

There are tiny challenges with this approach but not that much challeging.

I used it very much in RAC env with a system where multiple users and child cursors are also running
so if you are careful enough to compare apples with apples (same instance, same plan hash value,right session serial, longer elapsed time cursor) you are very unlikely to be mislead

code used for last_sql is like below (modified a bit for rac aware)


select inst_id,sample_time,session_id,session_serial#,sql_id from (
select inst_id,sample_time,session_id,session_serial#,sql_id from
gv$active_session_history
where
sql_id is not null
 and session_id=&1 and inst_id=&2
 order by 1 desc
) where rownum

Method 2- Oradebug dump error stack (10g-11g no license needed)- Most reliable and certain way it is %100 accurate and good part is it does not need any input from me since Tanel Poder already covered how to go through error stack on his post how to read errorstack output

I tried 10046 and never seen any sql_id information written for the 10 mins period I gathered the trace so I’m not %100 sure if it needs whole call to be finished before it writes the sql_id information. I personally won’t use 10046 for this case.

Method-3- SQL Monitoring (11g onwards-license needed). This is my favourite and its damn quick and %99.9 it works if the sql was running long enough to get monitored and then lost the track. It is damn simple like this piece of code

select distinct inst_id,sid,status,sql_id,sql_plan_hash_value,sql_child_address,sql_exec_id
 from gv$sql_plan_monitor
 where sid in (&1) and status='EXECUTING';

—Method 3 On Action

DATABASE1> @swact

    SID USERN STATE   EVENT                        SEQ# SEC_IN_STATE         P1         P2         P3  SQL_ID         CHN
------- ----- ------- ---------------------- ---------- ------------ ---------- ---------- ----------  ------------- ----
   1352 XXWEB WAITING db file scattered read      40906            0         67    2638509         32                       --->sql_id empty
    294 XXWEB WAITING db file scattered read      16727            0         82     234468         32                       --->sql_id empy
   1263 COSKA WORKING On CPU / runqueue              44            0 1413697536          1          0  f5cqythck1gfz    1

---Check sql_monitoring
SQL> @mon_sql 1352,294

INST_ID  SID STATUS              SQL_ID        SQL_PLAN_HASH_VALUE SQL_CHILD_ADDRES SQL_EXEC_ID
-------	---- ------------------- ------------- ------------------- ---------------- -----------
1        1352 EXECUTING           3p1b93fq81x0f          2118159443 00000002D88378E8 16777216
1        1352 EXECUTING           2kgnch9h5hbkz          1874321290 00000002D6223418 16777216  ---paren pl/sql
1         294 EXECUTING           3p1b93fq81x0f          2118159443 00000002D88378E8 16777217
1         294 EXECUTING           2kgnch9h5hbkz          1874321290 00000002D6223418 16777217

The second sql_id 2kgnch9h5hbkz is from pl_sql parent call which was calling 3p1b93fq81x0f so they both get monitored by v$sql_monitor

Apart from all these again if you have diagnostics pack license you can always use OEM/Grid Control to check back the sessions history but comparing with querying v$sql_monitor it is as slow as driving an electric car when you have ferrari.

As you can see there are more ways then listed if your session lost track of the sql_ids. If you have more ways (especially with tracing -like dump sql if there is an option like that) I would really like to hear.

Update after first comment

Method-4 Using v$open_cursor

First comment on which Michael Dinh mentioned he is using toad open cursors option to hunt down the sqls but he was not sure for the sqlplus.
I had a chance to to check with one of our systems. Looks like v$open_cursor is giving us information which not as clear as the other methods but still can be very
useful for standart edition or users without diagnostics pack and also can be easier than error stack which usually needs a change request on PRD systems.

SQL> @sw 207,980

    SID USERN STATE   EVENT                    SEQ# SEC_IN_STATE SQL_ID         CHN                        P1                        P2                        P3
------- ----- ------- ------------------ ---------- ------------ ------------- ---- ------------------------- ------------------------- -------------------------
    207 XXXXX WORKING On CPU / runqueue       62033            3                    cellhash#=203CB5D5         diskhash#= 349428727      bytes= 16384
    980 XXXXX WORKING On CPU / runqueue         759            1                    cellhash#=D84BEC46         diskhash#=56AC941E       bytes= 16384

SQL> @mon_sql 207,980

   INST_ID        SID STATUS              SQL_ID        SQL_PLAN_HASH_VALUE SQL_CHILD_ADDRES SQL_EXEC_ID
---------- ---------- ------------------- ------------- ------------------- ---------------- -----------
         2        980 EXECUTING           91uc7cvw2rqab          4166729024 000000040D4560F8    33554458
         2        207 EXECUTING           91uc7cvw2rqab          4166729024 000000040D4560F8    33554456

SQL> select * from v$open_cursor where sid in (207,980) order by 2;

SADDR                   SID USER_NAME  ADDRESS          HASH_VALUE SQL_ID        SQL_TEXT LAST_SQL_ SQL_EXEC_ID CURSOR_TYPE
---------------- ---------- ---------- ---------------- ---------- ------------- -------- --------- ----------- -----------------------------------
00000004508BA640        207 XXXXXXX    00000003A72340A0 2292456266 gw771824a86ua Begin PK              33554456 OPEN
00000004508BA640        207 XXXXXXX    00000003E6F0C230 2181403846 5bvw0ka10b566 Begin PK                       OPEN
00000004508BA640        207 XXXXXXX    000000040E81D128 4111302712 6k03snbuhv01s Begin PK                       OPEN
00000004508BA640        207 XXXXXXX    00000003EA88E4F0 1535077424 68u7tx5dryv1h Begin PK                       OPEN
00000004508BA640        207 XXXXXXX    00000004592E3690  914163366 4vs91dcv7u1p6 insert i                       OPEN-RECURSIVE
00000004508BA640        207 XXXXXXX    0000000411FC64C0 1407509578 8nwyam59y9t2a Begin PK                       OPEN
00000004508BA640        207 XXXXXXX    00000004117E1340 2399958818 62h75aq7hsxt2 Begin PK                       OPEN
00000004508BA640        207 XXXXXXX    00000004125C3DC0 4163623243 91uc7cvw2rqab INSERT I              33554456 OPEN-PL/SQL
0000000452AA1BB8        980 XXXXXXX    00000004125C3DC0 4163623243 91uc7cvw2rqab INSERT I              33554458 OPEN-PL/SQL
0000000452AA1BB8        980 XXXXXXX    00000004592E3690  914163366 4vs91dcv7u1p6 insert i                       OPEN-RECURSIVE
0000000452AA1BB8        980 XXXXXXX    000000040E81D128 4111302712 6k03snbuhv01s Begin PK                       OPEN
0000000452AA1BB8        980 XXXXXXX    0000000411FC64C0 1407509578 8nwyam59y9t2a Begin PK                       OPEN
0000000452AA1BB8        980 XXXXXXX    00000003E6F0C230 2181403846 5bvw0ka10b566 Begin PK                       OPEN
0000000452AA1BB8        980 XXXXXXX    00000004117E1340 2399958818 62h75aq7hsxt2 Begin PK                       OPEN
0000000452AA1BB8        980 XXXXXXX    00000003EA88E4F0 1535077424 68u7tx5dryv1h Begin PK                       OPEN
0000000452AA1BB8        980 XXXXXXX    00000003A72340A0 2292456266 gw771824a86ua Begin PK              33554458 OPEN

Once we spot the sqls we need to check all of them one by one on v$sql where if users_executing column (UEX) populated

SQL> @sqlid gw771824a86ua   ---->bingo with two users and high ELAPSED TIMES. this is a pl/sql package calling the sql below

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- --------------------------------------------------------------------------------------------------------------------------------
gw771824a86ua 2292456266 xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

INS  CH#  UEX  PLAN_HASH   OPT_COST        CPU_MS     ELA_MS    AVG_ELA OFFLOAD IO_SAVED_% IO_ELIG_GB IO_INTER_GB       LIOS       PIOS
--- ---- ---- ---------- ----------  - ---------- ---------- ---------- ------- ---------- ---------- ----------- ---------- ----------
  2    0    2          0          0     412784696  414481976  16579.279 No             .00        .00      58.007 2653518789    3248794
  3    0    0          0          0    3215639.15 3307820.05 1653.91002 No             .00        .00       3.212 1480568056     184192

----
---- Tried for all other SQLS but thet do not have users_executing column populated
----

SQL> @sqlid 91uc7cvw2rqab   -->two users and high ELAPSED TIMES on 4th child

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- -----------------------------------------------------------------------------------------------------------------------------------
91uc7cvw2rqab 4163623243 xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

INS  CH#  UEX  PLAN_HASH   OPT_COST       CPU_MS     ELA_MS    AVG_ELA OFFLOAD IO_SAVED_% IO_ELIG_GB IO_INTER_GB       LIOS       PIOS
--- ---- ---- ---------- ----------   ---------- ---------- ---------- ------- ---------- ---------- ----------- ---------- ----------
  2    0    0 4166729024      13081   91245882.5 91690487.2 18338.0974 No             .00        .00      11.644 3739546208     657923
  2    1    0 4166729024      13081    120056449  120491925 15061.4907 No             .00        .00      15.889 1250311783     883494
  2    2    0 4166729024      13081   64616814.8 64899824.1 12979.9648 No             .00        .00      10.564  424413836     587066
  2    3    0 4166729024      13081   58282307.7 58454183.2 19484.7277 No             .00        .00       5.442 1809526416     303954
  2    4    2 4166729024      13081   78886102.5 79233558.3 19808.3896 No             .00        .00      13.555 4163654789     756589

Thanks Michael for the information so we have 1 more method which does not need license

August 15, 2011

What happens, when you did not do a load test before release upgrades ?

Filed under: Performance — coskan @ 11:55 pm

This is what happened on 11.2.0.2 upgrade from 10.2.0.5

Picture is taken from production (tool used is for this picture is my recent favourite Database Time Monitor from Dominic Giles – creator of swingbench) today, which was upgraded to 11.2.0.2 on last saturday.

What happens here is I think self-explanatory, our system and underlying I/O is hammered with direct path reads.

Explaining this will not satisfy the customer so question is Why it is happening ?

1- It is obvious that upgrade was not tested good enough

2- Enough research about behaviour change has not been made for 10g to 11g migration

All these are political answers, real technical answer is below

Once I checked the system there were roughly 9 distinct sqlids running with many sessions,  all waiting on the same event.

The sqls were more or less  centralised on querying Users table with filtering email where there is no index and there has never been an index.  Why wasn’t system suffering from this lack of index when we are on 10GR2?

It is because 11G is now using direct path reads whenever it can for full table scans  (for more information check the references at the bottom of this table). This means when we run this query and if the table is eligible for direct path reads it will use direct path reads from disk. When you query with 2-3 sessions this is ok but when you query with 20 sessions then it becomes disaster like here and your disk run queue gets bigger and bigger

What was happening on 10GR2? Because Oracle was using scattered reads and cache when it can we were doing more logical I/O without hammering disk subsystem and this contention never revealed and because this application is third party  no index was created by us even there was unnecessary logical I/O.

Because we only functionality and vendor certifies the production on 11G we just upgraded the database.

Let me replicate the issue with simple test case on 11G by first with default direct path read behaviour and then reverting back to pre 11G behaviour by setting 10949 event at system level (ref: Dion Cho)

Test Case

—Running 20 sessions of simple select each for 10 times. Total 200 runs. (Idea taken from -Andrey S. Nikolaev -)


create table test_direct as select * from dba_source;

---single_thread.sql --- 20 lines of the code below
select  * from test_directpath2 where name like 'TEST%';

---multi_runner.sql
set echo off
set verify off
set head off:
define parallelism="&1"

spool many_threads.sql
select 'host sqlplus hr/hr @single_thread.sql &'
from dba_objects where rownum <= ¶llelism;
spool off
@many_threads.sql

---Running
@multi_runner 20

Default 11G


Pre-10G behaviour

during the issue-Only slight increase during the caching then all LIO

After the issue happened – It lasted only 1 minute

As you can see we have a sudden CPU increased due to high number of logical reads in short time but no long term I/O load,  once oracle caches it burns CPU instead waiting on I/O. Meanwhile I think DB was in maintenance window so resource manager kicked in to avoid hammering CPU–resmgr:cpu quantum event.

Here are the other comparisons between two runs

Session waits – 11G

HR@ORACOS> @swact

    SID USERN STATE   EVENT                                          SEQ# SEC_IN_STATE SQL_ID         CHN                        P1                        P2                        P3 P1TRANSL                                   BLCKNG_SID BLCKNG_SID_S BLCKNG_INST
------- ----- ------- ---------------------------------------- ---------- ------------ ------------- ---- ------------------------- ------------------------- ------------------------- ------------------------------------------ ---------- ------------ -----------
     15 HR    WAITING direct path read                                606            0 ba40q4n8w1myy    0 file number= 10           first dba= 493120         block cnt= 64                                                                   UNKNOWN
    134 HR    WAITING direct path read                                444            0 ba40q4n8w1myy    0 file number= 10           first dba= 493314         block cnt= 62                                                                   UNKNOWN
     75 HR    WAITING direct path read                                483            0 ba40q4n8w1myy    0 file number= 10           first dba= 494656         block cnt= 64                                                                   UNKNOWN
    200 HR    WAITING direct path read                                491            0 ba40q4n8w1myy    0 file number= 10           first dba= 494912         block cnt= 64                                                                   UNKNOWN
    196 HR    WAITING direct path read                                493            0 ba40q4n8w1myy    0 file number= 10           first dba= 494912         block cnt= 64                                                                   UNKNOWN
     12 HR    WAITING direct path read                                504            0 ba40q4n8w1myy    0 file number= 10           first dba= 495234         block cnt= 62                                                                   UNKNOWN
    132 HR    WAITING direct path read                                507            0 ba40q4n8w1myy    0 file number= 10           first dba= 495234         block cnt= 62                                                                   UNKNOWN
     13 HR    WAITING direct path read                                515            0 ba40q4n8w1myy    0 file number= 10           first dba= 495617         block cnt= 63                                                                   UNKNOWN
    201 HR    WAITING direct path read                                532            0 ba40q4n8w1myy    0 file number= 10           first dba= 496192         block cnt= 64                                                                   UNKNOWN
    137 HR    WAITING direct path read                                547            0 ba40q4n8w1myy    0 file number= 10           first dba= 496704         block cnt= 64                                                                   UNKNOWN
    136 HR    WAITING direct path read                                453            0 ba40q4n8w1myy    0 file number= 13           first dba= 100032         block cnt= 64                                                                   UNKNOWN
    202 HR    WAITING direct path read                                464            0 ba40q4n8w1myy    0 file number= 13           first dba= 100354         block cnt= 62                                                                   UNKNOWN
    135 HR    WAITING direct path read                                472            0 ba40q4n8w1myy    0 file number= 13           first dba= 100610         block cnt= 62                                                                   UNKNOWN
    199 HR    WAITING direct path read                                485            0 ba40q4n8w1myy    0 file number= 13           first dba= 101056         block cnt= 64                                                                   UNKNOWN
     71 HR    WAITING direct path read                                502            0 ba40q4n8w1myy    0 file number= 13           first dba= 101506         block cnt= 62                                                                   UNKNOWN
     16 HR    WAITING direct path read                                505            0 ba40q4n8w1myy    0 file number= 13           first dba= 101696         block cnt= 64                                                                   UNKNOWN
     73 HR    WAITING direct path read                                521            0 ba40q4n8w1myy    0 file number= 13           first dba= 102208         block cnt= 64                                                                   UNKNOWN
    133 HR    WAITING direct path read                                591            0 ba40q4n8w1myy    0 file number= 13           first dba= 98946          block cnt= 62                                                                   UNKNOWN
     74 HR    WAITING direct path read                                432            0 ba40q4n8w1myy    0 file number= 13           first dba= 99330          block cnt= 62                                                                   UNKNOWN
     72 HR    WAITING direct path read                                608            0 ba40q4n8w1myy    0 file number= 13           first dba= 99520          block cnt= 64                                                                   UNKNOWN
    192 HR    WORKING On CPU / runqueue                                43            0 90sp00nt0sspn    0 driver id=                #bytes= 1                 0                                                                               NOT IN WAIT
                                                                                                          0x0000000054435000

21 rows selected.

Session waits – Pre 11G


HR@ORACOS> @swact

    SID USERN STATE   EVENT                                          SEQ# SEC_IN_STATE SQL_ID         CHN                        P1                        P2                        P3 P1TRANSL                                   BLCKNG_SID BLCKNG_SID_S BLCKNG_INST
------- ----- ------- ---------------------------------------- ---------- ------------ ------------- ---- ------------------------- ------------------------- ------------------------- ------------------------------------------ ---------- ------------ -----------
    136       WAITING jobq slave wait                                  10            0                  0 0                         0                         0                                                                               NO HOLDER
     20       WAITING jobq slave wait                                  10            0                  0 0                         0                         0                                                                               NO HOLDER
     23 HR    WAITING resmgr:cpu quantum                               53            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     25 HR    WAITING resmgr:cpu quantum                               60            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     22 HR    WAITING resmgr:cpu quantum                               53            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    210 HR    WAITING resmgr:cpu quantum                               50            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     14 HR    WAITING resmgr:cpu quantum                               56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    208 HR    WAITING resmgr:cpu quantum                               53            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    206 HR    WAITING resmgr:cpu quantum                               54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    144 HR    WAITING resmgr:cpu quantum                               58            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    141 HR    WAITING resmgr:cpu quantum                               54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    140 HR    WAITING resmgr:cpu quantum                               50            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     74 HR    WAITING resmgr:cpu quantum                               55            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     81 HR    WAITING resmgr:cpu quantum                               56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     85 HR    WAITING resmgr:cpu quantum                               56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    132 HR    WAITING resmgr:cpu quantum                               54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    137 HR    WAITING resmgr:cpu quantum                               57            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     73 HR    WORKING On CPU / runqueue                               506            0 90sp00nt0sspn    0 driver id=                #bytes= 1                 0                                                                               NOT IN WAIT
                                                                                                          0x0000000054435000

     80 HR    WORKING On CPU / runqueue                                56            0 ba40q4n8w1myy    0 driver id=                #bytes= 1                 0                                                                               NOT IN WAIT
                                                                                                          0x0000000062657100

    199 HR    WORKING On CPU / runqueue                                52            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT
     82 HR    WORKING On CPU / runqueue                                57            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT
     13 HR    WORKING On CPU / runqueue                                56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT
    205 HR    WORKING On CPU / runqueue                                54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT

23 rows selected.

SQL Stats -Pre-11G


HR@ORACOS> @sqlid ba40q4n8w1myy

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
ba40q4n8w1myy  297848798 select  * from test_directpath2 where name like 'TEST%'

       INS  CH#   UEX  PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS
---------- ----  ---- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ----------
         1    0     0 3436878245     2147        200          1        200        200              0      52000  5333438.4  26.667192    2103470    2102605

SQL Stats -Pre-11G  –Check Elapsed times and PIO difference


HR@ORACOS> @sqlid ba40q4n8w1myy

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
ba40q4n8w1myy  297848798 select  * from test_directpath2 where name like 'TEST%'

       INS  CH#  UEX  PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS
---------- ---- ---- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ----------
         1    0    0 3436878245     2147        200          2        200        200              0      16120 102802.028  .51401014    2104800      10513

HR@ORACOS>
Snapper -11G


HR@ORACOS> @sn 10 15
Sampling...

-- Session Snapper v3.10 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

----------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                               ,         DELTA, HDELTA/SEC,    %TIME, GRAPH
----------------------------------------------------------------------------------------------------------------------
     15, HR        , STAT, session logical reads                   ,          3338,      333.8,    ===>check the difference
     15, HR        , STAT, user I/O wait time                      ,          1040,        104,
     15, HR        , STAT, non-idle wait time                      ,          1039,      103.9,
     15, HR        , STAT, non-idle wait count                     ,            53,        5.3,
     15, HR        , STAT, physical read total IO requests         ,            54,        5.4,
     15, HR        , STAT, physical read total multi block requests,            54,        5.4,
     15, HR        , STAT, physical read total bytes               ,      27869184,      2.79M,    ===>check the difference
     15, HR        , STAT, cell physical IO interconnect bytes     ,      27869184,      2.79M,
     15, HR        , STAT, consistent gets                         ,          3338,      333.8,
     15, HR        , STAT, consistent gets direct                  ,          3338,      333.8,
     15, HR        , STAT, physical reads                          ,          3338,      333.8,
     15, HR        , STAT, physical reads direct                   ,          3338,      333.8,
     15, HR        , STAT, physical read IO requests               ,            53,        5.3,
     15, HR        , STAT, physical read bytes                     ,      27344896,      2.73M,
     15, HR        , STAT, file io wait time                       ,          1010,        101,
     15, HR        , STAT, Number of read IOs issued               ,            53,        5.3,
     15, HR        , STAT, no work - consistent read gets          ,          3338,      333.8,
     15, HR        , STAT, table scan rows gotten                  ,        164188,     16.42k,
     15, HR        , STAT, table scan blocks gotten                ,          3338,      333.8,
     15, HR        , TIME, DB CPU                                  ,         70000,        7ms,      .7%, |@         |
     15, HR        , TIME, sql execute elapsed time                ,      10107298,      1.01s,   101.1%, |@@@@@@@@@@|
     15, HR        , TIME, DB time                                 ,      10107298,      1.01s,   101.1%, |@@@@@@@@@@|
     15, HR        , WAIT, direct path read                        ,       9535205,   953.52ms,    95.4%, |@@@@@@@@@@|
--  End of Stats snap 1, end=2011-08-16 00:02:14, seconds=10

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
    99% | ba40q4n8w1myy   | direct path read          | User I/O
     1% | ba40q4n8w1myy   | ON CPU                    | ON CPU

--  End of ASH snap 1, end=2011-08-16 00:02:14, seconds=10, samples_taken=99

Snapper – Pre-11G  –check logical I/O per second 


HR@ORACOS> @sn 30 192
Sampling...

-- Session Snapper v3.10 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

----------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                               ,         DELTA, HDELTA/SEC,    %TIME, GRAPH
----------------------------------------------------------------------------------------------------------------------
    192, HR        , STAT, opened cursors cumulative               ,            77,       2.57,
    192, HR        , STAT, opened cursors current                  ,            -1,       -.03,
    192, HR        , STAT, user calls                              ,           116,       3.87,
    192, HR        , STAT, pinned cursors current                  ,             1,        .03,
    192, HR        , STAT, session logical reads                   ,        406695,     13.56k,	  ===>check the difference
    192, HR        , STAT, CPU used when call started              ,           300,         10,
    192, HR        , STAT, CPU used by this session                ,           300,         10,
    192, HR        , STAT, DB time                                 ,          1926,       64.2,
    192, HR        , STAT, concurrency wait time                   ,             6,         .2,
    192, HR        , STAT, scheduler wait time                     ,          1471,      49.03,
    192, HR        , STAT, non-idle wait time                      ,          1479,       49.3,
    192, HR        , STAT, non-idle wait count                     ,           201,        6.7,
    192, HR        , STAT, enqueue requests                        ,             1,        .03,
    192, HR        , STAT, enqueue releases                        ,             1,        .03,
    192, HR        , STAT, consistent gets                         ,        406695,     13.56k,
    192, HR        , STAT, consistent gets from cache              ,        406695,     13.56k,
    192, HR        , STAT, consistent gets from cache (fastpath)   ,        406695,     13.56k,
    192, HR        , STAT, calls to kcmgcs                         ,           426,       14.2,
    192, HR        , STAT, calls to get snapshot scn: kcmgss       ,            39,        1.3,
    192, HR        , STAT, no work - consistent read gets          ,        406269,     13.54k,
    192, HR        , STAT, table scans (long tables)               ,            39,        1.3,
    192, HR        , STAT, table scan rows gotten                  ,      24488652,    816.29k,
    192, HR        , STAT, table scan blocks gotten                ,        406269,     13.54k,
    192, HR        , STAT, session cursor cache hits               ,            76,       2.53,
    192, HR        , STAT, parse count (total)                     ,            77,       2.57,
    192, HR        , STAT, execute count                           ,            77,       2.57,
    192, HR        , STAT, bytes sent via SQL*Net to client        ,         24169,     805.63,
    192, HR        , STAT, bytes received via SQL*Net from client  ,         19066,     635.53,
    192, HR        , STAT, SQL*Net roundtrips to/from client       ,            76,       2.53,
    192, HR        , TIME, parse time elapsed                      ,          2510,    83.67us,      .0%, |          |
    192, HR        , TIME, PL/SQL execution elapsed time           ,          4570,   152.33us,      .0%, |          |
    192, HR        , TIME, DB CPU                                  ,       3160000,   105.33ms,    10.5%, |@@        |
    192, HR        , TIME, sql execute elapsed time                ,      19257156,   641.91ms,    64.2%, |@@@@@@@   |
    192, HR        , TIME, DB time                                 ,      19757917,    658.6ms,    65.9%, |@@@@@@@   |
    192, HR        , WAIT, Disk file operations I/O                ,            38,     1.27us,      .0%, |          |
    192, HR        , WAIT, latch: cache buffers chains             ,         68244,     2.27ms,      .2%, |          |
    192, HR        , WAIT, resmgr:cpu quantum                      ,      14801390,   493.38ms,    49.3%, |@@@@@     |
    192, HR        , WAIT, SQL*Net message to client               ,           309,     10.3us,      .0%, |          |
    192, HR        , WAIT, SQL*Net message from client             ,       9924100,    330.8ms,    33.1%, |@@@@      |
    192, HR        , WAIT, events in waitclass Other               ,         11340,      378us,      .0%, |          |
--  End of Stats snap 1, end=2011-08-16 00:20:32, seconds=30

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
    43% | ba40q4n8w1myy   | resmgr:cpu quantum        | Scheduler
    20% | ba40q4n8w1myy   | ON CPU                    | ON CPU

--  End of ASH snap 1, end=2011-08-16 00:20:32, seconds=30, samples_taken=87

I/O Stats – 11G–high runque size

coskan@coskan-lucid:~$ sar -d 1
Linux 2.6.32-33-generic (coskan-lucid) 	16/08/11 	_x86_64_	(4 CPU)

00:00:55          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:00:56       dev8-0    132.00 127200.00    104.00    964.42     21.18    130.15      7.58    100.00
00:00:56      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:56      dev8-32    105.00   1680.00     34.00     16.32      0.68      6.38      6.00     63.00
00:00:56      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:56      dev8-64     84.00   3240.00   8400.00    138.57     79.66    109.88      9.64     81.00

00:00:56          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:00:57       dev8-0    131.00 128320.00     64.00    980.03     22.76    136.87      7.63    100.00
00:00:57      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:57      dev8-32    158.00   2640.00      0.00     16.71      0.79      5.00      5.00     79.00
00:00:57      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:57      dev8-64     56.00      0.00  13440.00    240.00    144.52   1049.64     17.86    100.00

I/O Stats – PRe-11G  —no runque


00:11:57          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:11:58       dev8-0      2.00      0.00     32.00     16.00      0.03     15.00     15.00      3.00
00:11:58      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:58      dev8-32     10.00    160.00     64.00     22.40      0.04      4.00      4.00      4.00
00:11:58      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:58      dev8-64    275.00   2464.00   4736.00     26.18    144.05    193.24      3.64    100.00

00:11:58          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:11:59       dev8-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-32      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-64    440.00   3344.00   2984.00     14.38    146.16    295.02      2.27    100.00

There are two  solutions in this case one is  creating an index to avoid full table scan or disabling direct path read at system level which is not right at all.
Moral Of the story is TEST YOUR SYSTEM WITH SAME/SIMILAR LOAD BEFORE YOU UPGRADE !!!!!

Being certified on a platform won’t mean it will run fine
If you havent read already, these are reference posts about direct path reads on 11GR2

http://blog.tanelpoder.com/2011/08/11/full-scans-direct-path-reads-and-ora-8103-error-hacking-session-video-here-plus-itunes-podcast-address-2/

http://dioncho.wordpress.com/2009/07/21/disabling-direct-path-read-for-the-serial-full-table-scan-11g/

http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-reads-what-is.html

http://oraclue.com/2009/07/17/direct-path-reads-and-serial-table-scans-in-11g/

http://oracledoug.com/serendipity/index.php?/archives/1321-11g-and-direct-path-reads.html

August 13, 2011

dbms_xplan.display_sql_plan_baseline can also lie !

Filed under: Bugs, CBO, Performance, Plan Stability — coskan @ 6:40 pm

This blog post is about another buggy/funny/misleading behaviour which you MUST know if you are using baselines ( especially in a non-controlled fashion by let oracle manage them
) It looks like a long post but it is actually an easy read since nothing rocket science in the post.

All story started when one of our batches was not using the plan stored in the sql plan baseline repository.
In these cases app support check if the baseline is used by checking v$sql.sql_plan_baseline value.

If it is not used then they check the enabled, accepted baselines by using the statement below as in documentation

dbms_xplan.display_sql_plan_baseline(sql_handle=>’&sql_handle’,plan_name=>’&sql_plan’, format=>’BASIC’));

Once they find the plan_hash_value they are looking for they try to fix it and expect Oracle use the baseline which it should use in theory.

In our case it didn’t. When I checked the plan baseline was created in september 2010 and never actually used since 14 January 2011.
Fun part is some occasions same plan hash value in the baseline is being generated but Oracle is not reporting that baseline is used. But that’s the whole point of baselines having a constant plan hash value. Why oracle is not using the plan baseline even it generates the same plan with same plan hash value

We raised the issue with Oracle and oracle come back with couple of request

1- 10053 trace
2- plan baseline trace( alter session set events ‘trace [SQL_Plan_Management.*]‘;)

We provided the requests and they said if there was a change in the table structure at some point which is causing the plan can’t be reproduced. App team confirmed that there was a structural change to drop global index and recreate it as local on actually the date when the baseline was last used.
The problem is we changed a global index to local index on 14 January 2011 with an index name change, but display_sql_plan_baseline shows us a plan which uses a local index with the current name.This cannot be possible since the baseline is create on September 2010. So what was wrong?

Support could not managed to give me a decent answer or accept that it is a bug so it is time for me to create a test case and reveal the truth why this issue is happening.


HR@ORACOS> ---simple table creation
HR@ORACOS> create table baseline_test as select * from dba_objects;

Table created.

HR@ORACOS> ---index creation
create unique index baseline_test_idx_1
  2  on baseline_test(object_id);

Index created.

HR@ORACOS> ---gather table stats
HR@ORACOS> @gts baseline_test

PL/SQL procedure successfully completed.

HR@ORACOS> HR@ORACOS>
HR@ORACOS>
HR@ORACOS> ---start session level baseline capturing process
HR@ORACOS> alter session set optimizer_capture_sql_plan_baselines=TRUE;

Session altered.

HR@ORACOS> ---run the query below several times for plan capturing happens
HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> @e1

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  910wg4cc9fbwn, child number 0
-------------------------------------
select object_name from baseline_test  where object_id=5 and
data_object_id=10

Plan hash value: 3566649846

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_1 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DATA_OBJECT_ID"=10)
   2 - access("OBJECT_ID"=5)

Note
-----
   - SQL plan baseline SQL_PLAN_g64u8gfpwnfr028052575 used for this statement

25 rows selected.

HR@ORACOS> @sql_plan_baselines2 910wg4cc9fbwn

##################
#active baselines#
##################

   INST_ID   UE SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER SQL_PLAN_BASELINE              SQL_HANDLE                     PLAN_NAME                      ENABLED ACCEPTED FIXED CREATED                   LAST_EXECUTED
---------- ---- ------------- --------------- ------------ ------------------------------ ------------------------------ ------------------------------ ------- -------- ----- ------------------------- -------------------------
         1    0 910wg4cc9fbwn      3566649846            0 SQL_PLAN_g64u8gfpwnfr028052575 SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr028052575 YES     YES      NO    13-AUG-11 15.45.42.000000 13-AUG-11 15.45.42.000000

Now it is time to check the plan output


HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 3566649846

-----------------------------------------------------------
| Id  | Operation                   | Name                |
-----------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |
|   2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_1 |
-----------------------------------------------------------

21 rows selected.

Since we have a baseline now we can change the table index structure to replicate the issue


HR@ORACOS> ---now drop the index
HR@ORACOS> drop index baseline_test_idx_1
  2  ;

Index dropped.

HR@ORACOS> create unique index baseline_test_idx_2 on baseline_test(data_object_id,object_id);

Index created.

HR@ORACOS> ---Run it couple of times for baseline generation

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> @e1

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  910wg4cc9fbwn, child number 0
-------------------------------------
select object_name from baseline_test  where object_id=5 and
data_object_id=10

Plan hash value: 3304552195

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=10 AND "OBJECT_ID"=5)

20 rows selected.

As you can see baseline no baseline is used. Lets check if a baseline captured ?


HR@ORACOS> @sql_plan_baselines2 910wg4cc9fbwn

##################
#active baselines#
##################

   INST_ID   UE SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER SQL_PLAN_BASELINE              SQL_HANDLE                     PLAN_NAME                      ENABLED ACCEPTED FIXED CREATED                   LAST_EXECUTED
---------- ---- ------------- --------------- ------------ ------------------------------ ------------------------------ ------------------------------ ------- -------- ----- ------------------------- -------------------------
         1    0 910wg4cc9fbwn      3304552195            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr028052575 YES     YES      NO    13-AUG-11 15.45.42.000000 13-AUG-11 15.45.42.000000
         1    0 910wg4cc9fbwn      3304552195            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr09db5a957 YES     NO       NO    13-AUG-11 15.49.54.000000

Two baselines are captured in the system, one is accepted which is the one initially created before the index recreation but the other one is not accepted ??

Now it is time to see what the plans of these baselines

HR@ORACOS> --ACCEPTED PLAN
HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 3304552195

-----------------------------------------------------------
| Id  | Operation                   | Name                |
-----------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |
|   2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |
-----------------------------------------------------------

21 rows selected.

HR@ORACOS> ---NON ACCEPTED NEWLY GENERATED PLAN

HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr09db5a957

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr09db5a957         Plan id: 2645928279
Enabled: YES     Fixed: NO      Accepted: NO      Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 3304552195

-----------------------------------------------------------
| Id  | Operation                   | Name                |
-----------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |
|   2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |
-----------------------------------------------------------

21 rows selected.

They are same ???? How come the plan can be same with the one which has different index names ??
If you are a careful reader you might have spotted The plan_id is different with same plan hash value

Lets check with ADVANCED option instead BASIC


HR@ORACOS> ----ACCEPTED BASELINE

HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'ADVANCED'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Outline Data from SMB:

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */
--------------------------------------------------------------------------------

Plan hash value: 3304552195

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |     1 |    32 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / BASELINE_TEST@SEL$1
   2 - SEL$1 / BASELINE_TEST@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."DATA_OBJECT_ID"
              "BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=10 AND "OBJECT_ID"=5)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "OBJECT_NAME"[VARCHAR2,128]
   2 - "BASELINE_TEST".ROWID[ROWID,10]

68 rows selected.

Did you spot something ? Look carefully because I did not look carefully and wasted 1 day. Once you know how to get the higher detail of information better you should look carefully


HR@ORACOS>---NON ACCEPTED NEWLY GENERATED PLAN
hR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'ADVANCED'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Outline Data from SMB:

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */
--------------------------------------------------------------------------------

Plan hash value: 3304552195

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |     1 |    32 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / BASELINE_TEST@SEL$1
   2 - SEL$1 / BASELINE_TEST@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."DATA_OBJECT_ID"
              "BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=10 AND "OBJECT_ID"=5)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "OBJECT_NAME"[VARCHAR2,128]
   2 - "BASELINE_TEST".ROWID[ROWID,10]

68 rows selected.

If you haven’t spotted already, the problem with the accepted plan output is the difference between OUTLINE DATA FROM SMB and OUTLINE DATA sections.
There are two outline sections per baseline output. one is OUTLINE DATA FROM SMB and other is OUTLINE DATA

Now you can do line by line comparison and see the difference.

Outline Data from SMB:

/*+
BEGIN_OUTLINE_DATA
INDEX_RS_ASC(@”SEL$1″ “BASELINE_TEST”@”SEL$1″ (“BASELINE_TEST”.”OBJECT_ID”))
OUTLINE_LEAF(@”SEL$1″)
ALL_ROWS
DB_VERSION(’11.2.0.1′)
OPTIMIZER_FEATURES_ENABLE(’11.2.0.1′)
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/

Outline Data
————-

/*+
BEGIN_OUTLINE_DATA
INDEX_RS_ASC(@”SEL$1″ “BASELINE_TEST”@”SEL$1″ (“BASELINE_TEST”.”DATA_OBJECT_ID”
“BASELINE_TEST”.”OBJECT_ID”))
OUTLINE_LEAF(@”SEL$1″)
ALL_ROWS
DB_VERSION(’11.2.0.1′)
OPTIMIZER_FEATURES_ENABLE(’11.2.0.1′)
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/

What happens here is that Oracle only stores the outline information and when you run the display_sql_plan_baseline
Oracle tries to generate the plan output from the outline but problem starts there because it tries to use the index which is not existent on the system anymore so it can’t reproduce the plan and instead of giving any warning or error it generates a plan for you by lying you.

To confirm what we said lets drop the secondary index and see what dbms_xplan.display_sql_plan_baseline will generate for us for the same baseline


HR@ORACOS> ---dropping index
HR@ORACOS>
HR@ORACOS> drop index baseline_test_idx_2;

Index dropped.

------FIRST BASELINE CREATED

HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr09db5a957

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr09db5a957         Plan id: 2645928279
Enabled: YES     Fixed: NO      Accepted: NO      Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 2027741861

-------------------------------------------
| Id  | Operation         | Name          |
-------------------------------------------
|   0 | SELECT STATEMENT  |               |
|   1 |  TABLE ACCESS FULL| BASELINE_TEST |
-------------------------------------------

20 rows selected.

As you can see now it generates a new plan for us for the same baseline :) What a liar

What is the output from the valid but not accepted baseline, once we dropped the index it was using


HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 2027741861

-------------------------------------------
| Id  | Operation         | Name          |
-------------------------------------------
|   0 | SELECT STATEMENT  |               |
|   1 |  TABLE ACCESS FULL| BASELINE_TEST |
-------------------------------------------

20 rows selected.

Same issue happened again.

Lets run the same sql to see if it generates and accepts a baseline with full table scan


HR@ORACOS> ---Lets run the sql again to force a plan
HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> @e1

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  910wg4cc9fbwn, child number 0
-------------------------------------
select object_name from baseline_test  where object_id=5 and
data_object_id=10

Plan hash value: 2027741861

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |       |       |   217 (100)|          |
|*  1 |  TABLE ACCESS FULL| BASELINE_TEST |     1 |    32 |   217   (1)| 00:00:02 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("OBJECT_ID"=5 AND "DATA_OBJECT_ID"=10))

19 rows selected.

The baseline is generated as below but it is still not accepted.


HR@ORACOS> @sql_plan_baselines2 910wg4cc9fbwn

##################
#active baselines#
##################

   INST_ID   UE SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER SQL_PLAN_BASELINE              SQL_HANDLE                     PLAN_NAME                      ENABLED ACCEPTED FIXED CREATED                   LAST_EXECUTED
---------- ---- ------------- --------------- ------------ ------------------------------ ------------------------------ ------------------------------ ------- -------- ----- ------------------------- -------------------------
         1    0 910wg4cc9fbwn      2027741861            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr028052575 YES     YES      NO    13-AUG-11 15.45.42.000000 13-AUG-11 15.45.42.000000
         1    0 910wg4cc9fbwn      2027741861            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr09db5a957 YES     NO       NO    13-AUG-11 15.49.54.000000
         1    0 910wg4cc9fbwn      2027741861            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr018837053 YES     NO       NO    13-AUG-11 15.59.15.000000

I assume You already understood the issue. This is a proper buggy/dodgy behaviour for me.

Now lets try some workaround to see how we can get over from this situation.

There are 2 ways to get around from this problem and all of them needs DBA input. I have tested all and they are working but you can go ahead and test them. I don’t want to make this blog post longer than it is so I’m only listing the options.

1- Evolve the baseline for the sql_handle so Oracle accepts the plans (will keep the invalid ones in the system)

2- Drop the invalid baselines and evolve the handle (this way you will get rid of the invalid ones)

3- Drop all baselines related with the objects altered and allow capturing to the trick for you. (Less headache than second but you may lose the good plans- still would be my approach for releases )

Even Oracle warns in documentation “SQL plan baselines cannot help in cases where an event has caused irreversible execution plan changes, such as dropping an index.” and also ” if a change in the system (such as a dropped index) causes all accepted plans to become non-reproducible, then the optimizer selects the best-cost plan. “
The did not give much information about the behaviour.

From my point of view there are two major problems which needs to be addressed for this specific table structure change case

1- Instead of showing a plan which is not same as baseline, display_sql_plan_baseline must report an error that stored baseline is not reproduce-able

2- Baselines are not invalidated there must be an invalidation flag which makes the baseline invalidated at very first time CBO can’t reproduce the plan via baseline and set enabled and accepted flag to no. Since the purging of the plans is 53 weeks by
default for non used baselines this is a needed enhancement. If a plan is accepted and not produceable with smaller cost it avoids other plans accepted.

Another problem with baselines is also covered in this clear post by Maxym Kharchenko which can be related with my first point (that very same post series from Maxym  is one of the reasons I did not continue my series since author nearly covered all -but I still have something to write) What happens if both objects are under two schemas like some development groups uses for different testing phases. One drops the index one did not and you have system level capturing on. ?

I think Oracle must address these issues by introducing another complexity for sql signatures which involves schema owner or something similar.

Long story short you might be easily fooled by baselines with the current structure. If I were you I will always check the plans with at least OUTLINE option or go for ADVANCED.

Confession:n I am always using advanced option but as usual I did not pay that much attention to the output at first point :( That was my biggest mistake there which cost 1 day for analysis. knowing and using an option but not paying attention to the output of the option.)

All tests are done on 11.2.0.1

Bonus material

For the ones who likes tracefiles this issue is shown as this

=======TRACEFILE ======

------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : HR
  plan_baseline signature  : 17515423071774325472
  plan_baseline plan_id    : 671425909
  plan_baseline hintset    :
    hint num  1 len 37 text: OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
SPM: generated non-matching plan:
----- Explain Plan Dump -----
----- Plan Table -----

============
Plan Table
============
------------------------------------------+-----------------------------------+
| Id  | Operation          | Name         | Rows  | Bytes | Cost  | Time      |
------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT   |              |       |       |   217 |           |
| 1   |  TABLE ACCESS FULL | BASELINE_TEST|     1 |    32 |   217 |  00:00:02 |
------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
1 - filter(("OBJECT_ID"=5 AND "DATA_OBJECT_ID"=10))

Content of other_xml column
===========================
  db_version     : 11.2.0.1
  parse_schema   : HR
  plan_hash      : 2027741861
  plan_hash_2    : 411267155
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      DB_VERSION('11.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "BASELINE_TEST"@"SEL$1")
    END_OUTLINE_DATA
  */

------- END SPM Plan Dump -------
SPM: couldn't reproduce any enabled+accepted plan so using the cost-based plan, planId = 411267155

June 24, 2011

Plan is same, Wait profile is different ?

Filed under: Performance — coskan @ 10:42 am

Here comes the nice one, What would you check when you are on 3 node RAC where 4 different database are balanced with services and Developer of one of the DBs come and asked you why their newly deployed batch is 3 times slower on PRD comparing to QA box which was snap copied (storage snap) from production box. RAC specs are exactly same plus all parameters are same ?

PRD runtime plan


PROD> @mon_sqlid2  52dr7m8nnx5mk

DBMS_SQLTUNE.REPORT_SQL_MONITOR(SQL_ID=>'52DR7M8NNX5MK')
----------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Monitoring Report

SQL Text
------------------------------
select PKEY, ISIN from BATCH_TB where BATCH = :1 order by PKEY

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  2
 Session             :  BATCH_APP (414:13998)
 SQL ID              :  52dr7m8nnx5mk
 SQL Execution ID    :  33554695
 Execution Started   :  06/14/2011 13:25:52
 First Refresh Time  :  06/14/2011 13:25:56
 Last Refresh Time   :  06/14/2011 13:26:08
 Duration            :  16s
 Module/Action       :  JDBC Thin Client/-
 Service             :  PRD
 Program             :  JDBC Thin Client
 Fetch Calls         :  1

Binds
========================================================================================================================
| Name | Position |  Type  |                                           Value                                           |
========================================================================================================================
| :1   |        1 | NUMBER | 37755                                                                                     |
========================================================================================================================

Global Stats
=========================================================================================
| Elapsed |   Cpu   |    IO    | Application |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs | Bytes |
=========================================================================================
|      16 |    0.62 |       16 |        0.01 |     0.03 |     1 |  60425 | 3874 | 469MB |
=========================================================================================

SQL Plan Monitoring Details (Plan Hash Value=4053290152)
====================================================================================================================================================
| Id |      Operation       |   Name   |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |    Activity Detail    |
|    |                      |          | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |      (# samples)      |
====================================================================================================================================================
|  0 | SELECT STATEMENT     |          |         |       |           |        |     1 |          |      |       |          |                       |
|  1 |   SORT ORDER BY      |          |     347 | 13216 |           |        |     1 |          |      |       |          |                       |
|  2 |    TABLE ACCESS FULL | BATCH_TB |     347 | 13215 |        17 |     +1 |     1 |        0 | 2996 | 370MB |   100.00 | direct path read (17) |
====================================================================================================================================================

QA runtime plan



QA> @mon_sqlid2 52dr7m8nnx5mk

DBMS_SQLTUNE.REPORT_SQL_MONITOR(SQL_ID=>'52DR7M8NNX5MK')
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Monitoring Report

SQL Text
------------------------------
select PKEY, ISIN from BATCH_TB where BATCH = :1 order by PKEY

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  3
 Session             :  BATCH_APP (621:853)
 SQL ID              :  52dr7m8nnx5mk
 SQL Execution ID    :  50331656
 Execution Started   :  06/14/2011 11:57:46
 First Refresh Time  :  06/14/2011 11:57:50
 Last Refresh Time   :  06/14/2011 11:57:52
 Duration            :  6s
 Module/Action       :  JDBC Thin Client/-
 Service             :  QA
 Program             :  JDBC Thin Client
 Fetch Calls         :  1

Binds
========================================================================================================================
| Name | Position |  Type  |                                           Value                                           |
========================================================================================================================
| :1   |        1 | NUMBER | 37755                                                                                     |
========================================================================================================================

Global Stats
===========================================================================
| Elapsed |   Cpu   |    IO    | Cluster  | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
===========================================================================
|    6.22 |    1.56 |     4.14 |     0.53 |     1 |  59621 |  966 | 107MB |
===========================================================================

SQL Plan Monitoring Details (Plan Hash Value=4053290152)
=========================================================================================================================================================
| Id |      Operation       |   Name   |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |      Activity Detail       |
|    |                      |          | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |        (# samples)         |
=========================================================================================================================================================
|  0 | SELECT STATEMENT     |          |         |       |           |        |     1 |          |      |       |          |                            |
|  1 |   SORT ORDER BY      |          |     338 | 12544 |           |        |     1 |          |      |       |          |                            |
|  2 |    TABLE ACCESS FULL | BATCH_TB |     338 | 12543 |         6 |     +1 |     1 |        0 |  435 |  47MB |   100.00 | Cpu (2)                    |
|    |                      |          |         |       |           |        |       |          |      |       |          | db file scattered read (4) |
=========================================================================================================================================================

(I hope you like the posh sql monitoring output which I really enjoy to use. I have to admit, I wasn’t the big fan of sql monitoring on GUI-just because of the speed of OEM- till I discovered the sqlplus options like the ones above, and now I can’t stop myself from using it and it is very good alternative to trace which is expensive by all means on PRD )

From the ouput above, you can see PRD database is doing direct path reads and QA database is using db_file_scattered reads and PRD is 3 times slower for each run. Somehow oracle favours doing 11G direct path implementation for full table scan (Links for direct path Doug Burns,Alex Fatkulin, Dion Cho ) on PROD. If you read the links you can see that this has to do something with the memory the buffer cache and small table threshold (for more information about this threshold Charles Hooper-great discussion with Jonathan Lewis on comments, Tanel Poder ).

Initially when I checked the output, I saw something weird which is against our policy for this DB. We configured the user services to run on Node-2 only and somehow on QA query runs on Node-3 (instance id), I double checked this with developer and we found out that on test app server they are not using services as service name but instead they use instance name, so he is able to run query on node-3 on QA. I thought this might be the case since Oracle instance is not highly active in terms of memory usage on node-3 so Oracle might favour using expensive buffer cache reads instead of direct path reads. This was a goodish as initial shot but ended up disappointment because when I test the same query on node-2 results were again same. Wasted 5 important minutes :)

After losing 5 minutes on initial shot, now it is time to be more scientific, Lets check what the “documented” memory parameters are

===================QA Parameters ============================

QA> show parameter sga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
sga_max_size                         big integer 10G
sga_target                           big integer 8G
QA> show parameter pga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
pga_aggregate_target                 big integer 6435M
QA> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
db_cache_size                        big integer 0
QA> show parameter shared_pool

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
shared_pool_reserved_size            big integer 70464307
shared_pool_size                     big integer 0
QA> show parameter memory

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------
hi_shared_memory_address             integer     0
memory_max_target                    big integer 0
memory_target                        big integer 0
shared_memory_address                integer     0




===================PRD Parameters ============================

PROD> show parameter sga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
sga_max_size                         big integer 10G
sga_target                           big integer 8G
PROD> show parameter pga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
pga_aggregate_target                 big integer 6435M

PROD> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
db_cache_size                        big integer 0
PROD> show parameter shared_pool

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
shared_pool_reserved_size            big integer 291923558
shared_pool_size                     big integer 0
PROD> show parameter memory

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
hi_shared_memory_address             integer     0
memory_max_target                    big integer 0
memory_target                        big integer 0
shared_memory_address                integer     0
PROD>


This output told me that we have everything same in terms of static parameters, lets check what the small table threshold —> using Tanel Poders pd script for hidden parameters (needs sys access)


QA> @pd small_table_threshold

NAME                                     VALUE                          ISDEFAULT DESCRIPTION
---------------------------------------- ------------------------------ --------- -------------------------------------------------------
_small_table_threshold                   15598                          TRUE      lower threshold level of table size for direct reads


PROD> @pd small_table_threshold

NAME                                     VALUE                          ISDEFAULT DESCRIPTION
---------------------------------------- ------------------------------ --------- ------------------------------------------------------
_small_table_threshold                   5830                           TRUE      lower threshold level of table size for direct reads

Lets check if the small_table_threshold*5 (check Tanels answer on the oracle_l question ) is bigger than number of blocks of the BATCH_TBS which is probably the decision maker for favouring direct path reads

=============================QA ====================================

---number of blocks of BATCH_TB table 

QA> select blocks from dba_segments where segment_name='BATCH_TB';

    BLOCKS
----------
     60416


---5 times small_table_threshold 

QA> select 15598 * 5 from dual;

   15598*5
----------
     77990



=============================PRD ====================================

---number of blocks of BATCH_TB table 

PRD> select blocks from dba_segments where segment_name='BATCH_TB';

    BLOCKS
----------
     61440


---5 times small_table_threshold 

PRD> select 5830 * 5 from dual;

    5830*5
----------
     29150

As you can see we find the root cause but I don’t think developer really cares about the root cause. All he need is a an answer Why they hell this code is working on QA and how can we avoid this problem ?

To answer his questions we need to look at the formula of small_table_threshold which is %2 of the buffer cache size. Which means our buffer cache is bigger on QA. Lets check

=============================QA ====================================


QA> select 0.02*current_size/8192 from v$sga_dynamic_components 
  2 where component='DEFAULT buffer cache';

0.02*CURRENT_SIZE/8192
----------------------
              15482.88



QA>  select component,current_size/1024/1024 current_size_mb
  2  from v$sga_dynamic_components;

COMPONENT                   CURRENT_SIZE_MB
--------------------------- ---------------
shared pool                            1888
large pool                               32
java pool                               160
streams pool                              0
DEFAULT buffer cache                   6048
KEEP buffer cache                         0
RECYCLE buffer cache                      0
DEFAULT 2K buffer cache                   0
DEFAULT 4K buffer cache                   0
DEFAULT 8K buffer cache                   0
DEFAULT 16K buffer cache                  0
DEFAULT 32K buffer cache                  0
Shared IO Pool                            0
ASM Buffer Cache                          0

14 rows selected.


=============================PRD ====================================


PRD> select 0.02*current_size/8192 from v$sga_dynamic_components
  2  where component='DEFAULT buffer cache';

0.02*CURRENT_SIZE/8192
----------------------
                5324.8

PRD> select component,current_size/1024/1024 current_size_mb
  2  from v$sga_dynamic_components;

COMPONENT                    CURRENT_SIZE_MB
---------------------------- ---------------
shared pool                             5952
large pool                                64
java pool                                 32
streams pool                               0
DEFAULT buffer cache                    2080
KEEP buffer cache                          0
RECYCLE buffer cache                       0
DEFAULT 2K buffer cache                    0
DEFAULT 4K buffer cache                    0
DEFAULT 8K buffer cache                    0
DEFAULT 16K buffer cache                   0
DEFAULT 32K buffer cache                   0
Shared IO Pool                             0
ASM Buffer Cache                           0


As you can see %2 of buffer cache formula is true and PRD buffer cache is 3 times less than QA buffer cache.
Why this can be ? Here is the answer


QA> select count(*) from v$sql;

COUNT(*)
———-
26135

PRD> select count(*) from v$sql;

COUNT(*)
———-
68711

We are running 3 times less number of sqls on QA which leads us having 1.8GB shared pool enough on QA but not enough on PRD.
This causes shrink of buffer cache and makes us think we have visually “SAME” environment but end up realistically “DIFFERENT” environments.

My solution for the problem was creating a better index on the table which needed to be there anyway (cannot touch system wide PRD setting):)
The real and long term solution to avoid these kind of QA-PRD different problems is in my opinion having bare minimums by setting db_cache_size and shared_pool_size
and not letting Oracle to decide to shrink below real production levels.

Hope you enjoyed as much as I enjoyed to diagnose it.

!!Make sure you read all the links in this post they are all informative

Note: I needed to stop my upgrade series but will continue as soon as possible :( Really sorry if someone is waiting an input from me.

mon_sqlid2 script is like follow

 set long 10000000
 set longchunksize 10000000
 select dbms_sqltune.report_sql_monitor (sql_id=>'&1') from dual;

pd script to get hidden parameters (From Tanel Poder)

col pd_name head NAME for a40
col pd_value head VALUE for a30
column pd_descr heading DESCRIPTION format a55 word_wrap

select n.ksppinm pd_name, c.ksppstvl pd_value, ksppstdf isdefault,n.ksppdesc pd_descr
from x$ksppi n, x$ksppcv c
where n.indx=c.indx
and (
   lower(n.ksppinm) like lower('&1') 
   or lower(n.ksppdesc) like lower('&1')
);

March 4, 2011

Plan stability through Upgrade to 11G-Why is my plan changed?-extra nested loop

Filed under: Bugs, Performance, Plan Stability — coskan @ 1:30 pm

After you upgrade to 11G from lower versions you might realize that some of your plans involving nested loops are changed and you started to see extra nested loops.
According to documentation this is because of the “new” implementation for Nested Loop Joins.

Document says
Oracle Database 11g introduces a new implementation for nested loop joins. As a result, execution plans that include nested loops might appear different
than they did in previous releases of Oracle Database. Both the new implementation and the original implementation for nested loop
joins are possible in Oracle Database 11g. So, when analyzing execution plans, it is important to understand that the number of NESTED LOOPS join row sources
might be different.

Oracle Database 11g introduces a new implementation for nested loop joins to reduce overall latency for physical I/O.
When an index or a table block is not in the buffer cache and is needed to process the join, a physical I/O is required.
Oracle Database 11g can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time.

As part of the new implementation for nested loop joins, two NESTED LOOPS join row sources might appear in the execution plan where only one
would have appeared in prior releases. In such cases, Oracle Database allocates one NESTED LOOPS join row source to join the values from the table
on the outer side of the join with the index on the inner side. A second row source is allocated to join the result of the first join,
which includes the rowids stored in the index, with the table on the inner side of the join.

Here are the plan differences between new implementation and old one.

Query of the plan

select /*+ use_nl(t2) full(t2) gather_plan_statistics*//*test_650*/ * from t2,t1
where t1.id=t2.id
and t2.id>650000;

for 10.2.0.4


HR@ORACOS> @ofe
Enter value for ofe: 10.2.0.4

Session altered.

HR@ORACOS> @test_e

Explained.

HR@ORACOS> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1708821971

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   350K|    78M|   703K  (1)| 00:58:40 |
|   1 |  NESTED LOOPS                |           |   350K|    78M|   703K  (1)| 00:58:40 |
|*  2 |   TABLE ACCESS FULL          | T2        |   350K|    39M|  3446   (1)| 00:00:18 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T1        |     1 |   118 |     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

18 rows selected.

HR@ORACOS>

For 11.2.0.1


HR@ORACOS> @ofe
Enter value for ofe: 11.2.0.1

Session altered.

HR@ORACOS> @test_e

Explained.

HR@ORACOS> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3217821761

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   350K|    78M|   703K  (1)| 00:58:40 |
|   1 |  NESTED LOOPS                |           |       |       |            |          |
|   2 |   NESTED LOOPS               |           |   350K|    78M|   703K  (1)| 00:58:40 |
|*  3 |    TABLE ACCESS FULL         | T2        |   350K|    39M|  3446   (1)| 00:00:18 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |     1 |       |     1   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1        |     1 |   118 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

19 rows selected.

According to me, this change brought extra complexity to read the plans. NLJ plans prior to 11G are way easier to read and bear in mind this example is just a simple 2 table join.
Once you query has got crazy number of joins it gets harder and harder to read. One thing you can spot is that the cost is “same” which is very good for plan change analysis.
According to my experience, %99 of same cost plan changes are because of this implementation and this made me exclude same cost-different plan_hash_value plans from comparisons.

On the last post, I already gave some information about the controlling hidden CBO parameter _nlj_batching_enabled.
Better I repeat here, this parameter has 2 values, 0 to disable the behaviour, 1 (which is default) to enable the behavior and this parameter is there since 11GR1.

When I interpret the sentence “new implementation for nested loop joins to reduce overall latency for physical I/O”. I expect some improvement with runtime like the effect of bulk collect or changing the array size of your sessions. But is my assumption true ?
I have doubts about this definition because in reality I do not feel any difference as interpreted from the official definition.
Nothing got faster in our production environment related with this change.

To prove my experience, I build a test case which you can find in the bottom of the page.

My test environment is 11.2.0.1 on Ubuntu with 1.5G memory target on a machine with 4 cores.

For this test I have 3 tables all of which have ID columns and have an index on ID columns

T1=100 million rows (I think I was too generous)
T2=1 million rows
T3=50K rows -derived from T2

First query is for small data set which I can do buffer cache read after the initial run to test non phsical I/O performance

select /*+ use_nl(t3) full(t3) gather_plan_statistics*//*test_999550*/ * from t3,t1
where t1.id=t3.id;

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=0

49999 rows selected.

Elapsed: 00:00:01.51

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      52395  consistent gets
       1842  physical reads
          0  redo size
    1781536  bytes sent via SQL*Net to client
       1612  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49999  rows processed

Plan hash value: 692324935

------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      0 |        |      0 |00:00:00.01 |       0 |      0 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |  49999 |00:00:00.83 |   52395 |   1842 |
|   2 |   NESTED LOOPS              |           |      1 |  50143 |  99999 |00:00:04.65 |    2396 |    978 |
|   3 |    TABLE ACCESS FULL        | T3        |      1 |  49999 |  49999 |00:00:00.11 |     949 |    849 |
|*  4 |    INDEX UNIQUE SCAN        | T1_IDX_ID |  49999 |      1 |  49999 |00:00:00.16 |    1447 |    129 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T3"."ID")

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=1

select /*+ use_nl(t3) full(t3) gather_plan_statistics*//*test_999550*/ * from t3,t1
  2  where t1.id=t3.id;

49999 rows selected.

Elapsed: 00:00:01.40

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      52395  consistent gets
       1842  physical reads
          0  redo size
    1781536  bytes sent via SQL*Net to client
       1612  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49999  rows processed

Plan hash value: 198103700

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |  49999 |00:00:00.74 |   52395 |   1842 |
|   1 |  NESTED LOOPS                |           |      1 |        |  49999 |00:00:00.74 |   52395 |   1842 |
|   2 |   NESTED LOOPS               |           |      1 |  50143 |  49999 |00:00:00.27 |    2396 |    978 |
|   3 |    TABLE ACCESS FULL         | T3        |      1 |  49999 |  49999 |00:00:00.07 |     949 |    849 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |  49999 |      1 |  49999 |00:00:00.15 |    1447 |    129 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1        |  49999 |      1 |  49999 |00:00:00.22 |   49999 |    864 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T3"."ID")

Second query is for bigger data set which always do some physical read for T2

select /*+ use_nl(t2) full(t2) gather_plan_statistics*//*test_650*/ * from t2,t1
where t1.id=t2.id
and t2.id>650000;

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=0


select /*+ use_nl(t2) full(t2) gather_plan_statistics*//*test_650*/ * from t2,t1
where t1.id=t2.id
and t2.id>650000;
  2    3

350000 rows selected.

Elapsed: 00:00:10.11

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     377819  consistent gets
      23736  physical reads
          0  redo size
   12606669  bytes sent via SQL*Net to client
       8212  bytes received via SQL*Net from client
        701  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     350000  rows processed

Plan hash value: 1010637246

------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      0 |        |      0 |00:00:00.01 |       0 |      0 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |    350K|00:00:04.91 |     377K|  23736 |
|   2 |   NESTED LOOPS              |           |      1 |    350K|    700K|00:01:46.31 |   27819 |  17752 |
|*  3 |    TABLE ACCESS FULL        | T2        |      1 |    350K|    350K|00:00:01.39 |   17643 |  16951 |
|*  4 |    INDEX UNIQUE SCAN        | T1_IDX_ID |    350K|      1 |    350K|00:00:01.05 |   10176 |    801 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=1

350000 rows selected.

Elapsed: 00:00:10.00

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     377819  consistent gets
      23736  physical reads
          0  redo size
   12606669  bytes sent via SQL*Net to client
       8212  bytes received via SQL*Net from client
        701  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     350000  rows processed

Plan hash value: 3217821761

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |    350K|00:00:04.67 |     377K|  23736 |
|   1 |  NESTED LOOPS                |           |      1 |        |    350K|00:00:04.67 |     377K|  23736 |
|   2 |   NESTED LOOPS               |           |      1 |    350K|    350K|00:00:02.54 |   27819 |  17752 |
|*  3 |    TABLE ACCESS FULL         | T2        |      1 |    350K|    350K|00:00:01.02 |   17643 |  16951 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |    350K|      1 |    350K|00:00:01.01 |   10176 |    801 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1        |    350K|      1 |    350K|00:00:01.47 |     350K|   5984 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

On the last post I already gave some information about the controlling parameter _nlj_batching_enabled so I change only this parameter to do my test and “keep OFE 11.2.0.1″

I ran these queries 4*50 times

1- _nlj_batching_enabled=1 (11G behaviour), flush the buffer cache twice and flush the cursor before first run, and run the statement 50 times.
2- _nlj_batching_enabled=1 (11G behaviour), flush the cursor at first run, flush the buffer cache twice before “each run” and run the statement 50 times to force phsical read all the time
3- _nlj_batching_enabled=0 (pre 11G behaviour), flush the buffer cache twice and flush the cursor before first run, and run the statement 50 times.
4- _nlj_batching_enabled=0 (pre 11G behaviour), flush the cursor at first run, flush the buffer cache twice before “each run” and run the statement 50 times to force phsical read all the time

After each run I snap the statistics from v$sql for the related sql and then do the comparison afterwards

Here are the results -( AVG1 includes first run for the step 1 and 3 which does both phsical and buffer cache read first run is not included on other calculations)

New behavior was faster in my Small dataset test.
I only consider median and for buffer cache reads it 10 milisecond less for CPU and 25 miliseconds less for Total ELA.
In Percentage we can say ~%5 for CPU and %6 for Total ELA
When it comes to phsical reads again 10 milisecond less for CPU and 11 miliseconds less for Total ELA
In percentage improvement is ~%4 for CPU and ~%1.5 for Total ELA

I was initially expecting bigger dataset will be again faster because we will do more phsical read but
This time new behaviour is slower for big dataset
For buffer cache reads 20 miliseconds slower for CPU and ~3 miliseconds slower for total ELA
In percentage degredation is ~%1.5 for CPU and ~%0.06 for Total ELA
For phsical reads 40 miliseconds slower for CPU and 7.5 miliseconds slower for total ELA
In percentage degredation is ~%2.5 for CPU and ~%0.14 for Total ELA

When I interpret these results I have the gut feeling that this is not like what is written in the documentation.
It looks like response time wise behavior is nearly same. From this output one  can say either CBO engineers wanted to make it harder to read the plans or maybe they wanted to be more honest to show what was always there.
Better optimizer group write a blog post about it rather than me :)

There is a bug note which also supports the idea that its been there since 9i

Bug 9919556: OFN DB: CUSTOMER INTERFACE PROGRAM TAKING LONGER TIME AFTER 11G R2 UPGRADE

There are also some other bugs related with dumps and wrong results reported for this behavior which is inconsistent with the idea that its been there for a long time.
Bugs disappeared when behavior is disabled and revert back to original so it loooks like has some effect somewhere but not necesarrily on performance as it can be interpreted??

Now its time for bonus material, probably careful eyes already spotted that the plan with _nlj_batching_enabled=0 is different than plan would be generated on 10.2.04 and also
there is a problem with runtime A-time values on 3th step of the plans when _nlj_batching_enabled=0 in 11.2.0.1. They are basically reported way higher than actual elapsed time.

First let me show you how actual pre 11G plan and reported A-Time values are

(to get this plan I flush buffer cache twice,flush the sql from shared pool, set ofe to 10.2.0.4 instead of _nlj_batching_enabled=0 on OFE=11.2.0.1)

Small Dataset


HR@ORACOS> @test

49999 rows selected.

Elapsed: 00:00:01.44

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      52395  consistent gets
       1842  physical reads
          0  redo size
    1781536  bytes sent via SQL*Net to client
       1613  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49999  rows processed

Plan hash value: 225503291

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |  49999 |00:00:00.74 |   52395 |   1842 |
|   1 |  NESTED LOOPS                |           |      1 |  50143 |  49999 |00:00:00.74 |   52395 |   1842 |
|   2 |   TABLE ACCESS FULL          | T3        |      1 |  49999 |  49999 |00:00:00.05 |     949 |    849 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T1        |  49999 |      1 |  49999 |00:00:00.44 |   51446 |    993 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |  49999 |      1 |  49999 |00:00:00.15 |    1447 |    129 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T3"."ID")

Big Dataset


HR@ORACOS> @test

350000 rows selected.

Elapsed: 00:00:10.25

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     377819  consistent gets
      23736  physical reads
          0  redo size
   12606669  bytes sent via SQL*Net to client
       8213  bytes received via SQL*Net from client
        701  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     350000  rows processed

Plan hash value: 1708821971

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |    350K|00:00:04.45 |     377K|  23736 |
|   1 |  NESTED LOOPS                |           |      1 |    350K|    350K|00:00:04.45 |     377K|  23736 |
|*  2 |   TABLE ACCESS FULL          | T2        |      1 |    350K|    350K|00:00:01.03 |   17643 |  16951 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T1        |    350K|      1 |    350K|00:00:02.81 |     360K|   6785 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |    350K|      1 |    350K|00:00:01.02 |   10176 |    801 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

As you can see timings are now more accurate and Nested Loop step is now changed its location in the plan.
I don’t know why A-time is reported wrong but thanks to the methods I described in earlier methods I can say, What is causing the plan change difference like below


HR@ORACOS> alter session set "_nlj_batching_enabled"=0;  ===> to revert back pre 11G behavior

Session altered.

HR@ORACOS>
HR@ORACOS> @build_optimizer_bugfix_test_e                ====> to find out which bugfix may have caused
Enter value for optimizer_feature_version: 11.1.0.6      ====> I alread found out it changed after 11.1.0.6
HR@ORACOS> @optimizer_bugfix_test_e

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
225503291               5585313
692324935               6439032
                        4279274
                        4370840
                        4386734
                        4507997
			....
			....
			....

39 rows selected.

HR@ORACOS> @bugfixg
Enter value for bugno: 5585313

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- -
   5585313          1 QKSFM_EXECUTION_5585313   enable index prefetching for NLJ with unique scan as inner acces 11.1.0.6

Since I know what is causing the plan changes for nested loop now its time for me to go ahead and test if OFE=10.2.0.4 is better for the tests I run above.

Here are the comparison between OFE 11.2.0.1 and OFE 10.2.0.4 (instead of setting _nlj_batching_enabled=0 while OFE is still 11.2.0.1 )

This time 10.2.0.4 results were better for both small and big dataset
for small dataset
For buffer cache reads, 10 miliseconds slower for CPU and ~3 miliseconds slower for total ELA
In percentage degredation is ~%7 for CPU and ~%0.06 for Total ELA
For phsical reads 10 miliseconds slower for CPU and ~19 miliseconds slower for total ELA
In percentage degredation is ~%7 for CPU and ~%8 for Total ELA

for big dataset
For buffer cache reads 70 miliseconds slower for CPU and ~65 miliseconds slower for total ELA
In percentage degredation is ~%5 for CPU and ~%1.3 for Total ELA
For phsical reads 80 miliseconds slower for CPU and 120 miliseconds slower for total ELA
In percentage degredation is ~%4.8 for CPU and ~%2.2 for Total ELA

As a last word I can’t say my tests are best way to measure, however results are not looking consistent with what is written in the official documentation.
I think this is a new feature that will effect your plans after you upgrade your DB to 11G, but it will not necessarily make any perceivable difference to your response times.
In my humble opinion as long as the cost of your queries are same with the new implementation in place,
you can ignore the existence of it and never bother to change the parameter unless support says so.

Please feel free to comment if you have anything to say on this one and if you spot anything wrong in my analysis.

 

Source Code;
Test script
Full Excell Sheet for time comparisons

<–Previous Post in This Series —————- Next Post in this series–>

February 17, 2011

Plan Stability Through Upgrade-Why is my plan changed?-new optimizer parameters

Filed under: CBO, Performance, Plan Stability — coskan @ 8:09 pm

Every Oracle version, developers at Oracle add or change some of the undocumented parameters to do better optimization. Most of the time these new optimizations
works fine but from time to time they have a negative effect for the generated plan which causes post upgrade slowness. When you change OFE (optimizer_features_enable)
from 10.2.0.4 to 11.2.0.1 Oracle changes value of 33 parameters !!! Because they are undocumented, unless an Oracle Scientist reveals what they are actually doing
or note on MOS explaining the parameter, their effects to the plans are not that clear.

To overcome this problem I again wrote couple of simple scripts to automate the process of what is actually changing our plan. They really helped me during the upgrades
and understanding of plans. Again purpose of the post is not about telling deeply how these parameters are effecting plans but how we can find what parameter is effecting.

First script is

build_opt_param_table: It needs to be run as root or a user which has read access to x$ksppi and x$ksppcv tables.
What this script does is, it accepts a base OFE version then creates a table (optimizer_parameters) for all the parameters for given OFE version
and OFE versions above. This script is the prerequsite for other scripts in this post because they will all use optimizer_parameters table.
!!! In the script it creates the public synonym and grants select to dba.

First lets build the optimizer_parameters table

SQL> @build_opt_param_table
Enter value for optimizer_feature_version: 10.2.0.4
SQL> grant select on optimizer_parameters to DBA;

Grant succeeded.

SQL> create public synonym optimize_parameters for optimizer_parameters;

Synonym created.

SQL> select ofe,count(*) from optimizer_parameters group by OFE;

OFE                 COUNT(*)
----------------- ----------
10.2.0.4                2399
10.2.0.5                2399
11.1.0.6                2399
11.1.0.7                2399
11.2.0.1                2399

Once our table is generated we can see the differences between OFE versions

before_after_opt_param: This script is just checking which values are different for given to OFE versions

Changes between 10.2.0.4 to 10.2.0.5

SQL> @before_after_opt_param
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 10.2.0.5

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------
optimizer_features_enable                          10.2.0.4             10.2.0.5             optimizer plan compatibility parameter
_optimizer_undo_cost_change                        10.2.0.4             10.2.0.5             optimizer undo cost change

Changes between 10.2.0.4 to 11.1.0.6

SQL> @before_after_opt_param
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 11.1.0.6

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          10.2.0.4             11.1.0.6             optimizer plan compatibility parameter
_replace_virtual_columns                           FALSE                TRUE                 replace expressions with virtual columns
_optimizer_undo_cost_change                        10.2.0.4             11.1.0.6             optimizer undo cost change
_optimizer_null_aware_antijoin                     FALSE                TRUE                 null-aware antijoin parameter
_optimizer_native_full_outer_join                  OFF                  FORCE                execute full outer join using native implementaion
_optimizer_multi_level_push_pred                   FALSE                TRUE                 consider join-predicate pushdown that requires multi-level pushdown to base table
_optimizer_improve_selectivity                     FALSE                TRUE                 improve table and partial overlap join selectivity computation
_optimizer_group_by_placement                      FALSE                TRUE                 consider group-by placement optimization
_optimizer_extended_stats_usage_control            255                  240                  controls the optimizer usage of extended stats
_optimizer_extended_cursor_sharing_rel             NONE                 SIMPLE               optimizer extended cursor sharing for relational operators
_optimizer_extend_jppd_view_types                  FALSE                TRUE                 join pred pushdown on group-by, distinct, semi-/anti-joined view
_optimizer_enable_extended_stats                   FALSE                TRUE                 use extended statistics for selectivity estimation
_optimizer_enable_density_improvements             FALSE                TRUE                 use improved density computation for selectivity estimation
_optimizer_adaptive_cursor_sharing                 FALSE                TRUE                 optimizer adaptive cursor sharing
_nlj_batching_enabled                              0                    1                    enable batching of the RHS IO in NLJ
_first_k_rows_dynamic_proration                    FALSE                TRUE                 enable the use of dynamic proration of join cardinalities
_bloom_pruning_enabled                             FALSE                TRUE                 Enable partition pruning using bloom filtering

17 rows selected.

Changes between 11.1.0.6 to 11.1.0.7

SQL> @before_after_opt_param
Enter value for before_ofe: 11.1.0.6
Enter value for after_ofe: 11.1.0.7

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          11.1.0.6             11.1.0.7             optimizer plan compatibility parameter
_optimizer_undo_cost_change                        11.1.0.6             11.1.0.7             optimizer undo cost change
_optimizer_extended_stats_usage_control            240                  224                  controls the optimizer usage of extended stats
_bloom_folding_enabled                             FALSE                TRUE                 Enable folding of bloom filter

Changes between 11.1.0.7 to 11.2.0.1

SQL> @before_after_opt_param
Enter value for before_ofe: 11.1.0.7
Enter value for after_ofe: 11.2.0.1

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          11.1.0.7             11.2.0.1             optimizer plan compatibility parameter
_optimizer_use_feedback                            FALSE                TRUE                 optimizer use feedback
_optimizer_use_cbqt_star_transformation            FALSE                TRUE                 use rewritten star transformation using cbqt framework
_optimizer_unnest_disjunctive_subq                 FALSE                TRUE                 Unnesting of disjunctive subqueries (TRUE/FALSE)
_optimizer_unnest_corr_set_subq                    FALSE                TRUE                 Unnesting of correlated set subqueries (TRUE/FALSE)
_optimizer_undo_cost_change                        11.1.0.7             11.2.0.1             optimizer undo cost change
_optimizer_try_st_before_jppd                      FALSE                TRUE                 try Star Transformation before Join Predicate Push Down
_optimizer_table_expansion                         FALSE                TRUE                 consider table expansion transformation
_optimizer_join_factorization                      FALSE                TRUE                 use join factorization transformation
_optimizer_fast_pred_transitivity                  FALSE                TRUE                 use fast algorithm to generate transitive predicates
_optimizer_fast_access_pred_analysis               FALSE                TRUE                 use fast algorithm to traverse predicates for physical optimizer
_optimizer_eliminate_filtering_join                FALSE                TRUE                 optimizer filtering join elimination enabled
_optimizer_distinct_placement                      FALSE                TRUE                 consider distinct placement optimization
_optimizer_distinct_agg_transform                  FALSE                TRUE                 Transforms Distinct Aggregates to non-distinct aggregates
_optimizer_connect_by_elim_dups                    FALSE                TRUE                 allow connect by to eliminate duplicates from input
_optimizer_coalesce_subqueries                     FALSE                TRUE                 consider coalescing of subqueries optimization
_connect_by_use_union_all                          OLD_PLAN_MODE        TRUE                 use union all for connect by
_and_pruning_enabled                               FALSE                TRUE                 allow partition pruning based on multiple mechanisms
_aggregation_optimization_settings                 32                   0                    settings for aggregation optimizations

19 rows selected.

As you might expect changes between minor releases are very small numbers but major release changes really effects the way CBO works.

Next script is for using the information above as a test case. They have the similar mind as bugfix test builders.

build_optimizer_param_test: accepts before and after OFE. you can make the test by changing parameters one by one do the test and revert back to the original.
Even it accepts the OFE as before and after you can reverse the order like after can be 10.2.0.4 and before can be 11.2.0.1. What the script will do is
set OFE to 11.2.0.1 and try each parameter by reverting them back to 10.2.0.4 values. Generated scripts is optimizer_parameter_test.sql and test.sql is the
script it calls for tests.

build_optimizer_param_test_e: Same as above but this time it is for explain plan only and generates optimizer_parameter_test_e.sql and uses test_e.sql

Now its time to play with the scripts

After the upgrade one of the massive queries again start to go crazy. This time I have the chance to build a test case.
We have two tables monthly partitioned on ASOF date. Once we use ansi “full outer join” syntax Oracle was not doing partition pruning
after upgrade from 10.2.0.4 to 11.2.0.1.

The full test case can be seen here full_outer_join_test_case.sql.

The test_e.sql is like below

EXPLAIN PLAN
   FOR
      SELECT T1.asof asof,
             T1.t1_id,
             t1.t1_data,
             T2.t2_data
        FROM    table_1 t1
             FULL OUTER JOIN
                (SELECT *
                   FROM table_2
                  WHERE table_2.asof = TO_DATE ('15Sep10', 'DDMONYY')) t2
             ON t1.asof = t2.asof
       WHERE t1.asof = TO_DATE ('15Sep10', 'DDMONYY');

Lets see how our test case behaves with different OFE settings


HR@ORACOS> @test_e   =>default OFE is 11.2.0.1
HR@ORACOS> @e

Plan hash value: 1183327479

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |          |  1991K|  7644M| 16343   (1)| 00:01:22 |       |       |
|*  1 |  VIEW                                 | VW_FOJ_0 |  1991K|  7644M| 16343   (1)| 00:01:22 |       |       |
|*  2 |   HASH JOIN FULL OUTER                |          |  1991K|  4348M| 16343   (1)| 00:01:22 |       |       |
|   3 |    PARTITION RANGE SINGLE             |          |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|   4 |     VIEW                              |          |     1 |  2011 |     1   (0)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID| T2       |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |       INDEX SKIP SCAN                 | TEST2_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|   7 |    PARTITION RANGE ALL                |          |  1991K|   529M| 16329   (1)| 00:01:22 |     1 |    28 |
|   8 |     TABLE ACCESS FULL                 | T1       |  1991K|   529M| 16329   (1)| 00:01:22 |     1 |    28 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
   2 - access("T1"."ASOF"="T2"."ASOF")
   6 - access("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))

HR@ORACOS> alter session set optimizer_features_enable='10.2.0.4';

Session altered.
HR@ORACOS> @test_e

Explained.
HR@ORACOS> @e

Plan hash value: 2581814110

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |          |     2 |  8052 |     1 (100)| 00:00:01 |       |       |
|   1 |  VIEW                                  |          |     2 |  8052 |     1 (100)| 00:00:01 |       |       |
|   2 |   UNION-ALL                            |          |       |       |            |          |       |       |
|*  3 |    HASH JOIN OUTER                     |          |     1 |  4035 |     1 (100)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE             |          |     1 |  2024 |     0   (0)| 00:00:01 |   KEY |   KEY |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID | T1       |     1 |  2024 |     0   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |       INDEX SKIP SCAN                  | TEST1_PK |     1 |       |     0   (0)| 00:00:01 |   KEY |   KEY |
|   7 |     PARTITION RANGE SINGLE             |          |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|   8 |      TABLE ACCESS BY LOCAL INDEX ROWID | T2       |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |       INDEX SKIP SCAN                  | TEST2_PK |     1 |       |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |    FILTER                              |          |       |       |            |          |       |       |
|  11 |     NESTED LOOPS ANTI                  |          |     1 |  2020 |     0   (0)| 00:00:01 |       |       |
|  12 |      PARTITION RANGE SINGLE            |          |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|  13 |       TABLE ACCESS BY LOCAL INDEX ROWID| T2       |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 14 |        INDEX SKIP SCAN                 | TEST2_PK |     1 |       |     0   (0)| 00:00:01 |   KEY |   KEY |
|  15 |      PARTITION RANGE SINGLE            |          |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 16 |       INDEX FULL SCAN                  | TEST1_PK |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."ASOF"="T2"."ASOF"(+))
   6 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
   9 - access("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
  10 - filter(CAST(NULL AS DATE)=TO_DATE('15Sep10','DDMONYY'))
  14 - access("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
  16 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))

As you can clearly see, partition pruning is not happening on 11.2.0.1 and query suddenly becomes killer for the system and slower for end user.

Lets see what is changed in terms of parameters between the releases?


HR@ORACOS> @before_after_opt_param
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 11.2.0.1

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          10.2.0.4             11.2.0.1             optimizer plan compatibility parameter
_replace_virtual_columns                           FALSE                TRUE                 replace expressions with virtual columns
_optimizer_use_feedback                            FALSE                TRUE                 optimizer use feedback
_optimizer_use_cbqt_star_transformation            FALSE                TRUE                 use rewritten star transformation using cbqt framework
_optimizer_unnest_disjunctive_subq                 FALSE                TRUE                 Unnesting of disjunctive subqueries (TRUE/FALSE)
_optimizer_unnest_corr_set_subq                    FALSE                TRUE                 Unnesting of correlated set subqueries (TRUE/FALSE)
_optimizer_undo_cost_change                        10.2.0.4             11.2.0.1             optimizer undo cost change
_optimizer_try_st_before_jppd                      FALSE                TRUE                 try Star Transformation before Join Predicate Push Down
_optimizer_table_expansion                         FALSE                TRUE                 consider table expansion transformation
_optimizer_null_aware_antijoin                     FALSE                TRUE                 null-aware antijoin parameter
_optimizer_native_full_outer_join                  OFF                  FORCE                execute full outer join using native implementaion
_optimizer_multi_level_push_pred                   FALSE                TRUE                 consider join-predicate pushdown that requires multi-level pushdown to base table
_optimizer_join_factorization                      FALSE                TRUE                 use join factorization transformation
_optimizer_improve_selectivity                     FALSE                TRUE                 improve table and partial overlap join selectivity computation
_optimizer_group_by_placement                      FALSE                TRUE                 consider group-by placement optimization
_optimizer_fast_pred_transitivity                  FALSE                TRUE                 use fast algorithm to generate transitive predicates
_optimizer_fast_access_pred_analysis               FALSE                TRUE                 use fast algorithm to traverse predicates for physical optimizer
_optimizer_extended_stats_usage_control            255                  224                  controls the optimizer usage of extended stats
_optimizer_extended_cursor_sharing_rel             NONE                 SIMPLE               optimizer extended cursor sharing for relational operators
_optimizer_extend_jppd_view_types                  FALSE                TRUE                 join pred pushdown on group-by, distinct, semi-/anti-joined view
_optimizer_enable_extended_stats                   FALSE                TRUE                 use extended statistics for selectivity estimation
_optimizer_enable_density_improvements             FALSE                TRUE                 use improved density computation for selectivity estimation
_optimizer_eliminate_filtering_join                FALSE                TRUE                 optimizer filtering join elimination enabled
_optimizer_distinct_placement                      FALSE                TRUE                 consider distinct placement optimization
_optimizer_distinct_agg_transform                  FALSE                TRUE                 Transforms Distinct Aggregates to non-distinct aggregates
_optimizer_connect_by_elim_dups                    FALSE                TRUE                 allow connect by to eliminate duplicates from input
_optimizer_coalesce_subqueries                     FALSE                TRUE                 consider coalescing of subqueries optimization
_optimizer_adaptive_cursor_sharing                 FALSE                TRUE                 optimizer adaptive cursor sharing
_nlj_batching_enabled                              0                    1                    enable batching of the RHS IO in NLJ
_first_k_rows_dynamic_proration                    FALSE                TRUE                 enable the use of dynamic proration of join cardinalities
_connect_by_use_union_all                          OLD_PLAN_MODE        TRUE                 use union all for connect by
_bloom_pruning_enabled                             FALSE                TRUE                 Enable partition pruning using bloom filtering
_bloom_folding_enabled                             FALSE                TRUE                 Enable folding of bloom filter
_and_pruning_enabled                               FALSE                TRUE                 allow partition pruning based on multiple mechanisms
_aggregation_optimization_settings                 32                   0                    settings for aggregation optimizations

35 rows selected.

35 parameters changed and we already now bugfix is not our problem.

Lets build our test case and run it. (check the time, only 0.1 seconds to understand which parameter is driving my problem)

!!! I am testing 11.2.0.1 parameters when I am on 10.2.0.4


HR@ORACOS> @build_optimizer_param_test_e
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 11.2.0.1

HR@ORACOS> @optimizer_param_test_e

PD_NAME                                            PD_VALUE      PLAN_HASH_VALUE
-------------------------------------------------- ---------- ------------------
_optimizer_native_full_outer_join                  FORCE              1183327479
_aggregation_optimization_settings                 0                  2581814110
_optimizer_use_cbqt_star_transformation            TRUE
_optimizer_unnest_disjunctive_subq                 TRUE
_optimizer_unnest_corr_set_subq                    TRUE
_optimizer_try_st_before_jppd                      TRUE
_optimizer_table_expansion                         TRUE
_optimizer_null_aware_antijoin                     TRUE
_optimizer_multi_level_push_pred                   TRUE
_optimizer_join_factorization                      TRUE
_optimizer_improve_selectivity                     TRUE
_optimizer_group_by_placement                      TRUE
_optimizer_fast_pred_transitivity                  TRUE
_optimizer_fast_access_pred_analysis               TRUE
_optimizer_extended_stats_usage_control            224
_optimizer_extended_cursor_sharing_rel             SIMPLE
_optimizer_extend_jppd_view_types                  TRUE
_optimizer_enable_extended_stats                   TRUE
_optimizer_enable_density_improvements             TRUE
_optimizer_eliminate_filtering_join                TRUE
_optimizer_distinct_placement                      TRUE
_optimizer_distinct_agg_transform                  TRUE
_optimizer_connect_by_elim_dups                    TRUE
_optimizer_coalesce_subqueries                     TRUE
_optimizer_adaptive_cursor_sharing                 TRUE
_nlj_batching_enabled                              1
_first_k_rows_dynamic_proration                    TRUE
_connect_by_use_union_all                          TRUE
_bloom_pruning_enabled                             TRUE
_bloom_folding_enabled                             TRUE
_and_pruning_enabled                               TRUE
_replace_virtual_columns                           TRUE
_optimizer_use_feedback                            TRUE

33 rows selected.

Elapsed: 00:00:00.01

Session altered.

Elapsed: 00:00:00.01

Our plan suddenly reverts back to the wrong plan when _optimizer_native_full_outer_join is force.

Lets turn it off and see the effect to confirm our test environment is right

HR@ORACOS> alter session set "_optimizer_native_full_outer_join"=OFF;

Session altered.

HR@ORACOS> @test_e

Explained.

HR@ORACOS> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2581814110

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |          |     2 |  8052 |     4  (25)| 00:00:01 |       |       |
|   1 |  VIEW                                  |          |     2 |  8052 |     4  (25)| 00:00:01 |       |       |
|   2 |   UNION-ALL                            |          |       |       |            |          |       |       |
|*  3 |    HASH JOIN OUTER                     |          |     1 |  4035 |     3  (34)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE             |          |     1 |  2024 |     1   (0)| 00:00:01 |   KEY |   KEY |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID | T1       |     1 |  2024 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |       INDEX SKIP SCAN                  | TEST1_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|   7 |     PARTITION RANGE SINGLE             |          |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|   8 |      TABLE ACCESS BY LOCAL INDEX ROWID | T2       |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |       INDEX SKIP SCAN                  | TEST2_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |    FILTER                              |          |       |       |            |          |       |       |
|  11 |     NESTED LOOPS ANTI                  |          |     1 |  2020 |     1   (0)| 00:00:01 |       |       |
|  12 |      PARTITION RANGE SINGLE            |          |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|  13 |       TABLE ACCESS BY LOCAL INDEX ROWID| T2       |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 14 |        INDEX SKIP SCAN                 | TEST2_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|  15 |      PARTITION RANGE SINGLE            |          |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 16 |       INDEX FULL SCAN                  | TEST1_PK |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."ASOF"="T2"."ASOF"(+))
   6 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
   9 - access("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
  10 - filter(CAST(NULL AS DATE)=TO_DATE('15Sep10','DDMONYY'))
  14 - access("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
  16 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))

37 rows selected.

Perfect, in less then 5 minutes we spot the difference . Actually this was the issue which started everything for this series.
I spent 1 full day on the real sql and 10053 traces and suddenly I checked the sql carefully again (another “invalid dba exception”, should have done before )
and it reminded me Christian Antognini’s post about native full outer join support (Being a blog reader is good thing !!!)
then I checked the parameter and immediatelly realized that it was my issue.

After I automate all the things I have tried for this issue now I really feel confident for %99 of the issues caused during upgrades with bugfix and optimizer_parameter changes. 10053 with many joins makes me feel really dumb :)
Thank god at least for the upgrades I don’t have to go through it again.

For this particular case because I was ready with all the information I needed, I raised the issue with Oracle and got response back on the “same day” that
there is a bug with a patch 9287401 available, and once I patched it worked like charm.

Unfortunatelly, not eveything is clear as daylight like on this example:( There can be combination of parameters effecting the generation of the plan.
Controlled environment like the one above (one parameter at a time) only reveals simple issue. We may need to run non_controlled version
to see how things can differ when we keep the parameter at the value we set.

Non controlled version for the script is (sorry for the dodgy naming)

build_optimizer_param_test_2 : Same logic this time we keep the parameter with the setting we set.
build_optimizer_param_test_e_2: Same logic this time we keep the parameter with the setting we set.

There is also one more thing I discovered, you need to re-connect after your tests because there is no “alter session reset” option for session level parameters.
When you explicitly set a parameter, changing OFE does not adjust that parameter so better refresh your session environment by re-connect

Lets see how non controlled test case work.

This time I don’t have a test case but a real case where things went wrong after 11.2.0.1 upgrade from 10.2.0.4.
A report query suddenly started to run like a dog. From 10 minutes to 10 hours. I went through all bugfixes and it did not help again.

Plan differences for versions were like below.

For 11.2.0.1

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 769921106

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                              | 43364 |    28M|       |   795K (22)| 00:28:06 |       |       |
|   1 |  SORT ORDER BY                              |                              | 43364 |    28M|    30M|   795K (22)| 00:28:06 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                     |                              | 43364 |    28M|       |   790K (22)| 00:27:54 |       |       |
|   3 |    VIEW                                     |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                               |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                      | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                      | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                    |                              | 43364 |    26M|       |   790K (22)| 00:27:54 |       |       |
|   8 |     VIEW                                    |                              |  4841 |  1096K|       |   760K (23)| 00:26:51 |       |       |
|   9 |      HASH GROUP BY                          |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|* 10 |       HASH JOIN                             |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|  11 |        TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                            |                              |  4841 |   756K|  2368K|   760K (23)| 00:26:51 |       |       |
|  13 |         TABLE ACCESS FULL                   | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|* 14 |         HASH JOIN                           |                              |   219K|    23M|    20M|   759K (23)| 00:26:49 |       |       |
|* 15 |          HASH JOIN                          |                              |   219K|    17M|       |   738K (23)| 00:26:04 |       |       |
|  16 |           PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 17 |            TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 18 |             INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  19 |             TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 20 |              INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|  21 |           PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  22 |            TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  23 |          PARTITION LIST SINGLE              |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 24 |           TABLE ACCESS FULL                 | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|* 25 |     HASH JOIN                               |                              | 43364 |    16M|       | 29859   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                      | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                              |                              | 43364 |    16M|  2736K| 29853   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                     | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                  |                              | 43364 |    13M|       | 29230   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                        |                              | 43364 |  4319K|  4968K| 29230   (2)| 00:01:02 |       |       |
|  31 |         NESTED LOOPS                        |                              |       |       |       |            |          |       |       |
|  32 |          NESTED LOOPS                       |                              | 43364 |  4319K|       | 28309   (2)| 00:01:00 |       |       |
|  33 |           VIEW                              | VW_GBF_14                    |  9021 |   669K|       |  1065  (12)| 00:00:03 |       |       |
|  34 |            HASH GROUP BY                    |                              |  9021 |   916K|  1072K|  1065  (12)| 00:00:03 |       |       |
|* 35 |             HASH JOIN                       |                              |  9021 |   916K|       |   868  (13)| 00:00:02 |       |       |
|* 36 |              HASH JOIN                      |                              |  9021 |   643K|       |   736  (13)| 00:00:02 |       |       |
|* 37 |               INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  38 |               PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|* 39 |                TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|  40 |              TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 41 |           INDEX RANGE SCAN                  | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
|  42 |          TABLE ACCESS BY GLOBAL INDEX ROWID | TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  14 - access("TERM"."INSTRUMENT"="I"."PKEY")
  15 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  17 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  18 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  20 - access("CLIENTS"."NAME"='Xxxxxxxx')
  24 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  35 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  36 - access("M"."RANGE"="VALUE")
  37 - access("PROPERTY"='PercentileInTblReport')
  39 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  41 - access("ITEM_1"="BA"."YYYYYYYY_METRICS_ID")

74 rows selected.

For 10.2.0.4

SQL> alter session set optimizer_features_enable='10.2.0.4';

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3853593815

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                              | 43364 |    28M|       |   794K (22)| 00:28:04 |       |       |
|   1 |  SORT ORDER BY                            |                              | 43364 |    28M|    61M|   794K (22)| 00:28:04 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                   |                              | 43364 |    28M|       |   789K (22)| 00:27:52 |       |       |
|   3 |    VIEW                                   |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                             |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                    | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                    | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                  |                              | 43364 |    26M|       |   789K (22)| 00:27:52 |       |       |
|   8 |     VIEW                                  |                              |  1770 |   401K|       |   759K (23)| 00:26:49 |       |       |
|   9 |      HASH GROUP BY                        |                              |  1770 |   295K|       |   759K (23)| 00:26:49 |       |       |
|* 10 |       HASH JOIN                           |                              |  1770 |   295K|       |   759K (23)| 00:26:49 |       |       |
|  11 |        TABLE ACCESS FULL                  | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                          |                              |  1770 |   276K|       |   759K (23)| 00:26:49 |       |       |
|  13 |         PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 14 |          TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 15 |           INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  16 |           TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 17 |            INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|* 18 |         HASH JOIN                         |                              |  2066K|   201M|       |   758K (23)| 00:26:48 |       |       |
|* 19 |          HASH JOIN                        |                              | 14382 |  1067K|  2368K| 20817  (15)| 00:00:45 |       |       |
|  20 |           TABLE ACCESS FULL               | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|  21 |           PARTITION LIST SINGLE           |                              |  3215K|    88M|       | 17213  (15)| 00:00:37 |   KEY |   KEY |
|  22 |            TABLE ACCESS FULL              | INSTRUMENTS                  |  3215K|    88M|       | 17213  (15)| 00:00:37 |    34 |    34 |
|  23 |          PARTITION RANGE ALL              |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  24 |           TABLE ACCESS FULL               | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|* 25 |     HASH JOIN                             |                              | 43364 |    16M|       | 29935   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                            |                              | 43364 |    16M|  2736K| 29929   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                   | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                |                              | 43364 |    13M|       | 29306   (2)| 00:01:03 |       |       |
|  30 |        HASH GROUP BY                      |                              | 43364 |  5759K|    12M| 29306   (2)| 00:01:03 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID| TBL_YYYYYYYYY_SPREADS        |     5 |   160 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                     |                              | 43364 |  5759K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                       |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                      |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  14 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  15 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  17 - access("CLIENTS"."NAME"='Xxxxxxxx')
  18 - access("TERM"."INSTRUMENT"="I"."PKEY")
  19 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

70 rows selected.

The plan_hash_value I was chasing for is 3853593815

I ran the controlled version and the results were like below.

SQL> @optimizer_param_test_e

PD_NAME                                            PD_VALUE      PLAN_HASH_VALUE
-------------------------------------------------- ---------- ------------------
_nlj_batching_enabled                              0                    49955995
_aggregation_optimization_settings                 32                  769921106
_optimizer_use_cbqt_star_transformation            FALSE
_optimizer_unnest_disjunctive_subq                 FALSE
_optimizer_unnest_corr_set_subq                    FALSE
_optimizer_try_st_before_jppd                      FALSE
_optimizer_table_expansion                         FALSE
_optimizer_null_aware_antijoin                     FALSE
_optimizer_native_full_outer_join                  OFF
_optimizer_multi_level_push_pred                   FALSE
_optimizer_join_factorization                      FALSE
_optimizer_improve_selectivity                     FALSE
_optimizer_fast_pred_transitivity                  FALSE
_optimizer_fast_access_pred_analysis               FALSE
_replace_virtual_columns                           FALSE
_optimizer_use_feedback                            FALSE
_and_pruning_enabled                               FALSE
_bloom_folding_enabled                             FALSE
_optimizer_extended_cursor_sharing_rel             NONE
_optimizer_extend_jppd_view_types                  FALSE
_optimizer_enable_density_improvements             FALSE
_optimizer_eliminate_filtering_join                FALSE
_optimizer_distinct_placement                      FALSE
_optimizer_distinct_agg_transform                  FALSE
_optimizer_connect_by_elim_dups                    FALSE
_optimizer_coalesce_subqueries                     FALSE
_optimizer_adaptive_cursor_sharing                 FALSE
_first_k_rows_dynamic_proration                    FALSE
_connect_by_use_union_all                          OLD_PLAN_M
                                                   ODE

_bloom_pruning_enabled                             FALSE
_optimizer_group_by_placement                      FALSE              1801558718
_optimizer_enable_extended_stats                   FALSE              1803847571
_optimizer_extended_stats_usage_control            255

It did not help me for the right plan, and to be honest I wasn’t expecting that it will help in this case because of the extra nexted loop which is controlled
by “_nlj_batching_enabled”.

I again tried it with the non_controlled version of the test builder to see if the parameters above will still effect.

SQL&gt; @build_optimizer_param_test_e_2
Enter value for before_ofe: 11.2.0.1
Enter value for after_ofe: 10.2.0.4

SQL&gt; @optimizer_param_test_e_2

PD_NAME PD_VALUE PLAN_HASH_VALUE
-------------------------------------------------- ---------- ------------------
_replace_virtual_columns FALSE 769921106
_optimizer_use_feedback FALSE
_optimizer_use_cbqt_star_transformation FALSE
_optimizer_unnest_disjunctive_subq FALSE
_optimizer_unnest_corr_set_subq FALSE
_optimizer_try_st_before_jppd FALSE
_optimizer_table_expansion FALSE
_optimizer_null_aware_antijoin FALSE
_optimizer_native_full_outer_join OFF
_optimizer_multi_level_push_pred FALSE
_optimizer_join_factorization FALSE
_optimizer_improve_selectivity FALSE
_optimizer_group_by_placement FALSE 1801558718 =&gt;change
_optimizer_fast_pred_transitivity FALSE
_optimizer_fast_access_pred_analysis FALSE
_optimizer_extended_stats_usage_control 255 1340272812 =&gt;change
_optimizer_extended_cursor_sharing_rel NONE
_optimizer_extend_jppd_view_types FALSE
_optimizer_enable_extended_stats FALSE
_optimizer_enable_density_improvements FALSE
_optimizer_eliminate_filtering_join FALSE
_optimizer_distinct_placement FALSE
_optimizer_distinct_agg_transform FALSE
_optimizer_connect_by_elim_dups FALSE
_optimizer_coalesce_subqueries FALSE
_optimizer_adaptive_cursor_sharing FALSE
_nlj_batching_enabled 0 3853593815 =&gt;change
_first_k_rows_dynamic_proration FALSE
_connect_by_use_union_all OLD_PLAN_M
ODE

_bloom_pruning_enabled FALSE
_bloom_folding_enabled FALSE
_and_pruning_enabled FALSE
_aggregation_optimization_settings 32

For non controlled version the order of the output above is the order of insert to the table so from first row to the last row test builder
reverts the parameter back to 10.2.0.4 value and goes on with the new parameter without changing it back to 11.2.0.1 value .
Till “_optimizer_group_by_placement” we have 11.2.0.1 default plan. After that one, we see another change on “_optimizer_extended_stats_usage_control”
and when nested loop behaviour reverts back with “_nlj_batching_enabled “ to the original we find our right plan. So possibilities drop down from 35 to 3.
If we are lucky enough combination of these 3 parameters will give us the right plan.

First start with extra nested loop controller “_nlj_batching_enabled” and we revert it back to pre-11g behavior

SQL> alter session set "_nlj_batching_enabled"=0;

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 49955995

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                              | 43364 |    28M|       |   795K (22)| 00:28:06 |       |       |
|   1 |  SORT ORDER BY                              |                              | 43364 |    28M|    30M|   795K (22)| 00:28:06 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                     |                              | 43364 |    28M|       |   790K (22)| 00:27:54 |       |       |
|   3 |    VIEW                                     |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                               |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                      | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                      | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                    |                              | 43364 |    26M|       |   790K (22)| 00:27:54 |       |       |
|   8 |     VIEW                                    |                              |  4841 |  1096K|       |   760K (23)| 00:26:51 |       |       |
|   9 |      HASH GROUP BY                          |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|* 10 |       HASH JOIN                             |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|  11 |        TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                            |                              |  4841 |   756K|  2368K|   760K (23)| 00:26:51 |       |       |
|  13 |         TABLE ACCESS FULL                   | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|* 14 |         HASH JOIN                           |                              |   219K|    23M|    20M|   759K (23)| 00:26:49 |       |       |
|* 15 |          HASH JOIN                          |                              |   219K|    17M|       |   738K (23)| 00:26:04 |       |       |
|  16 |           PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 17 |            TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 18 |             INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  19 |             TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 20 |              INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|  21 |           PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  22 |            TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  23 |          PARTITION LIST SINGLE              |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 24 |           TABLE ACCESS FULL                 | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|* 25 |     HASH JOIN                               |                              | 43364 |    16M|       | 29859   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                      | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                              |                              | 43364 |    16M|  2736K| 29853   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                     | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                  |                              | 43364 |    13M|       | 29230   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                        |                              | 43364 |  4319K|  4968K| 29230   (2)| 00:01:02 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID  | TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                       |                              | 43364 |  4319K|       | 28309   (2)| 00:01:00 |       |       |
|  33 |           VIEW                              | VW_GBF_14                    |  9021 |   669K|       |  1065  (12)| 00:00:03 |       |       |
|  34 |            HASH GROUP BY                    |                              |  9021 |   916K|  1072K|  1065  (12)| 00:00:03 |       |       |
|* 35 |             HASH JOIN                       |                              |  9021 |   916K|       |   868  (13)| 00:00:02 |       |       |
|* 36 |              HASH JOIN                      |                              |  9021 |   643K|       |   736  (13)| 00:00:02 |       |       |
|* 37 |               INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  38 |               PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|* 39 |                TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|  40 |              TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 41 |           INDEX RANGE SCAN                  | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  14 - access("TERM"."INSTRUMENT"="I"."PKEY")
  15 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  17 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  18 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  20 - access("CLIENTS"."NAME"='Xxxxxxxx')
  24 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  35 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  36 - access("M"."RANGE"="VALUE")
  37 - access("PROPERTY"='PercentileInTblReport')
  39 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  41 - access("ITEM_1"="BA"."YYYYYYYY_METRICS_ID")

73 rows selected.

Now its time to go with the “_optimizer_group_by_placement”. This is about “consider group-by placement optimization” and We have many group bys in our query
as inner query. I was at first under the impression that this is about “_optimizer_extend_jppd_view_types” like Dion Cho explained here
and Maria Colgan (she is also a great presented please do not miss her sessions) from optimizer group explained here
but actually that parameter is not changing the behavior. I wish there were more information about “_optimizer_group_by_placement” but there aren’t :(

SQL> alter session set "_optimizer_group_by_placement"=FALSE;

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3419104514

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                              | 43364 |    28M|       |   795K (22)| 00:28:06 |       |       |
|   1 |  SORT ORDER BY                              |                              | 43364 |    28M|    30M|   795K (22)| 00:28:06 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                     |                              | 43364 |    28M|       |   790K (22)| 00:27:54 |       |       |
|   3 |    VIEW                                     |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                               |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                      | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                      | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                    |                              | 43364 |    26M|       |   790K (22)| 00:27:54 |       |       |
|   8 |     VIEW                                    |                              |  4841 |  1096K|       |   760K (23)| 00:26:51 |       |       |
|   9 |      HASH GROUP BY                          |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|* 10 |       HASH JOIN                             |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|  11 |        TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                            |                              |  4841 |   756K|  2368K|   760K (23)| 00:26:51 |       |       |
|  13 |         TABLE ACCESS FULL                   | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|* 14 |         HASH JOIN                           |                              |   219K|    23M|    20M|   759K (23)| 00:26:49 |       |       |
|* 15 |          HASH JOIN                          |                              |   219K|    17M|       |   738K (23)| 00:26:04 |       |       |
|  16 |           PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 17 |            TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 18 |             INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  19 |             TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 20 |              INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|  21 |           PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  22 |            TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  23 |          PARTITION LIST SINGLE              |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 24 |           TABLE ACCESS FULL                 | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|* 25 |     HASH JOIN                               |                              | 43364 |    16M|       | 29883   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                      | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                              |                              | 43364 |    16M|  2736K| 29877   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                     | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                  |                              | 43364 |    13M|       | 29254   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                        |                              | 43364 |  5505K|  6208K| 29254   (2)| 00:01:02 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID  | TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                       |                              | 43364 |  5505K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                         |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                        |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN                | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE          |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL              | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL                | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                  | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  14 - access("TERM"."INSTRUMENT"="I"."PKEY")
  15 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  17 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  18 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  20 - access("CLIENTS"."NAME"='Xxxxxxxx')
  24 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

71 rows selected.

Still not helping with the exact plan I am looking for. Last try will be “_optimizer_extended_stats_usage_control”
This parameter is normally not available on 10.2.0.4 so probably setting it to 255 actually means disabling it.
Even we don’t have any extended stats on our database somehow it effects optimizer. Lets see how it effects

SQL> alter session set "_optimizer_extended_stats_usage_control"=255;

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3853593815

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                              | 43364 |    28M|       |   794K (22)| 00:28:04 |       |       |
|   1 |  SORT ORDER BY                            |                              | 43364 |    28M|    30M|   794K (22)| 00:28:04 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                   |                              | 43364 |    28M|       |   789K (22)| 00:27:52 |       |       |
|   3 |    VIEW                                   |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                             |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                    | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                    | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                  |                              | 43364 |    26M|       |   789K (22)| 00:27:52 |       |       |
|   8 |     VIEW                                  |                              |  1683 |   381K|       |   759K (23)| 00:26:49 |       |       |
|   9 |      HASH GROUP BY                        |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|* 10 |       HASH JOIN                           |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|  11 |        TABLE ACCESS FULL                  | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                          |                              |  1683 |   262K|       |   759K (23)| 00:26:49 |       |       |
|  13 |         PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 14 |          TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 15 |           INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  16 |           TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 17 |            INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|* 18 |         HASH JOIN                         |                              |  1964K|   191M|       |   758K (23)| 00:26:48 |       |       |
|* 19 |          HASH JOIN                        |                              | 14382 |  1067K|  2368K| 20870  (16)| 00:00:45 |       |       |
|  20 |           TABLE ACCESS FULL               | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|  21 |           PARTITION LIST SINGLE           |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 22 |            TABLE ACCESS FULL              | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|  23 |          PARTITION RANGE ALL              |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  24 |           TABLE ACCESS FULL               | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|* 25 |     HASH JOIN                             |                              | 43364 |    16M|       | 29883   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                            |                              | 43364 |    16M|  2736K| 29877   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                   | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                |                              | 43364 |    13M|       | 29254   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                      |                              | 43364 |  5505K|  6208K| 29254   (2)| 00:01:02 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID| TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                     |                              | 43364 |  5505K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                       |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                      |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  14 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  15 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  17 - access("CLIENTS"."NAME"='Xxxxxxxx')
  18 - access("TERM"."INSTRUMENT"="I"."PKEY")
  19 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  22 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

71 rows selected.

As you can see after the 3th parameter we have our plan back

One more thing to show is that how will it be with extra nested loop (preperation for the next post :) ).

SQL> alter session set "_nlj_batching_enabled"=1;

Session altered.

SQL> @tesT_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1340272812

---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                              | 43364 |    28M|       |   794K (22)| 00:28:04 |       |       |
|   1 |  SORT ORDER BY                             |                              | 43364 |    28M|    30M|   794K (22)| 00:28:04 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                    |                              | 43364 |    28M|       |   789K (22)| 00:27:52 |       |       |
|   3 |    VIEW                                    |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                              |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                     | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                     | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                   |                              | 43364 |    26M|       |   789K (22)| 00:27:52 |       |       |
|   8 |     VIEW                                   |                              |  1683 |   381K|       |   759K (23)| 00:26:49 |       |       |
|   9 |      HASH GROUP BY                         |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|* 10 |       HASH JOIN                            |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|  11 |        TABLE ACCESS FULL                   | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                           |                              |  1683 |   262K|       |   759K (23)| 00:26:49 |       |       |
|  13 |         PARTITION RANGE SINGLE             |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 14 |          TABLE ACCESS BY LOCAL INDEX ROWID | FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 15 |           INDEX RANGE SCAN                 | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  16 |           TABLE ACCESS BY INDEX ROWID      | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 17 |            INDEX UNIQUE SCAN               | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|* 18 |         HASH JOIN                          |                              |  1964K|   191M|       |   758K (23)| 00:26:48 |       |       |
|* 19 |          HASH JOIN                         |                              | 14382 |  1067K|  2368K| 20870  (16)| 00:00:45 |       |       |
|  20 |           TABLE ACCESS FULL                | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|  21 |           PARTITION LIST SINGLE            |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 22 |            TABLE ACCESS FULL               | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|  23 |          PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  24 |           TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|* 25 |     HASH JOIN                              |                              | 43364 |    16M|       | 29883   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                     | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                             |                              | 43364 |    16M|  2736K| 29877   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                    | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                 |                              | 43364 |    13M|       | 29254   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                       |                              | 43364 |  5505K|  6208K| 29254   (2)| 00:01:02 |       |       |
|  31 |         NESTED LOOPS                       |                              |       |       |       |            |          |       |       |
|  32 |          NESTED LOOPS                      |                              | 43364 |  5505K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                        |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                       |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN               | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE         |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL             | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL               | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                 | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
|  40 |          TABLE ACCESS BY GLOBAL INDEX ROWID| TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
---------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  14 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  15 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  17 - access("CLIENTS"."NAME"='Xxxxxxxx')
  18 - access("TERM"."INSTRUMENT"="I"."PKEY")
  19 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  22 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

72 rows selected.

This plan is the plan I would like to see once I upgraded the DB but due to the new optimizations I saw something completelly different.
Once I disabled the right ones we have a solution to generate the right plan.

If I have time to deal with it (maybe I have in the future) I could show you the differences on 10053 file for this case, but it can only happen when I finish this series.

This post and the previous posts about bugfixes are just for you to give better understanding of what is effecting your plan change
and how you can feedback oracle support with the finding you found out. The non-scientific methods here are for automating your researches and saving you some time and
giving you the idea where to look when you look at the 10053 file if you want to dig further.
Normally shortest way to create the plan you are looking after is Using OFE and Plan Baselines together and job will be done for %99 of the cases.

!!!!!These are again undocumented parameters They can be solution to wrong results or they can be cause of wrong results
please do a research on MOS and ask Oracle support before you use them

Scripts:

All in one
full_outer_join_test_case.sql
build_opt_param_table.sql
before_after_opt_param.sql
build_optimizer_param_test.sql
build_optimizer_param_test_e.sql
build_optimizer_param_test_2.sql
build_optimizer_param_test_e_2.sql

<–Previous Post in This Series —————-     Next Post in this series–>

February 15, 2011

Plan Stability Through Upgrade-Why is my plan changed?-bugfixes-2

Filed under: Bugs, CBO, Performance, Plan Stability — coskan @ 12:01 am

This post is a bonus post where things can get complicated with my approach for finding bugfixes , if you are already upgraded.

On a database where we upgraded from 10.2.0.4 to 11.2.0.1 , We had a sql reported to have changed its response time on 11.2.0.1 .(At this point I need to say I’m sorry that I can’t share the SQL and PLANS because of data privacy. Tried to edit the sql and plans but query is massive . )

Interestingly when I run the optimizer_features_test for the versions between 10.2.0.4 to 11.2.0.1
only change was extra nested loop. Cost is same and plan generated for 10.2.0.4 was not same as pre-upgrade version

Runtime statistics .


SQL> @sqlid b290wxr1j2a2b

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
b290wxr1j2a2b 3272681547 SELECT   /*+gather_plan_statistics*//*test*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002BCC6C048 00000002BCC6A0A0   83708128     2022          1          1          1          2            121 1255771.09 1323564.07 1323.56407   62856727      12621          1               0
   1 00000002BCC6C048 00000002BCC2A3C8 1845562615      889          1          1          1          2            121   1847.719   2327.507   2.327507       2889        306          1               0
   2 00000002BCC6C048 00000002BCC24D78   83708128     2022          2          1          2          4            242 2310945.68 2318713.43 1159.35672  125714199        103          2               0
   3 00000002BCC6C048 00000002BCBCEFD0 3790468002     2022          2          1          2          4            242 2349948.75 2361862.35 1180.93118  125714497        424          2               0
   4 00000002BCC6C048 00000002BDDAF630  427381853     2022          2          1          2          4            242 2308517.05 2319381.53 1159.69077  125713318         26          2               0
   5 00000002BCC6C048 00000002DA577EB8  427381853     2022          2          1          2          4            242 2294214.23 2300615.11 1150.30755  125713315          0          2               0

6 rows selected.
Child-Version Match is like below

Child	OFE
=================
0-1	11.2.0.1
2	11.1.0.7
3	11.1.0.6
4	10.2.0.5
5	10.2.0.4

Child number 1 on 11.2.0.1 runs with cardinality feedback and uses right plan for OR with extra nested loop make up and drops the response time from 1323 seconds to 2 seconds
Other

So what is changed and how can I catch it if changing OFE is not working ? Its pretty obvious that we have a stats problem but this query involves 10 different table
and its not always easy to gather stats on tables where all other queries are running fine on. I needed to find a solution.

When I check the query it has an OR and the new plans basically do not generate a plan for OR. I modified the query to use union all instead of OR I can get the same pre-upgrade plan which is interesting
because stats are still same so how come Oracle generates right plan when I use union all and wrong plan when I use OR ? There must be something causing this behaviour.

Since Changing OFE did not change anything in terms of response time we may look into the situation where 8.0 bugfix suddenly appeared on 11.2.0.1 but were not active on 10.2.0.4

How can we spot the bugfix ? You can use the bugfix builder that generates script to test bugfixes for all version below the given version

Script is build_optimizer_bugfix_test_below_e: Same thing but creates a script for all the bugfixes for the versions below the given version

Source code of the script

undefine optimizer_feature_version
set heading off
set feedback off
define optimizer_feature=&quot;&amp;optimizer_feature_version&quot;
set term off
spool optimizer_bugfix_test_e.sql
select 'set feedback off' from dual;
select 'set heading off' from dual;
select 'set term off' from dual;
select 'set echo off' from dual;
select 'spool optimizer_bugfix_test_results_e.log' from dual;
select 'drop table  bugfix_table purge;' from dual;
select 'create  table bugfix_table (bugfix number,plan_hash_value varchar2(20));' from dual;
select 'alter session set optimizer_features_enable=''&amp;optimizer_feature'';' from dual;
select   'alter session set &quot;_fix_control&quot;='''||bugno||decode(value,1,':OFF''',':ON''')||'; '
||chr(10)||'@test_e.sql'||chr(10)||
'insert into bugfix_table values ('||bugno||',(select substr(plan_table_output,18,12) from (select * from table(dbms_xplan.display)) where plan_table_output  like ''Plan hash value:%''));'||chr(10)||
'alter session set &quot;_fix_control&quot;='''||bugno||decode(value,1,':ON''',':OFF''')||'; '
from v$system_fix_control
where
to_number(replace(optimizer_feature_enable,'.',''))&lt;to_number(replace('&amp;optimizer_feature','.',''))
and value in (1,0) order by bugno;
select 'set term on' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'break on plan_hash_value' from dual;
select 'select plan_hash_value,bugfix from bugfix_table order by 1;' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'spool off' from dual;
set feedback on
set heading on
set term on
spool off

When I run it


SQL&gt; @build_optimizer_bugfix_test_below_e
Enter value for optimizer_feature_version: 10.2.0.4
SQL&gt; @optimizer_bugfix_test_e

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
1186576438              5129233
                        3120429
427381853               1403283
                        2320291
                        2324795
                        2492766
                        2660592
                        2663857
			.......
			.......
76 rows selected.

Two bugfixes gives us the plan we are looking for

SQL&gt; @bugfixg
Enter value for bugno: 5129233

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   5129233          1 QKSFM_CBO_5129233         peek at any foldable exprssion during pruning                    10.2.0.3                           0          1

1 row selected.

SQL&gt; @bugfixg
Enter value for bugno: 3120429

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   3120429          1 QKSFM_ACCESS_PATH_3120429 account for join key sparsity in computing NL index access cost  10.1.0.3                           0          1

1 row selected.

how can we sure which one is our problem ? Both of them enables us to run with the right plan
At this point only option is to go and check on a 10.2.0.4 database to see what these fixes were


SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> @bugfixg
Enter value for bugno: 5129233

     BUGNO      VALUE DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   5129233          1 peek at any foldable exprssion during pruning                    10.2.0.3                           0          1

SQL> @bugfixg
Enter value for bugno: 3120429

no rows selected

As you can see 5129233 is again enabled on 10.2.0.4 but bugfix 3120429 does not exist and not enabled/disabled on a database running on 10.2.0.4 binaries.

Even both of them are solving the issue when we are running on 11.2.0.1, in my opinion controlling bugfix is 3120429.
Thanks to the reasonable developer who accepted to change the code using union all instead of or we did not need to enable disable a bugfix.

Since we find two bugs lets test them (I add comments to force new parse instead of flushing the sql for fresh runtime stats so sqlids are different )

When bugfix 3120429 is disabled

SQL> @sqlid 3798wumm6mwsq

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
3798wumm6mwsq 3865703190 SELECT   /*+gather_plan_statistics*//*test_2*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002DC1DABC8 00000002DC968950 4098949197      531          2          1          2          2              0     1305.8     1332.1     .66605        207          3          2               0
   1 00000002DC1DABC8 00000002DC6182A0 4098949197      531          2          1          2          2              0   1328.799   1336.494    .668247        207          0          2               0
   2 00000002DC1DABC8 00000002DC76E5E8 3700866780      451          2          1          2          2              0   1353.794   1369.258    .684629        214          0          2               0
   3 00000002DC1DABC8 00000002DCCAAFF8 2856793267      451          2          1          2          2              0   1244.811   1251.411   .6257055        214          0          2               0
   4 00000002DC1DABC8 00000002A5F40D20 1186576438      481          2          1          2          2              0    898.864    906.962    .453481        190          0          2               0

When bugfix 5129233 is disabled


SQL> @sqlid bh6a9xtq27hba

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
bh6a9xtq27hba 1814282602 SELECT   /*+gather_plan_statistics*//*test_3*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002DC6F9150 00000002DC955228  986578726      519          2          1          2          2              0    1177.82   1183.933   .5919665        183          0          2               0
   1 00000002DC6F9150 00000002A5F400F0  986578726      519          2          1          2          2              0   1261.808   1268.259   .6341295        183          0          2               0
   2 00000002DC6F9150 00000002DCBB2DE0 3700866780      445          2          1          2          2              0   1506.771   1533.008    .766504        214          0          2               0
   3 00000002DC6F9150 00000002DCBB29C0 2856793267      445          2          1          2          2              0   1305.801   1317.385   .6586925        214          0          2               0
   4 00000002DC6F9150 00000002DC81E000 1186576438      474          2          1          2          2              0    940.858    945.413   .4727065        190          0          2               0

On 11.2.0.1 and 11.1.0.7 plan is changed depending on the bugfix but for 10.2.0.4, 10.2.0.5 and 11.1.0.6 plan is constant for changing different bugfixes.
You can also see previous cardinality feedback plan is also slower then bugfix disabled version.

We can dig further but its not the point of this post and Since I managed to take the response time back to where it was from 1323 seconds to 0.5 seconds I can stop here.

As you can see we still spot a difference after upgrade even time travelling between releases by altering OFE did not help us at first point.

This post was to explain complicated issue and things you can face with my approach.

I also find it very usefull to have v$system_fix_control table snapped before you upgrade/patch and compared afterwards, so you won’t get shocked with newly added previos versions related bugs.

Scripts Used;

build_optimizer_bugfix_test_below_e

<-Previos Post in Series ——————————-  Next Post in Series–>

February 14, 2011

Plan Stability Through Upgrade-Why is my plan changed?-bugfixes-1

Filed under: Bugs, CBO, Performance, Plan Stability — coskan @ 12:00 pm

Before I start the actual post let me clarify something, these post series will be about using practical ways like I did on this post to understand
what is the cause of the post upgrade performance changes. I am trying to be more practical than scientific because of the nature of the production support DBA job.
When you are a production DBA, you deal with many issues/databases at the same time and time spent for a solution is very important for you to be able to move to the second problem.
Normally if you are a scientist, oracle enthusiastic then you get 10053 trace try to understand the data try to find out why optimizer did something on Xth step but in reality when it comes to upgrade issues
this is really time consuming and not very practical. You really need time/experience/knowledge to understand a 10053 trace especially for massive queries which has gazillion joins in it.
You can still understand what is going on from prediction and runtime statistics of a plan but it is not the answer you can give to the developers.
If you go and explain them all the things you spot in execution plan output and 10053 trace All they will say to you is “DB is upgraded and upgrade sucks because previous release was better why are we upgrading”
so you need a bit bigger proof to make them accept the issue and workarounds. I also want to add that, my priority for these series is not focusing on content of sqls
but giving you the practical technique to stabilize their plans during upgrades.

Now back to post, You followed the steps in first post and you found out that your plan is changed, now it is time to find out why it is changed
when the only optimizer version is changed. One of the possible reasons is bugfixes in the optimizer.

a-Brief introduction to bugfixes.
Information below is taken from MOS Note – Init.ora Parameter “_FIX_CONTROL” [Hidden] Reference Note [ID 567171.1];

Support and Historical Notes for "_FIX_CONTROL"

  <strong>_FIX_CONTROL</strong> is a special hidden parameter introduced in 10.2.0.2 which can be used to enable or disable certain bug fixes.
  This is typically used in the optimizer area to allow specific fixes to be turned on or off as some fixes can have a significant
  effect on execution plans. Often the default setting (ON or OFF) for a fix is determined by the
  value of the OPTIMIZER_FEATURES_ENABLE parameter.  A full list of bug numbers which are  controlled by fix control can be found in the V$ views:
    - GV$SYSTEM_FIX_CONTROL
    - GV$SESSION_FIX_CONTROL

  The general format for this parameter is the form: _FIX_CONTROL='bug_number:ON|OFF'

  eg:  alter session set "_fix_control"='4969880:ON';
      or in the init.ora
       _fix_control='4969880:ON'

  Multiple values can be specified:
  eg:	alter session set "_fix_control"='4969880:ON','5483301:off';

  Note that fixes that use _FIX_CONTROL in one release may not use it in other releases so any settings should be reviewed when
  upgrading / downgrading or applying a patch set. Also some fix controls have numeric settings rather than ON or OFF.

In summary in every release of Oracle Database (( including patches) ) there are bugfixes and since 10.2.0.2 their availability can be controlled by
this “undocumented” parameter. Some of these bugfixes are on by default some of them are not depending on the optimizer version+oracle binary(check extra test case at the bottom). you are running.
Bugfixes gives us fine grained control. If the optimizer feature you are running causing problems on some queries because of a bugfix then you can still stay on
the version you want but disable the fixes, having said that you still need aproval from ORacle Support if it is safe or not to enable disable bug.

Here are the number of bugfixes by releases taken from 11.2.0.2 base release

SQL&gt; select optimizer_feature_enable,count(*) from v$system_fix_control group by optimizer_feature_enable order by 1 asc;

OPTIMIZER_FEATURE_ENABLE    COUNT(*)
------------------------- ----------
10.1.0                             3
10.1.0.3                           1
10.1.0.5                           2
10.2.0.1                           7
10.2.0.2                          12
10.2.0.3                          11
10.2.0.4                          73
10.2.0.5                         111
11.1.0.6                          39
11.1.0.7                          19
11.2.0.1                          65
11.2.0.2                         105
8.0.0                             52
8.1.6                              1
8.1.7                              2
9.2.0                              8
9.2.0.8                            3
37

And this is from 10.2.0.4 PSU 2

SQL&gt; select optimizer_feature_enable,count(*) from v$system_fix_control group by optimizer_feature_enable order by 1 asc;

OPTIMIZER_FEATURE_ENABLE    COUNT(*)
------------------------- ----------
10.1.0                             1
10.1.0.5                           1
10.2.0.1                           4
10.2.0.2                          13
10.2.0.3                          10
10.2.0.4                          68
9.2.0                              2
9.2.0.8                            7
9

There are couple of things you can spot here. Despite being higher version Oracle sometimes waits to put a bugfix of lower versions.
There are bugfixes from 10.1.0.3 and 8.0.0 and 8.1.6 8.1.7 which are on 11.2.0.2 relase but not on 10.2.0.4 release which is interesting.
In general do not think that bugfixes are added gradually. Oracle can put/make switchable a bugfix from a very old release to a brand new release. There is one more thing to bear in mind not %100 of bugfixes works with value 1 or 0. There are couple of them with values over 1 probably related with thresholds. In this post I will not pay attention to them.

 

After this brief information now it is time for the actual reason of this post.
As you recall, I shared an optimizer_features_test builder on the first post and now with the same logic I have 2 more builder scripts
to build your bugfix test for versions. After you identify under which OFE your plan behaviour changed, all you need to do is to pass
that version level to the scripts below and let them create test cases by enabling/disabling the bugfixes spesific to that version.

build_optimizer_bugfix_test.sql: This is for runtime bugfix test_case generator. All it does is create optimizer_bugfix_test.sql by using v$system_fix_control_table.
It gets each bugfix from that table for given OFE version and enable/disable them depending on their default value then run the test.sql and disable/enable the bugfix back again.
So we have a controlled environment where we test 1 bugfix at a time. It captures every plan_hash_value into a table and select them once all bugfix numbers tested.
If you are lucky enough or in other words if the problem is caused by a bugfix Oracle will revert the plan back to the original once you are changing that bugfixes value.
There is a tiny little problem with this approach, you cannot run 600 second query for 103 bugfixes. This is mostly for very quick queries.
The code of the sql is like below


undefine optimizer_feature_version
set heading off
set feedback off
define optimizer_feature="&optimizer_feature_version"
set term off
spool optimizer_bugfix_test.sql
select 'set feedback off' from dual;
select 'set heading off' from dual;
select 'set term off' from dual;
select 'set echo off' from dual;
select 'spool optimizer_bugfix_test_results.log' from dual;
select 'drop table  bugfix_table purge;' from dual;
select 'create  table bugfix_gtt (bugfix number,plan_hash_value varchar2(20));' from dual;
select 'alter session set optimizer_features_enable=''&optimizer_feature'';' from dual;
select   'alter session set "_fix_control"='''||bugno||decode(value,1,':OFF''',':ON''')||'; '
||chr(10)||'@test.sql'||chr(10)||
'insert into bugfix_table values ('||bugno||',(select substr(plan_table_output,18,12) from (select * from table(dbms_xplan.display_cursor)) where plan_table_output  like ''Plan hash value:%''));'||chr(10)||
'alter session set "_fix_control"='''||bugno||decode(value,1,':ON''',':OFF''')||'; '
from v$system_fix_control
where optimizer_feature_enable='&optimizer_feature' and value in (1,0) order by bugno;
select 'set term on' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'break on plan_hash_value' from dual;
select 'select plan_hash_value,bugfix from bugfix_table order by 1;' from dual;
select 'spool off' from dual;
spool off
set feedback on
set heading on
set term on

If you are sure your plan change behaviour also taking place when you explain the plan, better you run the builder below.

build_optimizer_bugfix_test_e.sql: This is same idea like above only difference is, this time it generates optimizer_features_test_e.sql which explains plans by runnint test_e.sql for every bugfix.
Instead of dbms_xplan.display_cursor I use dbms_xplan.display. Output of the optimizer_features_test_e.sql is same as optimizer_features_test.sql

source code of the builder

undefine optimizer_feature_version
set heading off
set feedback off
define optimizer_feature="&optimizer_feature_version"
set term off
spool optimizer_bugfix_test_e.sql
select 'set feedback off' from dual;
select 'set heading off' from dual;
select 'set term off' from dual;
select 'set echo off' from dual;
select 'spool optimizer_bugfix_test_results_e.log' from dual;
select 'drop table  bugfix_table purge;' from dual;
select 'create  table bugfix_gtt (bugfix number,plan_hash_value varchar2(20));' from dual;
select 'alter session set optimizer_features_enable=''&optimizer_feature'';' from dual;
select   'alter session set "_fix_control"='''||bugno||decode(value,1,':OFF''',':ON''')||'; '
||chr(10)||'@test_e.sql'||chr(10)||
'insert into bugfix_table values ('||bugno||',(select substr(plan_table_output,18,12) from (select * from table(dbms_xplan.display)) where plan_table_output  like ''Plan hash value:%''));'||chr(10)||
'alter session set "_fix_control"='''||bugno||decode(value,1,':ON''',':OFF''')||'; '
from v$system_fix_control
where optimizer_feature_enable='&optimizer_feature' and value in (1,0) order by bugno;
select 'set term on' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'break on plan_hash_value' from dual;
select 'select plan_hash_value,bugfix from bugfix_table order by 1;' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'spool off' from dual;
set feedback on
set heading on
set term on
spool off

Both of the queries drop and create the table that they insert plan and bugfix numbers. You may want to implement it different way.

Let me show you how they work in two different examples for two of the queries.

This example is for a sql which started to run slow after upgrade from 11.2.0.1 to 11.2.0.2.

first check if the plan is changed


SQL> @sqlhist 1m7wy721qytx3

   SNAP_ID BEGIN_INTERVAL_TIME            SQL_ID          CHILD_NO PLAN_HASH_VALUE
---------- ------------------------------ ------------- ---------- ---------------
     25077 19-JAN-11 04.00.56.483         1m7wy721qytx3          0      3561051786
     25078 19-JAN-11 04.31.00.422         1m7wy721qytx3          0      3561051786
     25269 23-JAN-11 04.00.47.349         1m7wy721qytx3          0      2266344041 =>upgrade
     25317 24-JAN-11 04.00.53.276         1m7wy721qytx3          0      2266344041
     25365 25-JAN-11 04.00.43.525         1m7wy721qytx3          0      2266344041
     25366 25-JAN-11 04.30.47.456         1m7wy721qytx3          0      2266344041
     25413 26-JAN-11 04.00.22.604         1m7wy721qytx3          0      2266344041
     25414 26-JAN-11 04.30.25.833         1m7wy721qytx3          0      2266344041
     25461 27-JAN-11 04.00.18.101         1m7wy721qytx3          0      2266344041
     25462 27-JAN-11 04.30.24.777         1m7wy721qytx3          0      2266344041
     25509 28-JAN-11 04.00.05.631         1m7wy721qytx3          0      2266344041
     25510 28-JAN-11 04.30.10.558         1m7wy721qytx3          0      2266344041
     25557 29-JAN-11 04.00.41.775         1m7wy721qytx3          0      2266344041
     25558 29-JAN-11 04.30.46.453         1m7wy721qytx3          0      2266344041

Plan hash value: 3561051786

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                  |                      |       |       |  1082K(100)|          |       |       |
|   1 |  MERGE                           | TABLE_XXXXXXXXXXX    |       |       |            |          |       |       |
|   2 |   VIEW                           |                      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS OUTER            |                      |     1 |   286 |  1082K  (1)| 01:44:15 |       |       |
|   4 |     VIEW                         |                      |     1 |    78 |  1082K  (1)| 01:44:15 |       |       |
|   5 |      SORT GROUP BY               |                      |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|   6 |       FILTER                     |                      |       |       |            |          |       |       |
|   7 |        NESTED LOOPS              |                      |   237 | 17775 |  1082K  (1)| 01:44:15 |       |       |
|   8 |         HASH JOIN                |                      |   237 | 15168 |  1081K  (1)| 01:44:12 |       |       |
|   9 |          INDEX RANGE SCAN        | INDEX_TABLE_3        |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|  10 |          PARTITION RANGE ITERATOR|                      |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|  11 |           INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX    |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|  12 |         INDEX RANGE SCAN         | INDEX_TABLE_2        |     1 |    11 |     2   (0)| 00:00:01 |       |       |
|  13 |     TABLE ACCESS BY INDEX ROWID  | TABLE_XXXXXXXXXXX    |     1 |   208 |     2   (0)| 00:00:01 |       |       |
|  14 |      INDEX UNIQUE SCAN           | PK_TABLE_XXXXXXXXXXX |     1 |       |     1   (0)| 00:00:01 |       |       |
-------------------------------------------------------------------------------------------------------------------------

Plan hash value: 2266344041

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                  |                      |       |       |  1081K(100)|          |       |       |
|   1 |  MERGE                           | TABLE_XXXXXXXXXXX    |       |       |            |          |       |       |
|   2 |   VIEW                           |                      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS OUTER            |                      |     1 |   282 |  1081K  (1)| 01:44:11 |       |       |
|   4 |     VIEW                         |                      |     1 |    78 |  1081K  (1)| 01:44:11 |       |       |
|   5 |      SORT GROUP BY               |                      |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   6 |       FILTER                     |                      |       |       |            |          |       |       |
|   7 |        NESTED LOOPS              |                      |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   8 |         NESTED LOOPS             |                      |     1 |    64 |  1081K  (1)| 01:44:11 |       |       |
|   9 |          INDEX RANGE SCAN        | INDEX_TABLE_3        |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|  10 |          PARTITION RANGE ITERATOR|                      |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|  11 |           INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX    |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|  12 |         INDEX RANGE SCAN         | INDEX_TABLE_2        |     1 |    11 |     2   (0)| 00:00:01 |       |       |
|  13 |     TABLE ACCESS BY INDEX ROWID  | TABLE_XXXXXXXXXXX    |     1 |   204 |     2   (0)| 00:00:01 |       |       |
|  14 |      INDEX UNIQUE SCAN           | PK_TABLE_XXXXXXXXXXX |     1 |       |     1   (0)| 00:00:01 |       |       |
-------------------------------------------------------------------------------------------------------------------------

Check the 8th step on the plans which makes the diffrence. Oracle suddenly started to choose nested loop instead of hash and estimated rows also drop down to 1.

Lets build the test case

SQL> @build_bind_vars_h
Enter SQL ID ==> 1m7wy721qytx3
Enter Plan Hash Value ==> 2266344041

variable B1 NUMBER

begin

:B1 := 14544;

end;

/

MERGE INTO MXS.TABLE_XXXXXXXXXXX TP USING ( select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.ID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USR1.TABLE_1, USR1.TABLE_2, USR1.TABLE_3 WHERE TABLE_1.ID = TABLE_2.TABLE_2.D
AND TABLE_2.MARKETID =TABLE_3.MARKETID AND TABLE_2.MARKETID IN :B1 AND TABLE_1.SOURCEID =TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.ID, TABLE_1.SOURCEID) TI ON (TP.TABLE_2.D = TI.TABLE_2.D AND TP.SOURCEID = TI.SOURCEID) WHEN MATCHED THEN UPDATE
SET TP.LASTYEARCLOSE = TI.LAST, TP.LASTYEARCLOSETIMESTAMP = TI.LASTTIMESTAMP, TP.LASTYEARCLOSESPREAD = TI.LASTSPREAD,
TP.LASTYEARCLOSEUPFRONT = TI.LASTUPFRONT WHEN NOT MATCHED THEN INSERT (TP.TABLE_2.D, TP.SOURCEID, TP.LASTYEARCLOSE,
TP.LASTYEARCLOSETIMESTAMP, TP.LASTYEARCLOSESPREAD, TP.LASTYEARCLOSEUPFRONT) VALUES (TI.TABLE_2.D, TI.SOURCEID, TI.LAST, TI.LASTTIMESTAMP,
TI.LASTSPREAD, TI.LASTUPFRONT);

Since this is a merge statement I needed to remove the merge part.

test.sql is like below

variable B1 NUMBER

begin

:B1 := 14544;

end;

/
select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.ID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USR1.TABLE_1, USR1.TABLE_2, USR1.TABLE_3 WHERE TABLE_1.ID = TABLE_2.TABLE_2.D
AND TABLE_2.MARKETID =TABLE_3.MARKETID AND TABLE_2.MARKETID IN :B1 AND TABLE_1.SOURCEID =TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.ID, TABLE_1.SOURCEID

test_e.sql is like below


explain plan for select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.ID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USR1.TABLE_1, USR1.TABLE_2, USR1.TABLE_3 WHERE TABLE_1.ID = TABLE_2.TABLE_2.D
AND TABLE_2.MARKETID =TABLE_3.MARKETID AND TABLE_2.MARKETID IN 14544 AND TABLE_1.SOURCEID =TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.ID, TABLE_1.SOURCEID;

build the optimizer test to see if its OFE related

SQL> @build_optimizer_features_test
Enter value for base_optimizer_version: 11.2.0.1

run the test for different OFE settings

SQL> @optimizer_features_test

SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  c5y47k1zmrb22, child number 0
-------------------------------------
-------------------------------------

Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |    320K|00:21:18.17 |      25M|     16M|       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |    320K|00:21:18.17 |      25M|     16M|    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1579K|01:37:13.89 |      25M|     16M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |      1 |   1579K|01:37:11.10 |      25M|     16M|       |       |          |
|   4 |     NESTED LOOPS             |                   |      1 |      1 |   5446K|00:15:20.03 |      16M|     16M|       |       |          |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |      1 |   5446K|00:15:10.67 |      16M|     16M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |      1 |   5446K|00:15:01.04 |      16M|     16M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   5446K|      1 |   1579K|00:05:19.24 |    9518K|  34458 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID"))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

##########Second Run Cardinality feedback kicked still same plan with different cardinality on the last step

Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |      0 |00:00:00.01 |       0 |      0 |    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1444K|00:22:15.31 |      24M|     15M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |      1 |   1444K|00:22:12.87 |      24M|     15M|       |       |          |
|   4 |     NESTED LOOPS             |                   |      1 |      1 |   4894K|00:10:27.91 |      15M|     15M|       |       |          |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |      1 |   4894K|00:10:19.78 |      15M|     15M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |      1 |   4894K|00:10:11.63 |      15M|     15M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   4894K|      1 |   1444K|00:00:53.00 |    8678K|   2859 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID"))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

Note
-----
   - cardinality feedback used for this statement

SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  c5y47k1zmrb22, child number 2
-------------------------------------

Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |    320K|00:21:24.30 |      25M|     16M|       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |    320K|00:21:24.30 |      25M|     16M|    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1579K|00:20:19.75 |      25M|     16M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |    234 |   1579K|00:20:17.00 |      25M|     16M|       |       |          |
|*  4 |     HASH JOIN                |                   |      1 |    234 |   5446K|00:09:47.69 |      16M|     16M|  1452K|  1452K|  372K (0)|
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |   1948 |     20M|00:21:13.17 |      16M|     16M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |   1948 |     20M|00:20:39.18 |      16M|     16M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   5446K|      1 |   1579K|00:00:51.19 |    9518K|   3216 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

##########Second Run Cardinality feedback kicked still same plan with different cardinality on the last step

Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |    320K|00:17:51.32 |      25M|     16M|       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |    320K|00:17:51.32 |      25M|     16M|    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1579K|00:19:42.92 |      25M|     16M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |    234 |   1579K|00:19:40.24 |      25M|     16M|       |       |          |
|*  4 |     HASH JOIN                |                   |      1 |    234 |   5446K|00:10:22.24 |      16M|     16M|  1452K|  1452K|  419K (0)|
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |   1948 |     20M|00:21:33.34 |      16M|     16M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |   1948 |     20M|00:20:59.31 |      16M|     16M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   5446K|      1 |   1579K|00:00:54.86 |    9518K|   3161 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

Note
-----
   - cardinality feedback used for this statement

As you see cardinality feedback kicked two times with same plan generated with different cardinalities.
When you check the new 11.2.0.2 plan, when it uses nested loop cardinality suddenly drop to 1 which is a bit suspicios.

Runtime sql statistics

SQL> @sqlid c5y47k1zmrb22

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 000000075E5B37D0 00000009C5E1FEC8 2561977778  1081582          1          1          1        642         320375 633798.646 2034600.72 2034.60072   25789778   16298873          1               0
   1 000000075E5B37D0 00000009022773E0 2561977778  1081582          1          1          1        643         320375 607056.710 1424804.27 1424.80427   25136188   16034118          1               0
   2 000000075E5B37D0 0000000731957330 2179327633  1082314          1          1          1        642         320375 516728.448 1284256.36 1284.25636   25789785   16266663          1               0
   3 000000075E5B37D0 00000008735DB1A8 2179327633  1082314          1          1          1        642         320375 512717.052 1315325.55 1315.32555   25789675   16266877          1               0

Clearly 11.2.0.1 has got better response times. Because the runtime is too long better try our chance if the behaviour is same for explain plan with literals as well

Build the explain plan test case

SQL> @build_optimizer_features_test_e
Enter value for base_optimizer_version: 11.2.0.1

Lets run the test

SQL> @optimizer_features_test_e
SQL> alter session set optimizer_features_enable='11.2.0.2';

Session altered.

SQL> set echo off

Explained.

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   4 |     NESTED LOOPS             |                   |     1 |    64 |  1081K  (1)| 01:44:11 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

25 rows selected.

SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

SQL> set echo off

Explained.

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |   234 | 17550 |  1082K  (1)| 01:44:15 |       |       |
|*  4 |     HASH JOIN                |                   |   234 | 14976 |  1081K  (1)| 01:44:12 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0)
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

26 rows selected.

Plans behaviour is same so there is no need for me to run it for every bugfix better I use explain plan testing

Its time to build the bugfix test

SQL> @build_optimizer_bugfix_test_e
Enter value for optimizer_feature_version: 11.2.0.2

Time to rock

SQL> @optimizer_bugfix_test_e		=>output is truncated and there is break on plan_hash_value

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
2179327633              7641601
2561977778              4926618
                        5982893
                        6086930
                        6236862
                        6408301
                        6472966
			.......
			.......
			.......

103 rows selected.

As you see bugfix 7641601 is the guilty guy for us and we found this info under 1 minute. Once we disable it it reverts back to the 11.2.0.1 plan

Lets see what this bug is. Bugfixg is a script where I check bugfix definition from v$system_fix_control

SQL> @bugfixg
Enter value for bugno: 7641601

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   7641601          1 QKSFM_CARDINALITY_7641601 treat a single value column as having a frequency histogram      11.2.0.2                           0          1

This one looks like a new very interesting bugfix which needs more research from Jonathan Lewis, I don’t feel that much comfortable to understand how it works :)

One more thing to prove how enabling disabling the bugfix works

SQL> alter session set "_fix_control"='7641601:OFF';

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |   234 | 17550 |  1082K  (1)| 01:44:15 |       |       |
|*  4 |     HASH JOIN                |                   |   234 | 14976 |  1081K  (1)| 01:44:12 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0)
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

26 rows selected.

SQL> alter session set "_fix_control"='7641601:ON';

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   4 |     NESTED LOOPS             |                   |     1 |    64 |  1081K  (1)| 01:44:11 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

25 rows selected.

I think research for this particular sql is now finished. We found out the problem. We have enough info to stabilize this sql. Rest is checking the effect of statistics with
dynamic sampling and also raising the issue with Oracle which I will also explain later.

I want to share another example with the bugfix I mentioned in this post.

I will skip the initial steps and paste only the output. The query normally runs under 3 seconds in worst condition and has got many binds so I would like to use
build_optimizer_bugfix_test instead of explain plan.

Different optimizer behavior

SQL> @optimizer_features_test
SQL> alter session set optimizer_features_enable='11.2.0.2';

Session altered.

SQL> set echo off
timing for: time_for_ofe_11.2.0.2
Elapsed: 00:00:03.01

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  64xn5ccrd1f9n, child number 0
-------------------------------------

Plan hash value: 2063817604

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                             |      1 |        |     50 |00:00:00.09 |    5308 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TABLE_1_XXXXXXXXXXXX        |      1 |     12 |     50 |00:00:00.09 |    5308 |
|*  2 |   INDEX SKIP SCAN           | IDX_TABLE_1_XXXXXXXXXXXX_05 |      1 |     12 |     50 |00:00:00.45 |    5272 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TABLE_1_XXXXXXXXXXXX"."SOURCEID"=:N51)
       filter(("TABLE_1_XXXXXXXXXXXX"."SOURCEID"=:N51 AND
              INTERNAL_FUNCTION("TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID")))

32 rows selected.

Elapsed: 00:00:00.04
SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

Elapsed: 00:00:00.01
Session altered.

Elapsed: 00:00:00.01
SQL> set echo off
timing for: time_for_ofe_11.2.0.1
Elapsed: 00:00:00.11

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  64xn5ccrd1f9n, child number 1
-------------------------------------

Plan hash value: 1927123218

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |      1 |        |     50 |00:00:00.01 |     113 |
|   1 |  INLIST ITERATOR             |                         |      1 |        |     50 |00:00:00.01 |     113 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TABLE_1_XXXXXXXXXXXX    |     50 |     12 |     50 |00:00:00.01 |     113 |
|*  3 |    INDEX UNIQUE SCAN         | PK_TABLE_1_XXXXXXXXXXXX |     50 |     12 |     50 |00:00:00.01 |      78 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access((("TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N1 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N2 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N3 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N4 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N5 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N6 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N7 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N8 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N9 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N10 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N11 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N12 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N13 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N14 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N15 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N16 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N17 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N18 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N19 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N20 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N21 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N22 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N23 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N24 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N25 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N26 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N27 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N28 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N29 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N30 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N31 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N32 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N33 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N34 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N35 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N36 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N37 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N38 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N39 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N40 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N41 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N42 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N43 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N44 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N45 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N46 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N47 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N48 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N49 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N50)) AND
              "TABLE_1_XXXXXXXXXXXX"."SOURCEID"=:N51)

SQL> @build_optimizer_bugfix_test
Enter value for optimizer_feature_version: 11.2.0.2
SQL> @optimizer_bugfix_test

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
1927123218              8893626
2063817604              4926618
                        5982893
                        6086930
                        6236862
                        6408301
                        6472966
                        6754080
                        7215982
                        7277732
                        7284269
                        7641601
			.......
			.......
			.......
			.......

103 rows selected.

In under 5 minutes I found out which bugfix is causing the issue

SQL> @bugfixg
Enter value for bugno: 8893626

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   8893626          1 QKSFM_ACCESS_PATH_8893626 apply index filter selectivity during skip scan costing          11.2.0.2                           0          1

In summary it is not black magic to spot what is causing the plan behaviour change when it comes to bugfixes.
Once you find out you are ready to take control of your plans again. Once you have this valuable information you can get faster responses from Oracle Support on your issue as well.
You will also look cool to your developers when

You can ask, what if my plan is not changing only on 1 bugfix? In that case I would check for the plan I am looking for
and if it exists I still assume thats my bugfix if it does not exist I will follow the upcoming steps.

I will post one more example for the issue I explained above when very old bugfixes suddenly appear on new version.

References Used:
Init.ora Parameter “_FIX_CONTROL” [Hidden] Reference Note [ID 567171.1];

Scripts Used:
All in one zip file

build_optimizer_bugfix_test.sql
build_optimizer_bugfix_test_e.sql
bugfixg.sql

<-Previous Post in Series ——————————-   Next Post in Series–>

Older Posts »

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 415 other followers