Coskan’s Approach to Oracle

August 15, 2011

What happens, when you did not do a load test before release upgrades ?

Filed under: Performance — coskan @ 11:55 pm

This is what happened on 11.2.0.2 upgrade from 10.2.0.5

Picture is taken from production (tool used is for this picture is my recent favourite Database Time Monitor from Dominic Giles – creator of swingbench) today, which was upgraded to 11.2.0.2 on last saturday.

What happens here is I think self-explanatory, our system and underlying I/O is hammered with direct path reads.

Explaining this will not satisfy the customer so question is Why it is happening ?

1- It is obvious that upgrade was not tested good enough

2- Enough research about behaviour change has not been made for 10g to 11g migration

All these are political answers, real technical answer is below

Once I checked the system there were roughly 9 distinct sqlids running with many sessions,  all waiting on the same event.

The sqls were more or less  centralised on querying Users table with filtering email where there is no index and there has never been an index.  Why wasn’t system suffering from this lack of index when we are on 10GR2?

It is because 11G is now using direct path reads whenever it can for full table scans  (for more information check the references at the bottom of this table). This means when we run this query and if the table is eligible for direct path reads it will use direct path reads from disk. When you query with 2-3 sessions this is ok but when you query with 20 sessions then it becomes disaster like here and your disk run queue gets bigger and bigger

What was happening on 10GR2? Because Oracle was using scattered reads and cache when it can we were doing more logical I/O without hammering disk subsystem and this contention never revealed and because this application is third party  no index was created by us even there was unnecessary logical I/O.

Because we only functionality and vendor certifies the production on 11G we just upgraded the database.

Let me replicate the issue with simple test case on 11G by first with default direct path read behaviour and then reverting back to pre 11G behaviour by setting 10949 event at system level (ref: Dion Cho)

Test Case

—Running 20 sessions of simple select each for 10 times. Total 200 runs. (Idea taken from -Andrey S. Nikolaev -)


create table test_direct as select * from dba_source;

---single_thread.sql --- 20 lines of the code below
select  * from test_directpath2 where name like 'TEST%';

---multi_runner.sql
set echo off
set verify off
set head off:
define parallelism="&1"

spool many_threads.sql
select 'host sqlplus hr/hr @single_thread.sql &'
from dba_objects where rownum <= ¶llelism;
spool off
@many_threads.sql

---Running
@multi_runner 20

Default 11G


Pre-10G behaviour

during the issue-Only slight increase during the caching then all LIO

After the issue happened – It lasted only 1 minute

As you can see we have a sudden CPU increased due to high number of logical reads in short time but no long term I/O load,  once oracle caches it burns CPU instead waiting on I/O. Meanwhile I think DB was in maintenance window so resource manager kicked in to avoid hammering CPU–resmgr:cpu quantum event.

Here are the other comparisons between two runs

Session waits – 11G

HR@ORACOS> @swact

    SID USERN STATE   EVENT                                          SEQ# SEC_IN_STATE SQL_ID         CHN                        P1                        P2                        P3 P1TRANSL                                   BLCKNG_SID BLCKNG_SID_S BLCKNG_INST
------- ----- ------- ---------------------------------------- ---------- ------------ ------------- ---- ------------------------- ------------------------- ------------------------- ------------------------------------------ ---------- ------------ -----------
     15 HR    WAITING direct path read                                606            0 ba40q4n8w1myy    0 file number= 10           first dba= 493120         block cnt= 64                                                                   UNKNOWN
    134 HR    WAITING direct path read                                444            0 ba40q4n8w1myy    0 file number= 10           first dba= 493314         block cnt= 62                                                                   UNKNOWN
     75 HR    WAITING direct path read                                483            0 ba40q4n8w1myy    0 file number= 10           first dba= 494656         block cnt= 64                                                                   UNKNOWN
    200 HR    WAITING direct path read                                491            0 ba40q4n8w1myy    0 file number= 10           first dba= 494912         block cnt= 64                                                                   UNKNOWN
    196 HR    WAITING direct path read                                493            0 ba40q4n8w1myy    0 file number= 10           first dba= 494912         block cnt= 64                                                                   UNKNOWN
     12 HR    WAITING direct path read                                504            0 ba40q4n8w1myy    0 file number= 10           first dba= 495234         block cnt= 62                                                                   UNKNOWN
    132 HR    WAITING direct path read                                507            0 ba40q4n8w1myy    0 file number= 10           first dba= 495234         block cnt= 62                                                                   UNKNOWN
     13 HR    WAITING direct path read                                515            0 ba40q4n8w1myy    0 file number= 10           first dba= 495617         block cnt= 63                                                                   UNKNOWN
    201 HR    WAITING direct path read                                532            0 ba40q4n8w1myy    0 file number= 10           first dba= 496192         block cnt= 64                                                                   UNKNOWN
    137 HR    WAITING direct path read                                547            0 ba40q4n8w1myy    0 file number= 10           first dba= 496704         block cnt= 64                                                                   UNKNOWN
    136 HR    WAITING direct path read                                453            0 ba40q4n8w1myy    0 file number= 13           first dba= 100032         block cnt= 64                                                                   UNKNOWN
    202 HR    WAITING direct path read                                464            0 ba40q4n8w1myy    0 file number= 13           first dba= 100354         block cnt= 62                                                                   UNKNOWN
    135 HR    WAITING direct path read                                472            0 ba40q4n8w1myy    0 file number= 13           first dba= 100610         block cnt= 62                                                                   UNKNOWN
    199 HR    WAITING direct path read                                485            0 ba40q4n8w1myy    0 file number= 13           first dba= 101056         block cnt= 64                                                                   UNKNOWN
     71 HR    WAITING direct path read                                502            0 ba40q4n8w1myy    0 file number= 13           first dba= 101506         block cnt= 62                                                                   UNKNOWN
     16 HR    WAITING direct path read                                505            0 ba40q4n8w1myy    0 file number= 13           first dba= 101696         block cnt= 64                                                                   UNKNOWN
     73 HR    WAITING direct path read                                521            0 ba40q4n8w1myy    0 file number= 13           first dba= 102208         block cnt= 64                                                                   UNKNOWN
    133 HR    WAITING direct path read                                591            0 ba40q4n8w1myy    0 file number= 13           first dba= 98946          block cnt= 62                                                                   UNKNOWN
     74 HR    WAITING direct path read                                432            0 ba40q4n8w1myy    0 file number= 13           first dba= 99330          block cnt= 62                                                                   UNKNOWN
     72 HR    WAITING direct path read                                608            0 ba40q4n8w1myy    0 file number= 13           first dba= 99520          block cnt= 64                                                                   UNKNOWN
    192 HR    WORKING On CPU / runqueue                                43            0 90sp00nt0sspn    0 driver id=                #bytes= 1                 0                                                                               NOT IN WAIT
                                                                                                          0x0000000054435000

21 rows selected.

Session waits – Pre 11G


HR@ORACOS> @swact

    SID USERN STATE   EVENT                                          SEQ# SEC_IN_STATE SQL_ID         CHN                        P1                        P2                        P3 P1TRANSL                                   BLCKNG_SID BLCKNG_SID_S BLCKNG_INST
------- ----- ------- ---------------------------------------- ---------- ------------ ------------- ---- ------------------------- ------------------------- ------------------------- ------------------------------------------ ---------- ------------ -----------
    136       WAITING jobq slave wait                                  10            0                  0 0                         0                         0                                                                               NO HOLDER
     20       WAITING jobq slave wait                                  10            0                  0 0                         0                         0                                                                               NO HOLDER
     23 HR    WAITING resmgr:cpu quantum                               53            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     25 HR    WAITING resmgr:cpu quantum                               60            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     22 HR    WAITING resmgr:cpu quantum                               53            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    210 HR    WAITING resmgr:cpu quantum                               50            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     14 HR    WAITING resmgr:cpu quantum                               56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    208 HR    WAITING resmgr:cpu quantum                               53            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    206 HR    WAITING resmgr:cpu quantum                               54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    144 HR    WAITING resmgr:cpu quantum                               58            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    141 HR    WAITING resmgr:cpu quantum                               54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    140 HR    WAITING resmgr:cpu quantum                               50            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     74 HR    WAITING resmgr:cpu quantum                               55            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     81 HR    WAITING resmgr:cpu quantum                               56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     85 HR    WAITING resmgr:cpu quantum                               56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    132 HR    WAITING resmgr:cpu quantum                               54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
    137 HR    WAITING resmgr:cpu quantum                               57            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             UNKNOWN
     73 HR    WORKING On CPU / runqueue                               506            0 90sp00nt0sspn    0 driver id=                #bytes= 1                 0                                                                               NOT IN WAIT
                                                                                                          0x0000000054435000

     80 HR    WORKING On CPU / runqueue                                56            0 ba40q4n8w1myy    0 driver id=                #bytes= 1                 0                                                                               NOT IN WAIT
                                                                                                          0x0000000062657100

    199 HR    WORKING On CPU / runqueue                                52            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT
     82 HR    WORKING On CPU / runqueue                                57            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT
     13 HR    WORKING On CPU / runqueue                                56            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT
    205 HR    WORKING On CPU / runqueue                                54            0 ba40q4n8w1myy    0 location= 3               = 0                       = 0                                                                             NOT IN WAIT

23 rows selected.

SQL Stats -Pre-11G


HR@ORACOS> @sqlid ba40q4n8w1myy

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
ba40q4n8w1myy  297848798 select  * from test_directpath2 where name like 'TEST%'

       INS  CH#   UEX  PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS
---------- ----  ---- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ----------
         1    0     0 3436878245     2147        200          1        200        200              0      52000  5333438.4  26.667192    2103470    2102605

SQL Stats -Pre-11G  –Check Elapsed times and PIO difference


HR@ORACOS> @sqlid ba40q4n8w1myy

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
ba40q4n8w1myy  297848798 select  * from test_directpath2 where name like 'TEST%'

       INS  CH#  UEX  PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS
---------- ---- ---- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ----------
         1    0    0 3436878245     2147        200          2        200        200              0      16120 102802.028  .51401014    2104800      10513

HR@ORACOS>

Snapper -11G


HR@ORACOS> @sn 10 15
Sampling...

-- Session Snapper v3.10 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

----------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                               ,         DELTA, HDELTA/SEC,    %TIME, GRAPH
----------------------------------------------------------------------------------------------------------------------
     15, HR        , STAT, session logical reads                   ,          3338,      333.8,    ===>check the difference
     15, HR        , STAT, user I/O wait time                      ,          1040,        104,
     15, HR        , STAT, non-idle wait time                      ,          1039,      103.9,
     15, HR        , STAT, non-idle wait count                     ,            53,        5.3,
     15, HR        , STAT, physical read total IO requests         ,            54,        5.4,
     15, HR        , STAT, physical read total multi block requests,            54,        5.4,
     15, HR        , STAT, physical read total bytes               ,      27869184,      2.79M,    ===>check the difference
     15, HR        , STAT, cell physical IO interconnect bytes     ,      27869184,      2.79M,
     15, HR        , STAT, consistent gets                         ,          3338,      333.8,
     15, HR        , STAT, consistent gets direct                  ,          3338,      333.8,
     15, HR        , STAT, physical reads                          ,          3338,      333.8,
     15, HR        , STAT, physical reads direct                   ,          3338,      333.8,
     15, HR        , STAT, physical read IO requests               ,            53,        5.3,
     15, HR        , STAT, physical read bytes                     ,      27344896,      2.73M,
     15, HR        , STAT, file io wait time                       ,          1010,        101,
     15, HR        , STAT, Number of read IOs issued               ,            53,        5.3,
     15, HR        , STAT, no work - consistent read gets          ,          3338,      333.8,
     15, HR        , STAT, table scan rows gotten                  ,        164188,     16.42k,
     15, HR        , STAT, table scan blocks gotten                ,          3338,      333.8,
     15, HR        , TIME, DB CPU                                  ,         70000,        7ms,      .7%, |@         |
     15, HR        , TIME, sql execute elapsed time                ,      10107298,      1.01s,   101.1%, |@@@@@@@@@@|
     15, HR        , TIME, DB time                                 ,      10107298,      1.01s,   101.1%, |@@@@@@@@@@|
     15, HR        , WAIT, direct path read                        ,       9535205,   953.52ms,    95.4%, |@@@@@@@@@@|
--  End of Stats snap 1, end=2011-08-16 00:02:14, seconds=10

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
    99% | ba40q4n8w1myy   | direct path read          | User I/O
     1% | ba40q4n8w1myy   | ON CPU                    | ON CPU

--  End of ASH snap 1, end=2011-08-16 00:02:14, seconds=10, samples_taken=99

Snapper – Pre-11G  –check logical I/O per second 


HR@ORACOS> @sn 30 192
Sampling...

-- Session Snapper v3.10 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

----------------------------------------------------------------------------------------------------------------------
    SID, USERNAME  , TYPE, STATISTIC                               ,         DELTA, HDELTA/SEC,    %TIME, GRAPH
----------------------------------------------------------------------------------------------------------------------
    192, HR        , STAT, opened cursors cumulative               ,            77,       2.57,
    192, HR        , STAT, opened cursors current                  ,            -1,       -.03,
    192, HR        , STAT, user calls                              ,           116,       3.87,
    192, HR        , STAT, pinned cursors current                  ,             1,        .03,
    192, HR        , STAT, session logical reads                   ,        406695,     13.56k,	  ===>check the difference
    192, HR        , STAT, CPU used when call started              ,           300,         10,
    192, HR        , STAT, CPU used by this session                ,           300,         10,
    192, HR        , STAT, DB time                                 ,          1926,       64.2,
    192, HR        , STAT, concurrency wait time                   ,             6,         .2,
    192, HR        , STAT, scheduler wait time                     ,          1471,      49.03,
    192, HR        , STAT, non-idle wait time                      ,          1479,       49.3,
    192, HR        , STAT, non-idle wait count                     ,           201,        6.7,
    192, HR        , STAT, enqueue requests                        ,             1,        .03,
    192, HR        , STAT, enqueue releases                        ,             1,        .03,
    192, HR        , STAT, consistent gets                         ,        406695,     13.56k,
    192, HR        , STAT, consistent gets from cache              ,        406695,     13.56k,
    192, HR        , STAT, consistent gets from cache (fastpath)   ,        406695,     13.56k,
    192, HR        , STAT, calls to kcmgcs                         ,           426,       14.2,
    192, HR        , STAT, calls to get snapshot scn: kcmgss       ,            39,        1.3,
    192, HR        , STAT, no work - consistent read gets          ,        406269,     13.54k,
    192, HR        , STAT, table scans (long tables)               ,            39,        1.3,
    192, HR        , STAT, table scan rows gotten                  ,      24488652,    816.29k,
    192, HR        , STAT, table scan blocks gotten                ,        406269,     13.54k,
    192, HR        , STAT, session cursor cache hits               ,            76,       2.53,
    192, HR        , STAT, parse count (total)                     ,            77,       2.57,
    192, HR        , STAT, execute count                           ,            77,       2.57,
    192, HR        , STAT, bytes sent via SQL*Net to client        ,         24169,     805.63,
    192, HR        , STAT, bytes received via SQL*Net from client  ,         19066,     635.53,
    192, HR        , STAT, SQL*Net roundtrips to/from client       ,            76,       2.53,
    192, HR        , TIME, parse time elapsed                      ,          2510,    83.67us,      .0%, |          |
    192, HR        , TIME, PL/SQL execution elapsed time           ,          4570,   152.33us,      .0%, |          |
    192, HR        , TIME, DB CPU                                  ,       3160000,   105.33ms,    10.5%, |@@        |
    192, HR        , TIME, sql execute elapsed time                ,      19257156,   641.91ms,    64.2%, |@@@@@@@   |
    192, HR        , TIME, DB time                                 ,      19757917,    658.6ms,    65.9%, |@@@@@@@   |
    192, HR        , WAIT, Disk file operations I/O                ,            38,     1.27us,      .0%, |          |
    192, HR        , WAIT, latch: cache buffers chains             ,         68244,     2.27ms,      .2%, |          |
    192, HR        , WAIT, resmgr:cpu quantum                      ,      14801390,   493.38ms,    49.3%, |@@@@@     |
    192, HR        , WAIT, SQL*Net message to client               ,           309,     10.3us,      .0%, |          |
    192, HR        , WAIT, SQL*Net message from client             ,       9924100,    330.8ms,    33.1%, |@@@@      |
    192, HR        , WAIT, events in waitclass Other               ,         11340,      378us,      .0%, |          |
--  End of Stats snap 1, end=2011-08-16 00:20:32, seconds=30

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
    43% | ba40q4n8w1myy   | resmgr:cpu quantum        | Scheduler
    20% | ba40q4n8w1myy   | ON CPU                    | ON CPU

--  End of ASH snap 1, end=2011-08-16 00:20:32, seconds=30, samples_taken=87

I/O Stats – 11G–high runque size

coskan@coskan-lucid:~$ sar -d 1
Linux 2.6.32-33-generic (coskan-lucid) 	16/08/11 	_x86_64_	(4 CPU)

00:00:55          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:00:56       dev8-0    132.00 127200.00    104.00    964.42     21.18    130.15      7.58    100.00
00:00:56      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:56      dev8-32    105.00   1680.00     34.00     16.32      0.68      6.38      6.00     63.00
00:00:56      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:56      dev8-64     84.00   3240.00   8400.00    138.57     79.66    109.88      9.64     81.00

00:00:56          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:00:57       dev8-0    131.00 128320.00     64.00    980.03     22.76    136.87      7.63    100.00
00:00:57      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:57      dev8-32    158.00   2640.00      0.00     16.71      0.79      5.00      5.00     79.00
00:00:57      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:00:57      dev8-64     56.00      0.00  13440.00    240.00    144.52   1049.64     17.86    100.00

I/O Stats – PRe-11G  —no runque


00:11:57          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:11:58       dev8-0      2.00      0.00     32.00     16.00      0.03     15.00     15.00      3.00
00:11:58      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:58      dev8-32     10.00    160.00     64.00     22.40      0.04      4.00      4.00      4.00
00:11:58      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:58      dev8-64    275.00   2464.00   4736.00     26.18    144.05    193.24      3.64    100.00

00:11:58          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
00:11:59       dev8-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-32      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
00:11:59      dev8-64    440.00   3344.00   2984.00     14.38    146.16    295.02      2.27    100.00

There are two  solutions in this case one is  creating an index to avoid full table scan or disabling direct path read at system level which is not right at all.
Moral Of the story is TEST YOUR SYSTEM WITH SAME/SIMILAR LOAD BEFORE YOU UPGRADE !!!!!

Being certified on a platform won’t mean it will run fine
If you havent read already, these are reference posts about direct path reads on 11GR2

http://blog.tanelpoder.com/2011/08/11/full-scans-direct-path-reads-and-ora-8103-error-hacking-session-video-here-plus-itunes-podcast-address-2/

http://dioncho.wordpress.com/2009/07/21/disabling-direct-path-read-for-the-serial-full-table-scan-11g/

http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-reads-what-is.html

http://oraclue.com/2009/07/17/direct-path-reads-and-serial-table-scans-in-11g/

http://oracledoug.com/serendipity/index.php?/archives/1321-11g-and-direct-path-reads.html

August 13, 2011

dbms_xplan.display_sql_plan_baseline can also lie !

Filed under: Bugs, CBO, Performance, Plan Stability — coskan @ 6:40 pm

This blog post is about another buggy/funny/misleading behaviour which you MUST know if you are using baselines ( especially in a non-controlled fashion by let oracle manage them
) It looks like a long post but it is actually an easy read since nothing rocket science in the post.

All story started when one of our batches was not using the plan stored in the sql plan baseline repository.
In these cases app support check if the baseline is used by checking v$sql.sql_plan_baseline value.

If it is not used then they check the enabled, accepted baselines by using the statement below as in documentation

dbms_xplan.display_sql_plan_baseline(sql_handle=>’&sql_handle’,plan_name=>’&sql_plan’, format=>’BASIC’));

Once they find the plan_hash_value they are looking for they try to fix it and expect Oracle use the baseline which it should use in theory.

In our case it didn’t. When I checked the plan baseline was created in september 2010 and never actually used since 14 January 2011.
Fun part is some occasions same plan hash value in the baseline is being generated but Oracle is not reporting that baseline is used. But that’s the whole point of baselines having a constant plan hash value. Why oracle is not using the plan baseline even it generates the same plan with same plan hash value

We raised the issue with Oracle and oracle come back with couple of request

1- 10053 trace
2- plan baseline trace( alter session set events ‘trace [SQL_Plan_Management.*]‘;)

We provided the requests and they said if there was a change in the table structure at some point which is causing the plan can’t be reproduced. App team confirmed that there was a structural change to drop global index and recreate it as local on actually the date when the baseline was last used.
The problem is we changed a global index to local index on 14 January 2011 with an index name change, but display_sql_plan_baseline shows us a plan which uses a local index with the current name.This cannot be possible since the baseline is create on September 2010. So what was wrong?

Support could not managed to give me a decent answer or accept that it is a bug so it is time for me to create a test case and reveal the truth why this issue is happening.


HR@ORACOS> ---simple table creation
HR@ORACOS> create table baseline_test as select * from dba_objects;

Table created.

HR@ORACOS> ---index creation
create unique index baseline_test_idx_1
  2  on baseline_test(object_id);

Index created.

HR@ORACOS> ---gather table stats
HR@ORACOS> @gts baseline_test

PL/SQL procedure successfully completed.

HR@ORACOS> HR@ORACOS>
HR@ORACOS>
HR@ORACOS> ---start session level baseline capturing process
HR@ORACOS> alter session set optimizer_capture_sql_plan_baselines=TRUE;

Session altered.

HR@ORACOS> ---run the query below several times for plan capturing happens
HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> @e1

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  910wg4cc9fbwn, child number 0
-------------------------------------
select object_name from baseline_test  where object_id=5 and
data_object_id=10

Plan hash value: 3566649846

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_1 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DATA_OBJECT_ID"=10)
   2 - access("OBJECT_ID"=5)

Note
-----
   - SQL plan baseline SQL_PLAN_g64u8gfpwnfr028052575 used for this statement

25 rows selected.

HR@ORACOS> @sql_plan_baselines2 910wg4cc9fbwn

##################
#active baselines#
##################

   INST_ID   UE SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER SQL_PLAN_BASELINE              SQL_HANDLE                     PLAN_NAME                      ENABLED ACCEPTED FIXED CREATED                   LAST_EXECUTED
---------- ---- ------------- --------------- ------------ ------------------------------ ------------------------------ ------------------------------ ------- -------- ----- ------------------------- -------------------------
         1    0 910wg4cc9fbwn      3566649846            0 SQL_PLAN_g64u8gfpwnfr028052575 SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr028052575 YES     YES      NO    13-AUG-11 15.45.42.000000 13-AUG-11 15.45.42.000000

Now it is time to check the plan output


HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 3566649846

-----------------------------------------------------------
| Id  | Operation                   | Name                |
-----------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |
|   2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_1 |
-----------------------------------------------------------

21 rows selected.

Since we have a baseline now we can change the table index structure to replicate the issue


HR@ORACOS> ---now drop the index
HR@ORACOS> drop index baseline_test_idx_1
  2  ;

Index dropped.

HR@ORACOS> create unique index baseline_test_idx_2 on baseline_test(data_object_id,object_id);

Index created.

HR@ORACOS> ---Run it couple of times for baseline generation

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> @e1

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  910wg4cc9fbwn, child number 0
-------------------------------------
select object_name from baseline_test  where object_id=5 and
data_object_id=10

Plan hash value: 3304552195

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=10 AND "OBJECT_ID"=5)

20 rows selected.

As you can see baseline no baseline is used. Lets check if a baseline captured ?


HR@ORACOS> @sql_plan_baselines2 910wg4cc9fbwn

##################
#active baselines#
##################

   INST_ID   UE SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER SQL_PLAN_BASELINE              SQL_HANDLE                     PLAN_NAME                      ENABLED ACCEPTED FIXED CREATED                   LAST_EXECUTED
---------- ---- ------------- --------------- ------------ ------------------------------ ------------------------------ ------------------------------ ------- -------- ----- ------------------------- -------------------------
         1    0 910wg4cc9fbwn      3304552195            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr028052575 YES     YES      NO    13-AUG-11 15.45.42.000000 13-AUG-11 15.45.42.000000
         1    0 910wg4cc9fbwn      3304552195            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr09db5a957 YES     NO       NO    13-AUG-11 15.49.54.000000

Two baselines are captured in the system, one is accepted which is the one initially created before the index recreation but the other one is not accepted ??

Now it is time to see what the plans of these baselines

HR@ORACOS> --ACCEPTED PLAN
HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 3304552195

-----------------------------------------------------------
| Id  | Operation                   | Name                |
-----------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |
|   2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |
-----------------------------------------------------------

21 rows selected.

HR@ORACOS> ---NON ACCEPTED NEWLY GENERATED PLAN

HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr09db5a957

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr09db5a957         Plan id: 2645928279
Enabled: YES     Fixed: NO      Accepted: NO      Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 3304552195

-----------------------------------------------------------
| Id  | Operation                   | Name                |
-----------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |
|   2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |
-----------------------------------------------------------

21 rows selected.

They are same ???? How come the plan can be same with the one which has different index names ??
If you are a careful reader you might have spotted The plan_id is different with same plan hash value

Lets check with ADVANCED option instead BASIC


HR@ORACOS> ----ACCEPTED BASELINE

HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'ADVANCED'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Outline Data from SMB:

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */
--------------------------------------------------------------------------------

Plan hash value: 3304552195

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |     1 |    32 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / BASELINE_TEST@SEL$1
   2 - SEL$1 / BASELINE_TEST@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."DATA_OBJECT_ID"
              "BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=10 AND "OBJECT_ID"=5)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "OBJECT_NAME"[VARCHAR2,128]
   2 - "BASELINE_TEST".ROWID[ROWID,10]

68 rows selected.

Did you spot something ? Look carefully because I did not look carefully and wasted 1 day. Once you know how to get the higher detail of information better you should look carefully


HR@ORACOS>---NON ACCEPTED NEWLY GENERATED PLAN
hR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'ADVANCED'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Outline Data from SMB:

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */
--------------------------------------------------------------------------------

Plan hash value: 3304552195

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |     1 |    32 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BASELINE_TEST       |     1 |    32 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BASELINE_TEST_IDX_2 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / BASELINE_TEST@SEL$1
   2 - SEL$1 / BASELINE_TEST@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX_RS_ASC(@"SEL$1" "BASELINE_TEST"@"SEL$1" ("BASELINE_TEST"."DATA_OBJECT_ID"
              "BASELINE_TEST"."OBJECT_ID"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('11.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DATA_OBJECT_ID"=10 AND "OBJECT_ID"=5)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "OBJECT_NAME"[VARCHAR2,128]
   2 - "BASELINE_TEST".ROWID[ROWID,10]

68 rows selected.

If you haven’t spotted already, the problem with the accepted plan output is the difference between OUTLINE DATA FROM SMB and OUTLINE DATA sections.
There are two outline sections per baseline output. one is OUTLINE DATA FROM SMB and other is OUTLINE DATA

Now you can do line by line comparison and see the difference.

Outline Data from SMB:

/*+
BEGIN_OUTLINE_DATA
INDEX_RS_ASC(@”SEL$1″ “BASELINE_TEST”@”SEL$1″ (“BASELINE_TEST”.”OBJECT_ID”))
OUTLINE_LEAF(@”SEL$1″)
ALL_ROWS
DB_VERSION(’11.2.0.1′)
OPTIMIZER_FEATURES_ENABLE(’11.2.0.1′)
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/

Outline Data
————-

/*+
BEGIN_OUTLINE_DATA
INDEX_RS_ASC(@”SEL$1″ “BASELINE_TEST”@”SEL$1″ (“BASELINE_TEST”.”DATA_OBJECT_ID”
“BASELINE_TEST”.”OBJECT_ID”))
OUTLINE_LEAF(@”SEL$1″)
ALL_ROWS
DB_VERSION(’11.2.0.1′)
OPTIMIZER_FEATURES_ENABLE(’11.2.0.1′)
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/

What happens here is that Oracle only stores the outline information and when you run the display_sql_plan_baseline
Oracle tries to generate the plan output from the outline but problem starts there because it tries to use the index which is not existent on the system anymore so it can’t reproduce the plan and instead of giving any warning or error it generates a plan for you by lying you.

To confirm what we said lets drop the secondary index and see what dbms_xplan.display_sql_plan_baseline will generate for us for the same baseline


HR@ORACOS> ---dropping index
HR@ORACOS>
HR@ORACOS> drop index baseline_test_idx_2;

Index dropped.

------FIRST BASELINE CREATED

HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr09db5a957

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr09db5a957         Plan id: 2645928279
Enabled: YES     Fixed: NO      Accepted: NO      Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 2027741861

-------------------------------------------
| Id  | Operation         | Name          |
-------------------------------------------
|   0 | SELECT STATEMENT  |               |
|   1 |  TABLE ACCESS FULL| BASELINE_TEST |
-------------------------------------------

20 rows selected.

As you can see now it generates a new plan for us for the same baseline :) What a liar

What is the output from the valid but not accepted baseline, once we dropped the index it was using


HR@ORACOS> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle=>'&sql_handle',plan_name=>'&sql_plan', format=>'BASIC'));
Enter value for sql_handle: SYS_SQL_f313487babca3ae0
Enter value for sql_plan: SQL_PLAN_g64u8gfpwnfr028052575

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_f313487babca3ae0
SQL text: select object_name from baseline_test  where object_id=5 and
          data_object_id=10
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_g64u8gfpwnfr028052575         Plan id: 671425909
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 2027741861

-------------------------------------------
| Id  | Operation         | Name          |
-------------------------------------------
|   0 | SELECT STATEMENT  |               |
|   1 |  TABLE ACCESS FULL| BASELINE_TEST |
-------------------------------------------

20 rows selected.

Same issue happened again.

Lets run the same sql to see if it generates and accepts a baseline with full table scan


HR@ORACOS> ---Lets run the sql again to force a plan
HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> select object_name from baseline_test  where object_id=5 and data_object_id=10;

no rows selected

HR@ORACOS> @e1

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  910wg4cc9fbwn, child number 0
-------------------------------------
select object_name from baseline_test  where object_id=5 and
data_object_id=10

Plan hash value: 2027741861

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |       |       |   217 (100)|          |
|*  1 |  TABLE ACCESS FULL| BASELINE_TEST |     1 |    32 |   217   (1)| 00:00:02 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("OBJECT_ID"=5 AND "DATA_OBJECT_ID"=10))

19 rows selected.

The baseline is generated as below but it is still not accepted.


HR@ORACOS> @sql_plan_baselines2 910wg4cc9fbwn

##################
#active baselines#
##################

   INST_ID   UE SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER SQL_PLAN_BASELINE              SQL_HANDLE                     PLAN_NAME                      ENABLED ACCEPTED FIXED CREATED                   LAST_EXECUTED
---------- ---- ------------- --------------- ------------ ------------------------------ ------------------------------ ------------------------------ ------- -------- ----- ------------------------- -------------------------
         1    0 910wg4cc9fbwn      2027741861            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr028052575 YES     YES      NO    13-AUG-11 15.45.42.000000 13-AUG-11 15.45.42.000000
         1    0 910wg4cc9fbwn      2027741861            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr09db5a957 YES     NO       NO    13-AUG-11 15.49.54.000000
         1    0 910wg4cc9fbwn      2027741861            0                                SYS_SQL_f313487babca3ae0       SQL_PLAN_g64u8gfpwnfr018837053 YES     NO       NO    13-AUG-11 15.59.15.000000

I assume You already understood the issue. This is a proper buggy/dodgy behaviour for me.

Now lets try some workaround to see how we can get over from this situation.

There are 2 ways to get around from this problem and all of them needs DBA input. I have tested all and they are working but you can go ahead and test them. I don’t want to make this blog post longer than it is so I’m only listing the options.

1- Evolve the baseline for the sql_handle so Oracle accepts the plans (will keep the invalid ones in the system)

2- Drop the invalid baselines and evolve the handle (this way you will get rid of the invalid ones)

3- Drop all baselines related with the objects altered and allow capturing to the trick for you. (Less headache than second but you may lose the good plans- still would be my approach for releases )

Even Oracle warns in documentation “SQL plan baselines cannot help in cases where an event has caused irreversible execution plan changes, such as dropping an index.” and also ” if a change in the system (such as a dropped index) causes all accepted plans to become non-reproducible, then the optimizer selects the best-cost plan. “
The did not give much information about the behaviour.

From my point of view there are two major problems which needs to be addressed for this specific table structure change case

1- Instead of showing a plan which is not same as baseline, display_sql_plan_baseline must report an error that stored baseline is not reproduce-able

2- Baselines are not invalidated there must be an invalidation flag which makes the baseline invalidated at very first time CBO can’t reproduce the plan via baseline and set enabled and accepted flag to no. Since the purging of the plans is 53 weeks by
default for non used baselines this is a needed enhancement. If a plan is accepted and not produceable with smaller cost it avoids other plans accepted.

Another problem with baselines is also covered in this clear post by Maxym Kharchenko which can be related with my first point (that very same post series from Maxym  is one of the reasons I did not continue my series since author nearly covered all -but I still have something to write) What happens if both objects are under two schemas like some development groups uses for different testing phases. One drops the index one did not and you have system level capturing on. ?

I think Oracle must address these issues by introducing another complexity for sql signatures which involves schema owner or something similar.

Long story short you might be easily fooled by baselines with the current structure. If I were you I will always check the plans with at least OUTLINE option or go for ADVANCED.

Confession:n I am always using advanced option but as usual I did not pay that much attention to the output at first point :( That was my biggest mistake there which cost 1 day for analysis. knowing and using an option but not paying attention to the output of the option.)

All tests are done on 11.2.0.1

Bonus material

For the ones who likes tracefiles this issue is shown as this

=======TRACEFILE ======

------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : HR
  plan_baseline signature  : 17515423071774325472
  plan_baseline plan_id    : 671425909
  plan_baseline hintset    :
    hint num  1 len 37 text: OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
SPM: generated non-matching plan:
----- Explain Plan Dump -----
----- Plan Table -----

============
Plan Table
============
------------------------------------------+-----------------------------------+
| Id  | Operation          | Name         | Rows  | Bytes | Cost  | Time      |
------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT   |              |       |       |   217 |           |
| 1   |  TABLE ACCESS FULL | BASELINE_TEST|     1 |    32 |   217 |  00:00:02 |
------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
1 - filter(("OBJECT_ID"=5 AND "DATA_OBJECT_ID"=10))

Content of other_xml column
===========================
  db_version     : 11.2.0.1
  parse_schema   : HR
  plan_hash      : 2027741861
  plan_hash_2    : 411267155
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
      DB_VERSION('11.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "BASELINE_TEST"@"SEL$1")
    END_OUTLINE_DATA
  */

------- END SPM Plan Dump -------
SPM: couldn't reproduce any enabled+accepted plan so using the cost-based plan, planId = 411267155

June 24, 2011

Plan is same, Wait profile is different ?

Filed under: Performance — coskan @ 10:42 am

Here comes the nice one, What would you check when you are on 3 node RAC where 4 different database are balanced with services and Developer of one of the DBs come and asked you why their newly deployed batch is 3 times slower on PRD comparing to QA box which was snap copied (storage snap) from production box. RAC specs are exactly same plus all parameters are same ?

PRD runtime plan


PROD> @mon_sqlid2  52dr7m8nnx5mk

DBMS_SQLTUNE.REPORT_SQL_MONITOR(SQL_ID=>'52DR7M8NNX5MK')
----------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Monitoring Report

SQL Text
------------------------------
select PKEY, ISIN from BATCH_TB where BATCH = :1 order by PKEY

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  2
 Session             :  BATCH_APP (414:13998)
 SQL ID              :  52dr7m8nnx5mk
 SQL Execution ID    :  33554695
 Execution Started   :  06/14/2011 13:25:52
 First Refresh Time  :  06/14/2011 13:25:56
 Last Refresh Time   :  06/14/2011 13:26:08
 Duration            :  16s
 Module/Action       :  JDBC Thin Client/-
 Service             :  PRD
 Program             :  JDBC Thin Client
 Fetch Calls         :  1

Binds
========================================================================================================================
| Name | Position |  Type  |                                           Value                                           |
========================================================================================================================
| :1   |        1 | NUMBER | 37755                                                                                     |
========================================================================================================================

Global Stats
=========================================================================================
| Elapsed |   Cpu   |    IO    | Application |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs | Bytes |
=========================================================================================
|      16 |    0.62 |       16 |        0.01 |     0.03 |     1 |  60425 | 3874 | 469MB |
=========================================================================================

SQL Plan Monitoring Details (Plan Hash Value=4053290152)
====================================================================================================================================================
| Id |      Operation       |   Name   |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |    Activity Detail    |
|    |                      |          | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |      (# samples)      |
====================================================================================================================================================
|  0 | SELECT STATEMENT     |          |         |       |           |        |     1 |          |      |       |          |                       |
|  1 |   SORT ORDER BY      |          |     347 | 13216 |           |        |     1 |          |      |       |          |                       |
|  2 |    TABLE ACCESS FULL | BATCH_TB |     347 | 13215 |        17 |     +1 |     1 |        0 | 2996 | 370MB |   100.00 | direct path read (17) |
====================================================================================================================================================

QA runtime plan



QA> @mon_sqlid2 52dr7m8nnx5mk

DBMS_SQLTUNE.REPORT_SQL_MONITOR(SQL_ID=>'52DR7M8NNX5MK')
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Monitoring Report

SQL Text
------------------------------
select PKEY, ISIN from BATCH_TB where BATCH = :1 order by PKEY

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  3
 Session             :  BATCH_APP (621:853)
 SQL ID              :  52dr7m8nnx5mk
 SQL Execution ID    :  50331656
 Execution Started   :  06/14/2011 11:57:46
 First Refresh Time  :  06/14/2011 11:57:50
 Last Refresh Time   :  06/14/2011 11:57:52
 Duration            :  6s
 Module/Action       :  JDBC Thin Client/-
 Service             :  QA
 Program             :  JDBC Thin Client
 Fetch Calls         :  1

Binds
========================================================================================================================
| Name | Position |  Type  |                                           Value                                           |
========================================================================================================================
| :1   |        1 | NUMBER | 37755                                                                                     |
========================================================================================================================

Global Stats
===========================================================================
| Elapsed |   Cpu   |    IO    | Cluster  | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
===========================================================================
|    6.22 |    1.56 |     4.14 |     0.53 |     1 |  59621 |  966 | 107MB |
===========================================================================

SQL Plan Monitoring Details (Plan Hash Value=4053290152)
=========================================================================================================================================================
| Id |      Operation       |   Name   |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |      Activity Detail       |
|    |                      |          | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |        (# samples)         |
=========================================================================================================================================================
|  0 | SELECT STATEMENT     |          |         |       |           |        |     1 |          |      |       |          |                            |
|  1 |   SORT ORDER BY      |          |     338 | 12544 |           |        |     1 |          |      |       |          |                            |
|  2 |    TABLE ACCESS FULL | BATCH_TB |     338 | 12543 |         6 |     +1 |     1 |        0 |  435 |  47MB |   100.00 | Cpu (2)                    |
|    |                      |          |         |       |           |        |       |          |      |       |          | db file scattered read (4) |
=========================================================================================================================================================

(I hope you like the posh sql monitoring output which I really enjoy to use. I have to admit, I wasn’t the big fan of sql monitoring on GUI-just because of the speed of OEM- till I discovered the sqlplus options like the ones above, and now I can’t stop myself from using it and it is very good alternative to trace which is expensive by all means on PRD )

From the ouput above, you can see PRD database is doing direct path reads and QA database is using db_file_scattered reads and PRD is 3 times slower for each run. Somehow oracle favours doing 11G direct path implementation for full table scan (Links for direct path Doug Burns,Alex Fatkulin, Dion Cho ) on PROD. If you read the links you can see that this has to do something with the memory the buffer cache and small table threshold (for more information about this threshold Charles Hooper-great discussion with Jonathan Lewis on comments, Tanel Poder ).

Initially when I checked the output, I saw something weird which is against our policy for this DB. We configured the user services to run on Node-2 only and somehow on QA query runs on Node-3 (instance id), I double checked this with developer and we found out that on test app server they are not using services as service name but instead they use instance name, so he is able to run query on node-3 on QA. I thought this might be the case since Oracle instance is not highly active in terms of memory usage on node-3 so Oracle might favour using expensive buffer cache reads instead of direct path reads. This was a goodish as initial shot but ended up disappointment because when I test the same query on node-2 results were again same. Wasted 5 important minutes :)

After losing 5 minutes on initial shot, now it is time to be more scientific, Lets check what the “documented” memory parameters are

===================QA Parameters ============================

QA> show parameter sga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
sga_max_size                         big integer 10G
sga_target                           big integer 8G
QA> show parameter pga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
pga_aggregate_target                 big integer 6435M
QA> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
db_cache_size                        big integer 0
QA> show parameter shared_pool

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------
shared_pool_reserved_size            big integer 70464307
shared_pool_size                     big integer 0
QA> show parameter memory

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------
hi_shared_memory_address             integer     0
memory_max_target                    big integer 0
memory_target                        big integer 0
shared_memory_address                integer     0




===================PRD Parameters ============================

PROD> show parameter sga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
sga_max_size                         big integer 10G
sga_target                           big integer 8G
PROD> show parameter pga_

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
pga_aggregate_target                 big integer 6435M

PROD> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
db_cache_size                        big integer 0
PROD> show parameter shared_pool

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
shared_pool_reserved_size            big integer 291923558
shared_pool_size                     big integer 0
PROD> show parameter memory

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------
hi_shared_memory_address             integer     0
memory_max_target                    big integer 0
memory_target                        big integer 0
shared_memory_address                integer     0
PROD>


This output told me that we have everything same in terms of static parameters, lets check what the small table threshold —> using Tanel Poders pd script for hidden parameters (needs sys access)


QA> @pd small_table_threshold

NAME                                     VALUE                          ISDEFAULT DESCRIPTION
---------------------------------------- ------------------------------ --------- -------------------------------------------------------
_small_table_threshold                   15598                          TRUE      lower threshold level of table size for direct reads


PROD> @pd small_table_threshold

NAME                                     VALUE                          ISDEFAULT DESCRIPTION
---------------------------------------- ------------------------------ --------- ------------------------------------------------------
_small_table_threshold                   5830                           TRUE      lower threshold level of table size for direct reads

Lets check if the small_table_threshold*5 (check Tanels answer on the oracle_l question ) is bigger than number of blocks of the BATCH_TBS which is probably the decision maker for favouring direct path reads

=============================QA ====================================

---number of blocks of BATCH_TB table 

QA> select blocks from dba_segments where segment_name='BATCH_TB';

    BLOCKS
----------
     60416


---5 times small_table_threshold 

QA> select 15598 * 5 from dual;

   15598*5
----------
     77990



=============================PRD ====================================

---number of blocks of BATCH_TB table 

PRD> select blocks from dba_segments where segment_name='BATCH_TB';

    BLOCKS
----------
     61440


---5 times small_table_threshold 

PRD> select 5830 * 5 from dual;

    5830*5
----------
     29150

As you can see we find the root cause but I don’t think developer really cares about the root cause. All he need is a an answer Why they hell this code is working on QA and how can we avoid this problem ?

To answer his questions we need to look at the formula of small_table_threshold which is %2 of the buffer cache size. Which means our buffer cache is bigger on QA. Lets check

=============================QA ====================================


QA> select 0.02*current_size/8192 from v$sga_dynamic_components 
  2 where component='DEFAULT buffer cache';

0.02*CURRENT_SIZE/8192
----------------------
              15482.88



QA>  select component,current_size/1024/1024 current_size_mb
  2  from v$sga_dynamic_components;

COMPONENT                   CURRENT_SIZE_MB
--------------------------- ---------------
shared pool                            1888
large pool                               32
java pool                               160
streams pool                              0
DEFAULT buffer cache                   6048
KEEP buffer cache                         0
RECYCLE buffer cache                      0
DEFAULT 2K buffer cache                   0
DEFAULT 4K buffer cache                   0
DEFAULT 8K buffer cache                   0
DEFAULT 16K buffer cache                  0
DEFAULT 32K buffer cache                  0
Shared IO Pool                            0
ASM Buffer Cache                          0

14 rows selected.


=============================PRD ====================================


PRD> select 0.02*current_size/8192 from v$sga_dynamic_components
  2  where component='DEFAULT buffer cache';

0.02*CURRENT_SIZE/8192
----------------------
                5324.8

PRD> select component,current_size/1024/1024 current_size_mb
  2  from v$sga_dynamic_components;

COMPONENT                    CURRENT_SIZE_MB
---------------------------- ---------------
shared pool                             5952
large pool                                64
java pool                                 32
streams pool                               0
DEFAULT buffer cache                    2080
KEEP buffer cache                          0
RECYCLE buffer cache                       0
DEFAULT 2K buffer cache                    0
DEFAULT 4K buffer cache                    0
DEFAULT 8K buffer cache                    0
DEFAULT 16K buffer cache                   0
DEFAULT 32K buffer cache                   0
Shared IO Pool                             0
ASM Buffer Cache                           0


As you can see %2 of buffer cache formula is true and PRD buffer cache is 3 times less than QA buffer cache.
Why this can be ? Here is the answer

QA> select count(*) from v$sql;

COUNT(*)
———-
26135

PRD> select count(*) from v$sql;

COUNT(*)
———-
68711

We are running 3 times less number of sqls on QA which leads us having 1.8GB shared pool enough on QA but not enough on PRD.
This causes shrink of buffer cache and makes us think we have visually “SAME” environment but end up realistically “DIFFERENT” environments.

My solution for the problem was creating a better index on the table which needed to be there anyway (cannot touch system wide PRD setting):)
The real and long term solution to avoid these kind of QA-PRD different problems is in my opinion having bare minimums by setting db_cache_size and shared_pool_size
and not letting Oracle to decide to shrink below real production levels.

Hope you enjoyed as much as I enjoyed to diagnose it.

!!Make sure you read all the links in this post they are all informative

Note: I needed to stop my upgrade series but will continue as soon as possible :( Really sorry if someone is waiting an input from me.

mon_sqlid2 script is like follow

 set long 10000000
 set longchunksize 10000000
 select dbms_sqltune.report_sql_monitor (sql_id=>'&1') from dual;

pd script to get hidden parameters (From Tanel Poder)

col pd_name head NAME for a40
col pd_value head VALUE for a30
column pd_descr heading DESCRIPTION format a55 word_wrap

select n.ksppinm pd_name, c.ksppstvl pd_value, ksppstdf isdefault,n.ksppdesc pd_descr
from x$ksppi n, x$ksppcv c
where n.indx=c.indx
and (
   lower(n.ksppinm) like lower('&1') 
   or lower(n.ksppdesc) like lower('&1')
);

March 4, 2011

Plan stability through Upgrade to 11G-Why is my plan changed?-extra nested loop

Filed under: Bugs, Performance, Plan Stability — coskan @ 1:30 pm

After you upgrade to 11G from lower versions you might realize that some of your plans involving nested loops are changed and you started to see extra nested loops.
According to documentation this is because of the “new” implementation for Nested Loop Joins.

Document says
Oracle Database 11g introduces a new implementation for nested loop joins. As a result, execution plans that include nested loops might appear different
than they did in previous releases of Oracle Database. Both the new implementation and the original implementation for nested loop
joins are possible in Oracle Database 11g. So, when analyzing execution plans, it is important to understand that the number of NESTED LOOPS join row sources
might be different.

Oracle Database 11g introduces a new implementation for nested loop joins to reduce overall latency for physical I/O.
When an index or a table block is not in the buffer cache and is needed to process the join, a physical I/O is required.
Oracle Database 11g can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time.

As part of the new implementation for nested loop joins, two NESTED LOOPS join row sources might appear in the execution plan where only one
would have appeared in prior releases. In such cases, Oracle Database allocates one NESTED LOOPS join row source to join the values from the table
on the outer side of the join with the index on the inner side. A second row source is allocated to join the result of the first join,
which includes the rowids stored in the index, with the table on the inner side of the join.

Here are the plan differences between new implementation and old one.

Query of the plan

select /*+ use_nl(t2) full(t2) gather_plan_statistics*//*test_650*/ * from t2,t1
where t1.id=t2.id
and t2.id>650000;

for 10.2.0.4


HR@ORACOS> @ofe
Enter value for ofe: 10.2.0.4

Session altered.

HR@ORACOS> @test_e

Explained.

HR@ORACOS> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1708821971

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   350K|    78M|   703K  (1)| 00:58:40 |
|   1 |  NESTED LOOPS                |           |   350K|    78M|   703K  (1)| 00:58:40 |
|*  2 |   TABLE ACCESS FULL          | T2        |   350K|    39M|  3446   (1)| 00:00:18 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T1        |     1 |   118 |     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

18 rows selected.

HR@ORACOS>

For 11.2.0.1


HR@ORACOS> @ofe
Enter value for ofe: 11.2.0.1

Session altered.

HR@ORACOS> @test_e

Explained.

HR@ORACOS> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3217821761

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   350K|    78M|   703K  (1)| 00:58:40 |
|   1 |  NESTED LOOPS                |           |       |       |            |          |
|   2 |   NESTED LOOPS               |           |   350K|    78M|   703K  (1)| 00:58:40 |
|*  3 |    TABLE ACCESS FULL         | T2        |   350K|    39M|  3446   (1)| 00:00:18 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |     1 |       |     1   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1        |     1 |   118 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

19 rows selected.

According to me, this change brought extra complexity to read the plans. NLJ plans prior to 11G are way easier to read and bear in mind this example is just a simple 2 table join.
Once you query has got crazy number of joins it gets harder and harder to read. One thing you can spot is that the cost is “same” which is very good for plan change analysis.
According to my experience, %99 of same cost plan changes are because of this implementation and this made me exclude same cost-different plan_hash_value plans from comparisons.

On the last post, I already gave some information about the controlling hidden CBO parameter _nlj_batching_enabled.
Better I repeat here, this parameter has 2 values, 0 to disable the behaviour, 1 (which is default) to enable the behavior and this parameter is there since 11GR1.

When I interpret the sentence “new implementation for nested loop joins to reduce overall latency for physical I/O”. I expect some improvement with runtime like the effect of bulk collect or changing the array size of your sessions. But is my assumption true ?
I have doubts about this definition because in reality I do not feel any difference as interpreted from the official definition.
Nothing got faster in our production environment related with this change.

To prove my experience, I build a test case which you can find in the bottom of the page.

My test environment is 11.2.0.1 on Ubuntu with 1.5G memory target on a machine with 4 cores.

For this test I have 3 tables all of which have ID columns and have an index on ID columns

T1=100 million rows (I think I was too generous)
T2=1 million rows
T3=50K rows -derived from T2

First query is for small data set which I can do buffer cache read after the initial run to test non phsical I/O performance

select /*+ use_nl(t3) full(t3) gather_plan_statistics*//*test_999550*/ * from t3,t1
where t1.id=t3.id;

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=0

49999 rows selected.

Elapsed: 00:00:01.51

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      52395  consistent gets
       1842  physical reads
          0  redo size
    1781536  bytes sent via SQL*Net to client
       1612  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49999  rows processed

Plan hash value: 692324935

------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      0 |        |      0 |00:00:00.01 |       0 |      0 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |  49999 |00:00:00.83 |   52395 |   1842 |
|   2 |   NESTED LOOPS              |           |      1 |  50143 |  99999 |00:00:04.65 |    2396 |    978 |
|   3 |    TABLE ACCESS FULL        | T3        |      1 |  49999 |  49999 |00:00:00.11 |     949 |    849 |
|*  4 |    INDEX UNIQUE SCAN        | T1_IDX_ID |  49999 |      1 |  49999 |00:00:00.16 |    1447 |    129 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T3"."ID")

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=1

select /*+ use_nl(t3) full(t3) gather_plan_statistics*//*test_999550*/ * from t3,t1
  2  where t1.id=t3.id;

49999 rows selected.

Elapsed: 00:00:01.40

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      52395  consistent gets
       1842  physical reads
          0  redo size
    1781536  bytes sent via SQL*Net to client
       1612  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49999  rows processed

Plan hash value: 198103700

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |  49999 |00:00:00.74 |   52395 |   1842 |
|   1 |  NESTED LOOPS                |           |      1 |        |  49999 |00:00:00.74 |   52395 |   1842 |
|   2 |   NESTED LOOPS               |           |      1 |  50143 |  49999 |00:00:00.27 |    2396 |    978 |
|   3 |    TABLE ACCESS FULL         | T3        |      1 |  49999 |  49999 |00:00:00.07 |     949 |    849 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |  49999 |      1 |  49999 |00:00:00.15 |    1447 |    129 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1        |  49999 |      1 |  49999 |00:00:00.22 |   49999 |    864 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T3"."ID")

Second query is for bigger data set which always do some physical read for T2

select /*+ use_nl(t2) full(t2) gather_plan_statistics*//*test_650*/ * from t2,t1
where t1.id=t2.id
and t2.id>650000;

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=0


select /*+ use_nl(t2) full(t2) gather_plan_statistics*//*test_650*/ * from t2,t1
where t1.id=t2.id
and t2.id>650000;
  2    3

350000 rows selected.

Elapsed: 00:00:10.11

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     377819  consistent gets
      23736  physical reads
          0  redo size
   12606669  bytes sent via SQL*Net to client
       8212  bytes received via SQL*Net from client
        701  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     350000  rows processed

Plan hash value: 1010637246

------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      0 |        |      0 |00:00:00.01 |       0 |      0 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |    350K|00:00:04.91 |     377K|  23736 |
|   2 |   NESTED LOOPS              |           |      1 |    350K|    700K|00:01:46.31 |   27819 |  17752 |
|*  3 |    TABLE ACCESS FULL        | T2        |      1 |    350K|    350K|00:00:01.39 |   17643 |  16951 |
|*  4 |    INDEX UNIQUE SCAN        | T1_IDX_ID |    350K|      1 |    350K|00:00:01.05 |   10176 |    801 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

Runtime stats and plan statistics for this sql in 11.2.0.1 with _nlj_batching_enabled=1

350000 rows selected.

Elapsed: 00:00:10.00

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     377819  consistent gets
      23736  physical reads
          0  redo size
   12606669  bytes sent via SQL*Net to client
       8212  bytes received via SQL*Net from client
        701  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     350000  rows processed

Plan hash value: 3217821761

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |    350K|00:00:04.67 |     377K|  23736 |
|   1 |  NESTED LOOPS                |           |      1 |        |    350K|00:00:04.67 |     377K|  23736 |
|   2 |   NESTED LOOPS               |           |      1 |    350K|    350K|00:00:02.54 |   27819 |  17752 |
|*  3 |    TABLE ACCESS FULL         | T2        |      1 |    350K|    350K|00:00:01.02 |   17643 |  16951 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |    350K|      1 |    350K|00:00:01.01 |   10176 |    801 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1        |    350K|      1 |    350K|00:00:01.47 |     350K|   5984 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

On the last post I already gave some information about the controlling parameter _nlj_batching_enabled so I change only this parameter to do my test and “keep OFE 11.2.0.1″

I ran these queries 4*50 times

1- _nlj_batching_enabled=1 (11G behaviour), flush the buffer cache twice and flush the cursor before first run, and run the statement 50 times.
2- _nlj_batching_enabled=1 (11G behaviour), flush the cursor at first run, flush the buffer cache twice before “each run” and run the statement 50 times to force phsical read all the time
3- _nlj_batching_enabled=0 (pre 11G behaviour), flush the buffer cache twice and flush the cursor before first run, and run the statement 50 times.
4- _nlj_batching_enabled=0 (pre 11G behaviour), flush the cursor at first run, flush the buffer cache twice before “each run” and run the statement 50 times to force phsical read all the time

After each run I snap the statistics from v$sql for the related sql and then do the comparison afterwards

Here are the results -( AVG1 includes first run for the step 1 and 3 which does both phsical and buffer cache read first run is not included on other calculations)

New behavior was faster in my Small dataset test.
I only consider median and for buffer cache reads it 10 milisecond less for CPU and 25 miliseconds less for Total ELA.
In Percentage we can say ~%5 for CPU and %6 for Total ELA
When it comes to phsical reads again 10 milisecond less for CPU and 11 miliseconds less for Total ELA
In percentage improvement is ~%4 for CPU and ~%1.5 for Total ELA

I was initially expecting bigger dataset will be again faster because we will do more phsical read but
This time new behaviour is slower for big dataset
For buffer cache reads 20 miliseconds slower for CPU and ~3 miliseconds slower for total ELA
In percentage degredation is ~%1.5 for CPU and ~%0.06 for Total ELA
For phsical reads 40 miliseconds slower for CPU and 7.5 miliseconds slower for total ELA
In percentage degredation is ~%2.5 for CPU and ~%0.14 for Total ELA

When I interpret these results I have the gut feeling that this is not like what is written in the documentation.
It looks like response time wise behavior is nearly same. From this output one  can say either CBO engineers wanted to make it harder to read the plans or maybe they wanted to be more honest to show what was always there.
Better optimizer group write a blog post about it rather than me :)

There is a bug note which also supports the idea that its been there since 9i

Bug 9919556: OFN DB: CUSTOMER INTERFACE PROGRAM TAKING LONGER TIME AFTER 11G R2 UPGRADE

There are also some other bugs related with dumps and wrong results reported for this behavior which is inconsistent with the idea that its been there for a long time.
Bugs disappeared when behavior is disabled and revert back to original so it loooks like has some effect somewhere but not necesarrily on performance as it can be interpreted??

Now its time for bonus material, probably careful eyes already spotted that the plan with _nlj_batching_enabled=0 is different than plan would be generated on 10.2.04 and also
there is a problem with runtime A-time values on 3th step of the plans when _nlj_batching_enabled=0 in 11.2.0.1. They are basically reported way higher than actual elapsed time.

First let me show you how actual pre 11G plan and reported A-Time values are

(to get this plan I flush buffer cache twice,flush the sql from shared pool, set ofe to 10.2.0.4 instead of _nlj_batching_enabled=0 on OFE=11.2.0.1)

Small Dataset


HR@ORACOS> @test

49999 rows selected.

Elapsed: 00:00:01.44

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      52395  consistent gets
       1842  physical reads
          0  redo size
    1781536  bytes sent via SQL*Net to client
       1613  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      49999  rows processed

Plan hash value: 225503291

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |  49999 |00:00:00.74 |   52395 |   1842 |
|   1 |  NESTED LOOPS                |           |      1 |  50143 |  49999 |00:00:00.74 |   52395 |   1842 |
|   2 |   TABLE ACCESS FULL          | T3        |      1 |  49999 |  49999 |00:00:00.05 |     949 |    849 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T1        |  49999 |      1 |  49999 |00:00:00.44 |   51446 |    993 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |  49999 |      1 |  49999 |00:00:00.15 |    1447 |    129 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."ID"="T3"."ID")

Big Dataset


HR@ORACOS> @test

350000 rows selected.

Elapsed: 00:00:10.25

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     377819  consistent gets
      23736  physical reads
          0  redo size
   12606669  bytes sent via SQL*Net to client
       8213  bytes received via SQL*Net from client
        701  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     350000  rows processed

Plan hash value: 1708821971

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |    350K|00:00:04.45 |     377K|  23736 |
|   1 |  NESTED LOOPS                |           |      1 |    350K|    350K|00:00:04.45 |     377K|  23736 |
|*  2 |   TABLE ACCESS FULL          | T2        |      1 |    350K|    350K|00:00:01.03 |   17643 |  16951 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T1        |    350K|      1 |    350K|00:00:02.81 |     360K|   6785 |
|*  4 |    INDEX UNIQUE SCAN         | T1_IDX_ID |    350K|      1 |    350K|00:00:01.02 |   10176 |    801 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T2"."ID">650000)
   4 - access("T1"."ID"="T2"."ID")
       filter("T1"."ID">650000)

As you can see timings are now more accurate and Nested Loop step is now changed its location in the plan.
I don’t know why A-time is reported wrong but thanks to the methods I described in earlier methods I can say, What is causing the plan change difference like below


HR@ORACOS> alter session set "_nlj_batching_enabled"=0;  ===> to revert back pre 11G behavior

Session altered.

HR@ORACOS>
HR@ORACOS> @build_optimizer_bugfix_test_e                ====> to find out which bugfix may have caused
Enter value for optimizer_feature_version: 11.1.0.6      ====> I alread found out it changed after 11.1.0.6
HR@ORACOS> @optimizer_bugfix_test_e

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
225503291               5585313
692324935               6439032
                        4279274
                        4370840
                        4386734
                        4507997
			....
			....
			....

39 rows selected.

HR@ORACOS> @bugfixg
Enter value for bugno: 5585313

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- -
   5585313          1 QKSFM_EXECUTION_5585313   enable index prefetching for NLJ with unique scan as inner acces 11.1.0.6

Since I know what is causing the plan changes for nested loop now its time for me to go ahead and test if OFE=10.2.0.4 is better for the tests I run above.

Here are the comparison between OFE 11.2.0.1 and OFE 10.2.0.4 (instead of setting _nlj_batching_enabled=0 while OFE is still 11.2.0.1 )

This time 10.2.0.4 results were better for both small and big dataset
for small dataset
For buffer cache reads, 10 miliseconds slower for CPU and ~3 miliseconds slower for total ELA
In percentage degredation is ~%7 for CPU and ~%0.06 for Total ELA
For phsical reads 10 miliseconds slower for CPU and ~19 miliseconds slower for total ELA
In percentage degredation is ~%7 for CPU and ~%8 for Total ELA

for big dataset
For buffer cache reads 70 miliseconds slower for CPU and ~65 miliseconds slower for total ELA
In percentage degredation is ~%5 for CPU and ~%1.3 for Total ELA
For phsical reads 80 miliseconds slower for CPU and 120 miliseconds slower for total ELA
In percentage degredation is ~%4.8 for CPU and ~%2.2 for Total ELA

As a last word I can’t say my tests are best way to measure, however results are not looking consistent with what is written in the official documentation.
I think this is a new feature that will effect your plans after you upgrade your DB to 11G, but it will not necessarily make any perceivable difference to your response times.
In my humble opinion as long as the cost of your queries are same with the new implementation in place,
you can ignore the existence of it and never bother to change the parameter unless support says so.

Please feel free to comment if you have anything to say on this one and if you spot anything wrong in my analysis.

 

Source Code;
Test script
Full Excell Sheet for time comparisons

<–Previous Post in This Series —————- Next Post in this series–>

February 17, 2011

Plan Stability Through Upgrade-Why is my plan changed?-new optimizer parameters

Filed under: CBO, Performance, Plan Stability — coskan @ 8:09 pm

Every Oracle version, developers at Oracle add or change some of the undocumented parameters to do better optimization. Most of the time these new optimizations
works fine but from time to time they have a negative effect for the generated plan which causes post upgrade slowness. When you change OFE (optimizer_features_enable)
from 10.2.0.4 to 11.2.0.1 Oracle changes value of 33 parameters !!! Because they are undocumented, unless an Oracle Scientist reveals what they are actually doing
or note on MOS explaining the parameter, their effects to the plans are not that clear.

To overcome this problem I again wrote couple of simple scripts to automate the process of what is actually changing our plan. They really helped me during the upgrades
and understanding of plans. Again purpose of the post is not about telling deeply how these parameters are effecting plans but how we can find what parameter is effecting.

First script is

build_opt_param_table: It needs to be run as root or a user which has read access to x$ksppi and x$ksppcv tables.
What this script does is, it accepts a base OFE version then creates a table (optimizer_parameters) for all the parameters for given OFE version
and OFE versions above. This script is the prerequsite for other scripts in this post because they will all use optimizer_parameters table.
!!! In the script it creates the public synonym and grants select to dba.

First lets build the optimizer_parameters table

SQL> @build_opt_param_table
Enter value for optimizer_feature_version: 10.2.0.4
SQL> grant select on optimizer_parameters to DBA;

Grant succeeded.

SQL> create public synonym optimize_parameters for optimizer_parameters;

Synonym created.

SQL> select ofe,count(*) from optimizer_parameters group by OFE;

OFE                 COUNT(*)
----------------- ----------
10.2.0.4                2399
10.2.0.5                2399
11.1.0.6                2399
11.1.0.7                2399
11.2.0.1                2399

Once our table is generated we can see the differences between OFE versions

before_after_opt_param: This script is just checking which values are different for given to OFE versions

Changes between 10.2.0.4 to 10.2.0.5

SQL> @before_after_opt_param
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 10.2.0.5

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------
optimizer_features_enable                          10.2.0.4             10.2.0.5             optimizer plan compatibility parameter
_optimizer_undo_cost_change                        10.2.0.4             10.2.0.5             optimizer undo cost change

Changes between 10.2.0.4 to 11.1.0.6

SQL> @before_after_opt_param
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 11.1.0.6

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          10.2.0.4             11.1.0.6             optimizer plan compatibility parameter
_replace_virtual_columns                           FALSE                TRUE                 replace expressions with virtual columns
_optimizer_undo_cost_change                        10.2.0.4             11.1.0.6             optimizer undo cost change
_optimizer_null_aware_antijoin                     FALSE                TRUE                 null-aware antijoin parameter
_optimizer_native_full_outer_join                  OFF                  FORCE                execute full outer join using native implementaion
_optimizer_multi_level_push_pred                   FALSE                TRUE                 consider join-predicate pushdown that requires multi-level pushdown to base table
_optimizer_improve_selectivity                     FALSE                TRUE                 improve table and partial overlap join selectivity computation
_optimizer_group_by_placement                      FALSE                TRUE                 consider group-by placement optimization
_optimizer_extended_stats_usage_control            255                  240                  controls the optimizer usage of extended stats
_optimizer_extended_cursor_sharing_rel             NONE                 SIMPLE               optimizer extended cursor sharing for relational operators
_optimizer_extend_jppd_view_types                  FALSE                TRUE                 join pred pushdown on group-by, distinct, semi-/anti-joined view
_optimizer_enable_extended_stats                   FALSE                TRUE                 use extended statistics for selectivity estimation
_optimizer_enable_density_improvements             FALSE                TRUE                 use improved density computation for selectivity estimation
_optimizer_adaptive_cursor_sharing                 FALSE                TRUE                 optimizer adaptive cursor sharing
_nlj_batching_enabled                              0                    1                    enable batching of the RHS IO in NLJ
_first_k_rows_dynamic_proration                    FALSE                TRUE                 enable the use of dynamic proration of join cardinalities
_bloom_pruning_enabled                             FALSE                TRUE                 Enable partition pruning using bloom filtering

17 rows selected.

Changes between 11.1.0.6 to 11.1.0.7

SQL> @before_after_opt_param
Enter value for before_ofe: 11.1.0.6
Enter value for after_ofe: 11.1.0.7

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          11.1.0.6             11.1.0.7             optimizer plan compatibility parameter
_optimizer_undo_cost_change                        11.1.0.6             11.1.0.7             optimizer undo cost change
_optimizer_extended_stats_usage_control            240                  224                  controls the optimizer usage of extended stats
_bloom_folding_enabled                             FALSE                TRUE                 Enable folding of bloom filter

Changes between 11.1.0.7 to 11.2.0.1

SQL> @before_after_opt_param
Enter value for before_ofe: 11.1.0.7
Enter value for after_ofe: 11.2.0.1

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          11.1.0.7             11.2.0.1             optimizer plan compatibility parameter
_optimizer_use_feedback                            FALSE                TRUE                 optimizer use feedback
_optimizer_use_cbqt_star_transformation            FALSE                TRUE                 use rewritten star transformation using cbqt framework
_optimizer_unnest_disjunctive_subq                 FALSE                TRUE                 Unnesting of disjunctive subqueries (TRUE/FALSE)
_optimizer_unnest_corr_set_subq                    FALSE                TRUE                 Unnesting of correlated set subqueries (TRUE/FALSE)
_optimizer_undo_cost_change                        11.1.0.7             11.2.0.1             optimizer undo cost change
_optimizer_try_st_before_jppd                      FALSE                TRUE                 try Star Transformation before Join Predicate Push Down
_optimizer_table_expansion                         FALSE                TRUE                 consider table expansion transformation
_optimizer_join_factorization                      FALSE                TRUE                 use join factorization transformation
_optimizer_fast_pred_transitivity                  FALSE                TRUE                 use fast algorithm to generate transitive predicates
_optimizer_fast_access_pred_analysis               FALSE                TRUE                 use fast algorithm to traverse predicates for physical optimizer
_optimizer_eliminate_filtering_join                FALSE                TRUE                 optimizer filtering join elimination enabled
_optimizer_distinct_placement                      FALSE                TRUE                 consider distinct placement optimization
_optimizer_distinct_agg_transform                  FALSE                TRUE                 Transforms Distinct Aggregates to non-distinct aggregates
_optimizer_connect_by_elim_dups                    FALSE                TRUE                 allow connect by to eliminate duplicates from input
_optimizer_coalesce_subqueries                     FALSE                TRUE                 consider coalescing of subqueries optimization
_connect_by_use_union_all                          OLD_PLAN_MODE        TRUE                 use union all for connect by
_and_pruning_enabled                               FALSE                TRUE                 allow partition pruning based on multiple mechanisms
_aggregation_optimization_settings                 32                   0                    settings for aggregation optimizations

19 rows selected.

As you might expect changes between minor releases are very small numbers but major release changes really effects the way CBO works.

Next script is for using the information above as a test case. They have the similar mind as bugfix test builders.

build_optimizer_param_test: accepts before and after OFE. you can make the test by changing parameters one by one do the test and revert back to the original.
Even it accepts the OFE as before and after you can reverse the order like after can be 10.2.0.4 and before can be 11.2.0.1. What the script will do is
set OFE to 11.2.0.1 and try each parameter by reverting them back to 10.2.0.4 values. Generated scripts is optimizer_parameter_test.sql and test.sql is the
script it calls for tests.

build_optimizer_param_test_e: Same as above but this time it is for explain plan only and generates optimizer_parameter_test_e.sql and uses test_e.sql

Now its time to play with the scripts

After the upgrade one of the massive queries again start to go crazy. This time I have the chance to build a test case.
We have two tables monthly partitioned on ASOF date. Once we use ansi “full outer join” syntax Oracle was not doing partition pruning
after upgrade from 10.2.0.4 to 11.2.0.1.

The full test case can be seen here full_outer_join_test_case.sql.

The test_e.sql is like below

EXPLAIN PLAN
   FOR
      SELECT T1.asof asof,
             T1.t1_id,
             t1.t1_data,
             T2.t2_data
        FROM    table_1 t1
             FULL OUTER JOIN
                (SELECT *
                   FROM table_2
                  WHERE table_2.asof = TO_DATE ('15Sep10', 'DDMONYY')) t2
             ON t1.asof = t2.asof
       WHERE t1.asof = TO_DATE ('15Sep10', 'DDMONYY');

Lets see how our test case behaves with different OFE settings


HR@ORACOS> @test_e   =>default OFE is 11.2.0.1
HR@ORACOS> @e

Plan hash value: 1183327479

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |          |  1991K|  7644M| 16343   (1)| 00:01:22 |       |       |
|*  1 |  VIEW                                 | VW_FOJ_0 |  1991K|  7644M| 16343   (1)| 00:01:22 |       |       |
|*  2 |   HASH JOIN FULL OUTER                |          |  1991K|  4348M| 16343   (1)| 00:01:22 |       |       |
|   3 |    PARTITION RANGE SINGLE             |          |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|   4 |     VIEW                              |          |     1 |  2011 |     1   (0)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID| T2       |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |       INDEX SKIP SCAN                 | TEST2_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|   7 |    PARTITION RANGE ALL                |          |  1991K|   529M| 16329   (1)| 00:01:22 |     1 |    28 |
|   8 |     TABLE ACCESS FULL                 | T1       |  1991K|   529M| 16329   (1)| 00:01:22 |     1 |    28 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
   2 - access("T1"."ASOF"="T2"."ASOF")
   6 - access("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))

HR@ORACOS> alter session set optimizer_features_enable='10.2.0.4';

Session altered.
HR@ORACOS> @test_e

Explained.
HR@ORACOS> @e

Plan hash value: 2581814110

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |          |     2 |  8052 |     1 (100)| 00:00:01 |       |       |
|   1 |  VIEW                                  |          |     2 |  8052 |     1 (100)| 00:00:01 |       |       |
|   2 |   UNION-ALL                            |          |       |       |            |          |       |       |
|*  3 |    HASH JOIN OUTER                     |          |     1 |  4035 |     1 (100)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE             |          |     1 |  2024 |     0   (0)| 00:00:01 |   KEY |   KEY |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID | T1       |     1 |  2024 |     0   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |       INDEX SKIP SCAN                  | TEST1_PK |     1 |       |     0   (0)| 00:00:01 |   KEY |   KEY |
|   7 |     PARTITION RANGE SINGLE             |          |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|   8 |      TABLE ACCESS BY LOCAL INDEX ROWID | T2       |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |       INDEX SKIP SCAN                  | TEST2_PK |     1 |       |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |    FILTER                              |          |       |       |            |          |       |       |
|  11 |     NESTED LOOPS ANTI                  |          |     1 |  2020 |     0   (0)| 00:00:01 |       |       |
|  12 |      PARTITION RANGE SINGLE            |          |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|  13 |       TABLE ACCESS BY LOCAL INDEX ROWID| T2       |     1 |  2011 |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 14 |        INDEX SKIP SCAN                 | TEST2_PK |     1 |       |     0   (0)| 00:00:01 |   KEY |   KEY |
|  15 |      PARTITION RANGE SINGLE            |          |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 16 |       INDEX FULL SCAN                  | TEST1_PK |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."ASOF"="T2"."ASOF"(+))
   6 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
   9 - access("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
  10 - filter(CAST(NULL AS DATE)=TO_DATE('15Sep10','DDMONYY'))
  14 - access("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
  16 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))

As you can clearly see, partition pruning is not happening on 11.2.0.1 and query suddenly becomes killer for the system and slower for end user.

Lets see what is changed in terms of parameters between the releases?


HR@ORACOS> @before_after_opt_param
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 11.2.0.1

PD_NAME                                            BEFORE_OFE_VALUE     AFTER_OFE_VALUE      DESCRIPTION
-------------------------------------------------- -------------------- -------------------- ----------------------------------------------------------------------------------------------------
optimizer_features_enable                          10.2.0.4             11.2.0.1             optimizer plan compatibility parameter
_replace_virtual_columns                           FALSE                TRUE                 replace expressions with virtual columns
_optimizer_use_feedback                            FALSE                TRUE                 optimizer use feedback
_optimizer_use_cbqt_star_transformation            FALSE                TRUE                 use rewritten star transformation using cbqt framework
_optimizer_unnest_disjunctive_subq                 FALSE                TRUE                 Unnesting of disjunctive subqueries (TRUE/FALSE)
_optimizer_unnest_corr_set_subq                    FALSE                TRUE                 Unnesting of correlated set subqueries (TRUE/FALSE)
_optimizer_undo_cost_change                        10.2.0.4             11.2.0.1             optimizer undo cost change
_optimizer_try_st_before_jppd                      FALSE                TRUE                 try Star Transformation before Join Predicate Push Down
_optimizer_table_expansion                         FALSE                TRUE                 consider table expansion transformation
_optimizer_null_aware_antijoin                     FALSE                TRUE                 null-aware antijoin parameter
_optimizer_native_full_outer_join                  OFF                  FORCE                execute full outer join using native implementaion
_optimizer_multi_level_push_pred                   FALSE                TRUE                 consider join-predicate pushdown that requires multi-level pushdown to base table
_optimizer_join_factorization                      FALSE                TRUE                 use join factorization transformation
_optimizer_improve_selectivity                     FALSE                TRUE                 improve table and partial overlap join selectivity computation
_optimizer_group_by_placement                      FALSE                TRUE                 consider group-by placement optimization
_optimizer_fast_pred_transitivity                  FALSE                TRUE                 use fast algorithm to generate transitive predicates
_optimizer_fast_access_pred_analysis               FALSE                TRUE                 use fast algorithm to traverse predicates for physical optimizer
_optimizer_extended_stats_usage_control            255                  224                  controls the optimizer usage of extended stats
_optimizer_extended_cursor_sharing_rel             NONE                 SIMPLE               optimizer extended cursor sharing for relational operators
_optimizer_extend_jppd_view_types                  FALSE                TRUE                 join pred pushdown on group-by, distinct, semi-/anti-joined view
_optimizer_enable_extended_stats                   FALSE                TRUE                 use extended statistics for selectivity estimation
_optimizer_enable_density_improvements             FALSE                TRUE                 use improved density computation for selectivity estimation
_optimizer_eliminate_filtering_join                FALSE                TRUE                 optimizer filtering join elimination enabled
_optimizer_distinct_placement                      FALSE                TRUE                 consider distinct placement optimization
_optimizer_distinct_agg_transform                  FALSE                TRUE                 Transforms Distinct Aggregates to non-distinct aggregates
_optimizer_connect_by_elim_dups                    FALSE                TRUE                 allow connect by to eliminate duplicates from input
_optimizer_coalesce_subqueries                     FALSE                TRUE                 consider coalescing of subqueries optimization
_optimizer_adaptive_cursor_sharing                 FALSE                TRUE                 optimizer adaptive cursor sharing
_nlj_batching_enabled                              0                    1                    enable batching of the RHS IO in NLJ
_first_k_rows_dynamic_proration                    FALSE                TRUE                 enable the use of dynamic proration of join cardinalities
_connect_by_use_union_all                          OLD_PLAN_MODE        TRUE                 use union all for connect by
_bloom_pruning_enabled                             FALSE                TRUE                 Enable partition pruning using bloom filtering
_bloom_folding_enabled                             FALSE                TRUE                 Enable folding of bloom filter
_and_pruning_enabled                               FALSE                TRUE                 allow partition pruning based on multiple mechanisms
_aggregation_optimization_settings                 32                   0                    settings for aggregation optimizations

35 rows selected.

35 parameters changed and we already now bugfix is not our problem.

Lets build our test case and run it. (check the time, only 0.1 seconds to understand which parameter is driving my problem)

!!! I am testing 11.2.0.1 parameters when I am on 10.2.0.4


HR@ORACOS> @build_optimizer_param_test_e
Enter value for before_ofe: 10.2.0.4
Enter value for after_ofe: 11.2.0.1

HR@ORACOS> @optimizer_param_test_e

PD_NAME                                            PD_VALUE      PLAN_HASH_VALUE
-------------------------------------------------- ---------- ------------------
_optimizer_native_full_outer_join                  FORCE              1183327479
_aggregation_optimization_settings                 0                  2581814110
_optimizer_use_cbqt_star_transformation            TRUE
_optimizer_unnest_disjunctive_subq                 TRUE
_optimizer_unnest_corr_set_subq                    TRUE
_optimizer_try_st_before_jppd                      TRUE
_optimizer_table_expansion                         TRUE
_optimizer_null_aware_antijoin                     TRUE
_optimizer_multi_level_push_pred                   TRUE
_optimizer_join_factorization                      TRUE
_optimizer_improve_selectivity                     TRUE
_optimizer_group_by_placement                      TRUE
_optimizer_fast_pred_transitivity                  TRUE
_optimizer_fast_access_pred_analysis               TRUE
_optimizer_extended_stats_usage_control            224
_optimizer_extended_cursor_sharing_rel             SIMPLE
_optimizer_extend_jppd_view_types                  TRUE
_optimizer_enable_extended_stats                   TRUE
_optimizer_enable_density_improvements             TRUE
_optimizer_eliminate_filtering_join                TRUE
_optimizer_distinct_placement                      TRUE
_optimizer_distinct_agg_transform                  TRUE
_optimizer_connect_by_elim_dups                    TRUE
_optimizer_coalesce_subqueries                     TRUE
_optimizer_adaptive_cursor_sharing                 TRUE
_nlj_batching_enabled                              1
_first_k_rows_dynamic_proration                    TRUE
_connect_by_use_union_all                          TRUE
_bloom_pruning_enabled                             TRUE
_bloom_folding_enabled                             TRUE
_and_pruning_enabled                               TRUE
_replace_virtual_columns                           TRUE
_optimizer_use_feedback                            TRUE

33 rows selected.

Elapsed: 00:00:00.01

Session altered.

Elapsed: 00:00:00.01

Our plan suddenly reverts back to the wrong plan when _optimizer_native_full_outer_join is force.

Lets turn it off and see the effect to confirm our test environment is right

HR@ORACOS> alter session set "_optimizer_native_full_outer_join"=OFF;

Session altered.

HR@ORACOS> @test_e

Explained.

HR@ORACOS> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2581814110

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |          |     2 |  8052 |     4  (25)| 00:00:01 |       |       |
|   1 |  VIEW                                  |          |     2 |  8052 |     4  (25)| 00:00:01 |       |       |
|   2 |   UNION-ALL                            |          |       |       |            |          |       |       |
|*  3 |    HASH JOIN OUTER                     |          |     1 |  4035 |     3  (34)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE             |          |     1 |  2024 |     1   (0)| 00:00:01 |   KEY |   KEY |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID | T1       |     1 |  2024 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |       INDEX SKIP SCAN                  | TEST1_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|   7 |     PARTITION RANGE SINGLE             |          |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|   8 |      TABLE ACCESS BY LOCAL INDEX ROWID | T2       |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |       INDEX SKIP SCAN                  | TEST2_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |    FILTER                              |          |       |       |            |          |       |       |
|  11 |     NESTED LOOPS ANTI                  |          |     1 |  2020 |     1   (0)| 00:00:01 |       |       |
|  12 |      PARTITION RANGE SINGLE            |          |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|  13 |       TABLE ACCESS BY LOCAL INDEX ROWID| T2       |     1 |  2011 |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 14 |        INDEX SKIP SCAN                 | TEST2_PK |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|  15 |      PARTITION RANGE SINGLE            |          |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
|* 16 |       INDEX FULL SCAN                  | TEST1_PK |     1 |     9 |     0   (0)| 00:00:01 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."ASOF"="T2"."ASOF"(+))
   6 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
   9 - access("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"(+)=TO_DATE('15Sep10','DDMONYY'))
  10 - filter(CAST(NULL AS DATE)=TO_DATE('15Sep10','DDMONYY'))
  14 - access("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T2"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
  16 - access("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))
       filter("T1"."ASOF"=TO_DATE('15Sep10','DDMONYY'))

37 rows selected.

Perfect, in less then 5 minutes we spot the difference . Actually this was the issue which started everything for this series.
I spent 1 full day on the real sql and 10053 traces and suddenly I checked the sql carefully again (another “invalid dba exception”, should have done before )
and it reminded me Christian Antognini’s post about native full outer join support (Being a blog reader is good thing !!!)
then I checked the parameter and immediatelly realized that it was my issue.

After I automate all the things I have tried for this issue now I really feel confident for %99 of the issues caused during upgrades with bugfix and optimizer_parameter changes. 10053 with many joins makes me feel really dumb :)
Thank god at least for the upgrades I don’t have to go through it again.

For this particular case because I was ready with all the information I needed, I raised the issue with Oracle and got response back on the “same day” that
there is a bug with a patch 9287401 available, and once I patched it worked like charm.

Unfortunatelly, not eveything is clear as daylight like on this example:( There can be combination of parameters effecting the generation of the plan.
Controlled environment like the one above (one parameter at a time) only reveals simple issue. We may need to run non_controlled version
to see how things can differ when we keep the parameter at the value we set.

Non controlled version for the script is (sorry for the dodgy naming)

build_optimizer_param_test_2 : Same logic this time we keep the parameter with the setting we set.
build_optimizer_param_test_e_2: Same logic this time we keep the parameter with the setting we set.

There is also one more thing I discovered, you need to re-connect after your tests because there is no “alter session reset” option for session level parameters.
When you explicitly set a parameter, changing OFE does not adjust that parameter so better refresh your session environment by re-connect

Lets see how non controlled test case work.

This time I don’t have a test case but a real case where things went wrong after 11.2.0.1 upgrade from 10.2.0.4.
A report query suddenly started to run like a dog. From 10 minutes to 10 hours. I went through all bugfixes and it did not help again.

Plan differences for versions were like below.

For 11.2.0.1

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 769921106

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                              | 43364 |    28M|       |   795K (22)| 00:28:06 |       |       |
|   1 |  SORT ORDER BY                              |                              | 43364 |    28M|    30M|   795K (22)| 00:28:06 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                     |                              | 43364 |    28M|       |   790K (22)| 00:27:54 |       |       |
|   3 |    VIEW                                     |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                               |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                      | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                      | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                    |                              | 43364 |    26M|       |   790K (22)| 00:27:54 |       |       |
|   8 |     VIEW                                    |                              |  4841 |  1096K|       |   760K (23)| 00:26:51 |       |       |
|   9 |      HASH GROUP BY                          |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|* 10 |       HASH JOIN                             |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|  11 |        TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                            |                              |  4841 |   756K|  2368K|   760K (23)| 00:26:51 |       |       |
|  13 |         TABLE ACCESS FULL                   | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|* 14 |         HASH JOIN                           |                              |   219K|    23M|    20M|   759K (23)| 00:26:49 |       |       |
|* 15 |          HASH JOIN                          |                              |   219K|    17M|       |   738K (23)| 00:26:04 |       |       |
|  16 |           PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 17 |            TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 18 |             INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  19 |             TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 20 |              INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|  21 |           PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  22 |            TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  23 |          PARTITION LIST SINGLE              |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 24 |           TABLE ACCESS FULL                 | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|* 25 |     HASH JOIN                               |                              | 43364 |    16M|       | 29859   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                      | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                              |                              | 43364 |    16M|  2736K| 29853   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                     | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                  |                              | 43364 |    13M|       | 29230   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                        |                              | 43364 |  4319K|  4968K| 29230   (2)| 00:01:02 |       |       |
|  31 |         NESTED LOOPS                        |                              |       |       |       |            |          |       |       |
|  32 |          NESTED LOOPS                       |                              | 43364 |  4319K|       | 28309   (2)| 00:01:00 |       |       |
|  33 |           VIEW                              | VW_GBF_14                    |  9021 |   669K|       |  1065  (12)| 00:00:03 |       |       |
|  34 |            HASH GROUP BY                    |                              |  9021 |   916K|  1072K|  1065  (12)| 00:00:03 |       |       |
|* 35 |             HASH JOIN                       |                              |  9021 |   916K|       |   868  (13)| 00:00:02 |       |       |
|* 36 |              HASH JOIN                      |                              |  9021 |   643K|       |   736  (13)| 00:00:02 |       |       |
|* 37 |               INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  38 |               PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|* 39 |                TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|  40 |              TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 41 |           INDEX RANGE SCAN                  | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
|  42 |          TABLE ACCESS BY GLOBAL INDEX ROWID | TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  14 - access("TERM"."INSTRUMENT"="I"."PKEY")
  15 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  17 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  18 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  20 - access("CLIENTS"."NAME"='Xxxxxxxx')
  24 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  35 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  36 - access("M"."RANGE"="VALUE")
  37 - access("PROPERTY"='PercentileInTblReport')
  39 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  41 - access("ITEM_1"="BA"."YYYYYYYY_METRICS_ID")

74 rows selected.

For 10.2.0.4

SQL> alter session set optimizer_features_enable='10.2.0.4';

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3853593815

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                              | 43364 |    28M|       |   794K (22)| 00:28:04 |       |       |
|   1 |  SORT ORDER BY                            |                              | 43364 |    28M|    61M|   794K (22)| 00:28:04 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                   |                              | 43364 |    28M|       |   789K (22)| 00:27:52 |       |       |
|   3 |    VIEW                                   |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                             |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                    | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                    | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                  |                              | 43364 |    26M|       |   789K (22)| 00:27:52 |       |       |
|   8 |     VIEW                                  |                              |  1770 |   401K|       |   759K (23)| 00:26:49 |       |       |
|   9 |      HASH GROUP BY                        |                              |  1770 |   295K|       |   759K (23)| 00:26:49 |       |       |
|* 10 |       HASH JOIN                           |                              |  1770 |   295K|       |   759K (23)| 00:26:49 |       |       |
|  11 |        TABLE ACCESS FULL                  | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                          |                              |  1770 |   276K|       |   759K (23)| 00:26:49 |       |       |
|  13 |         PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 14 |          TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 15 |           INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  16 |           TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 17 |            INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|* 18 |         HASH JOIN                         |                              |  2066K|   201M|       |   758K (23)| 00:26:48 |       |       |
|* 19 |          HASH JOIN                        |                              | 14382 |  1067K|  2368K| 20817  (15)| 00:00:45 |       |       |
|  20 |           TABLE ACCESS FULL               | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|  21 |           PARTITION LIST SINGLE           |                              |  3215K|    88M|       | 17213  (15)| 00:00:37 |   KEY |   KEY |
|  22 |            TABLE ACCESS FULL              | INSTRUMENTS                  |  3215K|    88M|       | 17213  (15)| 00:00:37 |    34 |    34 |
|  23 |          PARTITION RANGE ALL              |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  24 |           TABLE ACCESS FULL               | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|* 25 |     HASH JOIN                             |                              | 43364 |    16M|       | 29935   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                            |                              | 43364 |    16M|  2736K| 29929   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                   | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                |                              | 43364 |    13M|       | 29306   (2)| 00:01:03 |       |       |
|  30 |        HASH GROUP BY                      |                              | 43364 |  5759K|    12M| 29306   (2)| 00:01:03 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID| TBL_YYYYYYYYY_SPREADS        |     5 |   160 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                     |                              | 43364 |  5759K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                       |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                      |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  14 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  15 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  17 - access("CLIENTS"."NAME"='Xxxxxxxx')
  18 - access("TERM"."INSTRUMENT"="I"."PKEY")
  19 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

70 rows selected.

The plan_hash_value I was chasing for is 3853593815

I ran the controlled version and the results were like below.

SQL> @optimizer_param_test_e

PD_NAME                                            PD_VALUE      PLAN_HASH_VALUE
-------------------------------------------------- ---------- ------------------
_nlj_batching_enabled                              0                    49955995
_aggregation_optimization_settings                 32                  769921106
_optimizer_use_cbqt_star_transformation            FALSE
_optimizer_unnest_disjunctive_subq                 FALSE
_optimizer_unnest_corr_set_subq                    FALSE
_optimizer_try_st_before_jppd                      FALSE
_optimizer_table_expansion                         FALSE
_optimizer_null_aware_antijoin                     FALSE
_optimizer_native_full_outer_join                  OFF
_optimizer_multi_level_push_pred                   FALSE
_optimizer_join_factorization                      FALSE
_optimizer_improve_selectivity                     FALSE
_optimizer_fast_pred_transitivity                  FALSE
_optimizer_fast_access_pred_analysis               FALSE
_replace_virtual_columns                           FALSE
_optimizer_use_feedback                            FALSE
_and_pruning_enabled                               FALSE
_bloom_folding_enabled                             FALSE
_optimizer_extended_cursor_sharing_rel             NONE
_optimizer_extend_jppd_view_types                  FALSE
_optimizer_enable_density_improvements             FALSE
_optimizer_eliminate_filtering_join                FALSE
_optimizer_distinct_placement                      FALSE
_optimizer_distinct_agg_transform                  FALSE
_optimizer_connect_by_elim_dups                    FALSE
_optimizer_coalesce_subqueries                     FALSE
_optimizer_adaptive_cursor_sharing                 FALSE
_first_k_rows_dynamic_proration                    FALSE
_connect_by_use_union_all                          OLD_PLAN_M
                                                   ODE

_bloom_pruning_enabled                             FALSE
_optimizer_group_by_placement                      FALSE              1801558718
_optimizer_enable_extended_stats                   FALSE              1803847571
_optimizer_extended_stats_usage_control            255

It did not help me for the right plan, and to be honest I wasn’t expecting that it will help in this case because of the extra nexted loop which is controlled
by “_nlj_batching_enabled”.

I again tried it with the non_controlled version of the test builder to see if the parameters above will still effect.

SQL&gt; @build_optimizer_param_test_e_2
Enter value for before_ofe: 11.2.0.1
Enter value for after_ofe: 10.2.0.4

SQL&gt; @optimizer_param_test_e_2

PD_NAME PD_VALUE PLAN_HASH_VALUE
-------------------------------------------------- ---------- ------------------
_replace_virtual_columns FALSE 769921106
_optimizer_use_feedback FALSE
_optimizer_use_cbqt_star_transformation FALSE
_optimizer_unnest_disjunctive_subq FALSE
_optimizer_unnest_corr_set_subq FALSE
_optimizer_try_st_before_jppd FALSE
_optimizer_table_expansion FALSE
_optimizer_null_aware_antijoin FALSE
_optimizer_native_full_outer_join OFF
_optimizer_multi_level_push_pred FALSE
_optimizer_join_factorization FALSE
_optimizer_improve_selectivity FALSE
_optimizer_group_by_placement FALSE 1801558718 =&gt;change
_optimizer_fast_pred_transitivity FALSE
_optimizer_fast_access_pred_analysis FALSE
_optimizer_extended_stats_usage_control 255 1340272812 =&gt;change
_optimizer_extended_cursor_sharing_rel NONE
_optimizer_extend_jppd_view_types FALSE
_optimizer_enable_extended_stats FALSE
_optimizer_enable_density_improvements FALSE
_optimizer_eliminate_filtering_join FALSE
_optimizer_distinct_placement FALSE
_optimizer_distinct_agg_transform FALSE
_optimizer_connect_by_elim_dups FALSE
_optimizer_coalesce_subqueries FALSE
_optimizer_adaptive_cursor_sharing FALSE
_nlj_batching_enabled 0 3853593815 =&gt;change
_first_k_rows_dynamic_proration FALSE
_connect_by_use_union_all OLD_PLAN_M
ODE

_bloom_pruning_enabled FALSE
_bloom_folding_enabled FALSE
_and_pruning_enabled FALSE
_aggregation_optimization_settings 32

For non controlled version the order of the output above is the order of insert to the table so from first row to the last row test builder
reverts the parameter back to 10.2.0.4 value and goes on with the new parameter without changing it back to 11.2.0.1 value .
Till “_optimizer_group_by_placement” we have 11.2.0.1 default plan. After that one, we see another change on “_optimizer_extended_stats_usage_control”
and when nested loop behaviour reverts back with “_nlj_batching_enabled “ to the original we find our right plan. So possibilities drop down from 35 to 3.
If we are lucky enough combination of these 3 parameters will give us the right plan.

First start with extra nested loop controller “_nlj_batching_enabled” and we revert it back to pre-11g behavior

SQL> alter session set "_nlj_batching_enabled"=0;

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 49955995

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                              | 43364 |    28M|       |   795K (22)| 00:28:06 |       |       |
|   1 |  SORT ORDER BY                              |                              | 43364 |    28M|    30M|   795K (22)| 00:28:06 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                     |                              | 43364 |    28M|       |   790K (22)| 00:27:54 |       |       |
|   3 |    VIEW                                     |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                               |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                      | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                      | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                    |                              | 43364 |    26M|       |   790K (22)| 00:27:54 |       |       |
|   8 |     VIEW                                    |                              |  4841 |  1096K|       |   760K (23)| 00:26:51 |       |       |
|   9 |      HASH GROUP BY                          |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|* 10 |       HASH JOIN                             |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|  11 |        TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                            |                              |  4841 |   756K|  2368K|   760K (23)| 00:26:51 |       |       |
|  13 |         TABLE ACCESS FULL                   | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|* 14 |         HASH JOIN                           |                              |   219K|    23M|    20M|   759K (23)| 00:26:49 |       |       |
|* 15 |          HASH JOIN                          |                              |   219K|    17M|       |   738K (23)| 00:26:04 |       |       |
|  16 |           PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 17 |            TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 18 |             INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  19 |             TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 20 |              INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|  21 |           PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  22 |            TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  23 |          PARTITION LIST SINGLE              |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 24 |           TABLE ACCESS FULL                 | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|* 25 |     HASH JOIN                               |                              | 43364 |    16M|       | 29859   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                      | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                              |                              | 43364 |    16M|  2736K| 29853   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                     | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                  |                              | 43364 |    13M|       | 29230   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                        |                              | 43364 |  4319K|  4968K| 29230   (2)| 00:01:02 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID  | TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                       |                              | 43364 |  4319K|       | 28309   (2)| 00:01:00 |       |       |
|  33 |           VIEW                              | VW_GBF_14                    |  9021 |   669K|       |  1065  (12)| 00:00:03 |       |       |
|  34 |            HASH GROUP BY                    |                              |  9021 |   916K|  1072K|  1065  (12)| 00:00:03 |       |       |
|* 35 |             HASH JOIN                       |                              |  9021 |   916K|       |   868  (13)| 00:00:02 |       |       |
|* 36 |              HASH JOIN                      |                              |  9021 |   643K|       |   736  (13)| 00:00:02 |       |       |
|* 37 |               INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  38 |               PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|* 39 |                TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   733  (13)| 00:00:02 |    34 |    34 |
|  40 |              TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 41 |           INDEX RANGE SCAN                  | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  14 - access("TERM"."INSTRUMENT"="I"."PKEY")
  15 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  17 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  18 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  20 - access("CLIENTS"."NAME"='Xxxxxxxx')
  24 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  35 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  36 - access("M"."RANGE"="VALUE")
  37 - access("PROPERTY"='PercentileInTblReport')
  39 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  41 - access("ITEM_1"="BA"."YYYYYYYY_METRICS_ID")

73 rows selected.

Now its time to go with the “_optimizer_group_by_placement”. This is about “consider group-by placement optimization” and We have many group bys in our query
as inner query. I was at first under the impression that this is about “_optimizer_extend_jppd_view_types” like Dion Cho explained here
and Maria Colgan (she is also a great presented please do not miss her sessions) from optimizer group explained here
but actually that parameter is not changing the behavior. I wish there were more information about “_optimizer_group_by_placement” but there aren’t :(

SQL> alter session set "_optimizer_group_by_placement"=FALSE;

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3419104514

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                              | 43364 |    28M|       |   795K (22)| 00:28:06 |       |       |
|   1 |  SORT ORDER BY                              |                              | 43364 |    28M|    30M|   795K (22)| 00:28:06 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                     |                              | 43364 |    28M|       |   790K (22)| 00:27:54 |       |       |
|   3 |    VIEW                                     |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                               |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                      | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                      | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                    |                              | 43364 |    26M|       |   790K (22)| 00:27:54 |       |       |
|   8 |     VIEW                                    |                              |  4841 |  1096K|       |   760K (23)| 00:26:51 |       |       |
|   9 |      HASH GROUP BY                          |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|* 10 |       HASH JOIN                             |                              |  4841 |   808K|       |   760K (23)| 00:26:51 |       |       |
|  11 |        TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                            |                              |  4841 |   756K|  2368K|   760K (23)| 00:26:51 |       |       |
|  13 |         TABLE ACCESS FULL                   | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|* 14 |         HASH JOIN                           |                              |   219K|    23M|    20M|   759K (23)| 00:26:49 |       |       |
|* 15 |          HASH JOIN                          |                              |   219K|    17M|       |   738K (23)| 00:26:04 |       |       |
|  16 |           PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 17 |            TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 18 |             INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  19 |             TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 20 |              INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|  21 |           PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  22 |            TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  23 |          PARTITION LIST SINGLE              |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 24 |           TABLE ACCESS FULL                 | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|* 25 |     HASH JOIN                               |                              | 43364 |    16M|       | 29883   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                      | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                              |                              | 43364 |    16M|  2736K| 29877   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                     | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                  |                              | 43364 |    13M|       | 29254   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                        |                              | 43364 |  5505K|  6208K| 29254   (2)| 00:01:02 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID  | TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                       |                              | 43364 |  5505K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                         |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                        |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN                | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE          |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL              | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL                | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                  | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  14 - access("TERM"."INSTRUMENT"="I"."PKEY")
  15 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  17 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  18 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  20 - access("CLIENTS"."NAME"='Xxxxxxxx')
  24 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

71 rows selected.

Still not helping with the exact plan I am looking for. Last try will be “_optimizer_extended_stats_usage_control”
This parameter is normally not available on 10.2.0.4 so probably setting it to 255 actually means disabling it.
Even we don’t have any extended stats on our database somehow it effects optimizer. Lets see how it effects

SQL> alter session set "_optimizer_extended_stats_usage_control"=255;

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3853593815

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                              | 43364 |    28M|       |   794K (22)| 00:28:04 |       |       |
|   1 |  SORT ORDER BY                            |                              | 43364 |    28M|    30M|   794K (22)| 00:28:04 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                   |                              | 43364 |    28M|       |   789K (22)| 00:27:52 |       |       |
|   3 |    VIEW                                   |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                             |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                    | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                    | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                  |                              | 43364 |    26M|       |   789K (22)| 00:27:52 |       |       |
|   8 |     VIEW                                  |                              |  1683 |   381K|       |   759K (23)| 00:26:49 |       |       |
|   9 |      HASH GROUP BY                        |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|* 10 |       HASH JOIN                           |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|  11 |        TABLE ACCESS FULL                  | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                          |                              |  1683 |   262K|       |   759K (23)| 00:26:49 |       |       |
|  13 |         PARTITION RANGE SINGLE            |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 14 |          TABLE ACCESS BY LOCAL INDEX ROWID| FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 15 |           INDEX RANGE SCAN                | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  16 |           TABLE ACCESS BY INDEX ROWID     | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 17 |            INDEX UNIQUE SCAN              | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|* 18 |         HASH JOIN                         |                              |  1964K|   191M|       |   758K (23)| 00:26:48 |       |       |
|* 19 |          HASH JOIN                        |                              | 14382 |  1067K|  2368K| 20870  (16)| 00:00:45 |       |       |
|  20 |           TABLE ACCESS FULL               | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|  21 |           PARTITION LIST SINGLE           |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 22 |            TABLE ACCESS FULL              | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|  23 |          PARTITION RANGE ALL              |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  24 |           TABLE ACCESS FULL               | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|* 25 |     HASH JOIN                             |                              | 43364 |    16M|       | 29883   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                    | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                            |                              | 43364 |    16M|  2736K| 29877   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                   | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                |                              | 43364 |    13M|       | 29254   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                      |                              | 43364 |  5505K|  6208K| 29254   (2)| 00:01:02 |       |       |
|  31 |         TABLE ACCESS BY GLOBAL INDEX ROWID| TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
|  32 |          NESTED LOOPS                     |                              | 43364 |  5505K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                       |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                      |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN              | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE        |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL            | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL              | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  14 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  15 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  17 - access("CLIENTS"."NAME"='Xxxxxxxx')
  18 - access("TERM"."INSTRUMENT"="I"."PKEY")
  19 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  22 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

71 rows selected.

As you can see after the 3th parameter we have our plan back

One more thing to show is that how will it be with extra nested loop (preperation for the next post :) ).

SQL> alter session set "_nlj_batching_enabled"=1;

Session altered.

SQL> @tesT_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1340272812

---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                              | 43364 |    28M|       |   794K (22)| 00:28:04 |       |       |
|   1 |  SORT ORDER BY                             |                              | 43364 |    28M|    30M|   794K (22)| 00:28:04 |       |       |
|*  2 |   HASH JOIN RIGHT OUTER                    |                              | 43364 |    28M|       |   789K (22)| 00:27:52 |       |       |
|   3 |    VIEW                                    |                              | 17518 |   975K|       |    62  (25)| 00:00:01 |       |       |
|*  4 |     HASH JOIN                              |                              | 17518 |   615K|       |    62  (25)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS FULL                     | STATIC_CLASSIFICATIONS       |   658 | 14476 |       |     3   (0)| 00:00:01 |       |       |
|*  6 |      TABLE ACCESS FULL                     | ORGS_CLASSIFICATIONS         | 17518 |   239K|       |    57  (23)| 00:00:01 |       |       |
|*  7 |    HASH JOIN RIGHT OUTER                   |                              | 43364 |    26M|       |   789K (22)| 00:27:52 |       |       |
|   8 |     VIEW                                   |                              |  1683 |   381K|       |   759K (23)| 00:26:49 |       |       |
|   9 |      HASH GROUP BY                         |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|* 10 |       HASH JOIN                            |                              |  1683 |   281K|       |   759K (23)| 00:26:49 |       |       |
|  11 |        TABLE ACCESS FULL                   | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 12 |        HASH JOIN                           |                              |  1683 |   262K|       |   759K (23)| 00:26:49 |       |       |
|  13 |         PARTITION RANGE SINGLE             |                              | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 14 |          TABLE ACCESS BY LOCAL INDEX ROWID | FC_TBL_XXXXXXXXXX_CURVE      | 10667 |   604K|       |   353   (3)| 00:00:01 |    34 |    34 |
|* 15 |           INDEX RANGE SCAN                 | FC_TBL_XXXXXXXXXX_CURVE_PK   |    48 |       |       |    34   (6)| 00:00:01 |    34 |    34 |
|  16 |           TABLE ACCESS BY INDEX ROWID      | CLIENTS                      |     1 |    20 |       |     2   (0)| 00:00:01 |       |       |
|* 17 |            INDEX UNIQUE SCAN               | CLIENTS_NAME_UNQ             |     1 |       |       |     1   (0)| 00:00:01 |       |       |
|* 18 |         HASH JOIN                          |                              |  1964K|   191M|       |   758K (23)| 00:26:48 |       |       |
|* 19 |          HASH JOIN                         |                              | 14382 |  1067K|  2368K| 20870  (16)| 00:00:45 |       |       |
|  20 |           TABLE ACCESS FULL                | ORGS                         | 41094 |  1886K|       |   229  (12)| 00:00:01 |       |       |
|  21 |           PARTITION LIST SINGLE            |                              |  3215K|    88M|       | 17265  (15)| 00:00:37 |   KEY |   KEY |
|* 22 |            TABLE ACCESS FULL               | INSTRUMENTS                  |  3215K|    88M|       | 17265  (15)| 00:00:37 |    34 |    34 |
|  23 |          PARTITION RANGE ALL               |                              |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|  24 |           TABLE ACCESS FULL                | FC_TBL_XXXXXXXXXX_TERMS      |   255M|  6345M|       |   717K (21)| 00:25:19 |     1 |    40 |
|* 25 |     HASH JOIN                              |                              | 43364 |    16M|       | 29883   (2)| 00:01:04 |       |       |
|  26 |      TABLE ACCESS FULL                     | STATIC_COUNTRIES             |   239 |  2629 |       |     2   (0)| 00:00:01 |       |       |
|* 27 |      HASH JOIN                             |                              | 43364 |    16M|  2736K| 29877   (2)| 00:01:04 |       |       |
|  28 |       TABLE ACCESS FULL                    | ORGS                         | 41094 |  2247K|       |   229  (12)| 00:00:01 |       |       |
|  29 |       VIEW                                 |                              | 43364 |    13M|       | 29254   (2)| 00:01:02 |       |       |
|  30 |        HASH GROUP BY                       |                              | 43364 |  5505K|  6208K| 29254   (2)| 00:01:02 |       |       |
|  31 |         NESTED LOOPS                       |                              |       |       |       |            |          |       |       |
|  32 |          NESTED LOOPS                      |                              | 43364 |  5505K|       | 28118   (2)| 00:01:00 |       |       |
|* 33 |           HASH JOIN                        |                              |  9021 |   916K|       |   874  (14)| 00:00:02 |       |       |
|* 34 |            HASH JOIN                       |                              |  9021 |   643K|       |   742  (14)| 00:00:02 |       |       |
|* 35 |             INDEX RANGE SCAN               | TBL_METRICS_CONFIG_UNIQUE    |     2 |    44 |       |     1   (0)| 00:00:01 |       |       |
|  36 |             PARTITION RANGE SINGLE         |                              | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|* 37 |              TABLE ACCESS FULL             | TBL_YYYYYYYYY_METRICS        | 17495 |   871K|       |   739  (14)| 00:00:02 |    34 |    34 |
|  38 |            TABLE ACCESS FULL               | TBL_TTTTT_TIER_TTTTTTTT_CONV | 42010 |  1271K|       |   127  (10)| 00:00:01 |       |       |
|* 39 |           INDEX RANGE SCAN                 | TBL_YYY_SPREADS_UNIQUE       |     5 |       |       |     2   (0)| 00:00:01 |       |       |
|  40 |          TABLE ACCESS BY GLOBAL INDEX ROWID| TBL_YYYYYYYYY_SPREADS        |     5 |   130 |       |     3   (0)| 00:00:01 | ROWID | ROWID |
---------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ORGS"."PKEY"="CLASSIFICATIONS"."ORG"(+))
   4 - access("ORGS_CLASSIFICATIONS"."SECTOR"="STATIC_CLASSIFICATIONS"."PKEY")
   6 - filter("ORGS_CLASSIFICATIONS"."PROVIDER"='ICB')
   7 - access("YYYYYYYY_CURVE"."ASOF"="CURVES"."ASOF"(+) AND "YYYYYYYY_CURVE"."ENTITY"="CURVES"."ISSUER"(+) AND
              "YYYYYYYY_CURVE"."TIER"="CURVES"."TIER"(+) AND "YYYYYYYY_CURVE"."RUNNING_COUPON"="CURVES"."RUNNING_COUPON"(+))
  10 - access("O"."COUNTRY"="C"."ID")
  12 - access("CURVE"."ROWKEY"="TERM"."CURVE_KEY")
  14 - filter("CURVE"."RUNNING_COUPON" IS NOT NULL AND "CURVE"."SOURCE"='LEGACY' AND "CURVE"."OWNER"= (SELECT "CLIENTS"."PKEY" FROM
              "CORE"."CLIENTS" "CLIENTS" WHERE "CLIENTS"."NAME"='Xxxxxxxx'))
  15 - access("CURVE"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  17 - access("CLIENTS"."NAME"='Xxxxxxxx')
  18 - access("TERM"."INSTRUMENT"="I"."PKEY")
  19 - access("I"."ISSUER"="O"."PKEY" AND "I"."CCY"="O"."DEFAULTCCY" AND "I"."DOCCLAUSE"="O"."DEFAULTDOCCLAUSE")
  22 - filter("I"."DOCCLAUSE" IS NOT NULL)
  25 - access("ORGS"."COUNTRY"="STATIC_COUNTRIES"."ID")
  27 - access("YYYYYYYY_CURVE"."ENTITY"="ORGS"."PKEY")
  33 - access("M"."TRADING_CONV_ID"="C"."TBL_E_T_TRADING_CONV_ID")
  34 - access("M"."RANGE"="VALUE")
  35 - access("PROPERTY"='PercentileInTblReport')
  37 - filter("M"."ASOF"=TO_DATE(' 2010-10-18 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  39 - access("M"."YYYYYYYY_METRICS_ID"="BA"."YYYYYYYY_METRICS_ID")

72 rows selected.

This plan is the plan I would like to see once I upgraded the DB but due to the new optimizations I saw something completelly different.
Once I disabled the right ones we have a solution to generate the right plan.

If I have time to deal with it (maybe I have in the future) I could show you the differences on 10053 file for this case, but it can only happen when I finish this series.

This post and the previous posts about bugfixes are just for you to give better understanding of what is effecting your plan change
and how you can feedback oracle support with the finding you found out. The non-scientific methods here are for automating your researches and saving you some time and
giving you the idea where to look when you look at the 10053 file if you want to dig further.
Normally shortest way to create the plan you are looking after is Using OFE and Plan Baselines together and job will be done for %99 of the cases.

!!!!!These are again undocumented parameters They can be solution to wrong results or they can be cause of wrong results
please do a research on MOS and ask Oracle support before you use them

Scripts:

All in one
full_outer_join_test_case.sql
build_opt_param_table.sql
before_after_opt_param.sql
build_optimizer_param_test.sql
build_optimizer_param_test_e.sql
build_optimizer_param_test_2.sql
build_optimizer_param_test_e_2.sql

<–Previous Post in This Series —————-     Next Post in this series–>

February 15, 2011

Plan Stability Through Upgrade-Why is my plan changed?-bugfixes-2

Filed under: Bugs, CBO, Performance, Plan Stability — coskan @ 12:01 am

This post is a bonus post where things can get complicated with my approach for finding bugfixes , if you are already upgraded.

On a database where we upgraded from 10.2.0.4 to 11.2.0.1 , We had a sql reported to have changed its response time on 11.2.0.1 .(At this point I need to say I’m sorry that I can’t share the SQL and PLANS because of data privacy. Tried to edit the sql and plans but query is massive . )

Interestingly when I run the optimizer_features_test for the versions between 10.2.0.4 to 11.2.0.1
only change was extra nested loop. Cost is same and plan generated for 10.2.0.4 was not same as pre-upgrade version

Runtime statistics .


SQL> @sqlid b290wxr1j2a2b

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
b290wxr1j2a2b 3272681547 SELECT   /*+gather_plan_statistics*//*test*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002BCC6C048 00000002BCC6A0A0   83708128     2022          1          1          1          2            121 1255771.09 1323564.07 1323.56407   62856727      12621          1               0
   1 00000002BCC6C048 00000002BCC2A3C8 1845562615      889          1          1          1          2            121   1847.719   2327.507   2.327507       2889        306          1               0
   2 00000002BCC6C048 00000002BCC24D78   83708128     2022          2          1          2          4            242 2310945.68 2318713.43 1159.35672  125714199        103          2               0
   3 00000002BCC6C048 00000002BCBCEFD0 3790468002     2022          2          1          2          4            242 2349948.75 2361862.35 1180.93118  125714497        424          2               0
   4 00000002BCC6C048 00000002BDDAF630  427381853     2022          2          1          2          4            242 2308517.05 2319381.53 1159.69077  125713318         26          2               0
   5 00000002BCC6C048 00000002DA577EB8  427381853     2022          2          1          2          4            242 2294214.23 2300615.11 1150.30755  125713315          0          2               0

6 rows selected.
Child-Version Match is like below

Child	OFE
=================
0-1	11.2.0.1
2	11.1.0.7
3	11.1.0.6
4	10.2.0.5
5	10.2.0.4

Child number 1 on 11.2.0.1 runs with cardinality feedback and uses right plan for OR with extra nested loop make up and drops the response time from 1323 seconds to 2 seconds
Other

So what is changed and how can I catch it if changing OFE is not working ? Its pretty obvious that we have a stats problem but this query involves 10 different table
and its not always easy to gather stats on tables where all other queries are running fine on. I needed to find a solution.

When I check the query it has an OR and the new plans basically do not generate a plan for OR. I modified the query to use union all instead of OR I can get the same pre-upgrade plan which is interesting
because stats are still same so how come Oracle generates right plan when I use union all and wrong plan when I use OR ? There must be something causing this behaviour.

Since Changing OFE did not change anything in terms of response time we may look into the situation where 8.0 bugfix suddenly appeared on 11.2.0.1 but were not active on 10.2.0.4

How can we spot the bugfix ? You can use the bugfix builder that generates script to test bugfixes for all version below the given version

Script is build_optimizer_bugfix_test_below_e: Same thing but creates a script for all the bugfixes for the versions below the given version

Source code of the script

undefine optimizer_feature_version
set heading off
set feedback off
define optimizer_feature=&quot;&amp;optimizer_feature_version&quot;
set term off
spool optimizer_bugfix_test_e.sql
select 'set feedback off' from dual;
select 'set heading off' from dual;
select 'set term off' from dual;
select 'set echo off' from dual;
select 'spool optimizer_bugfix_test_results_e.log' from dual;
select 'drop table  bugfix_table purge;' from dual;
select 'create  table bugfix_table (bugfix number,plan_hash_value varchar2(20));' from dual;
select 'alter session set optimizer_features_enable=''&amp;optimizer_feature'';' from dual;
select   'alter session set &quot;_fix_control&quot;='''||bugno||decode(value,1,':OFF''',':ON''')||'; '
||chr(10)||'@test_e.sql'||chr(10)||
'insert into bugfix_table values ('||bugno||',(select substr(plan_table_output,18,12) from (select * from table(dbms_xplan.display)) where plan_table_output  like ''Plan hash value:%''));'||chr(10)||
'alter session set &quot;_fix_control&quot;='''||bugno||decode(value,1,':ON''',':OFF''')||'; '
from v$system_fix_control
where
to_number(replace(optimizer_feature_enable,'.',''))&lt;to_number(replace('&amp;optimizer_feature','.',''))
and value in (1,0) order by bugno;
select 'set term on' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'break on plan_hash_value' from dual;
select 'select plan_hash_value,bugfix from bugfix_table order by 1;' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'spool off' from dual;
set feedback on
set heading on
set term on
spool off

When I run it


SQL&gt; @build_optimizer_bugfix_test_below_e
Enter value for optimizer_feature_version: 10.2.0.4
SQL&gt; @optimizer_bugfix_test_e

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
1186576438              5129233
                        3120429
427381853               1403283
                        2320291
                        2324795
                        2492766
                        2660592
                        2663857
			.......
			.......
76 rows selected.

Two bugfixes gives us the plan we are looking for

SQL&gt; @bugfixg
Enter value for bugno: 5129233

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   5129233          1 QKSFM_CBO_5129233         peek at any foldable exprssion during pruning                    10.2.0.3                           0          1

1 row selected.

SQL&gt; @bugfixg
Enter value for bugno: 3120429

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   3120429          1 QKSFM_ACCESS_PATH_3120429 account for join key sparsity in computing NL index access cost  10.1.0.3                           0          1

1 row selected.

how can we sure which one is our problem ? Both of them enables us to run with the right plan
At this point only option is to go and check on a 10.2.0.4 database to see what these fixes were


SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> @bugfixg
Enter value for bugno: 5129233

     BUGNO      VALUE DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   5129233          1 peek at any foldable exprssion during pruning                    10.2.0.3                           0          1

SQL> @bugfixg
Enter value for bugno: 3120429

no rows selected

As you can see 5129233 is again enabled on 10.2.0.4 but bugfix 3120429 does not exist and not enabled/disabled on a database running on 10.2.0.4 binaries.

Even both of them are solving the issue when we are running on 11.2.0.1, in my opinion controlling bugfix is 3120429.
Thanks to the reasonable developer who accepted to change the code using union all instead of or we did not need to enable disable a bugfix.

Since we find two bugs lets test them (I add comments to force new parse instead of flushing the sql for fresh runtime stats so sqlids are different )

When bugfix 3120429 is disabled

SQL> @sqlid 3798wumm6mwsq

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
3798wumm6mwsq 3865703190 SELECT   /*+gather_plan_statistics*//*test_2*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002DC1DABC8 00000002DC968950 4098949197      531          2          1          2          2              0     1305.8     1332.1     .66605        207          3          2               0
   1 00000002DC1DABC8 00000002DC6182A0 4098949197      531          2          1          2          2              0   1328.799   1336.494    .668247        207          0          2               0
   2 00000002DC1DABC8 00000002DC76E5E8 3700866780      451          2          1          2          2              0   1353.794   1369.258    .684629        214          0          2               0
   3 00000002DC1DABC8 00000002DCCAAFF8 2856793267      451          2          1          2          2              0   1244.811   1251.411   .6257055        214          0          2               0
   4 00000002DC1DABC8 00000002A5F40D20 1186576438      481          2          1          2          2              0    898.864    906.962    .453481        190          0          2               0

When bugfix 5129233 is disabled


SQL> @sqlid bh6a9xtq27hba

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
bh6a9xtq27hba 1814282602 SELECT   /*+gather_plan_statistics*//*test_3*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002DC6F9150 00000002DC955228  986578726      519          2          1          2          2              0    1177.82   1183.933   .5919665        183          0          2               0
   1 00000002DC6F9150 00000002A5F400F0  986578726      519          2          1          2          2              0   1261.808   1268.259   .6341295        183          0          2               0
   2 00000002DC6F9150 00000002DCBB2DE0 3700866780      445          2          1          2          2              0   1506.771   1533.008    .766504        214          0          2               0
   3 00000002DC6F9150 00000002DCBB29C0 2856793267      445          2          1          2          2              0   1305.801   1317.385   .6586925        214          0          2               0
   4 00000002DC6F9150 00000002DC81E000 1186576438      474          2          1          2          2              0    940.858    945.413   .4727065        190          0          2               0

On 11.2.0.1 and 11.1.0.7 plan is changed depending on the bugfix but for 10.2.0.4, 10.2.0.5 and 11.1.0.6 plan is constant for changing different bugfixes.
You can also see previous cardinality feedback plan is also slower then bugfix disabled version.

We can dig further but its not the point of this post and Since I managed to take the response time back to where it was from 1323 seconds to 0.5 seconds I can stop here.

As you can see we still spot a difference after upgrade even time travelling between releases by altering OFE did not help us at first point.

This post was to explain complicated issue and things you can face with my approach.

I also find it very usefull to have v$system_fix_control table snapped before you upgrade/patch and compared afterwards, so you won’t get shocked with newly added previos versions related bugs.

Scripts Used;

build_optimizer_bugfix_test_below_e

<-Previos Post in Series ——————————-  Next Post in Series–>

February 14, 2011

Plan Stability Through Upgrade-Why is my plan changed?-bugfixes-1

Filed under: Bugs, CBO, Performance, Plan Stability — coskan @ 12:00 pm

Before I start the actual post let me clarify something, these post series will be about using practical ways like I did on this post to understand
what is the cause of the post upgrade performance changes. I am trying to be more practical than scientific because of the nature of the production support DBA job.
When you are a production DBA, you deal with many issues/databases at the same time and time spent for a solution is very important for you to be able to move to the second problem.
Normally if you are a scientist, oracle enthusiastic then you get 10053 trace try to understand the data try to find out why optimizer did something on Xth step but in reality when it comes to upgrade issues
this is really time consuming and not very practical. You really need time/experience/knowledge to understand a 10053 trace especially for massive queries which has gazillion joins in it.
You can still understand what is going on from prediction and runtime statistics of a plan but it is not the answer you can give to the developers.
If you go and explain them all the things you spot in execution plan output and 10053 trace All they will say to you is “DB is upgraded and upgrade sucks because previous release was better why are we upgrading”
so you need a bit bigger proof to make them accept the issue and workarounds. I also want to add that, my priority for these series is not focusing on content of sqls
but giving you the practical technique to stabilize their plans during upgrades.

Now back to post, You followed the steps in first post and you found out that your plan is changed, now it is time to find out why it is changed
when the only optimizer version is changed. One of the possible reasons is bugfixes in the optimizer.

a-Brief introduction to bugfixes.
Information below is taken from MOS Note – Init.ora Parameter “_FIX_CONTROL” [Hidden] Reference Note [ID 567171.1];

Support and Historical Notes for "_FIX_CONTROL"

  <strong>_FIX_CONTROL</strong> is a special hidden parameter introduced in 10.2.0.2 which can be used to enable or disable certain bug fixes.
  This is typically used in the optimizer area to allow specific fixes to be turned on or off as some fixes can have a significant
  effect on execution plans. Often the default setting (ON or OFF) for a fix is determined by the
  value of the OPTIMIZER_FEATURES_ENABLE parameter.  A full list of bug numbers which are  controlled by fix control can be found in the V$ views:
    - GV$SYSTEM_FIX_CONTROL
    - GV$SESSION_FIX_CONTROL

  The general format for this parameter is the form: _FIX_CONTROL='bug_number:ON|OFF'

  eg:  alter session set "_fix_control"='4969880:ON';
      or in the init.ora
       _fix_control='4969880:ON'

  Multiple values can be specified:
  eg:	alter session set "_fix_control"='4969880:ON','5483301:off';

  Note that fixes that use _FIX_CONTROL in one release may not use it in other releases so any settings should be reviewed when
  upgrading / downgrading or applying a patch set. Also some fix controls have numeric settings rather than ON or OFF.

In summary in every release of Oracle Database (( including patches) ) there are bugfixes and since 10.2.0.2 their availability can be controlled by
this “undocumented” parameter. Some of these bugfixes are on by default some of them are not depending on the optimizer version+oracle binary(check extra test case at the bottom). you are running.
Bugfixes gives us fine grained control. If the optimizer feature you are running causing problems on some queries because of a bugfix then you can still stay on
the version you want but disable the fixes, having said that you still need aproval from ORacle Support if it is safe or not to enable disable bug.

Here are the number of bugfixes by releases taken from 11.2.0.2 base release

SQL&gt; select optimizer_feature_enable,count(*) from v$system_fix_control group by optimizer_feature_enable order by 1 asc;

OPTIMIZER_FEATURE_ENABLE    COUNT(*)
------------------------- ----------
10.1.0                             3
10.1.0.3                           1
10.1.0.5                           2
10.2.0.1                           7
10.2.0.2                          12
10.2.0.3                          11
10.2.0.4                          73
10.2.0.5                         111
11.1.0.6                          39
11.1.0.7                          19
11.2.0.1                          65
11.2.0.2                         105
8.0.0                             52
8.1.6                              1
8.1.7                              2
9.2.0                              8
9.2.0.8                            3
37

And this is from 10.2.0.4 PSU 2

SQL&gt; select optimizer_feature_enable,count(*) from v$system_fix_control group by optimizer_feature_enable order by 1 asc;

OPTIMIZER_FEATURE_ENABLE    COUNT(*)
------------------------- ----------
10.1.0                             1
10.1.0.5                           1
10.2.0.1                           4
10.2.0.2                          13
10.2.0.3                          10
10.2.0.4                          68
9.2.0                              2
9.2.0.8                            7
9

There are couple of things you can spot here. Despite being higher version Oracle sometimes waits to put a bugfix of lower versions.
There are bugfixes from 10.1.0.3 and 8.0.0 and 8.1.6 8.1.7 which are on 11.2.0.2 relase but not on 10.2.0.4 release which is interesting.
In general do not think that bugfixes are added gradually. Oracle can put/make switchable a bugfix from a very old release to a brand new release. There is one more thing to bear in mind not %100 of bugfixes works with value 1 or 0. There are couple of them with values over 1 probably related with thresholds. In this post I will not pay attention to them.

 

After this brief information now it is time for the actual reason of this post.
As you recall, I shared an optimizer_features_test builder on the first post and now with the same logic I have 2 more builder scripts
to build your bugfix test for versions. After you identify under which OFE your plan behaviour changed, all you need to do is to pass
that version level to the scripts below and let them create test cases by enabling/disabling the bugfixes spesific to that version.

build_optimizer_bugfix_test.sql: This is for runtime bugfix test_case generator. All it does is create optimizer_bugfix_test.sql by using v$system_fix_control_table.
It gets each bugfix from that table for given OFE version and enable/disable them depending on their default value then run the test.sql and disable/enable the bugfix back again.
So we have a controlled environment where we test 1 bugfix at a time. It captures every plan_hash_value into a table and select them once all bugfix numbers tested.
If you are lucky enough or in other words if the problem is caused by a bugfix Oracle will revert the plan back to the original once you are changing that bugfixes value.
There is a tiny little problem with this approach, you cannot run 600 second query for 103 bugfixes. This is mostly for very quick queries.
The code of the sql is like below


undefine optimizer_feature_version
set heading off
set feedback off
define optimizer_feature="&optimizer_feature_version"
set term off
spool optimizer_bugfix_test.sql
select 'set feedback off' from dual;
select 'set heading off' from dual;
select 'set term off' from dual;
select 'set echo off' from dual;
select 'spool optimizer_bugfix_test_results.log' from dual;
select 'drop table  bugfix_table purge;' from dual;
select 'create  table bugfix_gtt (bugfix number,plan_hash_value varchar2(20));' from dual;
select 'alter session set optimizer_features_enable=''&optimizer_feature'';' from dual;
select   'alter session set "_fix_control"='''||bugno||decode(value,1,':OFF''',':ON''')||'; '
||chr(10)||'@test.sql'||chr(10)||
'insert into bugfix_table values ('||bugno||',(select substr(plan_table_output,18,12) from (select * from table(dbms_xplan.display_cursor)) where plan_table_output  like ''Plan hash value:%''));'||chr(10)||
'alter session set "_fix_control"='''||bugno||decode(value,1,':ON''',':OFF''')||'; '
from v$system_fix_control
where optimizer_feature_enable='&optimizer_feature' and value in (1,0) order by bugno;
select 'set term on' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'break on plan_hash_value' from dual;
select 'select plan_hash_value,bugfix from bugfix_table order by 1;' from dual;
select 'spool off' from dual;
spool off
set feedback on
set heading on
set term on

If you are sure your plan change behaviour also taking place when you explain the plan, better you run the builder below.

build_optimizer_bugfix_test_e.sql: This is same idea like above only difference is, this time it generates optimizer_features_test_e.sql which explains plans by runnint test_e.sql for every bugfix.
Instead of dbms_xplan.display_cursor I use dbms_xplan.display. Output of the optimizer_features_test_e.sql is same as optimizer_features_test.sql

source code of the builder

undefine optimizer_feature_version
set heading off
set feedback off
define optimizer_feature="&optimizer_feature_version"
set term off
spool optimizer_bugfix_test_e.sql
select 'set feedback off' from dual;
select 'set heading off' from dual;
select 'set term off' from dual;
select 'set echo off' from dual;
select 'spool optimizer_bugfix_test_results_e.log' from dual;
select 'drop table  bugfix_table purge;' from dual;
select 'create  table bugfix_gtt (bugfix number,plan_hash_value varchar2(20));' from dual;
select 'alter session set optimizer_features_enable=''&optimizer_feature'';' from dual;
select   'alter session set "_fix_control"='''||bugno||decode(value,1,':OFF''',':ON''')||'; '
||chr(10)||'@test_e.sql'||chr(10)||
'insert into bugfix_table values ('||bugno||',(select substr(plan_table_output,18,12) from (select * from table(dbms_xplan.display)) where plan_table_output  like ''Plan hash value:%''));'||chr(10)||
'alter session set "_fix_control"='''||bugno||decode(value,1,':ON''',':OFF''')||'; '
from v$system_fix_control
where optimizer_feature_enable='&optimizer_feature' and value in (1,0) order by bugno;
select 'set term on' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'break on plan_hash_value' from dual;
select 'select plan_hash_value,bugfix from bugfix_table order by 1;' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'spool off' from dual;
set feedback on
set heading on
set term on
spool off

Both of the queries drop and create the table that they insert plan and bugfix numbers. You may want to implement it different way.

Let me show you how they work in two different examples for two of the queries.

This example is for a sql which started to run slow after upgrade from 11.2.0.1 to 11.2.0.2.

first check if the plan is changed


SQL> @sqlhist 1m7wy721qytx3

   SNAP_ID BEGIN_INTERVAL_TIME            SQL_ID          CHILD_NO PLAN_HASH_VALUE
---------- ------------------------------ ------------- ---------- ---------------
     25077 19-JAN-11 04.00.56.483         1m7wy721qytx3          0      3561051786
     25078 19-JAN-11 04.31.00.422         1m7wy721qytx3          0      3561051786
     25269 23-JAN-11 04.00.47.349         1m7wy721qytx3          0      2266344041 =>upgrade
     25317 24-JAN-11 04.00.53.276         1m7wy721qytx3          0      2266344041
     25365 25-JAN-11 04.00.43.525         1m7wy721qytx3          0      2266344041
     25366 25-JAN-11 04.30.47.456         1m7wy721qytx3          0      2266344041
     25413 26-JAN-11 04.00.22.604         1m7wy721qytx3          0      2266344041
     25414 26-JAN-11 04.30.25.833         1m7wy721qytx3          0      2266344041
     25461 27-JAN-11 04.00.18.101         1m7wy721qytx3          0      2266344041
     25462 27-JAN-11 04.30.24.777         1m7wy721qytx3          0      2266344041
     25509 28-JAN-11 04.00.05.631         1m7wy721qytx3          0      2266344041
     25510 28-JAN-11 04.30.10.558         1m7wy721qytx3          0      2266344041
     25557 29-JAN-11 04.00.41.775         1m7wy721qytx3          0      2266344041
     25558 29-JAN-11 04.30.46.453         1m7wy721qytx3          0      2266344041

Plan hash value: 3561051786

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                  |                      |       |       |  1082K(100)|          |       |       |
|   1 |  MERGE                           | TABLE_XXXXXXXXXXX    |       |       |            |          |       |       |
|   2 |   VIEW                           |                      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS OUTER            |                      |     1 |   286 |  1082K  (1)| 01:44:15 |       |       |
|   4 |     VIEW                         |                      |     1 |    78 |  1082K  (1)| 01:44:15 |       |       |
|   5 |      SORT GROUP BY               |                      |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|   6 |       FILTER                     |                      |       |       |            |          |       |       |
|   7 |        NESTED LOOPS              |                      |   237 | 17775 |  1082K  (1)| 01:44:15 |       |       |
|   8 |         HASH JOIN                |                      |   237 | 15168 |  1081K  (1)| 01:44:12 |       |       |
|   9 |          INDEX RANGE SCAN        | INDEX_TABLE_3        |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|  10 |          PARTITION RANGE ITERATOR|                      |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|  11 |           INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX    |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|  12 |         INDEX RANGE SCAN         | INDEX_TABLE_2        |     1 |    11 |     2   (0)| 00:00:01 |       |       |
|  13 |     TABLE ACCESS BY INDEX ROWID  | TABLE_XXXXXXXXXXX    |     1 |   208 |     2   (0)| 00:00:01 |       |       |
|  14 |      INDEX UNIQUE SCAN           | PK_TABLE_XXXXXXXXXXX |     1 |       |     1   (0)| 00:00:01 |       |       |
-------------------------------------------------------------------------------------------------------------------------

Plan hash value: 2266344041

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                  |                      |       |       |  1081K(100)|          |       |       |
|   1 |  MERGE                           | TABLE_XXXXXXXXXXX    |       |       |            |          |       |       |
|   2 |   VIEW                           |                      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS OUTER            |                      |     1 |   282 |  1081K  (1)| 01:44:11 |       |       |
|   4 |     VIEW                         |                      |     1 |    78 |  1081K  (1)| 01:44:11 |       |       |
|   5 |      SORT GROUP BY               |                      |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   6 |       FILTER                     |                      |       |       |            |          |       |       |
|   7 |        NESTED LOOPS              |                      |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   8 |         NESTED LOOPS             |                      |     1 |    64 |  1081K  (1)| 01:44:11 |       |       |
|   9 |          INDEX RANGE SCAN        | INDEX_TABLE_3        |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|  10 |          PARTITION RANGE ITERATOR|                      |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|  11 |           INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX    |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|  12 |         INDEX RANGE SCAN         | INDEX_TABLE_2        |     1 |    11 |     2   (0)| 00:00:01 |       |       |
|  13 |     TABLE ACCESS BY INDEX ROWID  | TABLE_XXXXXXXXXXX    |     1 |   204 |     2   (0)| 00:00:01 |       |       |
|  14 |      INDEX UNIQUE SCAN           | PK_TABLE_XXXXXXXXXXX |     1 |       |     1   (0)| 00:00:01 |       |       |
-------------------------------------------------------------------------------------------------------------------------

Check the 8th step on the plans which makes the diffrence. Oracle suddenly started to choose nested loop instead of hash and estimated rows also drop down to 1.

Lets build the test case

SQL> @build_bind_vars_h
Enter SQL ID ==> 1m7wy721qytx3
Enter Plan Hash Value ==> 2266344041

variable B1 NUMBER

begin

:B1 := 14544;

end;

/

MERGE INTO MXS.TABLE_XXXXXXXXXXX TP USING ( select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.ID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USR1.TABLE_1, USR1.TABLE_2, USR1.TABLE_3 WHERE TABLE_1.ID = TABLE_2.TABLE_2.D
AND TABLE_2.MARKETID =TABLE_3.MARKETID AND TABLE_2.MARKETID IN :B1 AND TABLE_1.SOURCEID =TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.ID, TABLE_1.SOURCEID) TI ON (TP.TABLE_2.D = TI.TABLE_2.D AND TP.SOURCEID = TI.SOURCEID) WHEN MATCHED THEN UPDATE
SET TP.LASTYEARCLOSE = TI.LAST, TP.LASTYEARCLOSETIMESTAMP = TI.LASTTIMESTAMP, TP.LASTYEARCLOSESPREAD = TI.LASTSPREAD,
TP.LASTYEARCLOSEUPFRONT = TI.LASTUPFRONT WHEN NOT MATCHED THEN INSERT (TP.TABLE_2.D, TP.SOURCEID, TP.LASTYEARCLOSE,
TP.LASTYEARCLOSETIMESTAMP, TP.LASTYEARCLOSESPREAD, TP.LASTYEARCLOSEUPFRONT) VALUES (TI.TABLE_2.D, TI.SOURCEID, TI.LAST, TI.LASTTIMESTAMP,
TI.LASTSPREAD, TI.LASTUPFRONT);

Since this is a merge statement I needed to remove the merge part.

test.sql is like below

variable B1 NUMBER

begin

:B1 := 14544;

end;

/
select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.ID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USR1.TABLE_1, USR1.TABLE_2, USR1.TABLE_3 WHERE TABLE_1.ID = TABLE_2.TABLE_2.D
AND TABLE_2.MARKETID =TABLE_3.MARKETID AND TABLE_2.MARKETID IN :B1 AND TABLE_1.SOURCEID =TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.ID, TABLE_1.SOURCEID

test_e.sql is like below


explain plan for select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.ID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USR1.TABLE_1, USR1.TABLE_2, USR1.TABLE_3 WHERE TABLE_1.ID = TABLE_2.TABLE_2.D
AND TABLE_2.MARKETID =TABLE_3.MARKETID AND TABLE_2.MARKETID IN 14544 AND TABLE_1.SOURCEID =TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.ID, TABLE_1.SOURCEID;

build the optimizer test to see if its OFE related

SQL> @build_optimizer_features_test
Enter value for base_optimizer_version: 11.2.0.1

run the test for different OFE settings

SQL> @optimizer_features_test

SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  c5y47k1zmrb22, child number 0
-------------------------------------
-------------------------------------

Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |    320K|00:21:18.17 |      25M|     16M|       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |    320K|00:21:18.17 |      25M|     16M|    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1579K|01:37:13.89 |      25M|     16M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |      1 |   1579K|01:37:11.10 |      25M|     16M|       |       |          |
|   4 |     NESTED LOOPS             |                   |      1 |      1 |   5446K|00:15:20.03 |      16M|     16M|       |       |          |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |      1 |   5446K|00:15:10.67 |      16M|     16M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |      1 |   5446K|00:15:01.04 |      16M|     16M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   5446K|      1 |   1579K|00:05:19.24 |    9518K|  34458 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID"))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

##########Second Run Cardinality feedback kicked still same plan with different cardinality on the last step

Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |      0 |00:00:00.01 |       0 |      0 |    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1444K|00:22:15.31 |      24M|     15M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |      1 |   1444K|00:22:12.87 |      24M|     15M|       |       |          |
|   4 |     NESTED LOOPS             |                   |      1 |      1 |   4894K|00:10:27.91 |      15M|     15M|       |       |          |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |      1 |   4894K|00:10:19.78 |      15M|     15M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |      1 |   4894K|00:10:11.63 |      15M|     15M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   4894K|      1 |   1444K|00:00:53.00 |    8678K|   2859 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID"))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

Note
-----
   - cardinality feedback used for this statement

SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  c5y47k1zmrb22, child number 2
-------------------------------------

Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |    320K|00:21:24.30 |      25M|     16M|       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |    320K|00:21:24.30 |      25M|     16M|    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1579K|00:20:19.75 |      25M|     16M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |    234 |   1579K|00:20:17.00 |      25M|     16M|       |       |          |
|*  4 |     HASH JOIN                |                   |      1 |    234 |   5446K|00:09:47.69 |      16M|     16M|  1452K|  1452K|  372K (0)|
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |   1948 |     20M|00:21:13.17 |      16M|     16M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |   1948 |     20M|00:20:39.18 |      16M|     16M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   5446K|      1 |   1579K|00:00:51.19 |    9518K|   3216 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

##########Second Run Cardinality feedback kicked still same plan with different cardinality on the last step

Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |      1 |        |    320K|00:17:51.32 |      25M|     16M|       |       |          |
|   1 |  SORT GROUP BY               |                   |      1 |      1 |    320K|00:17:51.32 |      25M|     16M|    73M|  6303K|   65M (0)|
|*  2 |   FILTER                     |                   |      1 |        |   1579K|00:19:42.92 |      25M|     16M|       |       |          |
|   3 |    NESTED LOOPS              |                   |      1 |    234 |   1579K|00:19:40.24 |      25M|     16M|       |       |          |
|*  4 |     HASH JOIN                |                   |      1 |    234 |   5446K|00:10:22.24 |      16M|     16M|  1452K|  1452K|  419K (0)|
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |
|   6 |      PARTITION RANGE ITERATOR|                   |      1 |   1948 |     20M|00:21:33.34 |      16M|     16M|       |       |          |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |      1 |   1948 |     20M|00:20:59.31 |      16M|     16M|       |       |          |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |   5446K|      1 |   1579K|00:00:54.86 |    9518K|   3161 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=:B1)
   7 - filter(("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7
              AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND "TABLE_2.NTERVAL"."TABLE_2.D">0))
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=:B1)

Note
-----
   - cardinality feedback used for this statement

As you see cardinality feedback kicked two times with same plan generated with different cardinalities.
When you check the new 11.2.0.2 plan, when it uses nested loop cardinality suddenly drop to 1 which is a bit suspicios.

Runtime sql statistics

SQL> @sqlid c5y47k1zmrb22

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 000000075E5B37D0 00000009C5E1FEC8 2561977778  1081582          1          1          1        642         320375 633798.646 2034600.72 2034.60072   25789778   16298873          1               0
   1 000000075E5B37D0 00000009022773E0 2561977778  1081582          1          1          1        643         320375 607056.710 1424804.27 1424.80427   25136188   16034118          1               0
   2 000000075E5B37D0 0000000731957330 2179327633  1082314          1          1          1        642         320375 516728.448 1284256.36 1284.25636   25789785   16266663          1               0
   3 000000075E5B37D0 00000008735DB1A8 2179327633  1082314          1          1          1        642         320375 512717.052 1315325.55 1315.32555   25789675   16266877          1               0

Clearly 11.2.0.1 has got better response times. Because the runtime is too long better try our chance if the behaviour is same for explain plan with literals as well

Build the explain plan test case

SQL> @build_optimizer_features_test_e
Enter value for base_optimizer_version: 11.2.0.1

Lets run the test

SQL> @optimizer_features_test_e
SQL> alter session set optimizer_features_enable='11.2.0.2';

Session altered.

SQL> set echo off

Explained.

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   4 |     NESTED LOOPS             |                   |     1 |    64 |  1081K  (1)| 01:44:11 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

25 rows selected.

SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

SQL> set echo off

Explained.

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |   234 | 17550 |  1082K  (1)| 01:44:15 |       |       |
|*  4 |     HASH JOIN                |                   |   234 | 14976 |  1081K  (1)| 01:44:12 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0)
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

26 rows selected.

Plans behaviour is same so there is no need for me to run it for every bugfix better I use explain plan testing

Its time to build the bugfix test

SQL> @build_optimizer_bugfix_test_e
Enter value for optimizer_feature_version: 11.2.0.2

Time to rock

SQL> @optimizer_bugfix_test_e		=>output is truncated and there is break on plan_hash_value

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
2179327633              7641601
2561977778              4926618
                        5982893
                        6086930
                        6236862
                        6408301
                        6472966
			.......
			.......
			.......

103 rows selected.

As you see bugfix 7641601 is the guilty guy for us and we found this info under 1 minute. Once we disable it it reverts back to the 11.2.0.1 plan

Lets see what this bug is. Bugfixg is a script where I check bugfix definition from v$system_fix_control

SQL> @bugfixg
Enter value for bugno: 7641601

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   7641601          1 QKSFM_CARDINALITY_7641601 treat a single value column as having a frequency histogram      11.2.0.2                           0          1

This one looks like a new very interesting bugfix which needs more research from Jonathan Lewis, I don’t feel that much comfortable to understand how it works :)

One more thing to prove how enabling disabling the bugfix works

SQL> alter session set "_fix_control"='7641601:OFF';

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2179327633

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1082K  (1)| 01:44:15 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |   234 | 17550 |  1082K  (1)| 01:44:15 |       |       |
|*  4 |     HASH JOIN                |                   |   234 | 14976 |  1081K  (1)| 01:44:12 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |  1948 |   104K|  1081K  (1)| 01:44:12 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   4 - access("TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0)
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

26 rows selected.

SQL> alter session set "_fix_control"='7641601:ON';

Session altered.

SQL> @test_e

Explained.

SQL> @e

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2561977778

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   1 |  SORT GROUP BY               |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|*  2 |   FILTER                     |                   |       |       |            |          |       |       |
|   3 |    NESTED LOOPS              |                   |     1 |    75 |  1081K  (1)| 01:44:11 |       |       |
|   4 |     NESTED LOOPS             |                   |     1 |    64 |  1081K  (1)| 01:44:11 |       |       |
|*  5 |      INDEX RANGE SCAN        | INDEX_TABLE_3     |     1 |     9 |     2   (0)| 00:00:01 |       |       |
|   6 |      PARTITION RANGE ITERATOR|                   |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  7 |       INDEX FAST FULL SCAN   | PK_TABLE_1_XXXXXX |     1 |    55 |  1081K  (1)| 01:44:11 |   KEY |   KEY |
|*  8 |     INDEX RANGE SCAN         | INDEX_TABLE_2     |     1 |    11 |     2   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC(SYSDATE@!,'fmyear')-1>TRUNC(SYSDATE@!,'fmyear')-7)
   5 - access("MARKET"."MARKETID"=14544)
   7 - filter("DURATION"=INTERVAL'+000000001 00:00:00.000000000' DAY(9) TO SECOND(9) AND
              "STARTTIMESTAMP">TRUNC(SYSDATE@!,'fmyear')-7 AND "STARTTIMESTAMP"<TRUNC(SYSDATE@!,'fmyear')-1 AND
              "TABLE_2.NTERVAL"."TABLE_2.D">0 AND "TABLE_2.NTERVAL"."SOURCEID"="MARKET"."SOURCEID")
   8 - access("TABLE_2.NTERVAL"."TABLE_2.D"="TABLE_2.."TABLE_2.D" AND "TABLE_2.."MARKETID"=14544)

25 rows selected.

I think research for this particular sql is now finished. We found out the problem. We have enough info to stabilize this sql. Rest is checking the effect of statistics with
dynamic sampling and also raising the issue with Oracle which I will also explain later.

I want to share another example with the bugfix I mentioned in this post.

I will skip the initial steps and paste only the output. The query normally runs under 3 seconds in worst condition and has got many binds so I would like to use
build_optimizer_bugfix_test instead of explain plan.

Different optimizer behavior

SQL> @optimizer_features_test
SQL> alter session set optimizer_features_enable='11.2.0.2';

Session altered.

SQL> set echo off
timing for: time_for_ofe_11.2.0.2
Elapsed: 00:00:03.01

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  64xn5ccrd1f9n, child number 0
-------------------------------------

Plan hash value: 2063817604

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                             |      1 |        |     50 |00:00:00.09 |    5308 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TABLE_1_XXXXXXXXXXXX        |      1 |     12 |     50 |00:00:00.09 |    5308 |
|*  2 |   INDEX SKIP SCAN           | IDX_TABLE_1_XXXXXXXXXXXX_05 |      1 |     12 |     50 |00:00:00.45 |    5272 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TABLE_1_XXXXXXXXXXXX"."SOURCEID"=:N51)
       filter(("TABLE_1_XXXXXXXXXXXX"."SOURCEID"=:N51 AND
              INTERNAL_FUNCTION("TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID")))

32 rows selected.

Elapsed: 00:00:00.04
SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

Elapsed: 00:00:00.01
Session altered.

Elapsed: 00:00:00.01
SQL> set echo off
timing for: time_for_ofe_11.2.0.1
Elapsed: 00:00:00.11

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  64xn5ccrd1f9n, child number 1
-------------------------------------

Plan hash value: 1927123218

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |      1 |        |     50 |00:00:00.01 |     113 |
|   1 |  INLIST ITERATOR             |                         |      1 |        |     50 |00:00:00.01 |     113 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TABLE_1_XXXXXXXXXXXX    |     50 |     12 |     50 |00:00:00.01 |     113 |
|*  3 |    INDEX UNIQUE SCAN         | PK_TABLE_1_XXXXXXXXXXXX |     50 |     12 |     50 |00:00:00.01 |      78 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access((("TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N1 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N2 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N3 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N4 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N5 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N6 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N7 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N8 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N9 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N10 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N11 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N12 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N13 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N14 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N15 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N16 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N17 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N18 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N19 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N20 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N21 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N22 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N23 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N24 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N25 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N26 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N27 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N28 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N29 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N30 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N31 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N32 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N33 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N34 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N35 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N36 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N37 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N38 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N39 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N40 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N41 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N42 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N43 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N44 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N45 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N46 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N47 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N48 OR
              "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N49 OR "TABLE_1_XXXXXXXXXXXX"."TABLE_1_XXXXXXXXXXXX_ID"=:N50)) AND
              "TABLE_1_XXXXXXXXXXXX"."SOURCEID"=:N51)

SQL> @build_optimizer_bugfix_test
Enter value for optimizer_feature_version: 11.2.0.2
SQL> @optimizer_bugfix_test

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
1927123218              8893626
2063817604              4926618
                        5982893
                        6086930
                        6236862
                        6408301
                        6472966
                        6754080
                        7215982
                        7277732
                        7284269
                        7641601
			.......
			.......
			.......
			.......

103 rows selected.

In under 5 minutes I found out which bugfix is causing the issue

SQL> @bugfixg
Enter value for bugno: 8893626

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   8893626          1 QKSFM_ACCESS_PATH_8893626 apply index filter selectivity during skip scan costing          11.2.0.2                           0          1

In summary it is not black magic to spot what is causing the plan behaviour change when it comes to bugfixes.
Once you find out you are ready to take control of your plans again. Once you have this valuable information you can get faster responses from Oracle Support on your issue as well.
You will also look cool to your developers when

You can ask, what if my plan is not changing only on 1 bugfix? In that case I would check for the plan I am looking for
and if it exists I still assume thats my bugfix if it does not exist I will follow the upcoming steps.

I will post one more example for the issue I explained above when very old bugfixes suddenly appear on new version.

References Used:
Init.ora Parameter “_FIX_CONTROL” [Hidden] Reference Note [ID 567171.1];

Scripts Used:
All in one zip file

build_optimizer_bugfix_test.sql
build_optimizer_bugfix_test_e.sql
bugfixg.sql

<-Previous Post in Series ——————————-   Next Post in Series–>

January 31, 2011

Plan stability through Upgrade to 11G- Building a test case

Filed under: Basics, Plan Stability — coskan @ 6:16 pm

This will be the first post of plan stability through 11G upgrade series.
I initially planned to write about bugfixes as second post but change my mind and add some common steps to find the sqls I need to investigate and build a test case for.
The scripts I use here are mostly modified versions of Kerry Osborne and Tanel Poder, so just to avoid taking time of some of you who already knows this stuff, the only interesting part can be a bit further down in the post. I also did not add sqls problematic sqls yet, I kept them for related posts :)

Before I start, I have some assumptions for this and upcoming posts
I am assuming that you took the actions below on the test system as a DBA who knows what he is doing.

1- you cloned/snapped/duplicated the database from production so data is same
2- if you have a cutdown version of prod then you copied all the stats from production (both system and object stats)
3- you did not overwrite the statistics by gathering new stats after step 1 or 2
4- you have the same parameter settings apart from the version related ones.
5- You have optimizer_features_enable parameter set to the version you upgraded
(this is very important and I got my own invalid DBA exception and set this parameter at system level during my own test and caused tester test on the wrong optimizer environment)

You took the actions above, started the database and testing began then you start to get calls from testing/development groups that some of the tasks are running slower than SLAs for that spesific task
Since you are on testing phase you have time to sort the problems and understand what is the cause of the problem.

First things first, we will start by checking what is going on in terms of plans. Here is how “I” do it. For ease of read I did not put content of the scripts. They are all at the bottom of the post

1-Find/get the sqlid or sqltext for the task from developers.

1a-If developer provide sqlid then I call sqlid.sql- modified version of sqlid from Tanel Poder. This script gives the statistics of cursors in memory.

SQL> @sqlid cp7cg4db39a8z

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
cp7cg4db39a8z 1446291743 select from USER.TEST where ID =:1

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 0000000A16BA1338 000000072A73FE20 1026121647        3     117859          3     117858          0         117858    4223100 6820106.35  .57867148  279018595     408403          0               1

1b- If you got the sqltext but not sqlid but the process is reported as still running check with swact.sql script which is calling modified version of Tanel Poders sw.sql for active foreground processes.

Run the sqlid.sql script for the active sql_ids and double check the sqltext

SQL> @swact

    SID USERN STATE   EVENT              SEQ# SEC_IN_STATE         P1         P2         P3 P1TRANSL   SQL_ID         CHN BLCKNG_SID BLCKNG_SID_S BLCKNG_INST
------- ----- ------- ----------------------- ------------ ---------- ---------- ---------- ---------- ------------- ---- ---------- ------------ -----------
    401       WAITING jobq slave wait      32            0          0          0          0                             0            NO HOLDER
    553       WAITING jobq slave wait      32            0          0          0          0                             0            NO HOLDER
    402 COSKA WORKING On CPU / runqueue   145            0 1413697536          1          0            f5cqythck1gfz    0            NOT IN WAIT
    481 COSKA WORKING On CPU / runqueue   271            2          1      51202          1            fuws9wt9d16fq    0            NOT IN WAIT

SQL> @sqlid fuws9wt9d16fq

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
fuws9wt9d16fq 1389402582 select count(*) from dba_source,dba_objects

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002A9B90758 00000002A9C55DF8 2265727165 99999999          1          1          1          0              0  18539.182  20075.237  20.075237      32172       3619          1               1

1c- If session is not active and you could not get the sqlid from developer and start searching for the sqlid in v$sqlarea or dba_hist_sqltext

For this I usually use find_sql.sql from Kerry Osborne.

I needed to edit it to just find the sqlid of the given search key to avoid massive screen output for sqls with multiple child cursors


COSKANDBA@WHQAINT> @find_sql
Enter value for sql_text: select count(*) from v$parameter
Enter value for sql_id: %

SQL_ID         CHILD  PLAN_HASH      EXECS         ETIME     AVG_ETIME SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------- -----------------------------------------
89f82uh6x612p      0  322934900          2           .01           .00 select count(*) from v$parameter where na
                                                                       me = 'cpu_count' and   (isdefault = 'FALS
                                                                       E' or ismodified != 'FALSE')
COSKANDBA@WHQAINT> @find_sqlid
Enter value for sql_text: select count(*) from v$parameter

SQL_ID        SQL_TEXT
------------- ----------------------------------------------------------------------------------------------------
89f82uh6x612p select count(*) from v$parameter where name = 'cpu_count' and   (isdefault = 'FALSE' or ismodified !
              = 'FALSE')

If the sql is not active then I use find_sqlid_h which check dba_hist_sqltext
for historic check which can be dramatically slow if your AWR retention is high or number of unique sqls

2-Check the plan changes for this spesific sql

2a-If you have diagnostics pack license then you can use awr to check the historic plans

I use two scripts for getting the plan changes from awr.

awr_plan_changes.sql from Kerry Osborne which uses dba_hist_sqlstats for given sql_id. The good part of this script is that you can see the average elapsed time and
avg logical io. Take average into consideration but know that they are also dangerous and misleading

SQL> @awr_plan_change
Enter value for sql_id: 2h5nub7zqkj5q

   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
     24906      1 15-JAN-11 14.30.49.673         2h5nub7zqkj5q      3435162931            3       60.352       75,163.0
     24936      1 16-JAN-11 05.30.36.765         2h5nub7zqkj5q                            3      135.858       25,870.0
     24954      1 16-JAN-11 14.30.26.315         2h5nub7zqkj5q      3125217954            3       20.494       74,416.7
     25246      1 22-JAN-11 16.30.05.775         2h5nub7zqkj5q      1903048315            4        4.289       73,862.0
     25273      1 23-JAN-11 06.00.55.152         2h5nub7zqkj5q                            3       77.159      160,780.7
     25280      1 23-JAN-11 09.30.03.439         2h5nub7zqkj5q                            3       23.366       73,862.0
     25283      1 23-JAN-11 11.00.06.489         2h5nub7zqkj5q                            3        5.612       73,862.0
     25292      1 23-JAN-11 15.30.15.880         2h5nub7zqkj5q                            1       11.525       73,862.0
     25321      1 24-JAN-11 06.00.02.214         2h5nub7zqkj5q                            7       11.784       73,862.0
     25328      1 24-JAN-11 09.30.10.651         2h5nub7zqkj5q                            3       24.208       73,869.0
     25417      1 26-JAN-11 06.00.31.983         2h5nub7zqkj5q                            7       17.478       74,686.0
     25475      1 27-JAN-11 11.00.55.155         2h5nub7zqkj5q                            3       25.786       74,845.3
     25523      1 28-JAN-11 11.00.35.584         2h5nub7zqkj5q                            3       24.190       74,947.0
     25561      1 29-JAN-11 06.00.55.227         2h5nub7zqkj5q                            7       99.244       74,902.0
     25568      1 29-JAN-11 09.30.08.835         2h5nub7zqkj5q                            3       18.939       74,914.3
     25571      1 29-JAN-11 11.00.12.085         2h5nub7zqkj5q                            3       26.258       74,922.7

16 rows selected.

SQL>

sqlhist.sql which check uses dba_hist_active_sess_history for given sql_id. This does not have sql statistics for the plan_hash_value but it gives information for the sqls
which are not captured for dba_hist_sqlstats.


SQL> @sqlhist 2h5nub7zqkj5q

   SNAP_ID BEGIN_INTERVAL_TIME            SQL_ID          CHILD_NO PLAN_HASH_VALUE
---------- ------------------------------ ------------- ---------- ---------------
     24891 15-JAN-11 06.00.14.552         2h5nub7zqkj5q          0      3125217954   ****
     24906 15-JAN-11 14.30.49.673         2h5nub7zqkj5q          0      3435162931
     24937 16-JAN-11 06.00.38.585         2h5nub7zqkj5q          0
     24954 16-JAN-11 14.30.26.315         2h5nub7zqkj5q          0      3125217954
     25003 17-JAN-11 15.00.01.148         2h5nub7zqkj5q          0      4093460156
     25051 18-JAN-11 15.00.18.035         2h5nub7zqkj5q          1
     25099 19-JAN-11 15.00.44.108         2h5nub7zqkj5q          0
     25147 20-JAN-11 15.00.53.100         2h5nub7zqkj5q          0
     25225 22-JAN-11 06.00.09.902         2h5nub7zqkj5q          0
     25227 22-JAN-11 07.00.14.473         2h5nub7zqkj5q          0
     25230 22-JAN-11 08.30.19.472         2h5nub7zqkj5q          0
     25231 22-JAN-11 09.00.20.496         2h5nub7zqkj5q          0      1903048315
     25244 22-JAN-11 15.30.03.874         2h5nub7zqkj5q          0
     25246 22-JAN-11 16.30.05.775         2h5nub7zqkj5q          0
     25273 23-JAN-11 06.00.55.152         2h5nub7zqkj5q          0
     25280 23-JAN-11 09.30.03.439         2h5nub7zqkj5q          0
     25283 23-JAN-11 11.00.06.489         2h5nub7zqkj5q          0
     25292 23-JAN-11 15.30.15.880         2h5nub7zqkj5q          0
     25321 24-JAN-11 06.00.02.214         2h5nub7zqkj5q          0
     25328 24-JAN-11 09.30.10.651         2h5nub7zqkj5q          0
     25331 24-JAN-11 11.00.14.273         2h5nub7zqkj5q          0
     25369 25-JAN-11 06.00.56.648         2h5nub7zqkj5q          0
     25376 25-JAN-11 09.30.12.066         2h5nub7zqkj5q          0
     25379 25-JAN-11 11.00.15.482         2h5nub7zqkj5q          0
     25388 25-JAN-11 15.30.32.502         2h5nub7zqkj5q          0
     25417 26-JAN-11 06.00.31.983         2h5nub7zqkj5q          0
     25427 26-JAN-11 11.00.47.332         2h5nub7zqkj5q          0
     25436 26-JAN-11 15.30.07.648         2h5nub7zqkj5q          0
     25465 27-JAN-11 06.00.35.469         2h5nub7zqkj5q          0
     25472 27-JAN-11 09.30.51.194         2h5nub7zqkj5q          0
     25475 27-JAN-11 11.00.55.155         2h5nub7zqkj5q          0
     25484 27-JAN-11 15.30.09.941         2h5nub7zqkj5q          0
     25520 28-JAN-11 09.30.31.090         2h5nub7zqkj5q          0
     25523 28-JAN-11 11.00.35.584         2h5nub7zqkj5q          0
     25532 28-JAN-11 15.30.49.332         2h5nub7zqkj5q          0
     25561 29-JAN-11 06.00.55.227         2h5nub7zqkj5q          0
     25568 29-JAN-11 09.30.08.835         2h5nub7zqkj5q          0
     25571 29-JAN-11 11.00.12.085         2h5nub7zqkj5q          0

38 rows selected.

SQL>

When you look carefully you will see why I needed to check dba_hist_active_sess_stats with another script. The first line is not in awr_plan_change which can be misleading!!!!.

2b-If you have statspack configured to be taken at “Level 6″

then you can write scripts on STATS$SQL_PLAN_USAGE table in the similar manner. The difference is that since statspack does not store sql_id you need to use hash_value for instead of sql_id and in addition to this after 10G
You need to use V$SQL.OLD_HASH_VALUE for the given SQL_ID. I may write a version and put here but not this time because I don’t have it configured:(

3- You found out that your plan is changed after upgrade or you could not find that information due to lack of awr data or statspack but you suspect that it is changed. Now is the time to build the test case.

3a-Build the test case :

To create the test case I choose one of the options below
1- Manual: I use this option when I don’t have sqlid but sqltext to generate a case, which is painful if it has binds. Better ask the case from developer

2- build_bind_vars script from Kerry Osborne: Once you have/find the sqlid then you have a perfect option from Kerry Osborne which “nearly” does everything for you
he has two scripts called build_bind_vars and build_bind_vars2 one is using v$sql_bind_capture for the binded value the other one is using other_xml column for peeked binds from v$sql_plan table . Downside of build_bind_vars is that it uses display_raw function of Greg Rahn which means creating a function in the DB because this function is not there by default

SQL> @sqlid 6asfnp4v15rg3

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
6asfnp4v15rg3  907206115  SELECT DISTINCT TABLE_1.ID FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2
                         TABLE_2 WHERE TABLE_1.ID = TABLE_2.ID AND TABLE_2.SOURCEID = :1 AND TABLE_1.MARKETID = :2 AND TABLE_1.TYPEID
                         = :3

Elapsed: 00:00:00.01

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   1 000000082DF6FB88 00000009498BCE50 2222075559        6          2         12          1        130           1295     16.997     69.267    .069267       1562         13          0               0
   4 000000082DF6FB88 000000084D6B6808 2222075559       15          1         10          1        995           9946     55.993    1168.99    1.16899       6497         63          0               0
   6 000000082DF6FB88 00000009F2DB4B98 2222075559        6          2          7          1        504           5039     22.996     188.12     .18812       3877         34          0               0
   7 000000082DF6FB88 00000006D3B821A0 2222075559        7          1         10          1          9             86      6.999     15.825    .015825        151          2          0               0
   8 000000082DF6FB88 00000006EC670138 2222075559       39        144         11        105      33458         334050   1797.724    7692.61 .073262952     262247        999          0               0
  18 000000082DF6FB88 00000008AED02378 1994644346      176        768          3        804     233085        2326441  12999.013 119810.985 .149018638    2624167      12429          0               0

6 rows selected.

Elapsed: 00:00:00.01
SQL> @e5
Enter value for sql_id: 6asfnp4v15rg3
Enter value for child_number: 18

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
 SELECT
DISTINCT TABLE_1.ID FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2
TABLE_2 WHERE TABLE_1.ID = TABLE_2.ID AND
TABLE_2.SOURCEID = :1 AND TABLE_1.MARKETID = :2 AND
TABLE_1.TYPEID = :3

Plan hash value: 1994644346

---------------------------------------------------
| Id  | Operation           | Name                |
---------------------------------------------------
|   0 | SELECT STATEMENT    |                     |
|   1 |  SORT UNIQUE NOSORT |                     |
|   2 |   NESTED LOOPS      |                     |
|   3 |    INDEX RANGE SCAN | IDX_TABLE1_09       |
|   4 |    INDEX UNIQUE SCAN| PK_TABLE_2 |
---------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :1 (NUMBER): 66
   2 - :2 (NUMBER): 1281
   3 - :3 (NUMBER): 125

27 rows selected.

Elapsed: 00:00:00.06

SQL> l
  1* SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR(('&sql_id'),&child_number,'BASIC +PEEKED_BINDS'))

SQL> @build_bind_vars
Enter SQL ID ==> 6asfnp4v15rg3
Enter Child Number ==> 18

variable N1 NUMBER
variable N2 NUMBER
variable N3 NUMBER

begin

:N1 := 66;
:N2 := 1281;
:N3 := 125;

end;

/

select /* test_6asfnp4v15rg3 */ /*+ gather_plan_statistics */ DISTINCT TABLE_1.ID
FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2 TABLE_2
WHERE TABLE_1.ID = TABLE_2.ID
AND TABLE_2.SOURCEID = :N1
AND TABLE_1.MARKETID = :N2
AND TABLE_1.TYPEID = :N3;

As you see build_bind_vars uses actual peeked binds however build_bind_vars2 uses v$sql_bind_capture which maybe misleading due to the bind peeking.

SQL> @build_bind_vars2
Enter SQL ID ==> 6asfnp4v15rg3
Enter Child Number ==> 18

variable N1 NUMBER
variable N2 NUMBER
variable N3 NUMBER

begin

:N1 := 66;
:N2 := 1661;
:N3 := 1;

end;

/

select /* test_6asfnp4v15rg3 */ /*+ gather_plan_statistics */ DISTINCT TABLE_1.ID
FROM USER_1.TABLE_1 TABLE_1, USER_1.TABLE_2 TABLE_2
WHERE TABLE_1.ID = TABLE_2.ID
AND TABLE_2.SOURCEID = :N1
AND TABLE_1.MARKETID = :N2
AND TABLE_1.TYPEID = :N3;

Challenge is that these two script use active cursors in the shared pool. what if you need something which is not in the cache but in history.
To sort this problem I also add one more script called build_bind_vars_h to build test case using dba_hist_* tables and again other_xml column of those tables for peeked binds.
For the example below I will use build_bind_vars_h script because our query is not in memory.


SQL> @sqlid 1m7wy721qytx3

no rows selected

no rows selected

SQL> @sqlhist 1m7wy721qytx3

   SNAP_ID BEGIN_INTERVAL_TIME            SQL_ID          CHILD_NO PLAN_HASH_VALUE
---------- ------------------------------ ------------- ---------- ---------------
     24981 17-JAN-11 04.00.28.416         1m7wy721qytx3          0      2266344041
     24982 17-JAN-11 04.30.32.884         1m7wy721qytx3          0      2266344041
     24987 17-JAN-11 07.00.03.043         1m7wy721qytx3          0      2266344041
     24988 17-JAN-11 07.30.10.114         1m7wy721qytx3          0      2266344041
     25029 18-JAN-11 04.00.07.227         1m7wy721qytx3          0      2266344041
     25030 18-JAN-11 04.30.10.894         1m7wy721qytx3          0      2266344041
     25077 19-JAN-11 04.00.56.483         1m7wy721qytx3          0      3561051786 =>after the upgrade
     25078 19-JAN-11 04.31.00.422         1m7wy721qytx3          0      3561051786

8 rows selected.

SQL> @build_bind_vars_h
Enter SQL ID ==> 1m7wy721qytx3
Enter Plan Hash Value ==> 2266344041

variable B1 NUMBER

begin

:B1 := 14544;

end;

/

select /* test 1m7wy721qytx3 */ /*+gather_plan_statistics*/TABLE_1.TICKERID,
TABLE_1.SOURCEID, NVL (MAX (CLOSETIMESTAMP), MAX (STARTTIMESTAMP)) LASTTIMESTAMP, MAX (CLOSE) KEEP (DENSE_RANK LAST ORDER BY
STARTTIMESTAMP) AS LAST, MAX (CLOSESPREAD) KEEP (DENSE_RANK LAST ORDER BY STARTTIMESTAMP) AS LASTSPREAD, MAX (CLOSEUPFRONT) KEEP (DENSE_RANK
LAST ORDER BY STARTTIMESTAMP) AS LASTUPFRONT FROM USER_1.TABLE_1, USER_1.TABLE_2, USER_1.TABLE_3 WHERE TABLE_1.TICKERID = TABLE_2.TICKERID
AND TABLE_2.MARKETID = TABLE_3.MARKETID AND TABLE_2.MARKETID IN :B1 AND TABLE_1.SOURCEID = TABLE_3.SOURCEID AND DURATION =
NUMTODSINTERVAL (1, 'DAY') AND STARTTIMESTAMP < TRUNC (SYSDATE, 'YEAR') - 1 AND STARTTIMESTAMP > TRUNC (SYSDATE, 'YEAR') - 7 GROUP BY
TABLE_1.TICKERID, TABLE_1.SOURCEID;

Once I have the test content available , I copy the test case content two sql files to ease and generalize my tests.

test.sql sql file with /*+gather_plan_statistics*//*identifier*/ hints+comments added so I can gather the runtime statistics and easily find my sql
Sample test.sql

variable N1 NUMBER

begin
:N1 :=  100;
end;
/

select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/*
from T1,T2 where t1.id=t2.id and t1.id&lt;=:N1;

test_e.sql same as test but this time I replace binds with literals and run explain plan for the sql. I put literals because runtime plans can be
different than explain plans for sqls with bind variables.

Sample test_e.sql

explain plan for select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/*
from T1,T2 where t1.id=t2.id and t1.id&lt;=100;

4-Testing the test case with different optimizer_features_enable (OFE) settings

There can be couple of major minor releases between the release you are upgrading from and the release you are upgrading to.
For example If you want to upgrade from 10.2.0.4 to 11.2.0.2 and your plans are changed, it may not be because something is changed on 11.2.0.2 because there are 4 base releases between these two which are 10.2.0.5, 11.1.0.6, 11.2.0.7 and 11.2.0.1. Your plan characteristics may actually changed on one of these releases.

To understand which release my problems actually start I need to test the sqls for all the releases, to automate this process
I wrote two simple scripst like below

build_optimizer_features_test.sql : this script accepts base release as parameter and builds a test environment for all releases by generating another sql called
optimizer_features_test.sql which is just a spool file which sets OFE and runs test.sql (the one you created on step 3b) for each OFE from base and till last value
in v$system_fix_control.optimizer_feature_enable column.

Source of the build_optimizer_features_test.sql

undefine base_optimizer_version
set heading off
set feedback off
define optimizer_feature="&amp;base_optimizer_version"
set term off
spool optimizer_features_test.sql
select 'set timing on' from dual;
select 'set echo off' from dual;
--select 'set autotrace traceonly  statistics' from dual;
select 'spool optimizer_features_test_results.log' from dual;
select
'timing start time_for_ofe_'||optimizer_feature_enable||chr(10)||'set echo on'||chr(10)||'alter session set optimizer_features_enable='''||optimizer_feature_enable||''';'
||chr(10)||'set echo off'||chr(10)||'set term off'||chr(10)
||'@test.sql'||chr(10)||'/'||chr(10)||'set term on'
||chr(10)||'timing stop  time_for_ofe_'||optimizer_feature_enable||chr(10)
||chr(10)||'select * from table(dbms_xplan.display_cursor(null,null,''ALLSTATS LAST''));'||chr(10)
from
(select distinct optimizer_feature_enable from v$system_fix_control
where
to_number(replace(optimizer_feature_enable,'.',''))&gt;=to_number(replace('&amp;optimizer_feature','.',''))
order by to_number(replace(optimizer_feature_enable,'.','')) desc );
select 'set echo off' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'set timing off' from dual;
select 'spool off' from dual;
--select 'set autotrace off' from dual;
spool off
set term on
set heading on
set feedback on

and output from this sql is optimizer_features_test.sql which is like below for base version is 11.1.0.7 and database is running with 11.2.0.1 binary

set timing on

set echo off

spool optimizer_features_test_results.log

timing start time_for_ofe_11.2.0.1
set echo on
alter session set optimizer_features_enable='11.2.0.1';
set echo off
set term off
@test.sql
/
set term on
timing stop  time_for_ofe_11.2.0.1

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

timing start time_for_ofe_11.1.0.7
set echo on
alter session set optimizer_features_enable='11.1.0.7';
set echo off
set term off
@test.sql
/
set term on
timing stop  time_for_ofe_11.1.0.7

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

set timing off

spool off

For this example I want to see if my plan is changing behaviour when I upgrade from 10.2.0.4 to 11.2.0.1


SQL> ------------------------------
SQL> ----- BUILD THE TEST CASE ----
SQL> ------------------------------
SQL> @build_optimizer_features_test
Enter value for base_optimizer_version: 10.2.0.4
SQL> ------------------------------
SQL> ----- TEST THE TEST CASE -----
SQL> ------------------------------
SQL> @optimizer_features_test
SQL> alter session set optimizer_features_enable='11.2.0.1';

Session altered.

Elapsed: 00:00:00.02
SQL> set echo off
timing for: time_for_ofe_11.2.0.1
Elapsed: 00:00:00.03

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 0
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1268K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.02 SQL> alter session set optimizer_features_enable='11.1.0.7';

Session altered.

Elapsed: 00:00:00.01
SQL> set echo off
timing for: time_for_ofe_11.1.0.7
Elapsed: 00:00:00.02

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 1
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1254K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.01 SQL> alter session set optimizer_features_enable='11.1.0.6';

Session altered.

Elapsed: 00:00:00.01
SQL> set echo off
timing for: time_for_ofe_11.1.0.6
Elapsed: 00:00:00.02

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 2
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1258K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.02 SQL> alter session set optimizer_features_enable='10.2.0.5';

Session altered.

Elapsed: 00:00:00.02
SQL> set echo off
timing for: time_for_ofe_10.2.0.5
Elapsed: 00:00:00.03

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 3
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1250K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.01 SQL> alter session set optimizer_features_enable='10.2.0.4';

Session altered.

Elapsed: 00:00:00.02
SQL> set echo off
timing for: time_for_ofe_10.2.0.4
Elapsed: 00:00:00.03

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1hs7jn4tshzqa, child number 4
-------------------------------------
select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2
where t1.id=t2.id and t1.id<=:N1

Plan hash value: 1143140472

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |    100 |00:00:00.01 |      10 |       |       |          |
|*  1 |  HASH JOIN                   |        |      1 |     99 |    100 |00:00:00.01 |      10 |   751K|   751K| 1252K (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    100 |    100 |00:00:00.01 |       4 |       |       |          |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |      1 |    100 |    100 |00:00:00.01 |       2 |       |       |          |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |      1 |    100 |    100 |00:00:00.01 |       6 |       |       |          |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |      1 |    100 |    100 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=:N1)
   5 - access("T2"."ID"<=:N1) 25 rows selected. Elapsed: 00:00:00.02 SQL> @sqlid 1hs7jn4tshzqa

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
1hs7jn4tshzqa  864583370 select /*+gather_plan_statistics*/ /*test_0jvvdaw4kxpqw*/* from T1,T2 where t1.id=t2.id and t1.id<=:N1 1 row selected.  CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING ---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------    0 00000000926F6F90 00000000926F6B90 1143140472        9          2          5          2          4            200         10      3.435   .0017175         20          0          0               0    1 00000000926F6F90 00000000926F3E90 1143140472        9          2          5          2          4            200          0      3.278    .001639         20          0          0               0    2 00000000926F6F90 0000000094936D28 1143140472        9          2          5          2          4            200         10      3.931   .0019655         20          0          0               0    3 00000000926F6F90 00000000926F1EF0 1143140472        9          2          5          2          4            200          0      3.644    .001822         20          0          0               0    4 00000000926F6F90 00000000926C3148 1143140472        9          2          5          2          4            200          0       2.61    .001305         20          0          0               0 5 rows selected. SQL>

With this test you call test.sql for 2 times for each available OFE value since the given base OFE. There are 5 different cursors for 5 different CBO environment variable and
they all used the same plan. By calling 2 times you see the caching effect and also on 11.2.0.1 and onward you can see the cardinality feedback effect if it kicks in (we will see about that later)

build_optimizer_features_test_e.sql : The only difference is that this scripts calls test_e.sql (the one you create on step 3b) and runs dbms_xplan.display
for the explained plan. I use this one when the sql runs very long time, even I still test with the runtime plan generation
explaining the plan sometimes gives me the rough idea on which OFE my plan may change

sourcecode of the build_optimizer_features_test_e is below

undefine base_optimizer_version
set heading off
set feedback off
define optimizer_feature="&amp;base_optimizer_version"
set term off
spool optimizer_features_test_e.sql
select 'set timing on' from dual;
select 'set echo off' from dual;
select 'spool optimizer_features_test_2_results.log' from dual;
select   'set echo on'||chr(10)||'alter session set optimizer_features_enable='''||optimizer_feature_enable||''';'
||chr(10)||'set echo off'||chr(10)||'@test_e.sql'||chr(10)||
'select * from table(dbms_xplan.display);'||chr(10)
from
(select distinct optimizer_feature_enable from v$system_fix_control
where
to_number(replace(optimizer_feature_enable,'.',''))&gt;=to_number(replace('&amp;optimizer_feature','.',''))
order by to_number(replace(optimizer_feature_enable,'.','')) desc );
select 'set echo off' from dual;
select 'spool off' from dual;
spool off
set term on
set heading on
set feedback on

Same example with the previous one this time assuming I am moving from 11.1.0.6 to 11.2.0.1


SQL> @build_optimizer_features_test_e
Enter value for base_optimizer_version: 11.1.0.6
SQL> @optimizer_features_test_e
SQL> alter session set optimizer_features_enable='11.2.0.1';
Elapsed: 00:00:00.02
SQL> set echo off
Elapsed: 00:00:00.10

Plan hash value: 1143140472

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    99 | 24156 |     9  (12)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |    99 | 24156 |     9  (12)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |   100 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |   100 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=100)
   5 - access("T2"."ID"<=100) Elapsed: 00:00:00.08 SQL> alter session set optimizer_features_enable='11.1.0.7';
Elapsed: 00:00:00.01
SQL> set echo off
Elapsed: 00:00:00.01

Plan hash value: 1143140472

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    99 | 24156 |     9  (12)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |    99 | 24156 |     9  (12)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |   100 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |   100 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=100)
   5 - access("T2"."ID"<=100) Elapsed: 00:00:00.01 SQL> alter session set optimizer_features_enable='11.1.0.6';
Elapsed: 00:00:00.02
SQL> set echo off
Elapsed: 00:00:00.01

Plan hash value: 1143140472

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    99 | 24156 |     9  (12)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |    99 | 24156 |     9  (12)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | PAR_I1 |   100 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2     |   100 | 12200 |     4   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | CHI_I1 |   100 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."ID"="T2"."ID")
   3 - access("T1"."ID"<=100)
   5 - access("T2"."ID"<=100) Elapsed: 00:00:00.02 SQL>

In summary without building a test case and testing it yourself it is hard to find out what is going on and this post was all about having everything ready before you start the
actual diagnose. Next posts I will only put an output of the these steps and won’t explain what they are.

References Used:
http://kerryosborne.oracle-guy.com/2009/07/creating-test-scripts-with-bind-variables/
http://kerryosborne.oracle-guy.com/2008/10/unstable-plans/

Scripts used in this post

All in one as a zip file- build_test_case.zip

Scipts shared as txt for availability on google docs. (I think I need to move my own domain)
build_bind_vars2

build_bind_vars2 Kerry Osborne version
build_bind_vars
build_bind_vars2 Kerry Osborne version
build_bind_vars_h
build_optimizer_features_test
build_optimizer_features_test_e
find_sqlid
find_sqlid_h
sqlhist
sqlid
sw
sw Tanel Poder version
awr_plan_change

<-Previous Post in Series————————————–  Next Post in Series–>

January 26, 2011

Plan stability through Upgrade to 11G-Introduction

Filed under: Performance, Plan Stability, Upgrade — coskan @ 12:31 am

What are the common reasons to upgrade a database ?

1- Version Support is ending or already ended
2- You hit a bug and support does not give you one-off patch solution and ask you to upgrade
3- There are new features, you think you may benefit from (reduced storage, new dr options, new development/management features)
4- Your management feels like they are dynamic and they should adapt new versions asap
5- You/End users are not happy with the current performance of DB and marketing of new version “guarantees” %XX better performance

Apart from the number 5, which always depends on architecture of your application, none of them is something users of your database really care. They don’t care what level of support you have,how much compression ration you have, how fast and how small your backups are, how solid your HA solution is, how easy you develop the code with new features,how easy DBAs manage the database, how shiny the management tool , how many bugs been sorted with new version or what your manager thinks, it is and it will always be the response time they experience is, what they only care. If it is fast enough they will be happy, if it is slow they are not and they won’t be happy.

When I search on web for the experiences people had after Oracle version upgrades, it was always performance what they most complained. From 8i to 9i , 8i to 10g, 9i to 10g people always been promised by marketing that the performance would be better but they usually experienced the opposite. Performance degradation was usually related with the enhancements for sql execution plan optimizations. Sometimes it was because of bugfixes,sometimes, it was plan enhancements for better optimizations, sometimes it was new features oracle intruduced (CBO,automatic statistics gatherin, system stats etc) or sometimes it was because of lack of understanding of optimizer (thanks Jonathan Lewis for his perfect book – Cost Based Oracle Fundamentals and his blog for letting readers understand how it works) there was always something and some of your plans was changed and performance of your db went down right after upgrade.

For plan stability users tried using hints which needs maintenance, high level of understanding and code changes or outlines which were a bit complex to use.
To sort this plan stability complexity, Guys at Oracle first came up with a solution called sql profiles in 10G but but it was cost option and not everybody had chance to use it.
Oracle needed a license free, easy to use and solid solution for plan stability and with 11G Oracle finally managed to give these options to users by  addition of  new “sql plan baselines” feature.

I recently did a major DB upgrade from 10.2.4 to 11.2.0.1 and despite all the performance problems we had during the testing phase, apart from 2 issues which are sorted in 10 minutes time we did not have any issue after the actual production upgrade. How did we managed to do this ?

1- we planned the upgrade in detail
2- we tested good and long enough
3- we caught all the problems during the test phase and sorted them all by plan baselines and with some of the techniques I will explain later.
4- We react quick enough on the first day after the upgrade. By help of plan baselines , We did not waste seconds to sort the problem.

Are sql plan baselines perfect solution ? Like all the software it is not perfect sometimes can get very hard to track down, but it is good enough to give what your end users want at first place, which is “same level of performance they experienced 1 day ago”. As I said before, they don’t care what you did that Saturday so you should put “giving them same level of performance they had and paid before upgrade” to the top of your priority list,

Using plan baselines looks like a bit like cheating for a DBA who likes to understand what the problem is, however with magical parameter optimizer_features_enable, (which is like a time machine that lets you have a journey between versions) you will always have time to understand what the problem was, so instead of becoming obsessive about sorting/understanding the actual problem at first place, it is better first creating the plan baseline to stabilize the system then start investigating.

By this way you could also avoid pressure of managers hanging around your des trying to understand “what the hell is going on”. (they won’t understand what is going on when you tell them anyway, so why bother explaining. ) the only thing they care is the number of calls reporting performance issues from end-users and big bosses. if the number of calls goes down they will be happy so give them what they want to see/hear. A stabile, upgraded database.

As you already understood from the topic and  reading nothing technical in the post , this post was an intruduction to plan stability through upgrade series I am starting.
During the series, I will try to explain how you can avoid upgrade related performance problems to make your dream upgrade and become hero :)

Current outline is like below. When I have time I will write them down one by one and link them here.

1-Introduction
2-Building the test
3-Why is my plan changed?-bugfixes : how you can find which bug fix may caused your plan change
4-Why is my plan changed?-new optimizer parameters : how you can find which parameter change/addition may caused your plan change
5-Why is my plan changed?-extra nested loop : what is the new nested loop step you will see after 11G upgrade
6-Why is my plan changed?-Auto adjusted Dynamic Sampling : I will try to explain how auto adjusted dynamic sampling can effect stability of your parallel queries
7-Why is my plan changed?-adaptive cursor sharing : I will talk a “little” about adaptive cursor sharing which may cause different plans for binded sqls after upgrade
8-Opening plan change case on MOS-SQLT : I will try to save the time you spend with Oracle Support when you raise a call for post upgrade performance degredation
9-Plan Baselines-Introduction : What are plan baselines they how they work
10-Plan Baselines-Using SQL Tuning sets : How to create plan baselines from tuning set ?
11-Plan Baselines-Using SQL Cache : How to create plan baselines from SQL Cache ?
12-Plan Baselines-Moving Baselines : How to move your plan baselines between database ?
13-Plan Baselines-Faking Baselines : How to fake the plan baseline?s
14-Plan Baselines-Capturing Baselines : How to capture baselines?
15-Plan Baselines-Management : How to manage your baselines?
16-Testing Statistics with Pending Stats : I’ll go through how you can use pending statistics during upgrades
17-Comparing Statistics : I’ll explain comparing the statistics
18-Cardinality Feedback Feature : I’ll go through new built in cardinality feedback feature which may cause problems
19-Where is the sqlid of active session ? : I’ll show you how you can find what your sql_id when it is null
20-Testing hintless database : I’ll explain how you can get rid of hints
21-Upgrade Day/Week : What needs to be ready for smooth upgrade ?
22-Before after analysis-mining problems : How you can spot possible problems comparing tuning sets
23-Before after analysis-graphs to sell : Using perfsheet to sell your work
24-Further Reading : Compilation of References I used during series and some helpfull links
25-Tools used : Index of the tools I used during series

I hope, I can write at least 3 post each week and finish series in 8-6 weeks. I nearly have all the material ready, which need a bit more polishing, I hope I won’t be to exhausted by lovely wordpress gui which is very annoying when it comes formatting.

January 24, 2011

Analysing Temp usage on 11GR2 when Temp space is not released

Filed under: Bugs — coskan @ 2:47 pm

We had an interesting issue on our 3 node cluster about the temp usage. One night one of the critical batch operation which never used high temp suddenly hit “ORA-01652: unable to extend temp segment by string in tablespace string tips”. Once we check the system we saw some “inactive sessions” were using very high amount of temp like below which were allocating most of the temp space causing batch operations fail even with very small temp request.

—The picture of temp usage right after the issue

PROD1> @temp_usageg

#####################################################################
#######################GLOBAL TEMP USAGE#############################
#####################################################################

   INST_ID TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
---------- --------------- ----------- ----------- ----------
         3 TEMP              22528.000       7.000      22521
         2 TEMP              22528.000      56.000      22472
         1 TEMP              22528.000   19102.000       3426

#####################################################################
#######################GLOBAL TEMP USERS-over 10MB ##################
#####################################################################

   INST_ID SID_SERIAL USERNAME     OSUSER     SPID   SERVICE            MB_USED TABLESPACE
---------- ---------- ------------ ---------- ------ -------------------------- ---------------
         1 620,4322   USER_1      osusr1   21712  WEB_USER_SERVICE         16.000 TEMP
         1 41,35454   USER_1      osusr1   20334  WEB_USER_SERVICE         19.000 TEMP
         1 529,21034  USER_1      osusr1   5776   WEB_USER_SERVICE         20.000 TEMP
         1 538,54672  USER_1      osusr1   30383  WEB_USER_SERVICE         27.000 TEMP
         1 1258,36144 USER_1      osusr1   5745   WEB_USER_SERVICE         31.000 TEMP
         1 1351,56539 USER_2       root      29910  WEB_USER_SERVICE         67.000 TEMP
         1 1113,59480 USER_2       root      11397  WEB_USER_SERVICE         78.000 TEMP
         1 30,4102    USER_2       root      15244  WEB_USER_SERVICE         78.000 TEMP
         1 892,14168  USER_2       root      19971  WEB_USER_SERVICE         79.000 TEMP
         1 699,56288  USER_1      osusr1   11240  WEB_USER_SERVICE        668.000 TEMP
         1 1501,55691 USER_1      osusr1   11236  WEB_USER_SERVICE        685.000 TEMP
         1 1396,7154  USER_1      osusr1   12526  WEB_USER_SERVICE        808.000 TEMP
         1 233,46662  USER_1      osusr1   11334  WEB_USER_SERVICE        810.000 TEMP
         1 1004,14127 USER_1      osusr1   24287  WEB_USER_SERVICE        832.000 TEMP
         1 1196,820   USER_1      osusr1   12530  WEB_USER_SERVICE        868.000 TEMP
         1 1121,1990  USER_1      osusr1   24290  WEB_USER_SERVICE        893.000 TEMP
         1 297,2940   USER_1      osusr1   8910   WEB_USER_SERVICE        903.000 TEMP
         1 294,39051  USER_1      osusr1   15116  WEB_USER_SERVICE        924.000 TEMP
         1 37,46739   USER_1      osusr1   15118  WEB_USER_SERVICE        932.000 TEMP
         1 414,31439  USER_1      osusr1   8877   WEB_USER_SERVICE        935.000 TEMP
         1 913,40598  USER_1      osusr1   9710   WEB_USER_SERVICE       1064.000 TEMP
         1 710,44883  USER_1      osusr1   9703   WEB_USER_SERVICE       1115.000 TEMP
         1 147,16701  USER_1      osusr1   9295   WEB_USER_SERVICE       1137.000 TEMP
         1 1446,25613 USER_1      osusr1   12372  WEB_USER_SERVICE       1139.000 TEMP
         1 721,5152   USER_1      osusr1   12728  WEB_USER_SERVICE       1183.000 TEMP
         1 1295,31011 USER_1      osusr1   32078  WEB_USER_SERVICE       1208.000 TEMP
         1 14,60566   USER_1      osusr1   8236   WEB_USER_SERVICE       1209.000 TEMP
         1 29,64998   USER_1      osusr1   9292   WEB_USER_SERVICE       1318.000 TEMP

28 rows selected.

PROD1> check active sessions

PROD1> @swact

    SID STATE   EVENT                                     SEQ# SEC_IN_STATE         P1         P2         P3 SQL_ID         CHN
------- ------- ---------------------------------------- ----- ------------ ---------- ---------- ---------- ------------- ----
   1082 WAITING Streams AQ: waiting for messages in the  34587            1       8560 2.2692E+10          5
    390 WAITING wait for unread message on broadcast cha 52505            0 2.2570E+10 2.3036E+10          0 2b064ybzkwf1y    0
    888 WORKING On CPU / runqueue                        10058            0 1413697536          1          0 dgcp1b9y4h89a    0

PROD1> --none of the sessions were active

According to my Oracle knowledge, this should not has happened once the operation finishes (commit rollback) ,Oracle should has released the space. All of these sessions were all specific to one product group and service. Question was if this was a permanent problem which we never realized or something was going wrong ?
I checked back my old spool files for temporary checks (thanks to Tanel Poder who thought us how to spool every action- check init.sql from his script library) and saw that sessions with same username and service name were also holding big temp back in september and November like below. (These checks were probably done when we hit temp issue on node-1 or when I was checking the script so I did not pay attention if those sessions were active or not (this can be counted as invalid dba exception for me :))

From november – before 11.2.0.1 upgrade from 10.2.0.4


PROD3> @temp_usageg

#####################################################################
#######################GLOBAL TEMP USAGE#############################
#####################################################################

   INST_ID TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
---------- --------------- ----------- ----------- ----------
         1 TEMP              20480.000     119.000      20361
         2 TEMP              20480.000        .000      20480
         3 TEMP              20480.000    8029.000      12451 ==>right node

#####################################################################
#######################GLOBAL TEMP USERS-over 10MB ##################
#####################################################################

   INST_ID SID_SERIAL USERNAME   OSUSER     SPID   PROGRAM             MB_USED TABLESPACE
---------- ---------- ---------- ---------- ------ --------------------------- ------------

         3 484,26686  USER_1       osusr1   6798    WEB_USER_SERVICE        109.000 TEMP
         3 898,53562  USER_1       osusr1   7272    WEB_USER_SERVICE        127.000 TEMP
         3 986,27409  USER_1       osusr1   5644    WEB_USER_SERVICE        147.000 TEMP
         3 771,4399   USER_1       osusr1   23613   WEB_USER_SERVICE        148.000 TEMP
         3 946,18709  USER_1       osusr1   12065   WEB_USER_SERVICE        163.000 TEMP
         3 378,20452  USER_1       osusr1   5646    WEB_USER_SERVICE        163.000 TEMP
         3 1007,23777 USER_1       osusr1   23413   WEB_USER_SERVICE        200.000 TEMP
         3 391,37179  USER_1       osusr1   9245    WEB_USER_SERVICE        701.000 TEMP
	 3 768,23493  USER_1       osusr1   12829   WEB_USER_SERVICE        904.000 TEMP
         3 607,13902  USER_1       osusr1   26307   WEB_USER_SERVICE       1000.000 TEMP
         3 475,32196  USER_1       osusr1   8460    WEB_USER_SERVICE       1010.000 TEMP
         3 593,58443  USER_1       osusr1   31126   WEB_USER_SERVICE       1063.000 TEMP
         3 879,51688  USER_1       osusr1   12831   WEB_USER_SERVICE       1113.000 TEMP
         3 664,40877  USER_1       osusr1   8470    WEB_USER_SERVICE       1136.000 TEMP

18 rows selected.

From October – before 11.2.0.1 upgrade from 10.2.0.4


PROD3> @temp_usage

TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
--------------- ----------- ----------- ----------
TEMP              20480.000    7899.000      12581

#####################################################################
#######################GLOBAL TEMP USERS-over 10MB ##################
#####################################################################

SID_SERIAL USERNAME        OSUSER          SPID    PROGRAM             MB_USED TABLESPACE
---------- --------------- --------------- ------- --------------------------- -----------

463,42814  USER_1            osusr1        1447     WEB_USER_SERVICE         17.000 TEMP
1008,59864 USER_1            osusr1        16932    WEB_USER_SERVICE         17.000 TEMP
594,37070  USER_1            osusr1        6451     WEB_USER_SERVICE         20.000 TEMP
512,40923  USER_1            osusr1        16934    WEB_USER_SERVICE         25.000 TEMP
789,37294  USER_1            osusr1        9483     WEB_USER_SERVICE        731.000 TEMP
893,63174  USER_1            osusr1        6018     WEB_USER_SERVICE        815.000 TEMP
583,54040  USER_1            osusr1        23859    WEB_USER_SERVICE        933.000 TEMP
675,37736  USER_1            osusr1        9487     WEB_USER_SERVICE        984.000 TEMP
737,58648  USER_1            osusr1        25274    WEB_USER_SERVICE       1035.000 TEMP
831,46843  USER_1            osusr1        25272    WEB_USER_SERVICE       1062.000 TEMP
973,64549  USER_1            osusr1        10270    WEB_USER_SERVICE       1109.000 TEMP
774,18708  USER_1            osusr1        6625     WEB_USER_SERVICE       1133.000 TEMP

As you can see USER_1 from WEB_USER_SERVICE on the 3th (preferred) node was always using high temp.

Next question is why we did not have this issue before on Node-1? Once we check in detail we saw services of this product group were running on the wrong node (relocated due to a failure just before change freeze) which is our main OLTP and batch node. We already knew the issue (thanks to our service check script) but it was change freeze time and we did not have chance to move service to the right nodes. When I check with the product developer, I have been told they usually reboot their web servers every 3 weeks for maintenance. This was the obvious answer to the question why we did not have the issue before. We never run this service on the OLTP node and we never let web servers hit the limits even when they are running on their own node.

Since we gave answer to our curiosity questions, now it was time for the real question “Why was Oracle not releasing the temp space back ?” On 11GR2 we have a nice column called temp_space_allocated in V$ACTIVE_SESSION_HISTORY and DBA_HIST_ACTIVE_SESS_HISTORY tables and I am lucky that we have diagnostics pack license.

Let see what the trend is and see if some sqls are the main cause. The output below is taken when the issue happened.


SQL> select sample_time,event,sql_id,is_sqlid_current,temp_space_allocated/1024/1024 TEMP_MB
 1 from v$active_session_history where session_id=29;

SAMPLE_TIME             EVENT                     SQL_ID        I TEMP_MB
----------------------- ------------------------- ------------- - -----
04-JAN-11 10.18.09.591                            akgmrn2bww2p5 Y  1319
04-JAN-11 09.50.31.125  db file sequential read   73adfqhtsd909 Y  1319
04-JAN-11 09.46.29.488                            0gaghgkkb4qyc Y  1318
04-JAN-11 08.58.14.930  db file sequential read   73adfqhtsd909 Y  1318
04-JAN-11 08.13.09.845                            0gaghgkkb4qyc Y  1318
04-JAN-11 07.46.01.586  db file sequential read   ds7zkf3vhtms0 Y  1318
04-JAN-11 07.33.05.592  row cache lock            4tcbsv32h1xh9 Y  1317
04-JAN-11 06.49.49.772  db file sequential read   9s48k84m7kt36 Y  1317
04-JAN-11 06.49.48.772  db file sequential read   734fasjm6rfu8 Y  1316
04-JAN-11 06.49.47.762  db file sequential read   734fasjm6rfu8 Y  1316
04-JAN-11 06.49.45.762  db file sequential read   9s48k84m7kt36 Y  1315
04-JAN-11 06.49.44.762  log file sync                           Y  1314
04-JAN-11 06.18.09.894                            0gaghgkkb4qyc Y  1314
04-JAN-11 06.03.21.508                            akgmrn2bww2p5 Y  1314
04-JAN-11 04.18.20.232                            11chc5v8rkusd Y  1313
04-JAN-11 02.28.14.858                            73adfqhtsd909 Y  1312
04-JAN-11 02.08.13.726                            54hw5asg19hpt Y  1312
04-JAN-11 01.00.25.381                            akgmrn2bww2p5 Y  1312
04-JAN-11 00.58.15.042  db file sequential read   9s48k84m7kt36 Y  1312
04-JAN-11 00.00.02.040  direct path read temp     9tgtnsr31xsv5 Y  1312
03-JAN-11 23.52.33.856                            ds7zkf3vhtms0 Y  1301
03-JAN-11 22.33.14.607                            g7sy7a6ynn9r1 Y  1295
03-JAN-11 22.30.07.101  direct path read          9tgtnsr31xsv5 Y  1295

Because we did not sort the issue yet I also have the current picture of our system with a better script which accepts session sid and serial and uses lag function for the temp_space_allocated differences between sample times. (output is truncated)


PROD3> set echo on
PROD3> @temp_hist 603 8375
DBA_HIST_ACTIVE_SESS_HISTORY

PROD3>  select sample_time,session_id,session_serial#,sql_id,temp_space_allocated/1024/1024 temp_mb,
  2   temp_space_allocated/1024/1024-lag(temp_space_allocated/1024/1024,1,0) over (order by sample_time) as temp_diff
  3   from dba_hist_active_sess_history
  4   --from v$active_session_history
  5   where
  6   session_id=&1
  7   and session_serial#=&2
  8   order by sample_time asc
  9   /

SAMPLE_TIME               SESSION_ID SESSION_SERIAL# SQL_ID          TEMP_MB  TEMP_DIFF
------------------------- ---------- --------------- ------------- --------- ----------
22-JAN-11 13.00.51.214           603            8375                       1          1
22-JAN-11 15.14.29.430           603            8375 29kz43xq2s5xc         1          0
22-JAN-11 15.14.39.450           603            8375 734fasjm6rfu8         3          2
22-JAN-11 15.14.49.470           603            8375 734fasjm6rfu8         8          5
22-JAN-11 15.14.59.490           603            8375 734fasjm6rfu8        13          5
22-JAN-11 15.15.09.520           603            8375 9s48k84m7kt36        18          5
22-JAN-11 15.15.19.540           603            8375 734fasjm6rfu8        23          5
22-JAN-11 15.15.29.567           603            8375 9s48k84m7kt36        29          6
22-JAN-11 16.35.50.566           603            8375 bx5zxp1a5wb55        29          0
22-JAN-11 19.45.45.831           603            8375 73adfqhtsd909        32          3
22-JAN-11 21.40.40.613           603            8375 g7sy7a6ynn9r1        32          0
23-JAN-11 06.30.35.544           603            8375 ds7zkf3vhtms0        33          1
23-JAN-11 15.18.37.539           603            8375 9d618ka6h2hvz        35          2
23-JAN-11 15.18.47.583           603            8375 2du821ps20a4c        35          0
23-JAN-11 15.18.57.593           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.19.07.623           603            8375                      39          4
23-JAN-11 15.19.17.643           603            8375 734fasjm6rfu8        44          5
23-JAN-11 15.19.27.663           603            8375 ds7zkf3vhtms0        49          5
23-JAN-11 15.19.37.683           603            8375 734fasjm6rfu8        55          6
23-JAN-11 15.19.47.713           603            8375 9s48k84m7kt36        61          6
23-JAN-11 15.19.57.733           603            8375 7902sa7n37sv9        62          1
23-JAN-11 22.00.39.071           603            8375 basj9jchbmu4q        63          1
24-JAN-11 05.00.45.895           603            8375 apmkt8bvrpyk0        70          7
24-JAN-11 13.00.11.180           603            8375 9tgtnsr31xsv5        86         16

24 rows selected.

ACTIVE_SESS_HIST

PROD3>  select sample_time,session_id,session_serial#,sql_id,temp_space_allocated/1024/1024 temp_mb,
  2   temp_space_allocated/1024/1024-lag(temp_space_allocated/1024/1024,1,0) over (order by sample_time) as temp_diff
  3   --from dba_hist_active_sess_history
  4   from v$active_session_history
  5   where
  6   session_id=&1
  7   and session_serial#=&2
  8   order by sample_time asc
  9   /

SAMPLE_TIME               SESSION_ID SESSION_SERIAL# SQL_ID          TEMP_MB  TEMP_DIFF
------------------------- ---------- --------------- ------------- --------- ----------
23-JAN-11 05.30.39.251           603            8375                      33         33
23-JAN-11 06.00.35.429           603            8375 73adfqhtsd909        33          0
23-JAN-11 06.10.45.871           603            8375 g7sy7a6ynn9r1        33          0
23-JAN-11 06.30.35.544           603            8375 ds7zkf3vhtms0        33          0
23-JAN-11 06.40.50.971           603            8375                      33          0
23-JAN-11 07.00.40.613           603            8375 g7sy7a6ynn9r1        33          0
23-JAN-11 07.01.21.713           603            8375 ds7zkf3vhtms0        33          0
23-JAN-11 07.10.45.941           603            8375 1rsg7wva9pyxd        33          0
23-JAN-11 08.00.45.805           603            8375 73adfqhtsd909        33          0
23-JAN-11 08.15.35.931           603            8375 ds7zkf3vhtms0        33          0
23-JAN-11 09.15.39.009           603            8375                      34          1
23-JAN-11 14.15.50.763           603            8375 g7sy7a6ynn9r1        34          0
23-JAN-11 14.20.35.388           603            8375 g7sy7a6ynn9r1        34          0
23-JAN-11 14.45.45.800           603            8375 4awcvxgvdauy6        35          1
23-JAN-11 15.15.39.123           603            8375 6ubavp8u2k7xa        35          0
23-JAN-11 15.18.36.539           603            8375 d042b4z3hnra4        35          0
23-JAN-11 15.18.37.539           603            8375 9d618ka6h2hvz        35          0
23-JAN-11 15.18.38.539           603            8375 4vwy3jp3wmmz5        35          0
23-JAN-11 15.18.39.539           603            8375 4vwy3jp3wmmz5        35          0
23-JAN-11 15.18.40.549           603            8375 5jf1jgqnp993z        35          0
23-JAN-11 15.18.41.549           603            8375 5jf1jgqnp993z        35          0
23-JAN-11 15.18.42.549           603            8375 37fxhfs1p44p6        35          0
23-JAN-11 15.18.43.549           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.44.559           603            8375 2du821ps20a4c        35          0
23-JAN-11 15.18.45.573           603            8375 5jf1jgqnp993z        35          0
23-JAN-11 15.18.46.573           603            8375 5jf1jgqnp993z        35          0
23-JAN-11 15.18.47.583           603            8375 2du821ps20a4c        35          0
23-JAN-11 15.18.48.583           603            8375 67kpwgzpd4qam        35          0
23-JAN-11 15.18.49.583           603            8375 5jf1jgqnp993z        35          0
23-JAN-11 15.18.50.583           603            8375 7htn6x6s336f7        35          0
23-JAN-11 15.18.51.583           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.52.593           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.53.593           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.54.593           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.55.593           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.56.593           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.57.593           603            8375 29kz43xq2s5xc        35          0
23-JAN-11 15.18.58.603           603            8375 2du821ps20a4c        35          0
23-JAN-11 15.18.59.603           603            8375 9s48k84m7kt36        35          0
23-JAN-11 15.19.00.603           603            8375 9s48k84m7kt36        36          1

.....
.....
.....
129 rows selected.

PROD3>

Bear in mind relying on DBA_HIST data, for detecting the sqls which caused big jumps, will not be that realistic because of the sample frequency so it is better to focus on v$active_session_history.

When we look at the v$active_session_history it like there is no obvious reason. There are some sqlids which certainly increase the temp usage but not big jumps.I did one more check before I raise the question to Oracle support and checked what type of segments these TEMP segments are ? To do this I tweaked the temp_usage.sql and add segtype column to the scripts.


PROD1> @temp_usageg

#####################################################################
#######################GLOBAL TEMP USAGE#############################
#####################################################################

   INST_ID TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
---------- --------------- ----------- ----------- ----------
         3 TEMP              22528.000       7.000      22521
         2 TEMP              22528.000      56.000      22472
         1 TEMP              22528.000   19102.000       3426

#####################################################################
#######################GLOBAL TEMP USERS-over 10MB###################
#####################################################################

   INST_ID SID_SERIAL USERNAME        OSUSER          SPID   SERVICE		   SEGTYPE	MB_USED TABLESPACE
---------- ---------- --------------- --------------- ------ ---------------------------------- ---------------
         1 620,4322   USER_1         osusr1        21712  WEB_USER_SERVICE       LOB_DATA     16.000 TEMP
         1 41,35454   USER_1         osusr1        20334  WEB_USER_SERVICE       LOB_DATA     19.000 TEMP
         1 529,21034  USER_1         osusr1        5776   WEB_USER_SERVICE       LOB_DATA     20.000 TEMP
         1 538,54672  USER_1         osusr1        30383  WEB_USER_SERVICE       LOB_DATA     27.000 TEMP
         1 1258,36144 USER_1         osusr1        5745   WEB_USER_SERVICE       LOB_DATA     31.000 TEMP
         1 1351,56539 USER_2          root           29910  WEB_USER_SERVICE       LOB_DATA     67.000 TEMP
         1 1113,59480 USER_2          root           11397  WEB_USER_SERVICE       LOB_DATA     78.000 TEMP
         1 30,4102    USER_2          root           15244  WEB_USER_SERVICE       LOB_DATA     78.000 TEMP
         1 892,14168  USER_2          root           19971  WEB_USER_SERVICE       LOB_DATA     79.000 TEMP
         1 699,56288  USER_1         osusr1        11240  WEB_USER_SERVICE       LOB_DATA    668.000 TEMP
         1 1501,55691 USER_1         osusr1        11236  WEB_USER_SERVICE       LOB_DATA    685.000 TEMP
         1 1396,7154  USER_1         osusr1        12526  WEB_USER_SERVICE       LOB_DATA    808.000 TEMP
         1 233,46662  USER_1         osusr1        11334  WEB_USER_SERVICE       LOB_DATA    810.000 TEMP
         1 1004,14127 USER_1         osusr1        24287  WEB_USER_SERVICE       LOB_DATA    832.000 TEMP
         1 1196,820   USER_1         osusr1        12530  WEB_USER_SERVICE       LOB_DATA    868.000 TEMP
         1 1121,1990  USER_1         osusr1        24290  WEB_USER_SERVICE       LOB_DATA    893.000 TEMP
         1 297,2940   USER_1         osusr1        8910   WEB_USER_SERVICE       LOB_DATA    903.000 TEMP
         1 294,39051  USER_1         osusr1        15116  WEB_USER_SERVICE       LOB_DATA    924.000 TEMP
         1 37,46739   USER_1         osusr1        15118  WEB_USER_SERVICE       LOB_DATA    932.000 TEMP
         1 414,31439  USER_1         osusr1        8877   WEB_USER_SERVICE       LOB_DATA    935.000 TEMP
         1 913,40598  USER_1         osusr1        9710   WEB_USER_SERVICE       LOB_DATA   1064.000 TEMP
         1 710,44883  USER_1         osusr1        9703   WEB_USER_SERVICE       LOB_DATA   1115.000 TEMP
         1 147,16701  USER_1         osusr1        9295   WEB_USER_SERVICE       LOB_DATA   1137.000 TEMP
         1 1446,25613 USER_1         osusr1        12372  WEB_USER_SERVICE       LOB_DATA   1139.000 TEMP
         1 721,5152   USER_1         osusr1        12728  WEB_USER_SERVICE       LOB_DATA   1183.000 TEMP
         1 1295,31011 USER_1         osusr1        32078  WEB_USER_SERVICE       LOB_DATA   1208.000 TEMP
         1 14,60566   USER_1         osusr1        8236   WEB_USER_SERVICE       LOB_DATA   1209.000 TEMP
         1 29,64998   USER_1         osusr1        9292   WEB_USER_SERVICE       LOB_DATA   1318.000 TEMP

28 rows selected.

My initial thought was temp tablespaces but I was wrong, it looks like all temp allocators which do not release the space are LOB segments. I also checked the sqls in the ash trend and it reveals sql_id’s which has got lob operations (some of them are packages). Double checked with developer and got the answer that they use dbms_lob.

Now it is time to look for bug, First hit on metalink “How to Release the Temp LOB Space and Avoid Hitting ORA-1652 [ID 802897.1]“. It says issue is for releases 8.1.5.0 to 11.1.0.7 but we are on 11.2.0.1. This document says there are 2 workarounds for versions above 10.2.0.4, first workaround is basically use
dbms_lob.freetemporary(a) but still drop the connection which is not a workaround at all.

Second solution is using the event 60025 to force session to freed space. The event information is like below

Cause
Temp LOB segments used for temporary LOBs are deleted only on session exit which may lead to large amounts of memory being held across multiple sessions.

Action
Setting this event will cause temporary LOB segments to be freed when there are no active temporary LOBs in the session. Setting this event will have a significant performance impact as it can cause temporary lob segments to be allocated and deleted many times during a session rather than once per session. Use this event only when temporary LOB segment memory use is an issue.

When I try the event it works (not immediately but when you call the allocation again) like below. I raised a call with Oracle before we implement the event change (probably with logon trigger) and still working with them to see if there is a one-off patch for the issue. If Oracle could not come up with one-off patch we will use the event.


---Before the session runs

14:23:00 SYS@ORACOS> @temp_usage

#####################################################################
#######################LOCAL TEMP USAGE#############################
#####################################################################

TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
--------------- ----------- ----------- ----------
TEMP               2697.000        .000       2697

#####################################################################
#######################LOCAL TEMP USERS#############################
#####################################################################

no rows selected

---Session runs the dbms_lob

HR@ORACOS> @temp_test
14:23:10 HR@ORACOS>
14:23:10 HR@ORACOS> DECLARE
14:23:10   2  a clob;
14:23:10   3  BEGIN
14:23:10   4  dbms_lob.createtemporary(a, TRUE,dbms_lob.call);
14:23:10   5  END;
14:23:10   6  /

PL/SQL procedure successfully completed.

14:23:10 HR@ORACOS>
14:23:10 HR@ORACOS> commit;

Commit complete.

14:23:10 HR@ORACOS>
14:23:10 HR@ORACOS> ---check the temp usage on another session
14:23:10 HR@ORACOS> exec dbms_lock.sleep(15);

---monitoring session

14:23:01 SYS@ORACOS> @temp_usage

#####################################################################
#######################LOCAL TEMP USAGE#############################
#####################################################################

TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
--------------- ----------- ----------- ----------
TEMP               2697.000       1.000       2696

#####################################################################
#######################LOCAL TEMP USERS#############################
#####################################################################

SID_SERIAL USERNAME          OSUSER          SPID  SERVICE_NAME        MB_USED TABLESPACE      STATEMENTS
---------- ----------------- --------------- --------------------- ----------- --------------- ----------
200,7236   HR                coskan          22101 SYS$USERS             1.000 TEMP                     1

---session activates the event

14:23:25 HR@ORACOS> ---enable the event
14:23:25 HR@ORACOS> alter session set events '60025 trace name context forever';

Session altered.

14:23:25 HR@ORACOS>
14:23:25 HR@ORACOS> ---check the temp usage on another session
14:23:25 HR@ORACOS> exec dbms_lock.sleep(15);

---monitoring session

14:23:15 SYS@ORACOS> @temp_usage

#####################################################################
#######################LOCAL TEMP USAGE#############################
#####################################################################

TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
--------------- ----------- ----------- ----------
TEMP               2697.000       1.000       2696

#####################################################################
#######################LOCAL TEMP USERS#############################
#####################################################################

SID_SERIAL USERNAME          OSUSER          SPID   SERVICE_NAME        MB_USED TABLESPACE      STATEMENTS
---------- ----------------- --------------- ------ --------------- ----------- --------------- ----------
200,7236   HR                coskan          22101  SYS$USERS             1.000 TEMP                     1

---session activates the event and run the dbms_lob again

14:23:40 HR@ORACOS> ---re run the procedure
14:23:40 HR@ORACOS> DECLARE
14:23:40   2  a clob;
14:23:40   3  BEGIN
14:23:40   4  dbms_lob.createtemporary(a, TRUE,dbms_lob.call);
14:23:40   5  END;
14:23:40   6  /

PL/SQL procedure successfully completed.

14:23:40 HR@ORACOS>
14:23:40 HR@ORACOS> commit;

Commit complete.

14:23:40 HR@ORACOS>
14:23:40 HR@ORACOS> ---check the temp usage on another session
14:23:40 HR@ORACOS> exec dbms_lock.sleep(15);

--monitoring session
14:23:29 SYS@ORACOS> @temp_usage

#####################################################################
#######################LOCAL TEMP USAGE#############################
#####################################################################

TABLESPACE         MB_TOTAL     MB_USED    MB_FREE
--------------- ----------- ----------- ----------
TEMP               2697.000        .000       2697

#####################################################################
#######################LOCAL TEMP USERS#############################
#####################################################################

no rows selected

no rows selected

14:23:44 SYS@ORACOS>

In summary With this post I walk through how you can diagnose a temp issue and how new temp_allocated column of v$session and history tables can be very helpful during your diagnose.
Hope it will help some of you guys.

Scripts I used for this post

temp_usageg.sql for global temp usage info in RAC
temp_usage.sql for global temp usage info
temp_history.sqlto get historic information for a spesific sid,serial

Information about bug for sqlid column of V$TEMPSEG_USAGE / v$sort_usage by Kerry Osborne

Update 08/10/2012

Related info on another blog is also here for usage of WMSYS.WM_CONCAT and temp allocation

 

http://asanga-pradeep.blogspot.co.uk/2012/10/use-of-wmconcat-can-exhaust-temporary.html

« Newer PostsOlder Posts »

The Silver is the New Black Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 193 other followers