Coskan’s Approach to Oracle

February 15, 2011

Plan Stability Through Upgrade-Why is my plan changed?-bugfixes-2

Filed under: Bugs, CBO, Performance, Plan Stability — coskan @ 12:01 am

This post is a bonus post where things can get complicated with my approach for finding bugfixes , if you are already upgraded.

On a database where we upgraded from 10.2.0.4 to 11.2.0.1 , We had a sql reported to have changed its response time on 11.2.0.1 .(At this point I need to say I’m sorry that I can’t share the SQL and PLANS because of data privacy. Tried to edit the sql and plans but query is massive . )

Interestingly when I run the optimizer_features_test for the versions between 10.2.0.4 to 11.2.0.1
only change was extra nested loop. Cost is same and plan generated for 10.2.0.4 was not same as pre-upgrade version

Runtime statistics .


SQL> @sqlid b290wxr1j2a2b

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
b290wxr1j2a2b 3272681547 SELECT   /*+gather_plan_statistics*//*test*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002BCC6C048 00000002BCC6A0A0   83708128     2022          1          1          1          2            121 1255771.09 1323564.07 1323.56407   62856727      12621          1               0
   1 00000002BCC6C048 00000002BCC2A3C8 1845562615      889          1          1          1          2            121   1847.719   2327.507   2.327507       2889        306          1               0
   2 00000002BCC6C048 00000002BCC24D78   83708128     2022          2          1          2          4            242 2310945.68 2318713.43 1159.35672  125714199        103          2               0
   3 00000002BCC6C048 00000002BCBCEFD0 3790468002     2022          2          1          2          4            242 2349948.75 2361862.35 1180.93118  125714497        424          2               0
   4 00000002BCC6C048 00000002BDDAF630  427381853     2022          2          1          2          4            242 2308517.05 2319381.53 1159.69077  125713318         26          2               0
   5 00000002BCC6C048 00000002DA577EB8  427381853     2022          2          1          2          4            242 2294214.23 2300615.11 1150.30755  125713315          0          2               0

6 rows selected.
Child-Version Match is like below

Child	OFE
=================
0-1	11.2.0.1
2	11.1.0.7
3	11.1.0.6
4	10.2.0.5
5	10.2.0.4

Child number 1 on 11.2.0.1 runs with cardinality feedback and uses right plan for OR with extra nested loop make up and drops the response time from 1323 seconds to 2 seconds
Other

So what is changed and how can I catch it if changing OFE is not working ? Its pretty obvious that we have a stats problem but this query involves 10 different table
and its not always easy to gather stats on tables where all other queries are running fine on. I needed to find a solution.

When I check the query it has an OR and the new plans basically do not generate a plan for OR. I modified the query to use union all instead of OR I can get the same pre-upgrade plan which is interesting
because stats are still same so how come Oracle generates right plan when I use union all and wrong plan when I use OR ? There must be something causing this behaviour.

Since Changing OFE did not change anything in terms of response time we may look into the situation where 8.0 bugfix suddenly appeared on 11.2.0.1 but were not active on 10.2.0.4

How can we spot the bugfix ? You can use the bugfix builder that generates script to test bugfixes for all version below the given version

Script is build_optimizer_bugfix_test_below_e: Same thing but creates a script for all the bugfixes for the versions below the given version

Source code of the script

undefine optimizer_feature_version
set heading off
set feedback off
define optimizer_feature="&optimizer_feature_version"
set term off
spool optimizer_bugfix_test_e.sql
select 'set feedback off' from dual;
select 'set heading off' from dual;
select 'set term off' from dual;
select 'set echo off' from dual;
select 'spool optimizer_bugfix_test_results_e.log' from dual;
select 'drop table  bugfix_table purge;' from dual;
select 'create  table bugfix_table (bugfix number,plan_hash_value varchar2(20));' from dual;
select 'alter session set optimizer_features_enable=''&optimizer_feature'';' from dual;
select   'alter session set "_fix_control"='''||bugno||decode(value,1,':OFF''',':ON''')||'; '
||chr(10)||'@test_e.sql'||chr(10)||
'insert into bugfix_table values ('||bugno||',(select substr(plan_table_output,18,12) from (select * from table(dbms_xplan.display)) where plan_table_output  like ''Plan hash value:%''));'||chr(10)||
'alter session set "_fix_control"='''||bugno||decode(value,1,':ON''',':OFF''')||'; '
from v$system_fix_control
where
to_number(replace(optimizer_feature_enable,'.',''))<to_number(replace('&optimizer_feature','.',''))
and value in (1,0) order by bugno;
select 'set term on' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'break on plan_hash_value' from dual;
select 'select plan_hash_value,bugfix from bugfix_table order by 1;' from dual;
select 'set feedback on' from dual;
select 'set heading on' from dual;
select 'spool off' from dual;
set feedback on
set heading on
set term on
spool off

When I run it


SQL> @build_optimizer_bugfix_test_below_e
Enter value for optimizer_feature_version: 10.2.0.4
SQL> @optimizer_bugfix_test_e

PLAN_HASH_VALUE          BUGFIX
-------------------- ----------
1186576438              5129233
                        3120429
427381853               1403283
                        2320291
                        2324795
                        2492766
                        2660592
                        2663857
			.......
			.......
76 rows selected.

Two bugfixes gives us the plan we are looking for

SQL> @bugfixg
Enter value for bugno: 5129233

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   5129233          1 QKSFM_CBO_5129233         peek at any foldable exprssion during pruning                    10.2.0.3                           0          1

1 row selected.

SQL> @bugfixg
Enter value for bugno: 3120429

     BUGNO      VALUE SQL_FEATURE               DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ------------------------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   3120429          1 QKSFM_ACCESS_PATH_3120429 account for join key sparsity in computing NL index access cost  10.1.0.3                           0          1

1 row selected.

how can we sure which one is our problem ? Both of them enables us to run with the right plan
At this point only option is to go and check on a 10.2.0.4 database to see what these fixes were


SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> @bugfixg
Enter value for bugno: 5129233

     BUGNO      VALUE DESCRIPTION                                                      OPTIMIZER_FEATURE_ENABLE       EVENT IS_DEFAULT
---------- ---------- ---------------------------------------------------------------- ------------------------- ---------- ----------
   5129233          1 peek at any foldable exprssion during pruning                    10.2.0.3                           0          1

SQL> @bugfixg
Enter value for bugno: 3120429

no rows selected

As you can see 5129233 is again enabled on 10.2.0.4 but bugfix 3120429 does not exist and not enabled/disabled on a database running on 10.2.0.4 binaries.

Even both of them are solving the issue when we are running on 11.2.0.1, in my opinion controlling bugfix is 3120429.
Thanks to the reasonable developer who accepted to change the code using union all instead of or we did not need to enable disable a bugfix.

Since we find two bugs lets test them (I add comments to force new parse instead of flushing the sql for fresh runtime stats so sqlids are different )

When bugfix 3120429 is disabled

SQL> @sqlid 3798wumm6mwsq

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
3798wumm6mwsq 3865703190 SELECT   /*+gather_plan_statistics*//*test_2*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002DC1DABC8 00000002DC968950 4098949197      531          2          1          2          2              0     1305.8     1332.1     .66605        207          3          2               0
   1 00000002DC1DABC8 00000002DC6182A0 4098949197      531          2          1          2          2              0   1328.799   1336.494    .668247        207          0          2               0
   2 00000002DC1DABC8 00000002DC76E5E8 3700866780      451          2          1          2          2              0   1353.794   1369.258    .684629        214          0          2               0
   3 00000002DC1DABC8 00000002DCCAAFF8 2856793267      451          2          1          2          2              0   1244.811   1251.411   .6257055        214          0          2               0
   4 00000002DC1DABC8 00000002A5F40D20 1186576438      481          2          1          2          2              0    898.864    906.962    .453481        190          0          2               0

When bugfix 5129233 is disabled


SQL> @sqlid bh6a9xtq27hba

SQL_ID        HASH_VALUE SQL_TEXT
------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
bh6a9xtq27hba 1814282602 SELECT   /*+gather_plan_statistics*//*test_3*/q.rowkey, i.NAME instrument,                                  NVL ((q.enddate - to_date('01Oct10',

1 row selected.

 CH# PARENT_HANDLE    OBJECT_HANDLE     PLAN_HASH OPT_COST     PARSES   H_PARSES EXECUTIONS    FETCHES ROWS_PROCESSED     CPU_MS     ELA_MS    AVG_ELA       LIOS       PIOS      SORTS USERS_EXECUTING
---- ---------------- ---------------- ---------- -------- ---------- ---------- ---------- ---------- -------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------
   0 00000002DC6F9150 00000002DC955228  986578726      519          2          1          2          2              0    1177.82   1183.933   .5919665        183          0          2               0
   1 00000002DC6F9150 00000002A5F400F0  986578726      519          2          1          2          2              0   1261.808   1268.259   .6341295        183          0          2               0
   2 00000002DC6F9150 00000002DCBB2DE0 3700866780      445          2          1          2          2              0   1506.771   1533.008    .766504        214          0          2               0
   3 00000002DC6F9150 00000002DCBB29C0 2856793267      445          2          1          2          2              0   1305.801   1317.385   .6586925        214          0          2               0
   4 00000002DC6F9150 00000002DC81E000 1186576438      474          2          1          2          2              0    940.858    945.413   .4727065        190          0          2               0

On 11.2.0.1 and 11.1.0.7 plan is changed depending on the bugfix but for 10.2.0.4, 10.2.0.5 and 11.1.0.6 plan is constant for changing different bugfixes.
You can also see previous cardinality feedback plan is also slower then bugfix disabled version.

We can dig further but its not the point of this post and Since I managed to take the response time back to where it was from 1323 seconds to 0.5 seconds I can stop here.

As you can see we still spot a difference after upgrade even time travelling between releases by altering OFE did not help us at first point.

This post was to explain complicated issue and things you can face with my approach.

I also find it very usefull to have v$system_fix_control table snapped before you upgrade/patch and compared afterwards, so you won’t get shocked with newly added previos versions related bugs.

Scripts Used;

build_optimizer_bugfix_test_below_e

<-Previos Post in Series ——————————-  Next Post in Series–>

About these ads

2 Comments »

  1. [...] <–Previous Post in This Series —————-     Next Post in this series–> [...]

    Pingback by Plan Stability Through Upgrade-Why is my plan changed?-new optimizer parameters « Coskan’s Approach to Oracle — February 17, 2011 @ 8:11 pm

  2. [...] by Plan Stability Through Upgrade-Why is my plan changed?-bugfixes-2 « Coskan’s Approach to… — February 15, 2011 @ 12:03 am [...]

    Pingback by Plan Stability Through Upgrade-Why is my plan changed?-bugfixes-1 « Coskan’s Approach to Oracle — February 17, 2011 @ 8:19 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Silver is the New Black. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 206 other followers

%d bloggers like this: