Coskan’s Approach to Oracle

May 27, 2009

What if Vendor’s code was wrong ?

Filed under: Basics — coskan @ 12:32 pm

One of my biggest problem as a permanent worker is dealing with third party product databases. The main problem with them is, there is no one on site in case of a problem, calling someone from vendor  is a cost option which most of the companies try to avoid from.
Second problem is that, it is really hard to know, how that application works at database level. When you ask the guy who uses the program, they start to  talk business language with me and my brain, most of the time struggles to understand anything other  than 0 and 1.

Problems do not finish here, I dont know how it works at your company but most of the time when there is a major upgrade on the software they send a piece of DB upgrade script and you run it.

Please be honest, how many of you check , before you run this min 1000 lines of sql code ???

I hope I am wrong, but  probably most of you check it after it causes a problem on the DEV server especially when you are at a big enterprise which has lots of these third party products.  If it doesn’t cause a problem then you run it on prod.

Lets say you got a VM server or slow development server and most of the time users are already crying for slow performance on DEV system when they compare it with PRD so when it runs slow on DEV probably DEV system will be blamed instead of the batch upgrade code.  That is exactly how the incident occurred on one of our systems.

One of my colleague ran the batch job which took over 5  hours and they (him and the business guy) decided this slowness is because of VM DEV box.  They believed that PRD box will do this in  less than 2 hours which was acceptable for business.

Mistakes so far

Mistake 1- DBA did not check the code.

Mistake 2- DBA did not check the response time

Mistake 3- Common beliefs (slow DEV) directed both business and dba wrongly.

After they start upgrade on PRD, it was 5 hours so far when I decided to involve to the problem.  What I saw was was a fancy update with a more fancy execution plan.

it was something like below

SQL> create table test as select * from dba_tables;

Table created.

SQL> create table test2 as select * from test;

Table created.

SQL>
SQL> explain plan for
  2  update test t
  3  set OWNER=(select owner from test2 where owner=t.owner and table_name=t.table_name),
  4   TABLE_NAME=(select TABLE_NAME from test2 where owner=t.owner and table_name=t.table_name),
  5   TABLESPACE_NAME=(select TABLESPACE_NAME from test2 where owner=t.owner and table_name=t.table_name),
  6   CLUSTER_NAME=(select CLUSTER_NAME from test2 where owner=t.owner and table_name=t.table_name),
  7   IOT_NAME=(select IOT_NAME from test2 where owner=t.owner and table_name=t.table_name),
  8   STATUS=(select STATUS from test2 where owner=t.owner and table_name=t.table_name),
  9   PCT_FREE=(select PCT_FREE from test2 where owner=t.owner and table_name=t.table_name),
 10   INI_TRANS=(select INI_TRANS from test2 where owner=t.owner and table_name=t.table_name),
 11   MAX_TRANS=(select MAX_TRANS from test2 where owner=t.owner and table_name=t.table_name),
 12   NEXT_EXTENT=(select NEXT_EXTENT from test2 where owner=t.owner and table_name=t.table_name),
 13   PCT_INCREASE=(select PCT_INCREASE  from test2  where owner=t.owner and table_name=t.table_name),
 14   AVG_SPACE=(select AVG_SPACE from test2 where owner=t.owner and table_name=t.table_name),
 15   INSTANCES=(select INSTANCES from test2 where owner=t.owner and table_name=t.table_name),
 16   IOT_TYPE=(select IOT_TYPE from test2 where owner=t.owner and table_name=t.table_name),
 17   COMPRESS_FOR=(select COMPRESS_FOR from test2 where owner=t.owner and table_name=t.table_name),
 18   READ_ONLY=(select READ_ONLY from test2 where owner=t.owner and table_name=t.table_name),
 19   DROPPED=(select DROPPED from test2 where owner=t.owner and table_name=t.table_name);

Explained.

SQL> select * from table (dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
Plan hash value: 1747411015

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |       |  2350 |   461K|    23   (0)| 00:00:01 |
|   1 |  UPDATE            | TEST  |       |       |            |          |
|   2 |   TABLE ACCESS FULL| TEST  |  2350 |   461K|    23   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| TEST2 |     1 |    34 |    23   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL| TEST2 |     1 |    34 |    23   (0)| 00:00:01 |
|*  5 |   TABLE ACCESS FULL| TEST2 |     1 |    51 |    23   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS FULL| TEST2 |     1 |    51 |    23   (0)| 00:00:01 |
|*  7 |   TABLE ACCESS FULL| TEST2 |     1 |    51 |    23   (0)| 00:00:01 |
|*  8 |   TABLE ACCESS FULL| TEST2 |     1 |    40 |    23   (0)| 00:00:01 |
|*  9 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 10 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 12 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 13 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 14 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 15 |   TABLE ACCESS FULL| TEST2 |     1 |    41 |    23   (0)| 00:00:01 |
|* 16 |   TABLE ACCESS FULL| TEST2 |     1 |    42 |    23   (0)| 00:00:01 |
|* 17 |   TABLE ACCESS FULL| TEST2 |     1 |    45 |    23   (0)| 00:00:01 |
|* 18 |   TABLE ACCESS FULL| TEST2 |     1 |    37 |    23   (0)| 00:00:01 |
|* 19 |   TABLE ACCESS FULL| TEST2 |     1 |    37 |    23   (0)| 00:00:01 |
----------------------------------------------------------------------------

There wasn’t any index no stats nothing. Tables were created and updated and the developer was unaware from the syntax below.  (I know that Oracle is intelligent enough to cope with the dodgy one but as least developer should write more readable code)

SQL> explain plan for
  2  update test t
  3  set (OWNER,
  4  TABLE_NAME,
  5  TABLESPACE_NAME,
  6  CLUSTER_NAME,
  7  IOT_NAME,
  8  STATUS,
  9  PCT_FREE,
 10  INI_TRANS,
 11  MAX_TRANS,
 12  NEXT_EXTENT,
 13  PCT_INCREASE,
 14  AVG_SPACE,
 15  INSTANCES,
 16  IOT_TYPE,
 17  COMPRESS_FOR,
 18  READ_ONLY,
 19  DROPPED)=(select OWNER,
 20  TABLE_NAME,
 21  TABLESPACE_NAME,
 22  CLUSTER_NAME,
 23  IOT_NAME,
 24  STATUS,
 25  PCT_FREE,
 26  INI_TRANS,
 27  MAX_TRANS,
 28  NEXT_EXTENT,
 29  PCT_INCREASE,
 30  AVG_SPACE,
 31  INSTANCES,
 32  IOT_TYPE,
 33  COMPRESS_FOR,
 34  READ_ONLY,
 35  DROPPED from test2 where table_name=t.table_name)  ;

Explained.

SQL> select * from table (dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------
Plan hash value: 1507865077

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |       |  2350 |   461K|    23   (0)| 00:00:01 |
|   1 |  UPDATE            | TEST  |       |       |            |          |
|   2 |   TABLE ACCESS FULL| TEST  |  2350 |   461K|    23   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| TEST2 |    24 |  4824 |    23   (0)| 00:00:01 |
----------------------------------------------------------------------------

I asked to interrupt the running update and create index and gather stats before this update. Guess what, everything run under 1 minute.

But my colleague was against this idea that this script needs to be re-sent to vendor and must come back to us. I did not have time to deal with this discussion and left him to solve however he wants to as long as he doesn’t waste my resource and time.  Finally vendor approved what we did and the problem solved.

At our team meeting, I said this is unacceptable and anything running over 1 hour on OLTP DEV (no matter how the DEV configuration is ) systems deserves to be checked for performance initially by DBA, and thank god at least  this is accepted.

IMHO, as long as DBA doesn’t change any logical thing on the code, he should have the freedom of modifying the code, because ,sending code to vendor for review, is just easy way to get rid of dirt.  Vendor doesn’t have our data,  doesn’t have my stats, doesn’t have our configuration etc etc, and probably they are trying to write an application which will work on every RDBMS (deadliest mistake).

There is also a security related issue with these kind of third party codes but it is another issue …

Moral of the story for me is check everything before your run, during running don’t leave it monitor it, if you haven’t run it on DEV / TST ask for timings and  don’t trust word of anybody without any proof.

2 Comments »

  1. I totally agree with your observations especially with a common belief that slow run on development is fine, and this won’t happen in production because it is a bigger box. In our environment we do have policy to review codes if they are reviewable like SQL scripts before execution. Surprisingly, many vendors’ SQL scripts are written with potential issues (no index or index tablespace, cryptic table names like B1004 or H629, etc.) which can easily be remediated. And monitoring the executions is definitely a must. We often repeat the executions on development to ensure the consistent results. Your ending of “Don’t trust word of anybody without any proof” is very true. Thanks.

    Comment by ittichai — May 27, 2009 @ 2:13 pm

  2. I had a similar thing happen with an upgrade to a vendor supplied product I was running once. In the test instance the upgrade was taking about 2 days to run. I did some looking, added some indexes, gathered some stats and it ran in 5 hours.

    Comment by Jeremy — May 27, 2009 @ 7:02 pm


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.