Coskan’s Approach to Oracle

May 27, 2009

What if Vendor’s code was wrong ?

Filed under: Basics — coskan @ 12:32 pm

One of my biggest problem as a permanent worker is dealing with third party product databases. The main problem with them is, there is no one on site in case of a problem, calling someone from vendor  is a cost option which most of the companies try to avoid from.
Second problem is that, it is really hard to know, how that application works at database level. When you ask the guy who uses the program, they start to  talk business language with me and my brain, most of the time struggles to understand anything other  than 0 and 1.

Problems do not finish here, I dont know how it works at your company but most of the time when there is a major upgrade on the software they send a piece of DB upgrade script and you run it.

Please be honest, how many of you check , before you run this min 1000 lines of sql code ???

I hope I am wrong, but  probably most of you check it after it causes a problem on the DEV server especially when you are at a big enterprise which has lots of these third party products.  If it doesn’t cause a problem then you run it on prod.

Lets say you got a VM server or slow development server and most of the time users are already crying for slow performance on DEV system when they compare it with PRD so when it runs slow on DEV probably DEV system will be blamed instead of the batch upgrade code.  That is exactly how the incident occurred on one of our systems.

One of my colleague ran the batch job which took over 5  hours and they (him and the business guy) decided this slowness is because of VM DEV box.  They believed that PRD box will do this in  less than 2 hours which was acceptable for business.

Mistakes so far

Mistake 1- DBA did not check the code.

Mistake 2- DBA did not check the response time

Mistake 3- Common beliefs (slow DEV) directed both business and dba wrongly.

After they start upgrade on PRD, it was 5 hours so far when I decided to involve to the problem.  What I saw was was a fancy update with a more fancy execution plan.

it was something like below

SQL> create table test as select * from dba_tables;

Table created.

SQL> create table test2 as select * from test;

Table created.

SQL>
SQL> explain plan for
  2  update test t
  3  set OWNER=(select owner from test2 where owner=t.owner and table_name=t.table_name),
  4   TABLE_NAME=(select TABLE_NAME from test2 where owner=t.owner and table_name=t.table_name),
  5   TABLESPACE_NAME=(select TABLESPACE_NAME from test2 where owner=t.owner and table_name=t.table_name),
  6   CLUSTER_NAME=(select CLUSTER_NAME from test2 where owner=t.owner and table_name=t.table_name),
  7   IOT_NAME=(select IOT_NAME from test2 where owner=t.owner and table_name=t.table_name),
  8   STATUS=(select STATUS from test2 where owner=t.owner and table_name=t.table_name),
  9   PCT_FREE=(select PCT_FREE from test2 where owner=t.owner and table_name=t.table_name),
 10   INI_TRANS=(select INI_TRANS from test2 where owner=t.owner and table_name=t.table_name),
 11   MAX_TRANS=(select MAX_TRANS from test2 where owner=t.owner and table_name=t.table_name),
 12   NEXT_EXTENT=(select NEXT_EXTENT from test2 where owner=t.owner and table_name=t.table_name),
 13   PCT_INCREASE=(select PCT_INCREASE  from test2  where owner=t.owner and table_name=t.table_name),
 14   AVG_SPACE=(select AVG_SPACE from test2 where owner=t.owner and table_name=t.table_name),
 15   INSTANCES=(select INSTANCES from test2 where owner=t.owner and table_name=t.table_name),
 16   IOT_TYPE=(select IOT_TYPE from test2 where owner=t.owner and table_name=t.table_name),
 17   COMPRESS_FOR=(select COMPRESS_FOR from test2 where owner=t.owner and table_name=t.table_name),
 18   READ_ONLY=(select READ_ONLY from test2 where owner=t.owner and table_name=t.table_name),
 19   DROPPED=(select DROPPED from test2 where owner=t.owner and table_name=t.table_name);

Explained.

SQL> select * from table (dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
Plan hash value: 1747411015

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |       |  2350 |   461K|    23   (0)| 00:00:01 |
|   1 |  UPDATE            | TEST  |       |       |            |          |
|   2 |   TABLE ACCESS FULL| TEST  |  2350 |   461K|    23   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| TEST2 |     1 |    34 |    23   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL| TEST2 |     1 |    34 |    23   (0)| 00:00:01 |
|*  5 |   TABLE ACCESS FULL| TEST2 |     1 |    51 |    23   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS FULL| TEST2 |     1 |    51 |    23   (0)| 00:00:01 |
|*  7 |   TABLE ACCESS FULL| TEST2 |     1 |    51 |    23   (0)| 00:00:01 |
|*  8 |   TABLE ACCESS FULL| TEST2 |     1 |    40 |    23   (0)| 00:00:01 |
|*  9 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 10 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 12 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 13 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 14 |   TABLE ACCESS FULL| TEST2 |     1 |    47 |    23   (0)| 00:00:01 |
|* 15 |   TABLE ACCESS FULL| TEST2 |     1 |    41 |    23   (0)| 00:00:01 |
|* 16 |   TABLE ACCESS FULL| TEST2 |     1 |    42 |    23   (0)| 00:00:01 |
|* 17 |   TABLE ACCESS FULL| TEST2 |     1 |    45 |    23   (0)| 00:00:01 |
|* 18 |   TABLE ACCESS FULL| TEST2 |     1 |    37 |    23   (0)| 00:00:01 |
|* 19 |   TABLE ACCESS FULL| TEST2 |     1 |    37 |    23   (0)| 00:00:01 |
----------------------------------------------------------------------------

There wasn’t any index no stats nothing. Tables were created and updated and the developer was unaware from the syntax below.  (I know that Oracle is intelligent enough to cope with the dodgy one but as least developer should write more readable code)

SQL> explain plan for
  2  update test t
  3  set (OWNER,
  4  TABLE_NAME,
  5  TABLESPACE_NAME,
  6  CLUSTER_NAME,
  7  IOT_NAME,
  8  STATUS,
  9  PCT_FREE,
 10  INI_TRANS,
 11  MAX_TRANS,
 12  NEXT_EXTENT,
 13  PCT_INCREASE,
 14  AVG_SPACE,
 15  INSTANCES,
 16  IOT_TYPE,
 17  COMPRESS_FOR,
 18  READ_ONLY,
 19  DROPPED)=(select OWNER,
 20  TABLE_NAME,
 21  TABLESPACE_NAME,
 22  CLUSTER_NAME,
 23  IOT_NAME,
 24  STATUS,
 25  PCT_FREE,
 26  INI_TRANS,
 27  MAX_TRANS,
 28  NEXT_EXTENT,
 29  PCT_INCREASE,
 30  AVG_SPACE,
 31  INSTANCES,
 32  IOT_TYPE,
 33  COMPRESS_FOR,
 34  READ_ONLY,
 35  DROPPED from test2 where table_name=t.table_name)  ;

Explained.

SQL> select * from table (dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------
Plan hash value: 1507865077

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |       |  2350 |   461K|    23   (0)| 00:00:01 |
|   1 |  UPDATE            | TEST  |       |       |            |          |
|   2 |   TABLE ACCESS FULL| TEST  |  2350 |   461K|    23   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| TEST2 |    24 |  4824 |    23   (0)| 00:00:01 |
----------------------------------------------------------------------------

I asked to interrupt the running update and create index and gather stats before this update. Guess what, everything run under 1 minute.

But my colleague was against this idea that this script needs to be re-sent to vendor and must come back to us. I did not have time to deal with this discussion and left him to solve however he wants to as long as he doesn’t waste my resource and time.  Finally vendor approved what we did and the problem solved.

At our team meeting, I said this is unacceptable and anything running over 1 hour on OLTP DEV (no matter how the DEV configuration is ) systems deserves to be checked for performance initially by DBA, and thank god at least  this is accepted.

IMHO, as long as DBA doesn’t change any logical thing on the code, he should have the freedom of modifying the code, because ,sending code to vendor for review, is just easy way to get rid of dirt.  Vendor doesn’t have our data,  doesn’t have my stats, doesn’t have our configuration etc etc, and probably they are trying to write an application which will work on every RDBMS (deadliest mistake).

There is also a security related issue with these kind of third party codes but it is another issue …

Moral of the story for me is check everything before your run, during running don’t leave it monitor it, if you haven’t run it on DEV / TST ask for timings and  don’t trust word of anybody without any proof.

May 20, 2009

TOP by Christian Antognini

Filed under: Book Reviews, Diary, Performance — coskan @ 4:29 pm

I finally finished “reading” of Troubleshooting Oracle Performance by Christian Antognini.

The book is already reviewed as a blog post by many other bloggers like Carry Millsap, Jonathan Lewis (they  also wrote forewords for the book), Jason Arneil and  Tonguc Yilmaz.

Now it is my turn to say something about this invaluable book.

First of  all this review is based on what I read because I read the book mostly when I am on train and I did not try %90 of what is written on the book yet. After I go over the book again and try all the scripts,  I will write more posts about the book which is going to be, what I learned from TOP by Christian Antognini series.

The book is  very well structured and ordered,  you don’t loose yourself by references to upcoming topics that much.  It is not written by a native English speaker,  so you don’t struggle with language,  it is simple and easy to understand as much as it can be.  Everything he explained is supported by a proof. The best part of this book among other two performance related bibles (Cost Based Oracle Fundamentals and Optimizing Oracle Performance which are also very much recommended) you never get lost from the math. I graduated as an Computer Engineer but sometimes I really struggle to understand where all those numbers come from when I look at the other books.   the way Christian explains the Math in TOP is very understandableand there is no way you can complain about too much math in this book.

Apart from all these general readability issues, TOP gives you a method for designing a better application and a method to solve most of the common problems on Oracle Databases.  It covers all available supported Oracle versions, so you also find chance to see the differences on Oracle Troubleshooting and optimal design from 9i to 11g.

What TOP doesn’t give,  is magic. If you expect magical internal ways for troubleshooting Oracle, this book is not right for you. You need to look other definitive books :)

Christian covered nearly everything ,  The only problem is that ,he doest give step by step approach like Tanel does. It would be excellent, if Christian add one more chapter and give step by step approach in case of a problem. Although Last chapter does something smilar , I still think it is a bit small.

I have a dream that Christian and Tanel work on a book together :) It would be great to have a  book which covers nearly everything you can face while troubleshooting Oracle. Coincidentally their masterclasses at UKOUG were one after the other and during reading this book I attended Tanel’s class,and I found the chance to compare them twice, My conclusion is,  I think their work deserves to be combined. Just a feedback from audience :)

Last word about this book it is excellent piece of work. I recommend it to everyone who wants to learn Troubleshooting Oracle Performance and optimal Oracle design for better performance.

Many  Thanks goes to  Christian for sharing his work with community.

May 5, 2009

Disk backup to Network Drive on Windows with RMAN

Filed under: Backup/Recovery — coskan @ 4:38 pm

We are currently working on moving our databases to different geographic locations and try to look for the best approach. One of the suggested ideas was mapping the target server drive and backup directly to mapped network drive on the target server and get rid of copy step after the backup.

After the initial trials, every time RMAN failed to write to the network drive with the error stack below.

ORA-19504: failed to create file "X:\DB_T686079212_S20_P1"
ORA-27040: file create error, unable to create file
OSD-04002: unable to open file
O/S-Error: (OS 3) The system cannot find the path specified.

or


ORA-19504: failed to create file "\\SERVERNAME\DB_T686079212_S20_P1"
ORA-27040: file create error, unable to create file
OSD-04002: unable to open file
O/S-Error: (OS 3) The system cannot find the path specified.

Metalink note 145843.1 came up with a solution

Basically all you need to do is update Oracle Services (both listener and Database) to run with Domain Account which has access on both servers with Administrator Privilege.

After you update the services ;
1- Restart the services

2-Use UNC naming instead of using drive letter on your script. (UNC naming part is not mentioned in the note)

3- Run the RMAN backup

Worked fine on my case.

Theme: Silver is the New Black. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 206 other followers