Coskan’s Approach to Oracle

April 25, 2012

Carlos Sierra started blogging

Filed under: Diary — coskan @ 11:34 am

Just a quick post for letting people know that Carlos Sierra started blogging in his Oracle SQL Tuning Notepad. He is the developer of many tools like “coe_xplain.sql, followed by profiler.sql, Trace Analyzer (TRCANLZR), SQLTXPLAIN (SQLT) and SQL Health-Check (SQLHC). ”  which are available in MOS.

Never seen or talk him personally but I’m big fan and user of his tools for a long time and I’m sure he will write plenty of tuning stories. I would like to say welcome to Oracle Blogosphere and wish him good luck with his blogging journey

Advertisement

January 29, 2010

Working with statspack-part-1b-Solution/Answer/another Question

Filed under: Diary, Performance — coskan @ 6:58 pm

<—-Part 1a

In part 1a I was trying to find a problem to troubleshoot and I spotted 2 possible problems and drilled down into one of them and stuck. Below is the list of what I did till I stuck and started to spend time on doing more research than it actually needs because I forgot the most important and easy step which I will explain in a moment.

What I did so far,

  • found the possible bottleneck on graph
  • cross-checked the bottleneck with DB time
  • checked the statspack and found “row cache lock” “log file switch (checkpoint incomplete)”, “buffer busy” waits as queue time
  • checked if Dictionary Cache statistics gives clue – No luck -at least for me
  • checked if DB did more redo than redo size can handle- redo activity was way lower than actual office hours
  • checked the segments for buffer busy waits and found only 1 segment caused all the wait but buffer busy waits and row cache lock are not related are they ?

What else I have in my hand ? I have a skipped snapshot slot. One of the statspack snaps could not be gathered.

What should I have done, before going and reading and learning more (which wasn’t a waste of time for learning but will be a waste of time in pressure situation) about those top events ?

Answer was I should have checked the alert log especially when my information is minimum with this historic problems. I was on call that week and I “assumed” that nothing was wrong with DB because I did not get called. Assumptions without proofs can be big problem and big time waster for your troubleshooting practices. After 4 hours of searching on the net for causes of row cache lock (which is not a common event), I decided to check alert log because of that skipped snapshot. What I found ?

Wed Jan 20 21:47:09 2010
Errors in file j:\data\oracle\trace\prod01\bdump\prod01_arc1_7976.trc:
ora-19816: Message 19816 not found; No message file for product=RDBMS, facility=ORA; arguments: [db_recovery_file_dest]
ora-27044: Message 27044 not found; No message file for product=RDBMS, facility=ORA
OSD-04008: WriteFile() failure, unable to write to file
O/S-Error: (OS 112) There is not enough space on the disk.
....
....
....
....
Wed Jan 20 22:28:48 2010
kcrrdmx: Successful archiving of previously failed ORL
Archiver process freed from errors. No longer stopped
Wed Jan 20 22:28:49 2010

As you see, I found that the database was hanged because the disk was full . Why was it full ? Looks like a mistake maybe not. It might be because we do exports on the same disk with db_file_recovery_area and probably One of my team mates took extra export during the day and forgot to delete it. With the redo amount generated that day it is unlikely that we can fill the logical limit and we did not get any %85 warning on that day on the alert log. Why did not I get called, because the issue lasted 45 minutes and probably technical guys did not check Big Brother monitoring that time:) I have no clue how it solved

I asked myself what is it to do with “row cache lock” why not “log file switch (archiving needed)”

Definition of the expected wait event fits with the alert log so it should be that event or should it ?

log file switch (archiving needed)

Waiting for a log switch because the log that the LGWR will be switching into has not been archived yet. Check the alert file to make sure that archiving has not stopped due to a failed archive write. To speed archiving, consider adding more archive processes or putting the archive files on striped disks.

Wait Time: 1 second

Parameters: None

Let’s check how it was;

SQL> @sp_evtrends
Please enter the ORACLE_SID value: prod01
Please enter the report begin date: sysdate-11
Please enter the report end date:sysdate-8

Some useful database statistics to search upon:
------------------------------------------------------------
        ARCH random i/o
        ARCH sequential i/o
	....
	.....
	.....
What statistic do you want to analyze? log file switch (archiving needed)

...
...
...
Daily trends for "log file switch (archiving needed)"...

Day    Hour   Statistic Name                         Secs Waited Percentage of total over all hours for each day
------ ------ ------------------------------ ------------------- ------------------------------------------------
17:00  log file switch (archiving needed)                0.00
18:00  log file switch (archiving needed)                0.00
19:00  log file switch (archiving needed)                0.00
20:00  log file switch (archiving needed)                0.00
21:00  log file switch (archiving needed)                0.00
22:00  log file switch (archiving needed)                0.00
23:00  log file switch (archiving needed)                0.00

How come ? Alert log says I was getting “log file switch (archiving needed)” but why my system event doesn’t show it? Lets check cumulative wait on this event since server started (12/30/2009)

SQL> select * from v$system_event where event like 'log file switch (archiving needed)';

EVENT                                                            TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT TIME_WAITED_MICRO   EVENT_ID WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------------------------------------------------------------- ----------- -------------- ----------- ------------ ----------------- ---------- ------------- ----------- ---------------
log file switch (archiving needed)                                    132442          80465     8062918        60.88        8.0629E+10  681532028    3290255840           2 Configuration

It says my DB waited on this event before but when ?

SQL> @sp_event_delta2 "log file switch (archiving needed)"

   SNAP_ID SNAP_TIME         EVENT                                                            TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED_MICRO
---------- ----------------- ---------------------------------------------------------------- ----------- -------------- -----------------
      7517 20100108 06:07:52 log file switch (archiving needed)                                        27              5           7022081
      7766 20100113 06:18:58 log file switch (archiving needed)                                         4              0            194101
      7831 20100114 13:49:06 log file switch (archiving needed)                                        16              0           5062268
      7892 20100115 21:05:16 log file switch (archiving needed)                                    132395          80460        8.0617E+10

I have 4 more day and time slots for this event and looks like on 15/01/2010 I had a similar issue. Lets check further for that day.

SQL> @sp_systime2
Please enter the ORACLE_SID value: prod01
Please enter the report begin date: sysdate-15
Please enter the report end date:sysdate-13

                                                                        Total             Cum
             Service                                                  Seconds    % of    % of
Day          or Wait  Name                                              Spent   Total   Total
------------ -------- ----------------------------------- ------------------- ------- -------

14-JAN       Service  SQL execution                                 32,871.29   39.24   39.24
             Wait     db file sequential read                       14,875.81   17.76   56.99
             Service  Recursive SQL execution                       13,203.42   15.76   72.75
             Wait     log file parallel write                        3,381.76    4.04   76.79
             Wait     db file scattered read                         3,069.28    3.66   80.45
             Wait     log file sync                                  2,651.77    3.17   83.62
             Wait     db file parallel write                         2,330.67    2.78   86.40
             Wait     Backup: sbtwrite2                              2,323.22    2.77   89.17
             Wait     enq: TX - row lock contention                  1,515.07    1.81   90.98
             Wait     Log archive I/O                                1,387.98    1.66   92.64
             Service  Parsing SQL                                    1,138.64    1.36   94.00
             Wait     log file sequential read                         814.92    0.97   94.97

15-JAN       Wait     log file switch (archiving needed)            80,616.90   31.14   31.14
             Wait     buffer busy waits                             41,401.25   15.99   47.13
             Wait     row cache lock                                39,873.16   15.40   62.53
             Service  SQL execution                                 37,210.97   14.37   76.90
             Wait     db file sequential read                       15,267.63    5.90   82.79
             Service  Recursive SQL execution                       13,166.82    5.09   87.88
             Wait     db file scattered read                         5,490.75    2.12   90.00
             Wait     log file parallel write                        5,157.94    1.99   91.99
             Wait     log file sync                                  3,793.75    1.47   93.46
             Wait     db file parallel write                         3,447.89    1.33   94.79
             Wait     enq: TX - row lock contention                  3,175.70    1.23   96.02
             Wait     Log archive I/O                                2,231.86    0.86   96.88

....
....
...
                                                                              Total             Cum
                   Service                                                  Seconds    % of    % of
Day          Hour  or Wait  Name                                              Spent   Total   Total
------------ ----- -------- ----------------------------------- ------------------- ------- -------
15-JAN      19:00 Service  SQL execution                                  2,193.28   34.66   34.66
                   Wait     db file sequential read                        1,423.59   22.50   57.16
                   Wait     log file parallel write                          543.65    8.59   65.75
                   Wait     db file parallel write                           445.41    7.04   72.79
                   Wait     log file sequential read                         344.37    5.44   78.24
                   Wait     db file scattered read                           304.99    4.82   83.06
         *** 21:00 Wait     log file switch (archiving needed)            80,616.90   48.82   48.82
                   Wait     buffer busy waits                             41,336.81   25.03   73.86
                   Wait     row cache lock                                39,873.16   24.15   98.01
                   Service  SQL execution                                  1,637.31    0.99   99.00
                   Wait     db file sequential read                          359.08    0.22   99.22
                   Service  Recursive SQL execution                          325.17    0.20   99.41
             22:00 Service  SQL execution                                  3,763.74   53.76   53.76
                   Wait     db file sequential read                        1,475.86   21.08   74.84
                   Wait     db file scattered read                           947.84   13.54   88.38
                   Wait     db file parallel write                           313.39    4.48   92.86
                   Service  Recursive SQL execution                          273.72    3.91   96.77

Between 20-21 (attention!! there is no 20:00 data which means problem is not between 21-22 but 20-21)
log file switch (archiving needed) is 48 percent of our response time. Which is very expectable if there was an archiver hang
problem on that day.

Lets cross-check if there was a snapshot at 20

SQL> @sp_stats_snap sysdate-14 sysdate-13

SNAP_ID SNAP_TIME
---------- -----------------
...
...
      7890 20100115 19:19:18
      7891 20100115 19:49:18
      7892 20100115 21:05:16  ----where is 20:00
      7893 20100115 21:35:18
      7894 20100115 22:05:18
      7895 20100115 22:35:18
      7896 20100115 23:05:18
      7897 20100115 23:35:18

The reason our report shows the waits between 21-22 because query checks the time when the snapshot gathered so it assumes those happened at 21,
Tim wrote the queries for hourly interval so it sometimes mislead you in this kind of situations be carefull and pay more attention.

Alert log also says the problem started around 20:05 and ended at 21.

Fri Jan 15 20:00:45 2010
Errors in file j:\data\oracle\trace\prod01\bdump\prod01_arc1_7976.trc:
ORA-19816: Message 19816 not found; No message file for product=RDBMS, facility=ORA; arguments: [db_recovery_file_dest]
ORA-27044: Message 27044 not found; No message file for product=RDBMS, facility=ORA
OSD-04008: WriteFile() failure, unable to write to file
O/S-Error: (OS 112) There is not enough space on the disk.
...
...
...
...
kcrrdmx: Successful archiving of previously failed ORL
Archiver process freed from errors. No longer stopped
Fri Jan 15 21:05:15 2010

So for the exactly same issue (maybe not ) our DB did not cry for “log file switch (archiving needed)” why ?

I am thinking about possible answers but I need to test meanwhile maybe you want to share your ideas
before I wrote the 3th part for my answer?

Today I kept my promise about giving the reason but I have to close an answering post with another question
which is the best part of troubleshooting, it never ends.

Moral of the story, “Check the alert log”

January 8, 2010

Object without object_id

Filed under: Diary — coskan @ 4:44 pm

This is probably not new to most of you who checked the definition of DBA_OBJECTS view, but it was new to me, and I wanted to share.

Normally in documentation DBA_OBJECTS is referenced to ALL_OBJECTS view as all other DBA_XXX views, and OBJECT_ID column in ALL_OBJECTS view is NOT_NULL like below, but DBA_OBJECTS does not have any information about nullability

What I was trying was creating a test table created with CTAS from dba_objects  and tried to create a primary key on OBJECT_ID but I couldn’t because of having nulls. When I checked I found that DB_LINKS don’t have object_id.

 
 
SQL> desc all_objects;
           Name                            Null?    Type
           ------------------------------- -------- ------------------
    1      OWNER                           NOT NULL VARCHAR2(30)
    2      OBJECT_NAME                     NOT NULL VARCHAR2(30)
    3      SUBOBJECT_NAME                           VARCHAR2(30)
    4      OBJECT_ID                       NOT NULL NUMBER
    5      DATA_OBJECT_ID                           NUMBER
    6      OBJECT_TYPE                              VARCHAR2(19)
    7      CREATED                         NOT NULL DATE
    8      LAST_DDL_TIME                   NOT NULL DATE
    9      TIMESTAMP                                VARCHAR2(19)
   10      STATUS                                   VARCHAR2(7)
   11      TEMPORARY                                VARCHAR2(1)
   12      GENERATED                                VARCHAR2(1)
   13      SECONDARY                                VARCHAR2(1)
   14      NAMESPACE                       NOT NULL NUMBER
   15      EDITION_NAME                             VARCHAR2(30)

SQL> desc dba_objects
           Name                            Null?    Type
           ------------------------------- -------- ------------------
    1      OWNER                                    VARCHAR2(30)
    2      OBJECT_NAME                              VARCHAR2(128)
    3      SUBOBJECT_NAME                           VARCHAR2(30)
    4      OBJECT_ID                                NUMBER
    5      DATA_OBJECT_ID                           NUMBER
    6      OBJECT_TYPE                              VARCHAR2(19)
    7      CREATED                                  DATE
    8      LAST_DDL_TIME                            DATE
    9      TIMESTAMP                                VARCHAR2(19)
   10      STATUS                                   VARCHAR2(7)
   11      TEMPORARY                                VARCHAR2(1)
   12      GENERATED                                VARCHAR2(1)
   13      SECONDARY                                VARCHAR2(1)
   14      NAMESPACE                                NUMBER
   15      EDITION_NAME                             VARCHAR2(30)
 

When I check the definition of DBA_OBJECTS, I saw that DB_LINKS are added with a union to the view without an object_id

 
.....
.....
.....
union all
select u.name, l.name, NULL, to_number(null), to_number(null),
       'DATABASE LINK',
       l.ctime, to_date(null), NULL, 'VALID','N','N', 'N', NULL, NULL
from sys.link$ l, sys.user$ u
where l.owner# = u.user#;
 

I check ALL_OBJECTS again and could not find any DB_LINK listed in  ALL_OBJECTS. In my understanding DB_LINKS are not even objects, at least they don’t deserve to have an OBJECT_ID from Oracle’s point.

December 31, 2009

Summary for 2009, Plans for 2010

Filed under: Diary — coskan @ 5:44 pm

2009  was a bit silent year for me I did not find interesting topics to write, but I read too many interesting and new topics as you follow from blogroll report posts and book reviews. I think I did a good job with blogroll reports  by them. I now have a very nice indexed Oracle Blogosphere for half of 2009. Blog reached over quarter million hits (when I say 250 000 it is not looking that big 🙂 )  with average 400 visits a day.   By your support,  blog provided over 2500 gallons of clear water for people around the world,  I want to thank you all who visited and supported the social vibe link at the bottom of the page.

From career perspective, In 2009 I mostly focused on performance tuning and backup recovery.  I fully understood how Oracle Engine works in terms of performance management. I quit using TOAD and started to use sqlplus for %90 of my daily routine. I love  Toad but i can’t stand slowness of it and after Tanel Poders script library plus my own scripts I find it far more easy to use sqlplus.

Because of dealing with performance,  I started to be  more  obsessive about anything running slow and this obsessiveness made me quit Windows OS  start using Ubuntu 9.10 Karmic Koala as desktop. After 3 months of usage, I am quite happy about my decision. By Running Ubuntu on  my new Quad Code CPU and 8GB ram desktop I can say that I now have my dream speed and lab. Migration was not painful as much as I expected I can do 95% + more of the things I was able to do on windows,  I  wish I have the same option with the laptop I am using for job but no chance.

I also quit firefox and sage rss reader and start using chrome as browser  and google reader as reader with igoogle I highly recommend this combo to all of you.

For networking , I am still a bit shy when it comes to socializing at user group events. Normally I am a very social bloke, but I have a problem of introducing myself at meetings 😦  it is something i need to work on harder.

I always had a fear of giving  presentation even in Turkish and to sort  this problem, I got a course for presentation skills in 2009 and I learned all the tricks of a good presentation. I also did a presentation at the course and  I proved to myself that I am able to do presentation in English.  Now I need a good topic to prepare a presentation to test my new skills.

I think thats about it for 2009 summary, and  now it is time to talk about 2010 plans.

For the blog, I am planning to share more information in this blog from the sources I follow and read and from my own researches on top of them. I  want to write a tool maybe a plugin for swingbench to generate common performance problems for new DBAs to test their troubleshooting skills. I also want to write  possible crash  recovery scenarios series.

For Oracle knowledge, I plan to learn Streams and Analytic functions. It is a shame for me that I never really cared much about these two.

For the books,  I plan to read the ones below and I hope upcoming Apple Tablet will be affordable price to be my ebook reader.  I wish I will finish at least half of them.

  • Practical Oracle 8i from Jonathan Lewis– I am in the middle of this book and what I discovered so far is that most of the information Jonathan shares from his blog is based on this book. I can say it is a must.
  • Expert Oracle Practices Oracle Database Administration from the Oak_Table– I read two chapters so far and to be honest it is not as good as I expected but book is in alpha stage so my thoughts can still change.
  • Scaling Oracle 8i by James Morle
  • SQL Tuning by Dan Tow
  • Relational Database Index Design and the Optimizers by Tapio Lahdenmaki and Michael Lech
  • RMAN Recipes for Oracle Database 11g by Darl Kuhn , Sam R. Alapati , Arup Nanda
  • Oracle Data Guard 11G Handbook by Larry Carpenter , Joseph Meeks , Charles Kim , Bill Burke , Sonya Carothers , Joydip Kundu , Mike Smith , Nitin Vengurlekar
  • Linux Recipes for Oracle DBAs by Darl Kuhn, Charles Kim, Bernard Lopuz
  • Secrets of the Oracle Database by Norbert Debes
  • Oracle Automatic Storage Management: Under-the-Hood & Practical Deployment Guide by Nitin Vengurlekar , Murali Vallath , Rich Long
  • Pro Oracle Database 10g RAC on Linux: Installation, Administration, and Performance by Julian Dyke , Steve Shaw

I never be active enough at Oracle Forums and Oracle-L and they are also on top of my to-do list this year.

For linux desktop , I need to get used to use vi for every kind of text editing and because I don’t have system admin group luxury at home, I need to learn more about managing linux as a system admin.

I will start to give internal presentations to my team mates if it works maybe you can watch a presentation of me after second half of the year.

Well these are my plans of 2010 and I hope next year on 31st of December I will still be here and telling you that all of my plans went well.

Wish you and for the ones you love,  a very good year.

December 1, 2009

UKOUG 2009- My Agenda

Filed under: Diary — coskan @ 5:32 pm

Everybody is posting about their agenda so  I think  it is my turn.

I’m pretty sure , I will come up with something to share with you.

 

10:05 – 12:05 – Practical Oracle Capacity Planning – Tanel Poder

12:35 – 13:35 – The Oracle Wait Interface is Useless (sometimes) James Morle-Tanel Poder

13:50 – 14:50 – Parallel Processing -Christian Antognini  or Latch and Mutex Contention Troubleshooting- Tanel Poder

15:15 – 16:00  Anatomy of a SQL Tuning Session Wolfgang Breitling

 

With this agenda, I have to miss  “How to backup and recover enormous databases”  by Husnu Sensoy  and “Good indexing-Show CBO where to go” by Piet de Visser (I hope I can catch his presentation at some of the SIGs,   according to me he is one of the best presenters of Oracle World I watched so far very entertaining and teaching) so I will be a bit sorry at the end of the day. I really hate to choose between parallel sessions 🙂

Hope to catch some of you there.

November 30, 2009

Do you check your script library?

Filed under: Diary, Security — coskan @ 5:52 pm

I’m wondering how do you secure your own script library ?

If you love to do things by calling scripts  isn’t there a big one way trust for security of your own machine ?

Have you ever think of  your own machine hacked by a  hacker who knows how to be invisible and how to write dangerous  sqls as well ?

what happens if your basic v$session query scripts  suddenly tries to drop, deletes or updates something ?

I start to double check every script I run but still worried and want to hear about your solutions ?

November 26, 2009

ORA-02291: integrity constraint (RMAN.RLH_F1) violated

Filed under: Backup/Recovery, Diary — coskan @ 4:38 pm

On one of our boxes, we start to get the error stack below when we want to do a crosscheck with catalog database in use.

 

RMAN>crosscheck archivelog all;

released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=357 devtype=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of crosscheck command on ORA_DISK_1 channel at 11/26/2009 15:08:05
ORA-19633: control file record 8403 is out of sync with recovery catalog

RMAN>resync catalog; 

starting full resync of recovery catalog
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03008: error while performing automatic resync of recovery catalog
ORA-02291: integrity constraint (RMAN.RLH_F1) violated - parent key not found

I unregistered the DB and register it but after it tries to resync the catalog during registration, error came up like below .

RMAN>unregister database;

database unregistered from the recovery catalog

RMAN>register database
database registered in recovery catalog
starting full resync of recovery catalog
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03008: error while performing automatic resync of recovery catalog
ORA-02291: integrity constraint (RMAN.RLH_F1) violated - parent key not found

 

I could not find a metalink note, but a working solution from forums.oracle.com

Basically, you need to run unregister it by running undocumented internal (Update: I did not check metalink it is documented in 1058332.6)!!!!! DBMS_RCVCAT.UNREGISTERDATABASE procedure on RMAN catalog DB. (ask Oracle support before you start doing something)

 

 

SQL> select 'EXEC DBMS_RCVCAT.UNREGISTERDATABASE('||DB_KEY||','||DBID||');'
3   command, resetlogs_time
4  from rc_database
5  where name = '&db_name';
Enter value for db_name: XXXXXX

COMMAND                                                      RESETLOGS_TIME
------------------------------------------------------------ -----------------
EXEC DBMS_RCVCAT.UNREGISTERDATABASE(5626305,201074392);      20071130 11:57:12

 

 

Run the command if the resetlogs time is right and it will unregister your database from catalog.

Re-register your DB ( RMAN>register database) , if you are lucky as me it will register and sync the catalog without a problem.

September 10, 2009

Oracle Performance Firefighting by Craig Shallahamer

Filed under: Book Reviews, Diary, Performance — coskan @ 3:42 pm

Today I am going to write about an amazing book Oracle Performance Firefighting by Craig Shallahamer, which was released in July 2009. After all the papers I read from Craig, my fingers were crossed for the release date and I ordered the book for our company immediately after release.  The reason I was very excited about the book is that  I always feel uncomfortable with overall system performance tuning and this is the area where Craig looks  very good at.

During the review, I will try to cover what this book promises and gives us by going through  all the chapters one by one. For overall, this book is not for sql tuning  It is about how to find the problem and giving possible other solutions for common problems ( in addition to tuning your code as a first option.) by giving the information about how Oracle works in most problematic areas like buffer cache, shared pool, redo, undo and especially  latches and mutexes.

Before starting to go through chapters, I want to say that the book is very well organized  and the way Craig teaches avoids boredom about numbers and math.   I could not  leave the book even on holiday.  The way he covers the very hard to understand topics brings curiosity about the next topic.  One more thing is the number of usage of the word OraPub (his company) kept minimum.  While reading  Optimizing Oracle Performance by Cary Millsap and Jeffrey Holt, I really hated the word Hotsos because  it was nearly on every page and that was very annoying. If you are curious about this problem it is not happening that much in this book.

Lets start with going through chapters (Chapters with * are the ones that really needs attention)

1- Methods and madness: This is the introduction to how to be a firefighter from both methodic and holistic perspective. Craig gives information  What you need during Oracle firefighting and how you can build a methodology . He covers the basics of analysing the system as a whole under the name of  OraPub 3 Circle Analysis.  When I first read about the OraPub 3 Circle Analysis I did not expect much but after a few chapters I admit that it works quite fine.  Most interesting topic to me was how to write the summary of your firefighting story.  That is something I need to work on.

2- Listening to Oracle’s Pain: This chapter tries to explain how to understand what Oracle is crying for by using Oracle Wait Interface and ORTA (oracle response time analysis). It covers how Oracle kernel developers instrumented their code for creating Oracle Wait Interface. It wasnt something I did not know but still nice to remember most them.

3- Serialization Control (*): This chapter is where this book starts to rock about explaining what is going on in Oracle by telling how Latches Mutexes and Locks work and differ from each other.  After finishing this chapter I believed that the book already worth the money I paid and there were 6 more chapters to go. Craig definitely knows how Oracle works and better than just knowing he knows how to explain it like latches for dummies.  It is really hard not to remember or not understand what he is explaining.

4- Idenfying and Understanding Operating System Contention: This chapter tries to cover how to use unix/linux tool to gather information from operating system to do the operating system circle of OraPub 3 Circle Analysis. Apart from using tool it is very good about the way Craig teaches about how to talk and convince Network, OS and Storage guys in case of a problem without making their nerves. I personally liked very much the word Craig uses like “there is a memory pressure” instead of “paging” 🙂

5- Oracle Performance Diagnosis: This chapter is going deep into how to gather overall performance related data from Oracle and how to interpret the data you gathered without going to compulsive tuning disorder. He covers the wait event myths section which was very interesting and he tries explain why profiling is not “always” the best approach. Making the most out of DBMS_MONITOR  tool is also covered well. Craig also gives internals about ASH (Active Session History). How it works why it is good and how to use it. Craig also gives the simple trick about how to learn Oracle internals which is reading Oracle patents also search oracle in (http://www.faqs.org/patents/asnl/op).

6- Oracle Buffer Cache Internals(*): Till I read this book, I never dreamed  about that, one day I will really fully understand and be able to explain to others, how  Oracle buffer cache, shared pool and latches works, despite all official documentation and the books I read so far. This and upcoming 2 chapters  gave me this opportunity and I want to thank Craig very much for explaining these topics this well. Diagrams and way he explains are totally amazing.  Chapter is not just explaining buffer cache it is also how to tune it.  Knowing the internals of Buffer Cache can be the only reason to have the book because it is nearly the main thing for optimal performance. I really like too much about latch and enqueue subtopics.

7- Oracle Shared Pool Internals(*): This is another reason that makes this book unique.  I already wrote my own compilation about shared pool in a blog post but this chapter taught me many more things like lathes and mutexes in shared pool , In memory undo   and how to tune them. This chapter is also a must to learn to do proper performance firefighting.

8- Oracle Redo Management Internals(*): In this chapter Craig gives how redo management works in Oracle and how to tune redo related issues which is again very important performance related topic.  He goes over  every redo related problem and gives possible solutions to them. Learning the dangerous commit_write parameter usage was  completely new to me.

9- Oracle Performance Analysis(*): In this chapter every topic so far is integrated with each other in  a possible problem scenario. Before going into the scenario Craig cover response service and queue time calculations and prepares us to scenario. In this chapter Craig starts an Overall system performance tuning in OraPub three circle analysis by analysing Oracle OS and Application together. He goes into the analysis three times and this gives reader the clues when and where  to or not to stop tuning.  This chapter totally depends on math and to be honest it is really easy to understands

Now it is time to talk about what I don’t like about the book. Craig introduces and explains too many internal parameters and he gives them as an option in case there is a problem. I loved to learn them because it is very helpful when you talk with Oracle support and also they are part of understanding Oracle but I prefer he mentioned more about not to use them before asking Oracle . He says not to use them couple of times but still not enough to me.   Second thing is that we don’t  have option to download the test case codes Craig mentions apart from OraPub monitoring kit.  Third there is no index at the end of the book.Although I don’t use indexes that much but still nice to have.  One last thing because it is only available from OraPub you might wait a bit if you are ordering outside of US like me, I wish there was an e-book option which I always find it easy but probably it is because of avoiding pricy.

After negative part lets come to the conclusion. This book is very informative very well written and in my humble opinion, a must on every DBA’s desk and I suggest it to every DBA. If you are experienced and read too many performance tuning books and feel comfortable with performance tuning but still struggling to understand latches mutexes buffer cache shared pool  or overall performance tuning and if you want to fully understand them Oracle Performance Firefighting is totally for you. If you are a new for Oracle Performance Tuning,  this book is a must to understand how Oracle works but please be careful about hidden/unsupported parameters.

My only regret is not to order 2 copies one for company and one for me. 🙂 I hope you will like the book as much as I liked.

Many thank to Craig Shallahamer for bringing  deep knowledge and experience about Oracle Performance tuning and sharing in this very special book.

July 2, 2009

UKOUG DBMS SIG July 2009

Filed under: Diary — coskan @ 12:06 pm

According to me, the best part of being in UK as a DBA is, to have the opportunity to join UKOUG events.   I started to join SIG events this year and DBMS SIG July 2009 was the third and the best one so far, I joined as a delegate.

Here is my review of presentations and the event.

Upgrading Oracle Estate by Phill Brown.

This was a lucky presentation for me, because it was related with Oracle Failsafe, which we are currently having problems with. Phill explained the issues they faced during upgrading Oracle Failsafe, Oracle Database and filestructure  on windows clusters at the same time at one of their clients .  My highlights from the presentation are  some metalink notes like   Failsafe Errors (108442.1)  , Oracle DB and Windows memory (46001.11)  and 10G Agents on Failsafe (330072.1).   I asked Phill if they tried to configure DB Console apart from grid configuration but he said they did not try it. The problem we are having with Oracle Failsafe is, if you have your Oracle Home on local disks, DB Console is having problems with Oracle Failsafe, because its configuration repository sits in Oracle Home and for that reason Oracle suggest to put Oracle Home to network drive but that causes other problems to us.  I wish they found a solution during this project but I am not lucky enough 😦

Oracle Support June Update by Phil Davies

This was the second time I listened this presentation series from Phil and again it was very helpful.  I think  He is from Oracle support and his summary of available patches and patch bundles are very good for the ones who doesn’t have time to look available patches from metalink.
First news is that 10.2.0.5 will be terminal patch release and probably will be available in 2009 Christmas .
He mentioned that one of his clients having a very fast growing repository problem after 10.2.0.5 EM Grid Control release. He warned us not to collect everything available in templates, and  instead collect what you actually need, to not have this problem.

He gave a good list of available patches, bugs and Here are the ones caught my eye at the first place

Generic Support Status Notes  (strongly recommended to keep an eye on  notes below)

  • For 11.1.0   Note id  454507.1
  • For 10.2.0   Note id  316900.1
  • For 10.1.0   Note id  263719.1
  • For 9.2         Note id  189908.1

Dataguard Merged Patches

11.1.0.7
Physical/redo Note 7676737.8
Logical Standby Note 7628387.8
Data Guard Broker Note 7628357.8
10.2.0.4
Logical standby Note 7937113.8
Physical Bundle #1 Note 7936993.8
Broker Recommended Note 7936793.8
10.2.0.3
Physical/Redo Transport Patch 6081547
Logical Standby/Logminer   Patch 6081550
Data Guard Broker Patch 6048286
RMAN 10.2.0.3 Bundle Patch 6081556

11.1.0.7

  • Physical/redo Note 7676737.8
  • Logical Standby Note 7628387.8
  • Data Guard Broker Note 7628357.8

10.2.0.4

  • Logical standby Note 7937113.8
  • Physical Bundle #1 Note 7936993.8
  • Broker Recommended Note 7936793.8

10.2.0.3

  • Physical/Redo Transport Patch 6081547
  • Logical Standby/Logminer   Patch 6081550
  • Data Guard Broker Patch 6048286
  • RMAN 10.2.0.3 Bundle Patch 6081556

Recommended Patch Bundles  Note 756388.1

Bits and Pieces
This time slot is arranged for free talking about experiences of delegates and it was very successful.  Chris Dunscombe from SIG committee  did a small presentation about the problem they had with Delayed Cleanout and ORA-1555 problem.  End of his presentation there was a perfect discussion between Joel Goodman and Jonathan Lewis.  I was sitting in the middle  and they were at right and left edge of the room.  It was like watching Wimbledon Final from the middle of Center Court.  I think  Jonathan Lewis will cover this issue but I still plan to write a post to cover it more detailed.
DB Links Master Class Part 1 by Joel Goodman

This was the second time I watched  this part of the Master Class . I was in Edinburgh DBA SIG when I first watched  this presentation  and it was just after AOT by Tanel Poder,  so I couldn’t not focus that much.
This time I listened better and I plan to review it as another post, after listening the second part .

Graphing AWR Data in Excel by David Kurtz

This was presentation of this blog post.  His approach has the smilar idea of PerfSheet of Tanel Poder which is using excel instead of getting lost in a pool of  the metric values. If you ask me I am happy with PerfSheet.

Lunch

Best lunch I had at any SIG so far 🙂

Row Migration can Aggravate Contention on Cache Buffer Chain Latch.

This is again presentation for David’s another blog post.  This presentation was about using method from the previous presentation to catch the abnormality. I suggest you to read the blog post for more info. I think missing part of the story was, how come he got the idea to check row migration after detecting the abnormality.  David said it was idea of the DBA which does not depend on any hot block analysis.

How to Read Your Statspack/AWR Report  by Jonathan Lewis

Perfect as previous 3 presentations I watched from Jonathan Lewis.  He covered 5 Statspack reports in 1 hour time which were brought by delegates. It was very nice and a bit hard  to watch him while he is doing his job. Although  This session was very helpful,  I think you need to have 20+ years of Oracle Experience with supporting Math Degree to catch the possible problems as fast and as right as  he does.  To cover more you need to start with this blog post of him and read all 11 of his  series.  One good advice I caught, do not create index in last update date columns this will cause high redo generation because of index leaf splits if the row has high updates.

I found chance to ask Jonathan’s advices for  DBA who started after 9i (when features of Oracle DB and the Documents is too big comparing to v5) . Here are his advices

1- Read concepts guide and Admin guide, min of 2 times to cover the topics. Try the thing you learn

2- Spend time on OTN forums. You don’t have to reply just try to solve other peoples problems. Don’t give up and chase the problems which means have the enthusiasm to wake up at 3am 🙂

3- Do not focus on internals because 99% of the time they are useless

4- I asked him how many hours he sleeps, his answer was 6.  1 more than I thought 🙂

Thanks to the SIG committee and sponsors (Guardian)  for this nice event.  I hope one they I will be brave enough to do a presentation in English at one of these SIGS.

May 20, 2009

TOP by Christian Antognini

Filed under: Book Reviews, Diary, Performance — coskan @ 4:29 pm

I finally finished “reading” of Troubleshooting Oracle Performance by Christian Antognini.

The book is already reviewed as a blog post by many other bloggers like Carry Millsap, Jonathan Lewis (they  also wrote forewords for the book), Jason Arneil and  Tonguc Yilmaz.

Now it is my turn to say something about this invaluable book.

First of  all this review is based on what I read because I read the book mostly when I am on train and I did not try %90 of what is written on the book yet. After I go over the book again and try all the scripts,  I will write more posts about the book which is going to be, what I learned from TOP by Christian Antognini series.

The book is  very well structured and ordered,  you don’t loose yourself by references to upcoming topics that much.  It is not written by a native English speaker,  so you don’t struggle with language,  it is simple and easy to understand as much as it can be.  Everything he explained is supported by a proof. The best part of this book among other two performance related bibles (Cost Based Oracle Fundamentals and Optimizing Oracle Performance which are also very much recommended) you never get lost from the math. I graduated as an Computer Engineer but sometimes I really struggle to understand where all those numbers come from when I look at the other books.   the way Christian explains the Math in TOP is very understandableand there is no way you can complain about too much math in this book.

Apart from all these general readability issues, TOP gives you a method for designing a better application and a method to solve most of the common problems on Oracle Databases.  It covers all available supported Oracle versions, so you also find chance to see the differences on Oracle Troubleshooting and optimal design from 9i to 11g.

What TOP doesn’t give,  is magic. If you expect magical internal ways for troubleshooting Oracle, this book is not right for you. You need to look other definitive books 🙂

Christian covered nearly everything ,  The only problem is that ,he doest give step by step approach like Tanel does. It would be excellent, if Christian add one more chapter and give step by step approach in case of a problem. Although Last chapter does something smilar , I still think it is a bit small.

I have a dream that Christian and Tanel work on a book together 🙂 It would be great to have a  book which covers nearly everything you can face while troubleshooting Oracle. Coincidentally their masterclasses at UKOUG were one after the other and during reading this book I attended Tanel’s class,and I found the chance to compare them twice, My conclusion is,  I think their work deserves to be combined. Just a feedback from audience 🙂

Last word about this book it is excellent piece of work. I recommend it to everyone who wants to learn Troubleshooting Oracle Performance and optimal Oracle design for better performance.

Many  Thanks goes to  Christian for sharing his work with community.

Older Posts »

Create a free website or blog at WordPress.com.