Feed on
Posts
Comments

It is sunny Saturday for London and I just stopped struggling with our central machine which has Quest Big Brother (pretty handy tool with less price then HP Open View)and an Oracle Database for our centralized checks running on it .

Before stopping by office, All I wanted was using internet banking securely, but could not ignore the errors on BB Screen for not being entitled as an invalid DBA :)

When I checked the BB Screen everything was in unchecked status. At first I thought Oracle instance was down. I checked the instance and it was alive so I logged on the machine to see whats going on. It is a Windows 2003 Box with enough hardware. Memory was fully utilized and there were 180 NMUPM.EXE process explorer running with 11MB memory usage each. (I use to see windows processes instead of task manager so I can investigate more deeply about a running process. ) Process Explorer was saying that it was an Oracle Enterprise manager related process. This was the first time I saw this process. I searched through metalink and saw that NMUPM.EXE sometimes utilize CPU 99 percent and sometimes spawns too many processes. CPU utilization is caused because of .Net Framework installation NoteID:203224.1 (not my case), spawning is caused by “pdh.dll (Performance Data Helper) found in $ORACLE_HOME/bin which contains APIs to collect performance data from Windows Environments. It is spawned after EM usage Note ID:436138.1. There have been other bugs in the past about nmupm performance based on other dlls being registered with pdh.”

The solution was removing this pdh.ddl and restarting the DB Console service.

The steps I tried to solve the problem;

1- Tried to stop DB Console service. (did not stop)

2- Killed the DB Console Service but it did not helped to kill the other NMUPM.EXE processes.

3- Tried to kill NMUPM.exe but got bored of one by one killing limitation of both task manager and process explorer.

4- Tried to restart eh machine from remote desktop. It did not work. Machine was alive and I lost my option to log on the machine via remote desktop. Tried to stop services by remote services administration option of windows but did not worked

5- Tried to do shutdown command for remote machine like below. It said the machine is shutdown in progress :(

shutdown -r -m \\uklondmeyregdu1

5- I dont have option to go server room and reset server via button so I Did some googling for how to see / kill processes running on remote machines. What I found is magical and very very useful command line tools named tasklist and taskkill. I checked the running processes by tasklist like below

tasklist /s \\REMOTESERVER /u DOMAIN\USERNAME

After I saw that processes are still running I ran the best command ever (for me)on windows, TASKKILL, which allows multiple killing with one command by just giving the image name

taskkill /s \\REMOTESERVER /im nmupm.exe /u DOMAIN\USERNAME

It killed all the nmupm.exe processes and restart kicked at the end.

I feel embarrassed about now knowing or searching for this two commands especially the taskkill before. Maybe you know it for ages but it is worth to stay at office and write on this sunny London Saturday.

In conclusion I still hate to run Oracle on windows :) however, it is really good to know how to use Windows effectively.

Database Version : 10.2.0.3 Windows

Resources Used;

Metalink Note ID:436138.1.

Windows Reference

Today I got the error below when I try to run crosscheck command for one of our databases

It was weird to get ORA-19554: error allocating device, device type: SBT_TAPE, device name: error for this operation because it says device_type=disk and I was pretty sure that default device type is configured to disk because we are not doing tape backups. when I check with show all the default device type was disk as shown in the output below.

What was the reason. At first, I was thinking having default compressed backup set on disk channels causes this problem but After a search I found this thread on Oracle Forums I saw that The line which says

CONFIGURE CHANNEL DEVICE TYPE ‘SBT_TAPE’ MAXPIECESIZE 4 G;

causes this problem even if you set default device type. RMAN configuration file was updated through enterprise manager. When you set 4GB backup piece through manager it updates configuration file for both.

When you remove this configuration with the command below it will let you to do maintenance over disk channels.

CONFIGURE CHANNEL DEVICE TYPE ‘SBT_TAPE’ CLEAR;

Crosscheck command now works through disk channel

Edited: If you dont want to remove this parameter and still want to do crosscheck with disk channels you should try the following command.  (if you doing this over gui you have to write this command instead of the one produced by EM)

ALLOCATE CHANNEL FOR MAINTENANCE DEVICE TYPE DISK;

CROSSCHECK BACKUPSET;

Thanks to Michael T. Dihn for his comment.

Database Version is 10.2.0.2

Tonguc who is a milestone for my Oracle career, tagged me in his blog. What I understand from tagging is, after you give 8 information about yourlife, you tag someone else but I won’t tag anyone because I am always the last stop of forwarding :)

Here are my secrets.

1- I was born in Turkey years ago. Nobody in my life related with IT and to be honest, I touched computer for the first time in University when I was 18. My childhood dream was to be a James Bond, but now I am a DBA with only one gadget named Oracle… Maybe it is the only magical gadget I should need to be James Bond :)

2- if I am not dealing with databases, My life is mostly in gym . If I have to make a choice I should choose to go gym. I am addicted to endorphine and gym makes me secrete endorphine more than the databases make.

3- I am the fan of Galatasaray the only football team of Turkey which is known by real success, not by transfers :)

4- I moved to London this year and I think I will live here forever. Yes, I love dark weather.

5- I will engage with my girlfriend on this Saturday. Sometimes you have to disable your firewall against marriage :) My old release Coskan_28_m will be replaced by Coskan_28_m_m (male_married) in first quarter of 2008.

6- I still cannot use English effectively , but as you see, this is not secret, it is just a confess to myself :) Anyway, I could not write in Turkish too. None of my teachers gave good grade to my essays :(((.

I dream a world on which only demonstrations talks on blogs :)

7- I was an ex fan of Definitive Oracle Guru till 2006 the year that I started to test what I read.

8- Sometimes I think to sell my non-existent ferrari, but I think, first I have to be a monk and after that get permission from my future wife to go Tailand for education. (I started to think like a married man I think I am ready :) )

Thank you for following this blog. It is nice to be read by people around the world. I was’t expecting over 50000 hit in one year but I reached despite my English :) I hope you you get something from this blog.

I hit this ORA-02020 error for the first time today and it helped me to learn the truth behind the DB_LINK (I have to confess that I did not read before ). The truth is,” if you access a database link in a session, then the link remains open until you close the session. A link is open in the sense that a process is active on each of the remote databases accessed through the link.”

My case was;

I created a cursor and run a statement which uses all the DB_LINKS on the server. The number of db_links I used was more than 30. The code was getting the ORA-02020:too many database links in use error.

The definition and the solution of the error was simple

Cause: The current session has exceeded the INIT.ORA open_links maximum.
Action: Increase the open_links limit, or free up some open links by committing or rolling back the transaction and canceling open cursors that reference remote databases.

I checked the open_links parameter (I was unaware that this parameter was existed) and saw that parameter had the default value which was 4 and cannot be modified online. I took the advised action and added commit clause before the cursor gets the new value for the db_link. This solution worked very well.

By the way Document says that, I can close a database link session with “alter session close database link XXXX” clause but it did not worked as I expected. You still need to do commit or rollback before closing a db_link session to do that.  Also  after commit, the session  stays  on the remote machine but the sessions are put on list to be closed when a new db_link opened. (list works last in first out (before commit) fashion as can be seen from demonstration ).

Cole Listing 9

The only benefit of alter session close database link is closing a link with knowledge otherwise LIFO will work after commit.

 References Used;

 Oracle® Database Administrator’s Guide 10g Release 2 (10.2)

Altering Without Privileges

There was a mail in Turkish Oracle Users group mailing list about “why a users can still alter a table despite all privileges has been revoked (only connect privilege is left). My answer was to double check if there is any privilege was left by selecting DBA_TAB_PRIVS for that user, but the best answer send by Yasin Baskan (Oracle Today). He claimed and proved that, if the objects are created/owned by that user he can still alter the objects without any privilege. Below is how he proved his claim and a little addition from me (privileges on different schema test).

First create a user with connect and dba privileges.

usercreate

Create tables with that user under his own schema;

create table

Create tables with that user under HR schema;

create table on hr

Revoke DBA privilage from user; (the costst user is must disconnect after revoke to connect with new privs)

revoke

Test ddl on table ;

ddl

He is still altering but can he create new ?

new

What about the table he created on HR schema, can he alter it ?He couldn’t as I expected because he doesn’t have access to that schema.

hrddl

What privileges user has ? (Checked by find_all_privs.sql script written by Pete Finnegan ). CONNECT and CREATE SESSION

privs

What if other one creates a table on coststs schema. He still has access  as you might guess

otheruser

The privileges comes from schema owning and this is the expected behaviour. You can not disable this option, so  you must think different ways  such as creating different user or moving tables under another schema, to prevent a user from  accessing tablesunder its schem.

Special thanks goes to Yasin  for this entry.

It has been over 2 months since my last entry, but i have reasons to be offline, like major location and job changes. I moved to UK (London) 2 months ago and joined to a huge organization. I am not just an Oracle DBA anymore, so you should see some SQL Server posts in this blog, please do not panic when you see them :) I am still in an endless love with Oracle.

First of all, I am sorry that I can’t write any adventures about UKOUG because one of my colleagues already arranged his attendance when I joined the company. I hope I will write my own adventures next year.

Now its time to write technical stuff. Today we faced “ORA-1652: unable to extend temp segment”. One of my team members asked my opinion about “why the user getting this error on a normal tablespace instead of TEMP tablespace”. He said, the user was trying to create a table by using “create table xxx as select” clause. At first I thought that this should not happen, but when I re-think, I realized that, this is behaviour is like index rebuild. “Create table as select” must be different then “insert into” clause. In my opinion, It first uses temp segments on a permanent tablespace and if it is successful then it converts the segments as permanent. (Updated: Metalink note 181132.1 Thanks YAS) Lets see how

User HR using USERS tablespace as default tablespace and TEMP as temporary tablespace;

tablespace

Size of the source table T1 is 264MB;

tablesize

Available free space on USERS tablespace is;

free space

Auto Extend option on USERS tablespace is off;

autoextend

So I don’t have enough space for new table T2. Lets see what error is raised with different clauses.

When I try to create table T2 from T1 I got error 01652 which is about using temp segment on default tablespace USERS.

unable to extent temp segment

When I try to insert data to T2 from T1 I got error 01653 which is about cannot allocate normal extent on default tablespace. (where 1=2 syntax is just for creating the same table structure without data)
normal segments

Lets try it about with index rebuild. As you might know you need free space with the same size of index when you to rebuld it.

First set auto extend on with 16MB extents;

Then create an index on T1;

create index

Size of the index is 43MB;

index size

Set auto extent off while free space on USERS tablespace is 6 MB;


autoext

When I tried to rebuild the index I expect to get temp segment error ORA-10252 on normal USERS tablespace, because it is really a temp usage.

rebuild

Moral of the story is, temp segments are not always located on TEMP tablespaces.

updated (11/12/2007): Taken from metalink note id 181132.1

4. Temporary Segments for Permanent Segments Creation
—————————————————–
Besides sort operations, there are other SQL operations, which also require
temporary segments:
–> CREATE PRIMARY/UNIQUE KEY CONSTRAINT
–> ALTER TABLE … ENABLE PRIMARY/UNIQUE CONSTRAINT
–> CREATE TABLE STORAGE (MINEXTENTS>1)
–> CREATE TABLE AS SELECT
    –> The CTAS creates a data segment in the target tablespace and marks this
        segment as temporary in dictionary. On completion, the dictionary type
        is changed from temporary to table. In addition, if the SELECT performs
        a SORT operation, temporary space may be used as for a standard select.
    –> For a Parallel CTAS statement, each slave builds its own data segment
        (marked as temporary in the dictionary) from the row source which feeds
        it.
        Similarly, for Parallel Direct Load or Parallel Insert, each slave
        process creates its own single temporary segment to load data into.
–> CREATE PARTITION TABLE
–> ALTER TABLE … SPLIT PARTITION
–> CREATE SNAPSHOT
–> CREATE INDEX
    The CREATE INDEX statement, after sorting the index values, builds a
    temporary segment in the INDEX tablespace; once the index is completely
    built, the segment type is changed to INDEX.
–> ALTER INDEX REBUILD
    During an index rebuild, besides the temporary segments used to store
    partial sort (segments built in the user’s default TEMPORARY tablespace),
    Oracle uses a segment which is defined as a temporary segment until the
    rebuild is complete.
    Once this segment is fully populated, the old index can be dropped and the
    temporary segment is redefined as a permanent segment with the index name.
    The new version of the index, currently a temporary segment, resides in the
    tablespace where the index is required. Note that the old index segment that
    is to be dropped is itself converted to a temporary segment first (like drop
    ping a table). Therefore, an index rebuild involves three temporary segments,
    one of which is a sort segment, that all may be located in different tablespaces.
–> DROP TABLE
    Oracle first converts the segment to a temporary segment, and starts
    cleaning up the now temporary segments extents. If the drop is interrupted,
    the temporary segment is cleaned up by SMON. If the SMON is interrupted by a
    shutdown abort, this may cause serious problem, and the total time to
    cleanup is increased.


Nowadays, I am working on performance tuning guide of 10gR2 and I found it is worthy to write about shared pool management after reading the metalink note Note:396940.1 Troubleshooting and Diagnosing ORA-4031 Error.

Note: This blog entry is my summary of the references at the bottom and to be honest I wrote this long entry for myself for not to get lost between those references again :)

Shared pool is used to cache different types of data such as textual and executable forms of PL/SQL blocks and SQL statements, dictionary cache data, and other data. If you use shared pool effectively you can reduce resource consumption in at least four ways

  1. Parse overhead is avoided if the SQL statement is already in the shared pool. This saves CPU resources on the host and elapsed time for the end user.
  2. Latching resource usage is significantly reduced, which results in greater scalability.
  3. Shared pool memory requirements are reduced, because all applications use the same pool of SQL statements and dictionary resources.
  4. I/O resources are saved, because dictionary elements that are in the shared pool do not require disk access.

Main components of shared pool are library cache (executable forms of SQL cursors, PL/SQL programs, and Java classes.) and the dictionary cache (usernames, segment information, profile data, tablespace information, and sequence numbers. ).The library cache stores the executable (parsed or compiled) form of recently referenced SQL and PL/SQL code. The dictionary cache stores data referenced from the data dictionary. This caches are managed by LRU algorithm to “age out” memory structures that have not been reused over time. Allocation of memory from the shared pool is performed in chunks. This allows large objects (over 5k) to be loaded into the cache without requiring a single contiguous area, hence reducing the possibility of running out of enough contiguous memory due to fragmentation. Starting with 9i The Shared Pool divide its shared memory areas into subpools. Each subpool will have Free List Buckets (containing pointers to memory chunks within the subpool ) and , memory structure entries, and LRU list. This architecture is designed to to increase the throughput of shared pool in that now each subpool is protected by a Pool child latch. This means there is no longer contention in the Shared Pool for a single latch as in earlier versions.

Infrequently, Java, PL/SQL, or SQL cursors may make allocations out of the shared pool that are larger than 5k then Oracle must search for and free enough memory to satisfy this request. This operation could conceivably hold the latch resource for detectable periods of time, causing minor disruption to other concurrent attempts at memory allocation. To allow these allocations to occur most efficiently, Oracle segregates a small amount of the shared pool. This memory is used if the shared pool does not have enough space. The segregated area of the shared pool is called the reserved pool which is also divided into subpools. Smaller objects will not fragment the reserved list, helping to ensure the reserved list will have large contiguous chunks of memory. Once the memory allocated from the reserved list is freed, it returns to the reserved list.

By using automatic shared memory management (ASMM) option available with 10G, which is activated by setting SGA_TARGET parameter with a value greater than 0, shared pool size is started to be managed by Oracle, under the limits of SGA_TARGET and other SGA components.

After these explanations lets start to explain how to manage shared pool with ASMM.

1- Using Shared Pool Effectively

avoid hard parsing by

  • using bind variables instead of literal values in your queries The script below can be used to find sqls which use literals
    SQL> SELECT substr(sql_text,1,40) “SQL”,
    2 count(*) ,
    3 sum(executions) “TotExecs”;
    4 FROM v$sqlarea
    5 WHERE executions < 5
    6 GROUP BY substr(sql_text,1,40)
    7 HAVING count(*) > 30
    8 ORDER BY 2
    9 ;no rows selected
  • Standardizing naming conventions for bind variables and spacing conventions for SQL statements and PL/SQL blocks.
  • Because they are stored, Consider using stored procedures whenever possible
  • Avoiding users from change the optimization approach and goal for their individual sessions.
  • Reducing the number of entries in the dictionary cache by explicitly qualifying the segment owner, rather than using public synonyms or try to connect to the database through a single user ID, rather than individual user IDs because Reducing the number of distinct userIDs also reduces the load on the dictionary cache.SELECT employee_id FROM hr.employees WHERE department_id = :dept_id;
  • Using PL/SQL packages when your system has thousands of users, each with individual user sign-on and public synonyms because a package is executed as the owner, rather than the caller, which reduces the dictionary cache load considerably.
  • Avoid performing DDLs in peak hours because ddl operations invalidate the dependent SQLs and cause hard parsing when the statement called again.
  • Cache the sequence numbers by using CACHE keyword of CREATE/ALTER SEQUENCE clause to reduce the frequency of dictionary cache locks,
  • Try to avoid closing of rapidly executed cursors
  • Check for hash values which maps different literals. The query below should return no rows otherwise there is possibility for a bug
    SQL> SELECT hash_value, count(*)
    2 FROM v$sqlarea
    3 GROUP BY hash_value
    4 HAVING count(*) > 5
    5 ;
    no rows selected

3- Identify which sqls are using lots of shared memory

SQL> SELECT substr(sql_text,1,20) “Stmt”, count(*),
2 sum(sharable_mem) “Mem”,
3 sum(users_opening) “Open”,
4 sum(executions) “Exec”
5 FROM v$sql
6 GROUP BY substr(sql_text,1,20)
7 HAVING sum(sharable_mem) > 1426063 –%10 of Shared Pool Size
8 ;

Stmt COUNT(*) Mem Open Exec
/* OracleOEM */ SEL 18 1445971 2 54

This should show if there are similar literal statements, or multiple versions of a statements which account for a large portion of the memory in the shared pool.

4- Identify which allocations causing shared pool to be aged out

spool ageout.txtSELECT *FROM x$ksmlru

WHERE ksmlrnum>0

;

spool off;

This select returns no more than 10 rows and then erases the contents of the X$KSMLRU table so be sure to SPOOL the output. The X$KSMLRU table shows which memory allocations have caused the MOST memory chunks to be thrown out of the shared pool since it was last queried. This is sometimes useful to help identify sessions or statements which are continually causing space to be requested.

5- Why are there multiple child cursors.

V$SQL_SHARED_CURSOR explains why a particular child cursor is not shared with existing child cursors. Each column identifies a specific reason why the cursor cannot be shared.

SELECT SA.SQL_TEXT,SA.VERSION_COUNT,SS.*FROM V$SQLAREA SA,V$SQL_SHARED_CURSOR SSWHERE SA.ADDRESS=SS.ADDRESS

AND SA.VERSION_COUNT > 5

ORDER BY SA.VERSION_COUNT ;

6- Monitor Shared Pool sizing operations

You can see the shrinking and growing operations from V$SGA_RESIZE_OPS dynamic view and you can guess why there is need for this operations by focusing the sql at the sizing operation times.

SQL> select to_char(end_time, ‘dd-Mon-yyyy hh24:mi’) end, oper_type, initial_size,
2 target_size, final_size from V$SGA_RESIZE_OPS
3 where component=’shared pool
4 order by end;

END OPER_TYPE INITIAL_SIZE TARGET_SIZE FINAL_SIZE
12-Sep-2007 19:05 STATIC 0 134217728 134217728
12-Sep-2007 22:01 SHRINK 134217728 130023424 130023424
13-Sep-2007 11:35 SHRINK 130023424 125829120 125829120
13-Sep-2007 11:36 SHRINK 125829120 121634816 121634816
13-Sep-2007 22:08 GROW 121634816 125829120 125829120
13-Sep-2007 22:09 GROW 125829120 130023424 130023424
13-Sep-2007 22:10 GROW 130023424 134217728 134217728
13-Sep-2007 22:12 GROW 134217728 138412032 138412032
14-Sep-2007 09:49 GROW 138412032 142606336 142606336
14-Sep-2007 16:13 GROW 142606336 146800640 146800640

10 rows selected.

7- Minimum Size of Shared Pool

Current size of the shared pool;

SQL> select bytes from v$sgainfo where name=’Shared Pool Size’;BYTES

——————–

138412032

You can use the sizig advices from the view v$shared_pool_advice. This view displays information about estimated parse time in the shared pool for different pool sizes and the sizes range from %10 to %200 of current shared pool size. This can give you idea for sizing SGA and obliquely shared pool by the help of ASMM.

select * from V$SHARED_POOL_ADVICE;

Suggested minimum shared pool size;

SQL> select cr_shared_pool_size,
2 sum_obj_size, sum_sql_size,
3 sum_user_size,
4 (sum_obj_size + sum_sql_size+sum_user_size)* 1.3 min_shared_pool
5 from (select sum(sharable_mem) sum_obj_size
6 from v$db_object_cache where type<> ‘CURSOR’),
7 (select sum(sharable_mem) sum_sql_size from v$sqlarea),
8 (select sum(250*users_opening) sum_user_size from v$sqlarea),
9 (select to_Number(b.ksppstvl) cr_shared_pool_size
10 from x$ksppi a, x$ksppcv b, x$ksppsv c
11 where a.indx = b.indx and a.indx = c.indx
12 and a.ksppinm =’__shared_pool_size’ );

CR_SHARED_POOL_SIZE SUM_OBJ_SIZE SUM_SQL_SIZE SUM_USER_SIZE MIN_SHARED_POOL
146800640 9520659 25660770 11750 45751132,7

You should set the suggested minimum shared pool size to avoid shrinking operation of ASMM

SQL> alter system set shared_pool_size=73M;System altered.

8- How much free memory in SGA is available for shared pool and how to interpret the free memory

First of all find the free memory in shared pool. If you have free memory you should relax but if you don’t have go to the step below

SQL> SELECT * FROM V$SGASTAT
2 WHERE NAME = ‘FREE MEMORY’
3 AND POOL = ‘SHARED POOL’;no rows selected

The X$KSMSP view shows the breakdown of memory in the SGA. You can run this query to build trend information on memory usage in the SGA. Remember, the ‘free’ class in this query is not specific to the Shared Pool, but is across the SGA. Dont use the script below when db is under load. Check out Jonathan Lewis’s experiences on this view from here

SQL> SELECT KSMCHCLS CLASS, COUNT(KSMCHCLS) NUM, SUM(KSMCHSIZ) SIZ,
2 To_char( ((SUM(KSMCHSIZ)/COUNT(KSMCHCLS)/1024)),’999,999.00′)||’k’ “AVG SIZE”
3 FROM X$KSMSP GROUP BY KSMCHCLS;

CLASS NUM SIZ AVG SIZE
freeabl 19010 34519404 1.77k
recr 23581 24967956 1.03k
R-freea 68 1632 .02k
perm 22 39801268 1,766.75k
R-free 34 7238192 207.90k
free 2389 36075980 14.75k

Watch for trends using these guidelines:

a) if ‘free’ memory is low (less than 5mb or so) you may need to increase the shared_pool_size and shared_pool_reserved_size. You should expect ‘free’ memory to increase and decrease over time. Seeing trends where ‘free’ memory decreases consistently is not necessarily a problem, but seeing consistent spikes up and down could be a problem.
b) if ‘freeable’ or ‘perm’ memory continually grows then it is possible you are seeing a memory bug.
c) if ‘freeabl’ and ‘recr’ memory classes are always huge, this indicates that you have a lot of cursor info stored that is not releasing.
d) if ‘free’ memory is huge but you are still getting 4031 errors, the problem is likely reloads and invalids in the library cache causing fragmentation.
!!!!!!!!!! Note says that this query can hang database on HP platforms

To see the free memory chunks detailed use the script below

SQL> select KSMCHIDX “SubPool”, ’sga heap(’||KSMCHIDX||’,0)’sga_heap,ksmchcom ChunkComment,
2 decode(round(ksmchsiz/1000),0,’0-1K’, 1,’1-2K’, 2,’2-3K’,3,’3-4K’,
3 4,’4-5K’,5,’5-6k’,6,’6-7k’,7,’7-8k’,8,
4 ‘8-9k’, 9,’9-10k’,'> 10K’) “size”,
5 count(*),ksmchcls Status, sum(ksmchsiz) Bytes
6 from x$ksmsp
7 where KSMCHCOM = ‘free memory’
8 group by ksmchidx, ksmchcls,
9 ’sga heap(’||KSMCHIDX||’,0)’,ksmchcom, ksmchcls,decode(round(ksmchsiz/1000),0,’0-1K’,
10 1,’1-2K’, 2,’2-3K’, 3,’3-4K’,4,’4-5K’,5,’5-6k’,6,
11 ‘6-7k’,7,’7-8k’,8,’8-9k’, 9,’9-10k’,'> 10K’);

SubPool SGA_HEAP CHUNKCOMMENT size COUNT(*) STATUS BYTES
1 sga heap(1,0) free memory > 10K 34 R-free 7238192
1 sga heap(1,0) free memory 3-4K 2 free 6284
1 sga heap(1,0) free memory > 10K 241 free 35707400
1 sga heap(1,0) free memory 8-9k 1 free 7712
1 sga heap(1,0) free memory 2-3K 4 free 6752
1 sga heap(1,0) free memory 0-1K 2090 free 133288
1 sga heap(1,0) free memory 9-10k 21 free 188676
1 sga heap(1,0) free memory 1-2K 30 free 25868

8 rows selected.

If you see lack of large chunks it is possible that you can face with ORA-04031 in near future.
9- Is library_cache or dictionary_cache utilization satisfactory ?

The statistics below is based since the start of the instance. You should take interval statistics to interpret these values for performance issues .

  • Library Cache Stats
SQL> SELECT NAMESPACE, PINS, PINHITS, RELOADS, INVALIDATIONS
2 FROM V$LIBRARYCACHE
3 ORDER BY NAMESPACE;

NAMESPACE PINS PINHITS RELOADS INVALIDATIONS
BODY 72782 72582 49 0
CLUSTER 1175 1161 3 0
INDEX 2800 2023 42 0
JAVA DATA 0 0 0 0
JAVA RESOURCE 0 0 0 0
JAVA SOURCE 0 0 0 0
OBJECT 0 0 0 0
PIPE 0 0 0 0
SQL AREA 563349 541678 2069 342
TABLE/PROCEDURE 175850 165318 2005 0
TRIGGER 6923 6802 34 0

11 rows selected.

High invalidations indicates that there is parsing problem with the namespace and high reloads indicates that there is a sizing problem which causes aging out.

  • Library cache hit ratio;
SQL> SELECT SUM(PINHITS)/SUM(PINS) FROM V$LIBRARYCACHE;,SUM(PINHITS)/SUM(PINS)———————-,95558088

low hit ratio is an indication of a sizing or caching problem

  • Dictionary cache stats
SQL> SELECT PARAMETER
2 , SUM(GETS)
3 , SUM(GETMISSES)
4 , 100*SUM(GETS - GETMISSES) / SUM(GETS) PCT_SUCC_GETS
5 , SUM(MODIFICATIONS) UPDATES
6 FROM V$ROWCACHE
7 WHERE GETS > 0
8 GROUP BY PARAMETER;

PARAMETER SUM(GETS) SUM(GETMISSES) PCT_SUCC_GETS UPDATES
dc_constraints 99 35 64,6464646 99
dc_tablespaces 90104 14 99,9844624 0
dc_tablespace_quotas 13 3 76,9230769 0
dc_awr_control 1351 2 99,8519615 121
dc_object_grants 867 174 79,9307958 0
dc_histogram_data 52053 6181 88,1255643 3047
dc_rollback_segments 55098 92 99,8330248 263
dc_sequences 100 27 73 100
dc_usernames 6632 33 99,5024125 0
dc_segments 23404 2466 89,4633396 331
dc_objects 37434 3776 89,9129134 358
dc_histogram_defs 65987 16796 74,5465016 3280
dc_table_scns 8 8 0 0
dc_users 171638 105 99,9388247 0
outstanding_alerts 1674 58 96,5352449 66
dc_files 80 10 87,5 0
dc_object_ids 134005 2646 98,0254468 123
dc_global_oids 52337 185 99,6465216 0
dc_profiles 1962 4 99,7961264 0

19 rows selected.

High updates with low pct_succ_gets can be a clue of performance problems when accessing that dictionary object. For frequently accessed dictionary caches, the ratio of total GETMISSES to total GETS should be less than 10% or 15%, depending on the application. If this ratio is higher and every previous control is OK then you should consider to increase the shared pool size

  • Dictionary cache hit ratio;
SQL> SELECT (SUM(GETS - GETMISSES - FIXED)) / SUM(GETS) “ROW CACHE” FROM V$ROWCACHE;ROW CACHE

——————–,

9516921886454345524

Low hit ratio is an indication of a sizing problem.

10- Are there any objects candidate for library cache pinning ?

Having objects pinned will reduce fragmentation and changes of encountering the ORA-04031 error. Objects causing a large number of other objects been flushed out from the shared pool are candidates to be pinned into the shared pool using dbms_shared_pool.keep procedure. You can check the x$ksmlru fixed table to see the candidates. This table keeps track of the objects and the corresponding number of objects flushed out of the shared pool to allocate space for the load. These objects are stored and flushed out based on the Least Recently Used (LRU) algorithm. Because this is a fixed table, once you query the table, Oracle will automatically reset the table so first insert the contents to temporary table like below,

CREATE TABLE LRU_TMP AS SELECT * FROM X$KSMLRU;

and on regular intervals issue

INSERT INTO LRU_TMP SELECT * FROM X$KSMLRU;

Use the LRU_TMP table for analysis. You can use a query below to see more information on candidate code in the library cache.

SELECT USERNAME, KSMLRCOM, KSMLRHON, KSMLRNUM, KSMLRSIZ, SQL_TEXT
FROM V$SQLAREA A, LRU_TMP K, V$SESSION S
WHERE KSMLRSIZ > 3000
AND A.ADDRESS=S.SQL_ADDRESS AND A.HASH_VALUE = S.SQL_HASH_VALUE
AND SADDR=KSMLRSES;

You can see the candidates to pin from the query below

COL STORED_OBJECT FORMAT A40;
COL SQ_EXECUTIONS FORMAT 999,999;
SELECT /*+ ORDERED USE_HASH(D) USE_HASH(C) */
O.KGLNAOWN||’.'||O.KGLNAOBJ STORED_OBJECT,
SUM(C.KGLHDEXC) SQL_EXECUTIONS
FROM SYS.X$KGLOB O, SYS.X$KGLRD D, SYS.X$KGLCURSOR C
WHERE
O.INST_ID = USERENV(’INSTANCE’) AND
D.INST_ID = USERENV(’INSTANCE’) AND
C.INST_ID = USERENV(’INSTANCE’) AND
O.KGLOBTYP IN (7, 8, 9, 11, 12) AND
D.KGLHDCDR = O.KGLHDADR AND
C.KGLHDPAR = D.KGLRDHDL
GROUP BY O.KGLNAOWN, O.KGLNAOBJ
HAVING SUM(C.KGLHDEXC) > 0
ORDER BY 2 DESC;

You should pin objects you find immediatelly after the each restart of instance. You can pin the object by DBMS_SHARED_POOL package like below

EXECUTE DBMS_SHARED_POOL.KEEP(OWNER.TRIGGER, ‘R’)

11- Is my Reserved Area sized properly?

An ORA-04031 error referencing a large failed requests indicates the Reserved Area is too fragmented.

SQL> col free_space for 999,999,999,999 head “TOTAL FREE”
SQL> col avg_free_size for 999,999,999,999 head “AVERAGE|CHUNK SIZE
SQL> col free_count for 999,999,999,999 head “COUNT”
SQL> col request_misses for 999,999,999,999 head “REQUEST|MISSES
SQL> col request_failures for 999,999,999,999 head “REQUEST|FAILURES”
SQL> col max_free_size for 999,999,999,999 head “LARGEST CHUNK”
SQL> select free_space, avg_free_size, free_count, max_free_size, request_misses, request_failures
2 from v$shared_pool_reserved;

TOTAL FREE AVERAGE
CHUNK SIZE
COUNT LARGEST CHUNK REQUEST
MISSES
REQUEST
FAILURES
7,238,192 212,888 34 212,888 0 0

The reserved pool is small when:

REQUEST_FAILURES > 0 (and increasing)

The DBA should Increase shared_pool_reserved_size and shared_pool_size together.

It is possible that too much memory has been allocated to the reserved list.
If:

REQUEST_MISS = 0 or not increasing
FREE_MEMORY = > 50% of shared_pool_reserved_size minimum

The DBA should Decrease shared_pool_reserved_size

You should also use hidden and unsupported parameter “_shared_pool_reserved_pct” to control reserved pool. This parameter controls the allocated percentage of shared pool for reserved pool. By default it is %5 of the shared pool and if you use ASMM for memory management you can set this value higher like 10 to allocate reserved pool dynamically. When you set the parameter you will see the shared_pool_reserved_size parameter will be adjusted to the new setting.

The parameter can not be modified when instance is started. You can use the query below to see the current value

SQL> select a.ksppinm “Parameter”,
2 b.ksppstvl “Session Value”,
3 c.ksppstvl “Instance Value”
4 from sys.x$ksppi a, sys.x$ksppcv b, sys.x$ksppsv c
5 where a.indx = b.indx and a.indx = c.indx
6 and a.ksppinm = ‘_shared_pool_reserved_pct’;

Parameter Session Value Instance Value
_shared_pool_reserved_pct 10 10

12-Is there any fragmentation in shared pool?

The primary problem that occurs is that free memory in the shared pool becomes fragmented into small pieces over time. Any attempt to allocate a large piece of memory in the shared pool will cause large amount of objects in the library cache to be flushed out and may result in an ORA-04031 out of shared memory error. But how to understand the fragmentation ?

  • Occurrence of ORA-04031 error. Before this error signalled, memory is freed from unnecessary objects and merged. This error only occurs when there is still not a large enough contiguous piece of free memory after this cleaning process. There may be very large amounts of total free memory in the shared pool, but just not enough contiguous memory.
  • Using X$KSMLRU internal fixed table. We told about this view before about its usage for tracking age out operations, it also can be used to identify what is causing the large allocations. KSMLRSIZ column of this table shows the amount of contiguous memory being allocated. Values over around 5K start to be a problem, values over 10K are a serious problem, and values over 20K are very serious problems. Anything less then 5K should not be a problem. Again be careful to save spool the result when you query this table
    select * from x$ksmlru where ksmlrsiz > 5000;

After finding the result you should do the followings to correct fragmentation

  • Keep object by pinning them as we discussed above
  • Use bind variables as we discussed before
  • Eliminate large anonymous PL/SQL block. Large anonymous PL/SQL blocks should be turned into small anonymous PL/SQL blocks that call packaged functions. The packages should be ‘kept’ in memory. To view candidates
    select sql_text from v$sqlarea
    where command_type=47 — command type for anonymous block
    and length(sql_text) > 500;

Fallacies about solving shared pool fragmentation

  • Free memory in shared pool prevents fragmentation. This is not true because Free memory is more properly thought of as ‘wasted memory’. You would rather see this value be low than very high. In fact, a high value of free memory is sometimes a symptom that a lot of objects have been aged out of the shared pool and therefore the system is experiencing fragmentation problems.
  • Flushing shared pool frequently solves fragmentation and improves performance. This is also incorrect because Executing this statement causes a big spike in performance and does nothing to improve fragmentation. You lost your cached cursors when you flush and they will hard parsed next time with high CPU consumption.

13- Using related database parameters

  • CURSOR_SHARING: Setting this parameter to smilar can solve your hard parse problems caused by using literals but can have side effects mostly on DSS environments and systems which uses stored outlines.
  • CURSOR_SPACE_FOR_TIME: This parameter specifies whether a cursor can be deallocated from the library cache to make room for a new SQL statement. CURSOR_SPACE_FOR_TIME has the following values meanings:
    • If CURSOR_SPACE_FOR_TIME is set to false (the default), then a cursor can be deallocated from the library cache regardless of whether application cursors associated with its SQL statement are open. In this case, Oracle must verify that the cursor containing the SQL statement is in the library cache.
    • If CURSOR_SPACE_FOR_TIME is set to true, then a cursor can be deallocated only when all application cursors associated with its statement are closed. In this case, Oracle need not verify that a cursor is in the cache, because it cannot be deallocated while an application cursor associated with it is open.

You must be sure that the shared pool is large enough for the work load otherwise performance will be badly affected and ORA-4031 eventually signalled.

  • OPEN_CURSORS: This parameter sets the upper bound for the number of cursor that a session can have open and if you size it correctly, cached cursors can be stay opened and won’t have to be closed to let new cursor open
  • PROCESSES / SESSIONS: You can review the high water mark for Sessions and Processes in the V$RESOURCE_LIMIT view. If the hard-coded values for these parameters are much higher than the high water mark information, consider decreasing the parameter settings to free up some memory in the Shared Pool for other uses.
  • SESSION_CACHED_CURSORS: When a cursor is closed, Oracle divorces all association between the session and the library cache state. If no other session has the same cursor opened, the library cache object and its heaps are unpinned and available for an LRU operation. The parameter SESSION_CACHED_CURSORS controls the number of cursors “soft” closed, much like the cached PL/SQL cursors. Oracle checks the library cache to determine whether more than three parse requests have been issued on a given statement. If so, then Oracle assumes that the session cursor associated with the statement should be cached and moves the cursor into the session cursor cache. Subsequent requests to parse that SQL statement by the same session then find the cursor in the session cursor cache.To determine whether the session cursor cache is sufficiently large for your instance, you can examine the session statistic session cursor cache hits in the V$SYSSTAT view. This statistic counts the number of times a parse call found a cursor in the session cursor cache. If this statistic is a relatively low percentage of the total parse call count for the session, then consider setting SESSION_CACHED_CURSORS to a larger value. Steve Adams also wrote usefully queries to find the usage and the maximum cacheable cursors.session_cursor_cache.sql

This was a long article and if you see anything wrong or suspicious please feel free to comment for correction

All of the queries are tested on Oracle 10.2.0.3 for Windows

Code Depot of The Queries (All scripts are taken from metalink notes and official documentation)
References :

Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)

Metalink Notes

Note:396940.1 Troubleshooting and Diagnosing ORA-4031 Error

Note:146599.1 Diagnosing and Resolving Error ORA-04031

Note:61623.1 Resolving Shared Pool Fragmentation In Oracle7

Note:62143.1 Understanding and Tuning the Shared Pool

Note:1012047.6 How To Pin Objects in Your Shared Pool

Note:274496.1 ora-7445 and ora-4031 in 9.2.0.5 and 10g if SESSION_CACHED_CURSORS is used

www.ixora.com Oracle Advanced Performance Tuning Scripts

Mailing list threads from Oracle-l

http://www.freelists.org/archives/oracle-l/08-2007/msg00975.html

While searching through the metalink I saw an article Doc ID: 372996.1 “Using RMAN to Restore and Recover a Database When the Repository and Spfile/Init.ora Files Are Also Lost”. I wondered, how I can do it, but it only references to an internal article DocID 60545.1 “How to Extract Controlfiles, Datafiles, and Archived Logs from RMAN Backupsets.” with an advice of raising a server request.

The only thing I want is just try the case without disturbing Oracle Support , so I asked the help of google and it helped me as it always do.

Now its time to share what I learned, but please do, what Metalink says ” Raise a Service Request with Oracle Support Services“, when you are in this worst scenario.

First of all, I took a database backup of my test server DABAK, while controlfile autobackup option of. As it is written in official documents, RMAN took controlfile and spfile backups automatically because I backup datafile 1 (SYSTEM) .

backup1.jpg

Here is the backup list.

backup11.jpg

Then I move the spfileDABAK.ora, initDABAK.ora and forced the database to start without parameter file like below.

backuo2.jpg

After finishing the first part of the recovery, I moved into the second step ,which is restoring controlfile from backup pieces by calling undocumented dbms_restore package in a PL/SQL block.

restore_foundationsql.txt

DECLARE
v_dev varchar2(50); — device type allocated for restore
v_done boolean; — has the controlfile been fully extracted yet
type t_fileTable is table of varchar2(255)
index by binary_integer;
v_fileTable t_fileTable; — Stores the backuppiece names
v_maxPieces number:=1; — Number of backuppieces in backupsetBEGIN
– Initialise the filetable & number of backup pieces in the backupset
– This section of code MUST be edited to reflect the customer’s available
– backupset before the procedure is compiled and run. In this example, the
– backupset consists of 4 pieces:v_fileTable(1):=’C:\backup_test\ORA_DF631909845_S144_P1_C1′; –controlfile
v_fileTable(2):=‘C:\backup_test\ORA_DF631909818_S143_P1_C1′; –spfile
v_maxPieces:=2; —number of backup pieces

– Allocate a device. If the backuppiece is on disk, specify type=>null if it is on tape then spesify ’sbt_tape’

v_dev:=sys.dbms_backup_restore.deviceAllocate(type=>null, ident=>’d1′);

– Begin the restore conversation
sys.dbms_backup_restore.restoreSetDatafile;

– Specify where the controlfile is to be recreated
sys.dbms_backup_restore.restoreControlfileTo(cfname=>‘D:\oracle\oradata\DABAK_TEST\control01.ctl’); –ctrlfile

sys.dbms_backup_restore.restorespfileto(’D:\oracle\oradata\DABAK_TEST\spfile’); –spfile location

– Restore the controlfile
FOR i IN 1..v_maxPieces LOOP
sys.dbms_backup_restore.restoreBackupPiece(done=>v_done, handle=>v_fileTable(i), params=>null);
IF v_done THEN
GOTO all_done;
END IF;
END LOOP;

<<all_done>>
– Deallocate the device
sys.dbms_backup_restore.deviceDeallocate;

END;
/

The only things you must change in the script are backup locations and restore locations . (I don’t know the author of the script so I am sorry for not mentioning the author). I changed them and start the script but the error stack below raised. (Script will ask you a value you can just enter 1 and continue)

backup5.jpg
I ‘m sure that, sp file is in the backup set, so I decided to comment out spfile line of dbms_restore and start the script for only controlfile.

backup7.jpg

As you see it worked without any error.
After restoring controlfile, I commented out controlfile line and restored the spfile.

backup8.jpg

Note that, Mine was just a test database and I reached my goal without any support but you must consult Oracle Support before trying this solution !!!!

For further steps of this recovery you can visit the reference site.

References Used:

Advenced Recovery Article

While digging into Oracle 10gR2 Backup and Recovery Advanced User’s Guide document I came across a nice utility called orakill. This utility can be called “kill -9 of Windows for Oracle”. As you know Unix operating systems based on processes that fork another processes and you can see all of the working Oracle processes by using “ps -ef”, unlike unix, windows is thread based and Oracle can be seen working as only one process called oracle.exe. If you want to see the threads you must your programs like process explorer. If you are in a position to kill an oracle thread from OS on windows you can use orakill utility instead of killing oracle.exe

You might ask why should I use orakill ? The answer is given by www.oracleutilities.com.

1. The alter system statement will not clear any locks that exist. Instead, the session will remain connected until it times out, then the session is killed and the locks are released. The orakill command will kill the thread and the locks instantly.  (Check out the comments of Yasin about this case  )

2. A DBA may be unable to gain access to a SQL prompt due to a runaway query consuming all database resources. In this case, the session can be killed without ever logging in to the database.

You can use the syntax below when you are in a situation like above.

orakill utility usage

Thread number can be obtained by SPID column of v$process table. If you are unable to query this tables you can use QuickSlice from Windows.

Here is quick demo to show how to use orakill utility.

orakill utility demo

It is interesting that it is only mentioned in 1 book of Online documentation ???

References Used :

www.oracle.utilities.com

Oracle® Database Backup and Recovery Advanced User’s Guide
10g Release 2 (10.2)

GSM Giant Turkcell has a  very nice internship program about Oracle. Hasan Tonguc, who is working very hard to widen True Oracle Usage  in Turkey, is leading to this program. Hasan and his collegues prepare a schedule and teach Oracle to the newbies in both theorical and practical way. One of the interns,  Bilal Hatipoglu shares  his experiences about this internship in his own blog (in Turkish) . After digging his blog,  I started to think to go back his ages and try to participate  this internship program.  If I had a chance to learn Oracle that much at the beginning of my twenties, I might be nominated for ACE award now.  Bilal’s blog is a fruit of real internship and I wish every student has a chance to participate this kind of fully utilized internships.  Thanks Hasan and his friends for teaching Oracle to newbies , encouraging them to share what they learn and I wish good luck to Bilal  in his professional life.

Older Posts »