Warning : This long blog is for educational and quality purposes and especially targeting Junior DBAs and DBAs who does not know how to use the Session Snapper
(greatest Oracle Tuning Tool yet) by Tanel Poder . If you feel comfortable with snapper and you think your relationship with your
clients just rocks then there is no need to read the post but you may still give it a go 🙂
Story started with a long email thread forward by a developer who is asking question below for a system I am responsible but not feeling perfectly comfortable about how things really work.
Hi Josh (note to reader: Coskan is pronounced as Joshkahn so I use Josh at office) Any ideas on how we can work with dbas to find out why our database is only committing at a range of 2-30 sec for 100 commits instead of the ideal 100 commits a sec for a third party process?
I answered
From my point of view I will be requesting 1- Time interval when this process run 2- If you have SID information when this process run 3- If it gives same results on UAT as well then can we run the operation while tracing is on (which need Change Request-hassle on production) Then we can contione to understand what can be the problem
Developer replied
It runs 7/24 and UAT testing is not possible since we can't point the feeds to UAT but username is USER_XXX if it helps.
Long story short all I understood was in simplest words they want to be able to process more data in same amount of time, so I replied with a long mail below
BEGINNING OF THE EMAIL
I did some sampling on your sessions to understand what is happening and I think I can come up with rough conclusions about improvement suggestions.
Lets walk through on what is/maybe happening
This is 500 second sampling of one of your sessions
---Some important Statistics ------------------------------------------------------------------------------------------------------------------------------------- SID, USERNAME , TYPE, STATISTIC , HDELTA, HDELTA/SEC, %TIME, GRAPH ------------------------------------------------------------------------------------------------------------------------------------- 71, USER_XXX , STAT, user commits , 326, .65, 71, USER_XXX , STAT, session logical reads , 1.36M, 2.71k,
—Wait profile
71, USER_XXX , WAIT, buffer busy waits , 1.18ms, 2.36us, .0%, | | 71, USER_XXX , WAIT, log file sync , 36.12s, 72.24ms, 7.2%, |@ | 71, USER_XXX , WAIT, db file sequential read , 276.17s, 552.34ms, 55.2%, |@@@@@@ | 71, USER_XXX , WAIT, flashback buf free by RVWR , 403.28ms, 806.55us, .1%, | | 71, USER_XXX , WAIT, undo segment extension , 72.93ms, 145.86us, .0%, | | 71, USER_XXX , WAIT, SQL*Net message to client , 2.23ms, 4.47us, .0%, | | 71, USER_XXX , WAIT, SQL*Net message from client , 178.2s, 356.4ms, 35.6%, |@@@@ | 71, USER_XXX , WAIT, SQL*Net more data from client , 710.26ms, 1.42ms, .1%, | |
—Wait profile breakdown per sql
--------------------------------------------------------------------------------- Active% | SQL_ID | EVENT | WAIT_CLASS --------------------------------------------------------------------------------- 20% | ggjkmtv8cxxh3 | db file sequential read | User I/O 19% | 64djbhqv5v93u | db file sequential read | User I/O 13% | 3b8gt3dqtn345 | db file sequential read | User I/O 8% | | log file sync | Commit 2% | 8zt375nzk26rh | db file sequential read | User I/O 1% | 8ztz7p814q9u2 | db file sequential read | User I/O 1% | ggjkmtv8cxxh3 | ON CPU | ON CPU 1% | 64djbhqv5v93u | ON CPU | ON CPU 0% | 64djbhqv5v93u | flashback buf free by RVWR | Configuration 0% | 64djbhqv5v93u | SQL*Net more data from client | Network -- End of ASH snap 1, end=2012-01-11 21:51:46, seconds=500, samples_taken=499
When we check the stats, you have done 326 commits in 500 seconds which makes 0.65 commit per second but if we take 180 second of idle events (SQL*Net message to client + SQL*Net message from client) out, it is 326 commits in 320 (500-180) seconds active time which is roughly 1 commit per second.
When we look at the wait profile we can see 276 seconds of “db file sequential read” (physical read) over 320 second ofnon-iddle period and 36 second “log file sync” which we will come later
If we focus on the “db file sequential read” we can see it is %86 (276/320) of the active time and there are 3 main contributer sqls for this case.
SQLs are below
SQL> @sqlid2 ggjkmtv8cxxh3 SQL_ID HASH_VALUE SQL_TEXT ------------- ---------- --------------------------------------------------------------------------------------------------------------- ggjkmtv8cxxh3 3503224323 DELETE FROM TABLE_T WHERE ESMP = :1 1 row selected. PLAN_HASH OPT_COST PARSES H_PARSES EXECUTIONS AVG_FETCH_PE AVG_ROWS_PE AVG_ELA_SECONDS AVG_LIO_PE AVG_PIO_PE ---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- --------------- 4033870845 4 0 21 20219 .0 392.7 .26172575 1198.3 28.1
Plan hash value: 4033870845 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | | | 4 (100)| | | 1 | DELETE | TABLE_T | | | | | |* 2 | INDEX RANGE SCAN| PK_TABLE_T | 2 | 50 | 4 (0)| 00:00:01 | -----------------------------------------------------------------------------------------------
SQL> @sqlid2 64djbhqv5v93u SQL_ID HASH_VALUE SQL_TEXT ------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------ 64djbhqv5v93u 3059590266 some complex update statement which does not need to be shared PLAN_HASH OPT_COST PARSES H_PARSES EXECUTIONS AVG_FETCH_PE AVG_ROWS_PE AVG_ELA_SECONDS AVG_LIO_PE AVG_PIO_PE ---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- --------------- 1856144614 3 26 2 473152 .0 85.4 .270458035 454.4 25.4 Plan hash value: 1856144614 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | UPDATE STATEMENT | | | | 3 (100)| | | 1 | UPDATE | TABLE_XXXXXXX | | | | | |* 2 | INDEX UNIQUE SCAN| PK_TABLE_XXXXXXX | 1 | 192 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------------
SQL> @sqlid2 3b8gt3dqtn345 SQL_ID HASH_VALUE SQL_TEXT ------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------ 3b8gt3dqtn345 1838812293 some complex update statement which does not need to be shared PLAN_HASH OPT_COST PARSES H_PARSES EXECUTIONS AVG_FETCH_PE AVG_ROWS_PE AVG_ELA_SECONDS AVG_LIO_PE AVG_PIO_PE --------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- --------------- 861798698 3 0 7 22308 .0 97.5 .113096364 382.6 10.9 Plan hash value: 861798698 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | UPDATE STATEMENT | | | | 3 (100)| | | 1 | UPDATE | TABLE_ZZZZ | | | | | |* 2 | INDEX UNIQUE SCAN| PK_TABLE_ZZZZ | 1 | 69 | 2 (0)| 00:00:01 | -----------------------------------------------------------------------------------------------
There is nothing much we can do about index unique scans other than changing the table structure but I think there is room for improve on index range scan for sql_id ggjkmtv8cxxh3 for delete statement which is waitig on db file sequential read for %31 (500*%20 / 320 ) of the active time.
The reason the delete is doing index range scan is because we are trying to access 3 column composite index PK_TABLE_T by only using 1 column which is COL1. If business logic lets you also give other 2 columns as filter as well we will be doing much throughput with less LIO and PIO
TABLE_OWNER TABLE_NAME INDEX_NAME POS# COLUMN_NAME -------------------- ------------------------------ ------------------------------ ---- ------------------------------ APP_OWNER TABLE_T PK_TABLE_T 1 COL1 2 COL2 3 COL3
While I was doing sampling on above session at the same time I also did sampling on another session which is also having similar behaviour with different sqls like below.
—Some important Statistics
------------------------------------------------------------------------------------------------------------------------------------- SID, USERNAME , TYPE, STATISTIC , HDELTA, HDELTA/SEC, %TIME, GRAPH ------------------------------------------------------------------------------------------------------------------------------------- 192, USER_XXX , STAT, user commits , 343, .69, 192, USER_XXX , STAT, session logical reads , 2.78M, 5.56k,
—Wait profile
192, USER_XXX , WAIT, buffer busy waits , 702us, 1.4us, .0%, | | 192, USER_XXX , WAIT, log file switch completion , 94.34ms, 188.68us, .0%, | | 192, USER_XXX , WAIT, log file sync , 28.26s, 56.53ms, 5.7%, |@ | 192, USER_XXX , WAIT, db file sequential read , 364.3s, 728.6ms, 72.9%, |@@@@@@@@ | 192, USER_XXX , WAIT, flashback buf free by RVWR , 348.79ms, 697.57us, .1%, | | 192, USER_XXX , WAIT, undo segment extension , 492.4ms, 984.8us, .1%, | | 192, USER_XXX , WAIT, SQL*Net message to client , 1.03ms, 2.07us, .0%, | | 192, USER_XXX , WAIT, SQL*Net message from client , 94.02s, 188.05ms, 18.8%, |@@ | 192, USER_XXX , WAIT, SQL*Net more data from client , 368.33ms, 736.67us, .1%, | |
—Wait profile breakdown per sql
--------------------------------------------------------------------------------- Active% | SQL_ID | EVENT | WAIT_CLASS --------------------------------------------------------------------------------- 37% | dkh6b1nhnhfcu | db file sequential read | User I/O 32% | 29zz9n1b9sppr | db file sequential read | User I/O 7% | | log file sync | Commit 1% | dkh6b1nhnhfcu | ON CPU | ON CPU 1% | 29zz9n1b9sppr | ON CPU | ON CPU 0% | 29zz9n1b9sppr | undo segment extension | Configuration 0% | 29zz9n1b9sppr | flashback buf free by RVWR | Configuration -- End of ASH snap 1, end=2012-01-11 21:52:17, seconds=500, samples_taken=499
This time 94 seconds idle with 406 (500-94) second processing time which gives us 0.84 (343/406) commit per second and 392 seconds of wait
(364 “db file sequential read”+28 “log file sync”) wait.
When we look at the sqls below this time we can see 160 (500*%32) of 364 seconds (%43 of active time) is being spent on again index range scan (sqlid 29zz9n1b9sppr) on delete statement again due to the lack of composite index column filtering.
I again believe we can improve the access if we are able to give all columns as filter and maintain index unique scan.
TABLE_OWNER TABLE_NAME INDEX_NAME POS# COLUMN_NAME DSC -------------------- ------------------------------ ------------------------------ ---- ------------------------------ ---- APP_OWNER TABLE_Z PK_TABLE_Z 1 COL1 2 COL2 3 COL3
SQL> @sqlid2 dkh6b1nhnhfcu SQL_ID HASH_VALUE SQL_TEXT ------------- ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------ dkh6b1nhnhfcu 558381466 INSERT INTO APP_OWNER.TABLE_Z where bla bla bla bla PLAN_HASH OPT_COST PARSES H_PARSES EXECUTIONS AVG_FETCH_PE AVG_ROWS_PE AVG_ELA_SECONDS AVG_LIO_PE AVG_PIO_PE --------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- --------------- 0 1 778 15 102566 .0 1065.3 .177031281 5673.4 30.4 ------------------------------------------------- | Id | Operation | Name | Cost | ------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | 1 | LOAD TABLE CONVENTIONAL | | | -------------------------------------------------
SQL> @sqlid2 29zz9n1b9sppr SQL_ID HASH_VALUE SQL_TEXT ------------- ---------- ------------------------------------------------------------------------------------------------------- 29zz9n1b9sppr 1453086391 DELETE FROM APP_OWNER.TABLE_Z WHERE ESMP = :1 1 row selected. PLAN_HASH OPT_COST PARSES H_PARSES EXECUTIONS AVG_FETCH_PE AVG_ROWS_PE AVG_ELA_SECONDS AVG_LIO_PE AVG_PIO_PE ---------- -------- ---------- ---------- ---------- ------------ ------------ --------------- --------------- --------------- 3229062859 4 0 30 21308 .0 1136.6 .642183193 2081.5 68.1 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | | | 4 (100)| | | 1 | DELETE | TABLE_Z | | | | | |* 2 | INDEX RANGE SCAN| PK_TABLE_Z | 5 | 135 | 4 (0)| 00:00:01 | --------------------------------------------------------------------------------------
I think I managed to make it clear about the index range scan (for more information see the bonus material below), Now Lets see what the log file sync is all about and why we are seeing it .
<p
Log File sync event has got definition below in Oracle Documentation.
log file sync
When a user session commits, the session’s redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write the log buffer to the redo log file. When the LGWR has finished writing, it will post the user session.
Wait Time: The wait time includes the writing of the log buffer and the post.
Parameter | Description |
---|---|
buffer# | The number of the physical buffer in the redo log buffer that needs to be synchronized |
Potential causes
Wait Event | General Area | Possible Causes | Look for / Examine |
---|---|---|---|
log file sync |
I/O, over- committing | Slow disks that store the online logs
Un-batched commits |
Check the disks that house the online redo logs for resource contention. Check the number of transactions (commits + rollbacks ) each second, from V$SYSSTAT . |
First rule of tuning log file sync is doing less commit which means instead of commiting every dml or every single operation like
begin for i in 300001..600000 loop delete from t1 where id1=to_char(i); commit; end loop; end; /
Doing commits in batch like
begin for i in 300001..600000 loop delete from t1 where id1=to_char(i); end loop; commit; end; /
To prove how you are doing commits, I just get before sql statistics then get sampling of the session then again get after sql statistics and to see if number of user commits statistics during sampling interval are matching with number of increase in the sql executions
—-note to blog reader: this is, how I did, no rocket science (ones you arm yourself with right tools) just run the block as whole
select sql_id,executions from v$sql where sql_id in ('29zz9n1b9sppr','ggjkmtv8cxxh3'); @snapper ash=sql_id+event,stats,gather=stw,sinclude=user%commits 20 1 71,192 select sql_id,executions from v$sql where sql_id in ('29zz9n1b9sppr','ggjkmtv8cxxh3');
Results are
SQL> @test1 SQL_ID EXECUTIONS ------------- ---------- 29zz9n1b9sppr 28897 ggjkmtv8cxxh3 28211 Sampling SID 71,192 with interval 100 seconds, taking 1 snapshots... -- Session Snapper v3.53 by Tanel Poder ( http://blog.tanelpoder.com ) ------------------------------------------------------------------------------------------------------------------------------------- SID, USERNAME , TYPE, STATISTIC , HDELTA, HDELTA/SEC, %TIME, GRAPH ------------------------------------------------------------------------------------------------------------------------------------- 71, USER_XXX , STAT, user commits , 12, .12, 71, USER_XXX , TIME, repeated bind elapsed time , 59us, .59us, .0%, | | 71, USER_XXX , TIME, parse time elapsed , 127us, 1.27us, .0%, | | 71, USER_XXX , TIME, DB CPU , 278.96ms, 2.79ms, .3%, | | 71, USER_XXX , TIME, sql execute elapsed time , 36.32s, 363.21ms, 36.3%, |@@@@ | 71, USER_XXX , TIME, DB time , 40.75s, 407.51ms, 40.8%, |@@@@@ | 71, USER_XXX , WAIT, log file sync , 4.42s, 44.22ms, 4.4%, |@ | 71, USER_XXX , WAIT, db file sequential read , 35.04s, 350.43ms, 35.0%, |@@@@ | 71, USER_XXX , WAIT, SQL*Net message to client , 242us, 2.42us, .0%, | | 71, USER_XXX , WAIT, SQL*Net message from client , 60.11s, 601.05ms, 60.1%, |@@@@@@ | 71, USER_XXX , WAIT, SQL*Net more data from client , 3.94ms, 39.44us, .0%, | | 192, USER_XXX , STAT, user commits , 12, .12, 192, USER_XXX , TIME, DB CPU , 207.97ms, 2.08ms, .2%, | | 192, USER_XXX , TIME, sql execute elapsed time , 30.74s, 307.37ms, 30.7%, |@@@@ | 192, USER_XXX , TIME, DB time , 35.14s, 351.4ms, 35.1%, |@@@@ | 192, USER_XXX , WAIT, log file sync , 4.4s, 43.98ms, 4.4%, |@ | 192, USER_XXX , WAIT, db file sequential read , 29.95s, 299.51ms, 30.0%, |@@@ | 192, USER_XXX , WAIT, SQL*Net message to client , 113us, 1.13us, .0%, | | 192, USER_XXX , WAIT, SQL*Net message from client , 64.55s, 645.48ms, 64.5%, |@@@@@@@ | 192, USER_XXX , WAIT, SQL*Net more data from client , 19.36ms, 193.62us, .0%, | | -- End of Stats snap 1, end=2012-01-12 22:27:59, seconds=100 --------------------------------------------------------------- Active% | SQL_ID | EVENT --------------------------------------------------------------- 28% | 29zz9n1b9sppr | db file sequential read | 12% | ggjkmtv8cxxh3 | db file sequential read | 11% | 3b8gt3dqtn345 | db file sequential read | 9% | | log file sync | 9% | 64djbhqv5v93u | db file sequential read | 4% | dkh6b1nhnhfcu | db file sequential read | 1% | 8zt375nzk26rh | db file sequential read | 1% | b513mwk4mf9rt | db file sequential read | 1% | 0h01fy07y1c1p | db file sequential read | -- End of ASH snap 1, end=2012-01-12 22:27:59, seconds=100, samples_taken=99 PL/SQL procedure successfully completed. SQL_ID EXECUTIONS ------------- ---------- 29zz9n1b9sppr 28909 ggjkmtv8cxxh3 28223
As you can see, both sql increase by 12 (after-before execution times) and both sql show user commits statistics delta as 12 which means there is a commit per execution that leads log file sync event.
After proving that the app is not doing batch commits, I wonder if this is something avoidable or not, in terms of business logic. Doing batch commits will definitely help to increase your overall throughput plus whole system health.
So in my humble opinion we have two options without me seeing any code logic.
1- Providing unique index access for the delete operations by giving all columns of the index
2- Decreasing the number of commits and try to do batch commit.
For you to test all my actions, I have added a a simple synthetic test case attached to understand how these implementations can help us.
Cases tested.
1-Nonbatch commit range scan
2-Nonbatch commit unique scan
3-Batch commit range scan
4-Batch commit unique scan
Timings for the test are as follows for deleting 100K row sets as you can see
Elapsed | Improvement | |
nonbatch-commit-range | 84 | 0% |
nonbatch-commit-unique | 47 | 44% |
batch-commit-range | 55 | 34% |
batch-commit-unique | 35 | 58% |
END OF THE EMAIL
Note to blog reader : (In case they are still reading), I wish you found something interesting about how I used the right tool right time
and how I answered the development questions without talking bollocks but numbers and proofs. One thing I need to admit there were only 2 sessions running this operation which made my life easy.
If you wonder the duration of whole diagnose: It took 15 mins in total to find out what is going on and 1 hour to write the mail but that 1 hour save at least 3-4 unnecessary email ping pong just because nobody can prove their fact with numbers.
Quick Tip: If you are sending any output to anybody use Courier New character set and make everybodies eyes happy.
Session Snapper and its usage can be found at Tanels website
Test case can be found here
Enjoyed the post. Favourite part: “…how I answered the development questions without talking bollocks but numbers and proofs”
Comment by ezuall — January 18, 2012 @ 10:07 am
Awesome Post, I am still trying to learn snapper(awesome tool). But had one question for you.
When you sampled this session for 500 seconds what was your snapper input? was is something like below ?
@snapper ash=sql_id+event+wait_class,stat,gather=tsw 500 1 SID
or did you take 500 snapshots or 1 second ?
@snapper ash=sql_id+event+wait_class,stat,gather=tsw 1 500 SID
Either ways….the 500 seconds, dose snapper goes to sleep for most of that 500 seconds or works on cpu or something ??
Comment by irfan — January 30, 2012 @ 2:26 pm
Thank you for your comments
I used
@snapper ash,stats 500 1 SID
which is 1 time 500 second snapshots
snapper does not sleep. It gathers usually 1 snap per second. If you check the code you can see the whole idea behind it
Comment by coskan — February 2, 2012 @ 10:44 pm
Hi Coskan,
I found your methodological explanation of this commit issue as a perfect example of what Cary Millsaps said “Why guess when you can know?”. I like very much this blog article even though that commiting once and outside a loop is the thing I would have suggested immediately if I have had been asked to help solving this issue.
Best Regards
Comment by hourim — March 24, 2012 @ 4:15 pm