Oracle Statspack has evolved into the Automatic Workload Repository (AWR). Oracle AWR Report
- AWR collects database statistics every 60 minutes (on the hour) out of the box (this is configurable), and this data is maintained for a week and then purged.
- The Oracle database uses AWR for problem detection and analysis as well as for self-tuning.A number of different statistics are collected by the AWR, including wait events, time model statistics, active session history statistics, various system- and session-level statistics, object usage statistics, and information on the most resource-intensive SQL statements.
- To properly collect database statistics, set the initialization parameter STATISTICS_LEVEL to TYPICAL (the default) or ALL.
- The AWR consists of a number of tables owned by the SYS schema and typically stored in the SYSAUX tablespace (currently no method exists that I know of to move these objects to another tablespace).
Recommendations before getting an AWR Report :
- Collect Multiple AWR Reports:
It’s always good to have two Oracle AWR Reports, one for good time (when database was performing well), second when performance is poor. This way Remote DBA can easily compare good and bad report to find out the culprit.
- Stick to Particular Time:
“Database is performing slow” will not help anymore to resolve performace issues. We have to have a specific time like Database was slow yesterday at 1 Pm and continue till 4Pm. Here, DBA will get a report for these three hours.
- Split Large Oracle AWR Report into Smaller Reports:
Instead of having one Oracle AWR Report for long time like one report for 4hrs. it’s better to have four reports each for one hour. This will help to isolate the problem.
Interpreting the Oracle AWR Report Output :
The Header Information
- The first section of the Oracle AWR Report contains information about the database itself, including information on the database name, ID, release number, and host.
- Following this information is information on when the snapshot was started and ended and how many sessions were active.
- Always check to ensure the snapshot interval is what you were trying to measure, and make sure that it’s a long enough interval to make it representative.
- The Cache Sizes section shows the values of the Buffer Cache (DB_CACHE_SIZE), Shared Pool Size (SHARED_POOL_SIZE), Std Block Size (DB_BLOCK_SIZE), and the Log Buffer (LOG_BUFFER)
The Load Profile
- The next portion of the report output, following the basic information, provides per-second and per-transaction statistics for the Load Profile.
- This section is excellent for monitoring throughput and load variations on your system.
- As the load on the system increases, you will see larger numbers per second.
- As you tune the system for maximum efficiency, you will usually see lower numbers for the per-transaction statistic.
Some important things to know about Load Profile:
- An increase in Redo size, Block changes, and % Blocks changed per Read, all of which indicate increased DML (INSERT/UPDATE/DELETE) activity.
- A hard parse occurs when a SQL statement is executed and is not currently in the shared pool. A hard parse rate greater than 100/second could indicate that bind variables are not being used effectively; the CURSOR_SHARING initialization parameter should be used; or you have a shared pool–sizing problem.
- DB time(s): Its the amount of time oracle has spent performing database user calls. Note it does not include background processes.
- DB CPU(s): Its the amount of CPU time spent on user calls. Same as DB time it does not include background process. The value is in microseconds.
- Redo size: For example, the table above shows that an average transaction generates about 3,129.5 of redo data along with around 240,895.4 redo per second.
- Logical reads: It is the combination of Consistent Gets and DB Block Gets.
- Block Changes: The number of block modified during the sample interval.
- Physical reads: Number of block request causing I/O operation.
- Physical writes: Number of physical writes performed.
- User calls: Number of user queries generated.
- Parses: Combination of both hard and Soft parses.
- Hard Parses: The parses requiring a completely new parse or execution plan of the SQL statement. These consume both latches and shared pool area.
- Soft Parses: A soft parse reuses a previous hard parse; hence it consumes fewer resources.
- Sorts: No of sorts performed.
- Logons: No of logons during the interval.
- Executes: No of SQL Executes.
- Transactions: No of transactions per second.
- The Instance Efficiency section shows information for many of the common hit ratios.
- Hit ratios (when managed regularly) are a great alert mechanism for general potential problems or specific potential problems, such as bad SQL, that has been introduced recently into the system.
Things to know about Instance Efficiency :
- Here the Buffer Hit % is a very important parameter. This value is the ratio of hits on a request for a specific buffer when the buffer was in memory and no physical I/O was needed.
- Unfortunately, if you have unselective indexes that are frequently accessed, that will drive your hit ratio higher, which can be a misleading indication of good performance to some DBAs.
- A lower library hit ratio usually indicates that SQL is being pushed out of the shared pool early (could be due to a shared pool that is too small). A lower ratio could also indicate that bind variables are not used or some other issue is causing SQL not to be reused.
- In-Memory Sort % of less than 95 percent in OLTP. In an OLTP system, you really don’t want to do disk sorts. Setting the MEMORY_TARGET or PGA_AGGREGATE_TARGET (or SORT_AREA_SIZE in previous versions) initialization parameter effectively eliminates this problem.
- A soft parse ratio that is less than 80 percent indicates that SQL is not being reused and needs to be investigated.
- A Latch Hit % of less than 99 percent is usually a big problem. Finding the specific latch will lead you to solving this issue.
Top 5 wait events :
- We can consider “Instance efficiency” section as “Proactive tuning” part and “Top wait events” as “Reactive tuning” part.
- Top Wait Events section of AWR is probably the most revealing section in the entire report when you are trying to eliminate bottlenecks quickly on your system.
In the preceding example, the db file sequential read waits show an incredible number of waits (almost 400,000,000 which, assuming a 8K block size, is 3.2T of data read) and an incredible amount of time spent waiting on db file sequential reads.
- For this report, this wait event was due to a badly written SQL statement using too many indexes and reading many more blocks than it should have.
- Slow disk access contributed to the slow performance because of the large number of blocks needing to be read, but the real issue was the SQL itself (as usual). We need to tune that SQL query in that case.
Top SQL Statements
- The most resource-intensive SQL statements in the database are listed next, in descending order of CPU, Elapsed Time, Buffer Gets (or Gets), Disk Reads (or Reads), Executions, Parse Calls, Sharable Memory, and Version Count.
- Once you’ve identified the top events, drill down to see what SQL and PL/SQL are consuming the majority of those resources. On the “Main Report” section, click the “SQL Statistics” link.
On the “SQL Statistics” section, click the SQL ordered by ??” link that most closely relates to the wait event you identified in the “Top 5 Timed Foreground Events” section.
Generally we take a look of “SQL ordered by Elapsed time” section. It helps us to identify which SQL statement is taking long time to execute.
- We have to look at Executions, Elapsed time per Exec (s) etc. along with Elapsed time to analyze a particular SQL statement.
- For example, a query has low Executions and high Elapsed time per Exec (s) and this query could be a candidate for troubleshooting or optimizations.
In above report, you can see first query has maximum Elapsed time but no execution. So you have to investigate this.
An Important point, if executions is 0, it doesn’t means query is not executing, this might be the case when query was still executing and you took AWR report. That’s why query completion was not covered in Report.
- Apart from that we can also check “SQL ordered by CPU time” . By analysing this section we can identify the query causing high load on the system, I.e it is consuming huge amount of resources.
- Please consult with developers to tune those sqls. Run explain plan or sql tuning advisor. Also check OS level (linux) with “top” command that what proceses taking more cpu.