11gR2 For AIX 磁盘I/O100%案例分析

数据库反应比较慢,性能比较差.通过查看AIX操作系统可以看到I/O繁忙度是100%.

#节点1:

Disk Busy% KBPS TPS KB-Read KB-Writ
Total 100.0 531.4K 8911.0 530.2K 1237.9
#节点2:
Disk Busy% KBPS TPS KB-Read KB-Writ
Total 100.0 407.0K 6624.0 406.7K 289.2

而从CPU上的等待来看,是比较正常的.都还有空闲.

#节点1:

CPU User% Kern% Wait% Idle%
ALL 12.4 3.6 7.7 76.3
#节点2:
CPU User% Kern% Wait% Idle%
ALL 38.6 6.3 0.3 54.8

通过收集高峰时间段AWR报告进行分析.

Event Waits Time(s) Avg wait (ms) % DB time Wait Class
direct path read 1,650,817 173,726 105 78.47 User I/O
db file sequential read 1,480,058 20,644 14 9.32 User I/O
DB CPU 8,211 3.71
gc buffer busy acquire 385,913 5,620 15 2.54 Cluster
read by other session 134,434 4,988 37 2.25 User I/O

在短短的一个小时内,direct path read等待事件所等待的事件居然达到了173726秒.占整个等待的78%.
direct path read较高的可能原因有:

1.大量的磁盘排序操作,无法在排序区中完成排序,需要利用temp表空间进行排序.
2.SQL语句的并行查询。
3.预读操作。
4.大表的全表扫描.在Oracle11g,全表扫描会选择直接路径读取或者是通过高速缓冲区串行扫描读取.在Oracle10g中.默认扫描大表的时候 会选择通过缓存串行扫描.在11g中,通过直接路径读取或者是缓存读取这个决定是取决于表的大小、缓冲区高速缓存大小和其他各种统计信息.直接路径读取的 速度要比db scattered reads要快,而且影响要小,因为它避免了闩锁的产生.


Instance Efficiency Percentages (Target 100%)

Buffer Nowait %: 99.90 Redo NoWait %: 99.98
Buffer Hit %: 99.65 In-memory Sort %: 100.00
Library Hit %: 93.73 Soft Parse %: 95.19
Execute to Parse %: 98.91 Latch Hit %: 98.72
Parse CPU to Parse Elapsd %: 18.62 % Non-Parse CPU: 99.58

从上面实例的命中率来看,In-memory Sort%是100%的,也就证明都是在内存排序的.
In-memory Sort Ratio = sorts (memory) / [sorts (disk) + sorts (memory)].

接着可以看下面显示640个long table结果有596个table扫描方式采取了direct read.

table scans (direct read) 596 0.17 0.00
table scans (long tables) 640 0.18 0.00

接着细化一下做了一个20分钟的ash报告:

Top User Events

Event Event Class % Event Avg Active Sessions
direct path read User I/O 79.58 78.74
db file sequential read User I/O 6.83 6.76
gc buffer busy acquire Cluster 3.96 3.92
read by other session User I/O 2.46 2.43
CPU + Wait for CPU CPU 2.24 2.22

这里可以看到等待事件和AWR的等待其实是差不多的.

Top SQL with Top Events

SQL ID Planhash Sampled # of Executions % Activity Event % Event Top Row Source % RwSrc SQL Text
ghucxq7mx0026 2084660322 56 44.08 direct path read 43.74 TABLE ACCESS – FULL 43.74 SELECT count(*) AS rown FROM (…
26ksq990a8h6p 2084660322 14 11.84 direct path read 11.76 TABLE ACCESS – FULL 11.76 SELECT count(*) AS rown FROM (…
2yk7suhd556z5 966453952 32 10.25 direct path read 10.17 TABLE ACCESS – FULL 10.17 SELECT * FROM (SELECT pagetabl…
5jn3hpftrpz3u 976875450 26 4.45 gc buffer busy acquire 1.85 TABLE ACCESS – FULL 1.85 SELECT count(*) AS rown FROM (…
read by other session 1.70 TABLE ACCESS – FULL 1.70
7agg4p5xpbk9x 920334740 4 3.09 direct path read 2.99 TABLE ACCESS – FULL 2.99 SELECT XZQH, XB, CSRQ, ZJCX, S…

可以看到前面三条SQL语句全部在做TABLE ACCESS FULL的操作,每当做这个操作的时候,都回产生direct path read的等待

接着查询drv_log,drivinglicense,frm_Department这三个表.可以看 到并行度是1.并没有使用并行.这里需要说明一下的是default的,如果对该表执行过alter table parallel操作的话,Degree字段会变成Default,而Default的值等于参数 parallel_threads_per_cpu*cpu_count.

SQL> select TABLE_NAME,DEGREE from dba_tables where table_name in ('DRV_LOG','DRIVINGLICENSE','FRM_DEPARTMENT');

TABLE_NAME                     DEGREE
------------------------------ --------------------
FRM_DEPARTMENT                       1
DRIVINGLICENSE                       1
DRV_LOG                              1
DRIVINGLICENSE                       1
DRV_LOG                              1
FRM_DEPARTMENT                       1
DRV_LOG                              1
DRIVINGLICENSE                       1
FRM_DEPARTMENT                       1
FRM_DEPARTMENT                       1

综合上述现象,I/O 100%的问题主要是应用问题.出现大量的全表扫描大表,现在Oracle 11g和以前10g不一样, 11g会根据表的大小、缓冲区高速缓存大小和其他各种统计信息选择使用直接路径读,它会认为可能比10g中的数据文件散列读(db file scattered reads)速度更快.而且还会减少闩锁的产生.

分享到: 更多

Post a Comment

Your email is never published nor shared. Required fields are marked *