版权声明:本文为Buddy Yuan原创文章,转载请注明出处。原文地址:Oracle 12c新特性 Attribute Clustering及其对Index查询带来的性能改善
Oracle 12c中有一个新功能,叫Attribute Clustering,这个特性可以帮助我们改善一些索引查询性能。之前我们学习过索引的聚簇因子,如果索引的聚簇因子很高的话,当我们使用索引扫描1000行的数据,可能需要访问不同的数据块中的每一行数据。这本来也没什么,但是如果这个语句频繁的执行,则极大的影响数据库的效率,会消耗更多的逻辑读和CPU。如果我们把表的聚簇。我们来做个小试验验证一下新的功能Attribute Clustering。
1、首先我们创建一个表,插入100万数据。并在id列上创建索引。
SQL> create table testcluster(id number,name varchar2(30)); Table created. SQL> insert into testcluster select dbms_random.value(1,1000000),dbms_random.string('Y',trunc(dbms_random.value(1,10))) from dual connect by level <=1000000; 1000000 rows created. SQL> commit; Commit complete. SQL> create index idx_t1 on testcluster(id); Index created.
2.收集统计信息,查看聚簇因子。可以看到当前的聚簇因子是999809。
SQL> exec dbms_stats.gather_table_stats(ownname=>'A1',tabname=>'TESTCLUSTER',method_opt=>'for all columns size auto',cascade=>true,estimate_percent=>100);
SQL> select index_name,clustering_factor from dba_indexes where index_name='IDX_T1';
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
IDX_T1 999809
3、执行索引扫描,检索1000行数据。此时我们可以看到逻辑读是1019,如果频繁执行这个SQL,将会消耗系统的CPU。
SQL> set autotrace traceonly
SQL> select * from testcluster t where t.id between 11000 and 12000;
947 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2344056994
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1002 | 28056 | 1009 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TESTCLUSTER | 1002 | 28056 | 1009 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T1 | 1002 | | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."ID">=11000 AND "T"."ID"<=12000)
Statistics
----------------------------------------------------------
9 recursive calls
6 db block gets
1019 consistent gets
0 physical reads
1000 redo size
44135 bytes sent via SQL*Net to client
1117 bytes received via SQL*Net from client
65 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
947 rows processed
4.在12c之前,我们解决这个问题的做法是重新CTAS这张表,并在select的时候按照id进行排序,在创建索引。那么在12c当中我们就是可以简单的使用语法add clustering by linear,在move table online就能实现了。
SQL> alter table TESTCLUSTER add clustering by linear order(id);
Table altered.
SQL> alter table TESTCLUSTER move online;
Table altered.
SQL> select * from testcluster t where t.id between 11000 and 12000;
947 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2344056994
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1002 | 28056 | 12 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TESTCLUSTER | 1002 | 28056 | 12 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T1 | 1002 | | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."ID">=11000 AND "T"."ID"<=12000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
141 consistent gets
0 physical reads
0 redo size
44135 bytes sent via SQL*Net to client
1117 bytes received via SQL*Net from client
65 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
947 rows processed
我们可以发现逻辑读从1019下降到141,很大程度的改善了性能。大约提升了7.5倍的性能。此时我们在看下聚簇因子已经变成了4571。
SQL> select index_name,STATUS,CLUSTERING_FACTOR from dba_indexes where index_name='IDX_T1';
INDEX_NAME STATUS CLUSTERING_FACTOR
------------------------------ -------- -----------------
IDX_T1 VALID 4571
如果我们用12c之前的方法来实现。我们重建创建表和索引,其实效果是差不多的。
SQL> create table TESTCLUSTER_1 as select * from TESTCLUSTER order by id;
Table created.
SQL> create index idx_t1_1 on testcluster_1(id);
Index created.
SQL> exec dbms_stats.gather_table_stats(ownname=>'A1',tabname=>'TESTCLUSTER',method_opt=>'for all columns size auto',cascade=>true,estimate_percent=>100);
PL/SQL procedure successfully completed.
SQL> select index_name,clustering_factor from dba_indexes where index_name='IDX_T1_1';
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
IDX_T1_1 4570
虽然这么做可以显著的帮助我们提升某些执行频次高的语句的性能,但是我们仍然要考虑其他列在表上查询的影响。当然我演示的只是linear ordering,,还有一种方式叫interleaved ordering。两者的区别可以参考官方文档的图。
当然我们还可以在建表的时候使用,如下所示:
linear ordering CREATE TABLE sales ( prod_id NUMBER(6) NOT NULL, cust_id NUMBER NOT NULL, time_id DATE NOT NULL, channel_id CHAR(1) NOT NULL, promo_id NUMBER(6) NOT NULL, quantity_sold NUMBER(3) NOT NULL, amount_sold NUMBER(10,2) NOT NULL ) CLUSTERING BY LINEAR ORDER (cust_id, prod_id); Interleaved Ordering CREATE TABLE sales ( prod_id NUMBER(6) NOT NULL, cust_id NUMBER NOT NULL, time_id DATE NOT NULL, channel_id CHAR(1) NOT NULL, promo_id NUMBER(6) NOT NULL, quantity_sold NUMBER(3) NOT NULL, amount_sold NUMBER(10,2) NOT NULL ) CLUSTERING BY INTERLEAVED ORDER (time_id, prod_id);
如果我们要修改可以使用modify CLUSTERING,删除可以使用drop CLUSTERING。
ALTER TABLE sales MODIFY CLUSTERING YES ON DATA MOVEMENT; ALTER TABLE sales DROP CLUSTERING;
更多详细的细节及操作方法可以参考官方文档:http://devel.hotpilot.cz/ora-12cR1-lin-64-inst/E50529_01/DWHSG/attcluster.htm#CHDBGCEE
Post a Comment