Oracle 12c新特性 Attribute Clustering及其对Index查询带来的性能改善

版权声明:本文为Buddy Yuan原创文章,转载请注明出处。原文地址:Oracle 12c新特性 Attribute Clustering及其对Index查询带来的性能改善
Oracle 12c中有一个新功能,叫Attribute Clustering,这个特性可以帮助我们改善一些索引查询性能。之前我们学习过索引的聚簇因子,如果索引的聚簇因子很高的话,当我们使用索引扫描1000行的数据,可能需要访问不同的数据块中的每一行数据。这本来也没什么,但是如果这个语句频繁的执行,则极大的影响数据库的效率,会消耗更多的逻辑读和CPU。如果我们把表的聚簇。我们来做个小试验验证一下新的功能Attribute Clustering。

1、首先我们创建一个表,插入100万数据。并在id列上创建索引。

SQL> create table testcluster(id number,name varchar2(30));
Table created.

SQL> insert into testcluster select dbms_random.value(1,1000000),dbms_random.string('Y',trunc(dbms_random.value(1,10)))  from dual connect by level <=1000000; 1000000 rows created. 

SQL> commit;
Commit complete.

SQL> create index idx_t1 on testcluster(id);
Index created.

2.收集统计信息,查看聚簇因子。可以看到当前的聚簇因子是999809。

SQL>  exec dbms_stats.gather_table_stats(ownname=>'A1',tabname=>'TESTCLUSTER',method_opt=>'for all columns size auto',cascade=>true,estimate_percent=>100);

SQL> select index_name,clustering_factor from dba_indexes where index_name='IDX_T1';

INDEX_NAME		       CLUSTERING_FACTOR
------------------------------ -----------------
IDX_T1					  999809

3、执行索引扫描,检索1000行数据。此时我们可以看到逻辑读是1019,如果频繁执行这个SQL,将会消耗系统的CPU。

SQL> set autotrace traceonly  
SQL> select * from testcluster t where t.id between 11000 and 12000;

947 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2344056994

---------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |		  |  1002 | 28056 |  1009   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TESTCLUSTER |  1002 | 28056 |  1009   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN		    | IDX_T1	  |  1002 |	  |	7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T"."ID">=11000 AND "T"."ID"<=12000)


Statistics
----------------------------------------------------------
	  9  recursive calls
	  6  db block gets
       1019  consistent gets
	  0  physical reads
       1000  redo size
      44135  bytes sent via SQL*Net to client
       1117  bytes received via SQL*Net from client
	 65  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	947  rows processed

4.在12c之前,我们解决这个问题的做法是重新CTAS这张表,并在select的时候按照id进行排序,在创建索引。那么在12c当中我们就是可以简单的使用语法add clustering by linear,在move table online就能实现了。

SQL> alter table TESTCLUSTER add clustering by linear order(id);
Table altered.

SQL> alter table TESTCLUSTER move online;
Table altered.

SQL> select * from testcluster t where t.id between 11000 and 12000;

947 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2344056994

---------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |		  |  1002 | 28056 |    12   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TESTCLUSTER |  1002 | 28056 |    12   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN		    | IDX_T1	  |  1002 |	  |	7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T"."ID">=11000 AND "T"."ID"<=12000)


Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	141  consistent gets
	  0  physical reads
	  0  redo size
      44135  bytes sent via SQL*Net to client
       1117  bytes received via SQL*Net from client
	 65  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	947  rows processed

我们可以发现逻辑读从1019下降到141,很大程度的改善了性能。大约提升了7.5倍的性能。此时我们在看下聚簇因子已经变成了4571。

SQL> select index_name,STATUS,CLUSTERING_FACTOR from dba_indexes where index_name='IDX_T1';

INDEX_NAME		       STATUS	CLUSTERING_FACTOR
------------------------------ -------- -----------------
IDX_T1			       VALID		     4571

如果我们用12c之前的方法来实现。我们重建创建表和索引,其实效果是差不多的。

SQL> create table TESTCLUSTER_1 as select * from TESTCLUSTER order by id;
Table created.

SQL> create index idx_t1_1 on testcluster_1(id);
Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>'A1',tabname=>'TESTCLUSTER',method_opt=>'for all columns size auto',cascade=>true,estimate_percent=>100);
PL/SQL procedure successfully completed.


SQL> select index_name,clustering_factor from dba_indexes where index_name='IDX_T1_1';
INDEX_NAME		       CLUSTERING_FACTOR
------------------------------ -----------------
IDX_T1_1				    4570

虽然这么做可以显著的帮助我们提升某些执行频次高的语句的性能,但是我们仍然要考虑其他列在表上查询的影响。当然我演示的只是linear ordering,,还有一种方式叫interleaved ordering。两者的区别可以参考官方文档的图。

当然我们还可以在建表的时候使用,如下所示:

 linear ordering
CREATE TABLE sales (
  prod_id        NUMBER(6) NOT NULL,
  cust_id        NUMBER NOT NULL,
  time_id        DATE NOT NULL,
  channel_id     CHAR(1) NOT NULL,
  promo_id       NUMBER(6) NOT NULL,
  quantity_sold  NUMBER(3) NOT NULL,
  amount_sold    NUMBER(10,2) NOT NULL
)
CLUSTERING
  BY LINEAR ORDER (cust_id, prod_id);

Interleaved Ordering
  CREATE TABLE sales (
  prod_id        NUMBER(6) NOT NULL,
  cust_id        NUMBER NOT NULL,
  time_id        DATE NOT NULL,
  channel_id     CHAR(1) NOT NULL,
  promo_id       NUMBER(6) NOT NULL,
  quantity_sold  NUMBER(3) NOT NULL,
  amount_sold    NUMBER(10,2) NOT NULL
)
CLUSTERING
  BY INTERLEAVED ORDER (time_id, prod_id);

如果我们要修改可以使用modify CLUSTERING,删除可以使用drop CLUSTERING。

ALTER TABLE sales MODIFY CLUSTERING YES ON DATA MOVEMENT;
ALTER TABLE sales DROP CLUSTERING;

更多详细的细节及操作方法可以参考官方文档:http://devel.hotpilot.cz/ora-12cR1-lin-64-inst/E50529_01/DWHSG/attcluster.htm#CHDBGCEE

分享到: 更多

Post a Comment

Your email is never published nor shared. Required fields are marked *