给一个客户做Oracle 10g升级,最后一步运行utlrp.sql编译无效对象的时候hang住了,而且hang了很久,开始以为等一会就没有事了,后来查询无效对象的个数一直都没有发生变化,于是就开始着手寻找原因.
1.查询v$session,查看等待事件和SQL_ID.在这之前我查看了我运行utlrp.sql的sid.
SQL> select sid,event,sql_id from v$session; SID EVENT SQL_ID ---------- ---------------------------------------------------------------- ------------- 463 jobq slave wait 481 jobq slave wait 509 jobq slave wait 515 jobq slave wait 523 Streams AQ: qmn slave idle wait 524 os thread startup 73hrt07s8adn5 525 Streams AQ: waiting for time management or cleanup tasks 4gd6b1r53yt88 526 jobq slave wait 534 Streams AQ: qmn coordinator idle wait 536 SQL*Net message to client 2h0tzf68rbxzz 537 rdbms ipc message 539 rdbms ipc message 4gd6b1r53yt88 540 rdbms ipc message 541 rdbms ipc message 542 smon timer 543 rdbms ipc message 544 rdbms ipc message 545 rdbms ipc message 546 rdbms ipc message 547 rdbms ipc message 548 rdbms ipc message 549 rdbms ipc message 550 rdbms ipc message 551 rdbms ipc message 552 rdbms ipc message 553 rdbms ipc message 554 rdbms ipc message 555 pmon timer 28 rows selected.
2.根据sql_id来查询正在执行的SQL语法,此时发现正在创建一个索引,而且并行度是128.开始怀疑是并行度的问题.
SQL> select SQL_FULLTEXT from v$sql where sql_id='73hrt07s8adn5'; SQL_FULLTEXT -------------------------------------------------------------------------------- CREATE INDEX utl_recomp_comp_idx1 ON utl_recomp_compiled(obj#) PARALLEL 128
3.查看parallel_max_servers参数,发现设置的值很大.修改完参数为4之后,再次运行1分钟就结束了.
SQL> show parameter parallel_max_servers NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ parallel_max_servers integer 485 SQL> SQL> alter system set parallel_max_servers=4 scope=both; System altered. SQL> SQL> @?/rdbms/admin/utlrp.sql TIMESTAMP -------------------------------------------------------------------------------- COMP_TIMESTAMP UTLRP_BGN 2012-03-06 22:04:37 DOC> The following PL/SQL block invokes UTL_RECOMP to recompile invalid DOC> objects in the database. Recompilation time is proportional to the DOC> number of invalid objects in the database, so this command may take DOC> a long time to execute on a database with a large number of invalid DOC> objects. DOC> DOC> Use the following queries to track recompilation progress: DOC> DOC> 1. Query returning the number of invalid objects remaining. This DOC> number should decrease with time. DOC> SELECT COUNT(*) FROM obj$ WHERE status IN (4, 5, 6); DOC> DOC> 2. Query returning the number of objects compiled so far. This number DOC> should increase with time. DOC> SELECT COUNT(*) FROM UTL_RECOMP_COMPILED; DOC> DOC> This script automatically chooses serial or parallel recompilation DOC> based on the number of CPUs available (parameter cpu_count) multiplied DOC> by the number of threads per CPU (parameter parallel_threads_per_cpu). DOC> On RAC, this number is added across all RAC nodes. DOC> DOC> UTL_RECOMP uses DBMS_SCHEDULER to create jobs for parallel DOC> recompilation. Jobs are created without instance affinity so that they DOC> can migrate across RAC nodes. Use the following queries to verify DOC> whether UTL_RECOMP jobs are being created and run correctly: DOC> DOC> 1. Query showing jobs created by UTL_RECOMP DOC> SELECT job_name FROM dba_scheduler_jobs DOC> WHERE job_name like 'UTL_RECOMP_SLAVE_%'; DOC> DOC> 2. Query showing UTL_RECOMP jobs that are running DOC> SELECT job_name FROM dba_scheduler_running_jobs DOC> WHERE job_name like 'UTL_RECOMP_SLAVE_%'; DOC># PL/SQL procedure successfully completed. TIMESTAMP -------------------------------------------------------------------------------- COMP_TIMESTAMP UTLRP_END 2012-03-06 22:05:06 PL/SQL procedure successfully completed. DOC> The following query reports the number of objects that have compiled DOC> with errors (objects that compile with errors have status set to 3 in DOC> obj$). If the number is higher than expected, please examine the error DOC> messages reported with each object (using SHOW ERRORS) to see if they DOC> point to system misconfiguration or resource constraints that must be DOC> fixed before attempting to recompile these objects. DOC># OBJECTS WITH ERRORS ------------------- 0 DOC> The following query reports the number of errors caught during DOC> recompilation. If this number is non-zero, please query the error DOC> messages in the table UTL_RECOMP_ERRORS to see if any of these errors DOC> are due to misconfiguration or resource constraints that must be DOC> fixed before objects can compile successfully. DOC># ERRORS DURING RECOMPILATION --------------------------- 0 PL/SQL procedure successfully completed.
这里因为客户将parallel_max_servers参数设置成485才导致出现了执行缓慢.在这里介绍一下这个参数.parallel_max_servers是指设置并行执行可以使用的最大进程数量.根据官方文档里面的公式介绍:
PARALLEL_MAX_SERVERS |
CPU_COUNT x PARALLEL_THREADS_PER_CPU x (2 if PGA_AGGREGATE_TARGET > 0; otherwise 1) x 5 |
上面的公式还涉及到另外两个参数,一个是cpu_count,这个参数其实是物理cpu的个数,另外一个参数parallel_threads_per_cpu是依赖于操作系统的值,通常这个值是2.该参数值描述一个cpu在执行并行操作的时候,可以使用的线程或者进程数量.举个例子:假设我的物理cpu是4个.并且我设置了pga_aggregate_target的值.则这个parallel_max_server的数量就应该是4*2*2*5=80,在开始创建数据库的时候会自动计算出来并进行设置.
参考文档:http://docs.oracle.com/cd/B19306_01/server.102/b14223/usingpe.htm#sthref2310
Post a Comment