一次丧心病狂的PLS-00306错误分析

在做database replay预处理的时候遇到一个问题,初略的看,是执行包调用内部(函数/存储过程)出现的错误。感觉是一个典型的传参的错误。

SQL> exec dbms_workload_replay.PROCESS_CAPTURE('AAA');
BEGIN dbms_workload_replay.PROCESS_CAPTURE('AAA'); END; 

*
ERROR at line 1:
ORA-06550: line 1, column 15:
PLS-00306: wrong number or types of arguments in call to 'ADD_CAPTURE'
ORA-06550: line 1, column 9:
PL/SQL: Statement ignored
ORA-06512: at "SYS.DBMS_WORKLOAD_REPLAY", line 2301
ORA-06512: at line 1

这种问题,一般我们都会做10046 trace进一步看调用了那个函数引起的问题。从trace中看到是在运行dbms_wrr_internal.add_capture出的问题。

BEGIN   :1 := dbms_wrr_internal.add_capture( :name,                                        :db_id,                                        :db_name, 
                              :db_version,                                        'tmp',                                        '/tmp', 
                  'TRUE',                                        :status,                                        NULL, 
NULL,                                        :stime,                                        NULL,                                        :sscn, 
                           :dflt_action,                                        NULL,                                        NULL, 
             :wid,                                        :internal_stime                                      ); END; 
END OF STMT 
PARSE #11529215044979176000:c=0,e=540,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=434859905841 
===================== 
PARSING IN CURSOR #11529215044979172808 len=8 dep=1 uid=0 oct=45 lid=0 tim=434859906846 hv=2761672982 ad='0' sqlid='8sst43uk9rk8q' 
ROLLBACK 
END OF STMT 
PARSE #11529215044979172808:c=0,e=12,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=0,plh=0,tim=434859906844 
XCTEND rlbk=1, rd_only=1, tim=434859906921 
EXEC #11529215044979172808:c=0,e=33,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=0,plh=0,tim=434859906939 
CLOSE #11529215044979172808:c=0,e=2,dep=1,type=3,tim=434859906960 
PARSE #11529215044979172808:c=0,e=6,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=0,plh=0,tim=434859907039 
XCTEND rlbk=1, rd_only=1, tim=434859907062 
EXEC #11529215044979172808:c=0,e=15,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=0,plh=0,tim=434859907074 
CLOSE #11529215044979172808:c=0,e=1,dep=1,type=3,tim=434859907091 
EXEC #11529215044981457272:c=10000,e=77698,p=0,cr=60,cu=29,mis=0,r=0,dep=0,og=1,plh=0,tim=434859907137 
ERROR #11529215044981457272:err=6550 tim=434859907154

SQL> desc dbms_wrr_internal 
FUNCTION ADD_CAPTURE RETURNS NUMBER 
Argument Name                  Type                    In/Out Default? 
------------------------------ ----------------------- ------ -------- 
NAME                           VARCHAR2                IN 
DB_ID                          NUMBER                  IN 
DB_NAME                        VARCHAR2                IN 
DB_VERSION                     VARCHAR2                IN 
DIR                            VARCHAR2                IN 
DPATH                          VARCHAR2                IN 
DPATH_SHARED                   VARCHAR2                IN 
STATUS                         VARCHAR2                IN 
ECODE                          NUMBER                  IN 
EMSG                           VARCHAR2                IN 
STIME                          DATE                    IN 
ETIME                          DATE                    IN 
SSCN                           NUMBER                  IN 
DEFAULT_ACTION                 VARCHAR2                IN 
AWR_DB_ID                      NUMBER                  IN 
AWR_BSNAP                      NUMBER                  IN 
WID                            VARCHAR2                IN

SQL> select NAME,DATATYPE_STRING,VALUE_STRING from v$sql_bind_capture where sql_id='3h55aw49j63g8'; 

no rows selected

本来以为10046能够把绑定变量这些数值抓出来,但是这里确没有,v$sql_bind_capture也抓不到绑定变量的值。我怀疑和内部的包有关系。所以我们无从判断是哪一个参数的调用类型出了问题。遇到这样的问题也没什么好的办法,以前在自己的Linux环境上搞过database replay,是能够成功的,所以想法就是把Linux上环境的这个包迁移到Hp环境上在试试。于是就用grep去搜索具体是那个文件。搜到了文件名为"prvtwrr.plb"的文件,里面是一堆加密的东西。把这个文件传到HP上运行,运行完后发现,妹的还是报以前的错误。郁闷啊,为什么在我Linux上的环境就可以了,在HP的环境就不行了。走投无路的时候,同事在$ORACLE_HOME目录下面执行了下列命令。

w4sd23pa#[/home/oracle]find $ORACLE_HOME -name "prvtwrr.plb" 
/oracle/app/oracle/product/11.2.0/db_1/rdbms/admin/prvtwrr.plb 
/oracle/app/oracle/product/11.2.0/db_1/.patch_storage/17411249_Apr_21_2014_22_29_21/files/rdbms/admin/prvtwrr.plb 
w4sd23pa#[/home/oracle] 
w4sd23pa#[/home/oracle]ls -l /oracle/app/oracle/product/11.2.0/db_1/rdbms/admin/prvtwrr.plb 
-rw-r--r--   1 oracle     oinstall    151611 Apr 22 13:29 /oracle/app/oracle/product/11.2.0/db_1/rdbms/admin/prvtwrr.plb 
w4sd23pa#[/home/oracle]ls -l /oracle/app/oracle/product/11.2.0/db_1/.patch_storage/17411249_Apr_21_2014_22_29_21/files/rdbms/admin/prvtwrr.plb 
-rw-r--r--   1 oracle     oinstall    149850 Sep 29  2013 /oracle/app/oracle/product/11.2.0/db_1/.patch_storage/17411249_Apr_21_2014_22_29_21/files/rdbms/admin/prvtwrr.plb

执行find命令居然发现能够找到两个文件,而另外一个文件明眼一看就是补丁文件17411249,赶紧用opatch看了一下补丁的情况。发现这个补丁是最近新装的。

w4sd23pa#[/home/oracle]opatch lsinventory 
Oracle Interim Patch Installer version 11.2.0.3.6 
Copyright (c) 2013, Oracle Corporation.  All rights reserved. 

Oracle Home       : /oracle/app/oracle/product/11.2.0/db_1 
Central Inventory : /oracle/app/oraInventory 
   from           : /oracle/app/oracle/product/11.2.0/db_1/oraInst.loc 
OPatch version    : 11.2.0.3.6 
OUI version       : 11.2.0.4.0 
Log file location : /oracle/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2014-05-27_23-23-16PM_1.log 

Lsinventory Output file location : /oracle/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/lsinv/lsinventory2014-05-27_23-23-16PM.txt 

-------------------------------------------------------------------------------- 
Installed Top-level Products (1): 

Oracle Database 11g                                                  11.2.0.4.0 
There are 1 product(s) installed in this Oracle Home. 

Interim patches (18) : 

Patch  17411249     : applied on Tue May 20 18:41:12 GMT+08:00 2014 
Unique Patch ID:  17551033 
   Created on 21 Apr 2014, 22:29:21 hrs PST8PDT 
   Bugs fixed: 
     17411249

看到了这个信息之后,我们就怀疑是不是安装这个补丁出了问题。如果是的话,我们回退了再试试,于是我们把这个补丁做了一个回退,在readme里面看还要运行一个postinstall.sql脚本,回退完成后在执行预处理的操作发现居然成功了。然后我们再一次安装了这个补丁,运行了postinstall.sql脚本,发现预处理还是成功的。后面问了安装补丁的同事,最终发现是他忘记了执行postinstall.sql脚本。总的来说,这个故事还是很丧心病狂的,如果不去执行一把find命令,根本就不会发现是因为安装了新补丁造成的错误。

分享到: 更多