Oracle 9i rac遇到CMCLI ERROR: OpenCommPort: connect failed with error 111

9irac的一个节点起不来,尝试启动数据库的时候,报下列错误.

[oracle@gzgaswebdb2 ~]$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.8.0 - Production on Tue Feb 28 23:04:29 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to an idle instance.

SQL> startup
CMCLI ERROR: OpenCommPort: connect failed with error 111.
CMCLI ERROR: OpenCommPort: connect failed with error 111.
CMCLI ERROR: OpenCommPort: connect failed with error 111.
ORACLE instance started.

Total System Global Area 1669929032 bytes
Fixed Size             452680 bytes
Variable Size          704643072 bytes
Database Buffers      956301312 bytes
Redo Buffers            8531968 bytes
ORA-32700: error occurred in DIAG Group Service

SQL> exit

引起这个错误主要的原因是oracm(Oracle Cluster Manager)管理没有启动,因为重启了操作系统,而oracm没有被操作系统自动带起来.执行下列步骤,成功Open数据库.

1.验证Hangcheck-timer成功加载.在root用户下查看

[root@gzgaswebdb2 tmp]# lsmod | grep hang
hangcheck_timer         7897  0

2.使用root用户启动oracm.

[root@gzgaswebdb2 tmp]# cd /u01/app/oracle/product/920/oracm/bin
[root@gzgaswebdb2 bin]# ./ocmstart.sh

oracm </dev/null 2>&1 >/u01/app/oracle/product/920/oracm/log/cm.out &

3.启动数据库

[oracle@gzgaswebdb2 ~]$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.8.0 - Production on Tue Feb 28 23:22:15 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 1669929032 bytes
Fixed Size             452680 bytes
Variable Size          704643072 bytes
Database Buffers      956301312 bytes
Redo Buffers            8531968 bytes
Database mounted.
Database opened.

[问题]正常情况下,按照上述操作都没有问题,但我在运行oracm的时候出现了一个错误.

[root@gzgaswebdb2 bin]# ./ocmstart.sh
ocmstart.sh: Error: Restart is too frequent
ocmstart.sh: Info:  Check the system configuration and fix the problem.
ocmstart.sh: Info:  After you fixed the problem, remove the timestamp file
ocmstart.sh: Info:  "/u01/app/oracle/product/920/oracm/log/ocmstart.ts"

出现该问题后,我查看了ocmstart.sh脚本的内容,注意对以下内容进行分析

# Compare the timestamps
if test $limit -gt $current
then
echo "ocmstart.sh: Error: Restart is too frequent"
echo "ocmstart.sh: Info:  Check the system configuration and" \
"fix the problem."

可以看到它是在做一个timestamps的比较,首先它是测试变量$limit如果小于当前的timestamp就报我们刚才遇到的错误信息.然后我查看了$limit的定义.

if test -r $TIMESTAMP_FILE
then
timestamp=`date -r $TIMESTAMP_FILE '+%s'`
limit=`expr $timestamp + $norestart_args`
else
limit=0
fi

这里可以看到它首先是尝试从一个文件中获取到timestamp,然后计算出limit的值来.而Timestampe的值是来自于$ORACLE_HOME/oracm/log/ocmstart.ts文件

# Timestampe file name
TIMESTAMP_FILE=$ORACLE_HOME/oracm/log/ocmstart.ts

这里我判断是因为时间不一致导致出现了问题,我对该服务器进行了重启,重启完后比较了两边的时间,最后成功的open了另外一个节点.

参考:Linux AS 2.1 RAC Startup Fails With ORA-32700: Error Occurred In DIAG Group Service [ID 224422.1]

分享到: 更多

Post a Comment

Your email is never published nor shared. Required fields are marked *