一次ORA-00600[ktspfupdst-1]和ORA-00474故障解决

  问题是这样的,客户的数据库启动不久后,就会自动Down掉,alter日志如下:

Errors in file /oracle/admin/hdpos/udump/hdpos_ora_20580.trc:
ORA-00600: internal error code, arguments: [ktspfupdst-1], [], [], [], [], [], [], []
Thu Apr 26 12:52:16 2012
Errors in file /oracle/admin/hdpos/bdump/hdpos_smon_19673.trc:
ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [ktspfupdst-1]
Thu Apr 26 12:52:23 2012
Errors in file /oracle/admin/hdpos/bdump/hdpos_pmon_19659.trc:
ORA-00474: Message 474 not found; No message file for product=RDBMS, facility=ORA
Thu Apr 26 12:52:23 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 19659

  这里我们可以看到首先是报ORA-00600[ktspfupdst-1]错误,然后导致smon进程失败,最终导致数据库宕机.主要看smon进程的这个trace文件.

*** SESSION ID:(1099.1) 2012-04-26 12:52:38.132
Parallel Transaction recovery caught exception 12801
Parallel Transaction recovery caught error 600
*** 2012-04-26 12:52:42.222
SMON: Parallel transaction recovery slave got internal error
SMON: Downgrading transaction recovery to serial
Dead transaction 0x0017.011.0000082e recovered by SMON
Dumping current redo log in thread 1

  在smon的trace中我们可以看到系统SMON进程在做Parallel Transaction recover的时候出现错误,报Ora-12801和Ora-00600错误.而trace中运行的sql语句如下所示:

ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [4097]
Current SQL statement for this session:
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, tim_scn_map) values (0, :1, :2, :3, :4, :5, :6, :7)

  这里我们可以看到是在对SMON_SCN_TIME进行insert操作的时候报错.这个和MOS上的文章SMON Signals ORA-00600 [ktspfupdst-1] and Instance Terminates [ID 1353076.1]类似.所以可以断定是SMON_SCN_TIME这个表corrupted了,只要重新create这个表就行了.SMON_SCN_TIME这个表记录了SCN对应的时间戳.Oracle对SMON_SCN_TIME的描述如下:
SMON_SCN_TIME is updated by SMON every 5 minutes with a timestamp and the current SCN.
It holds 5 days worth of records (1440); data older than 5 days is aged out. This table makes it possible to roughly find an SCN for a point in time in the last 5 days
  关于怎么重建该表,我们需要查看sql.bsq文件,在($ORACLE_HOME/rdbms/admin/sql.bsq)中:

create cluster smon_scn_to_time (
  thread number                         /* thread, compatibility */
)
/
create index smon_scn_to_time_idx on cluster smon_scn_to_time
/
create table smon_scn_time (
  thread number,                         /* thread, compatibility */
  time_mp number,                        /* time this recent scn represents */
  time_dp date,                          /* time as date, compatibility */
  scn_wrp number,                        /* scn.wrp, compatibility */
  scn_bas number,                        /* scn.bas, compatibility */
  num_mappings number,
  tim_scn_map raw(1200),
  scn number default 0,                  /* scn */
  orig_thread number default 0           /* for downgrade */
) cluster smon_scn_to_time (thread)
/
create unique index smon_scn_time_tim_idx on smon_scn_time(time_mp)
/
create unique index smon_scn_time_scn_idx on smon_scn_time(scn)
/

  创建完成之后,我们还可以validate这些对象.

SQL> analyze table SMON_SCN_TIME validate structure;
SQL> analyze table SMON_SCN_TIME validate structure cascade;
分享到: 更多

Post a Comment

Your email is never published nor shared. Required fields are marked *