问题是这样的,客户的数据库启动不久后,就会自动Down掉,alter日志如下:
Errors in file /oracle/admin/hdpos/udump/hdpos_ora_20580.trc: ORA-00600: internal error code, arguments: [ktspfupdst-1], [], [], [], [], [], [], [] Thu Apr 26 12:52:16 2012 Errors in file /oracle/admin/hdpos/bdump/hdpos_smon_19673.trc: ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [ktspfupdst-1] Thu Apr 26 12:52:23 2012 Errors in file /oracle/admin/hdpos/bdump/hdpos_pmon_19659.trc: ORA-00474: Message 474 not found; No message file for product=RDBMS, facility=ORA Thu Apr 26 12:52:23 2012 PMON: terminating instance due to error 474 Instance terminated by PMON, pid = 19659
这里我们可以看到首先是报ORA-00600[ktspfupdst-1]错误,然后导致smon进程失败,最终导致数据库宕机.主要看smon进程的这个trace文件.
*** SESSION ID:(1099.1) 2012-04-26 12:52:38.132 Parallel Transaction recovery caught exception 12801 Parallel Transaction recovery caught error 600 *** 2012-04-26 12:52:42.222 SMON: Parallel transaction recovery slave got internal error SMON: Downgrading transaction recovery to serial Dead transaction 0x0017.011.0000082e recovered by SMON Dumping current redo log in thread 1
在smon的trace中我们可以看到系统SMON进程在做Parallel Transaction recover的时候出现错误,报Ora-12801和Ora-00600错误.而trace中运行的sql语句如下所示:
ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [4097] Current SQL statement for this session: insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas, num_mappings, tim_scn_map) values (0, :1, :2, :3, :4, :5, :6, :7)
这里我们可以看到是在对SMON_SCN_TIME进行insert操作的时候报错.这个和MOS上的文章SMON Signals ORA-00600 [ktspfupdst-1] and Instance Terminates [ID 1353076.1]类似.所以可以断定是SMON_SCN_TIME这个表corrupted了,只要重新create这个表就行了.SMON_SCN_TIME这个表记录了SCN对应的时间戳.Oracle对SMON_SCN_TIME的描述如下:
SMON_SCN_TIME is updated by SMON every 5 minutes with a timestamp and the current SCN.
It holds 5 days worth of records (1440); data older than 5 days is aged out. This table makes it possible to roughly find an SCN for a point in time in the last 5 days
关于怎么重建该表,我们需要查看sql.bsq文件,在($ORACLE_HOME/rdbms/admin/sql.bsq)中:
create cluster smon_scn_to_time ( thread number /* thread, compatibility */ ) / create index smon_scn_to_time_idx on cluster smon_scn_to_time / create table smon_scn_time ( thread number, /* thread, compatibility */ time_mp number, /* time this recent scn represents */ time_dp date, /* time as date, compatibility */ scn_wrp number, /* scn.wrp, compatibility */ scn_bas number, /* scn.bas, compatibility */ num_mappings number, tim_scn_map raw(1200), scn number default 0, /* scn */ orig_thread number default 0 /* for downgrade */ ) cluster smon_scn_to_time (thread) / create unique index smon_scn_time_tim_idx on smon_scn_time(time_mp) / create unique index smon_scn_time_scn_idx on smon_scn_time(scn) /
创建完成之后,我们还可以validate这些对象.
SQL> analyze table SMON_SCN_TIME validate structure; SQL> analyze table SMON_SCN_TIME validate structure cascade;
Post a Comment