成人无码视频,亚洲精品久久久久av无码,午夜精品久久久久久毛片,亚洲 中文字幕 日韩 无码

資訊專欄INFORMATION COLUMN

RAC雙節(jié)點(diǎn)crash回復(fù)一例

IT那活兒 / 1811人閱讀
RAC雙節(jié)點(diǎn)crash回復(fù)一例

客戶現(xiàn)場兩節(jié)點(diǎn)庫crash告警。運(yùn)維人員緊急將數(shù)據(jù)庫拉起,應(yīng)用恢復(fù)。但啟動后alert log 報(bào)錯(cuò)ORA-16191和ORA-01031,為DataGuard主備庫密碼文件不一致所致, 重建密碼文件后, 故障解決。

 分析alert log發(fā)現(xiàn):16:32,節(jié)點(diǎn)1讀取控制文件發(fā)現(xiàn)壞塊,緊接著16:33分實(shí)例無法正常讀取控制文件導(dǎo)致crash,然后實(shí)例2在16:35關(guān)閉。經(jīng)檢查控制文件并未存在壞塊,初步判定為數(shù)據(jù)庫短暫讀取控制文件失敗導(dǎo)致BUG。 

發(fā)起SR,經(jīng)SSC人員及SR后臺專家共同確認(rèn)為bug 11698676,該bug與bug  9549042為重復(fù)bug,并在patch 9549042上被fixed。 

2. 故障分析/處理

2.1 故障處理 

  4月5日16:34, ssyy庫兩節(jié)點(diǎn)相繼crash, 緊急接入后確認(rèn)兩實(shí)例已被徹底關(guān)閉、監(jiān)聽仍然開啟,緊急startup將兩實(shí)例拉起,應(yīng)用恢復(fù)連接至生產(chǎn)庫。

  重啟實(shí)例后,檢查節(jié)點(diǎn)1 alert log 發(fā)現(xiàn): 

Check that the primary and standby are using a password file

and remote_login_passwordfile is set to SHARED or EXCLUSIVE, 

and that the SYS password is same in the password files.

returning error ORA-16191

    提示為SYS主備庫上密碼文件不一致導(dǎo)致, 于是決定主庫重建密碼文件,并將新生成的密碼文件拷至備庫節(jié)點(diǎn)應(yīng)用(操作前備份原密碼文件,并更改主庫SYS密碼).

  分別在primary-rac兩個(gè)節(jié)點(diǎn)上執(zhí)行密碼文件創(chuàng)建語句.

orapwd file=/oracle/db/oracle/product/11.1.0/db/dbs/ssyydb1 entries=5 force=y  password=*********

orapwd file=/oracle/db/oracle/product/11.1.0/db/dbs/ssyydb2 entries=5 force=y  password=*********

       分別將ssyydb1和ssyydb2依次拷至standby-rac節(jié)點(diǎn)1和節(jié)點(diǎn)2.   

  primary-rac1節(jié)點(diǎn)alert log 仍持續(xù)報(bào)錯(cuò):

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_arc2_4134.trc:

ORA-01031: insufficient privileges

PING[ARC2]: Heartbeat failed to connect to standby drdb. Error is 1031.

     此時(shí),主庫節(jié)點(diǎn)1無法向備庫節(jié)點(diǎn)1傳送archive log. 查詢MOS,ORA-01031仍為主備庫密碼文件不一致導(dǎo)致,懷疑主庫歸檔進(jìn)程使用了主機(jī)緩存密碼文件導(dǎo)致,因歸檔進(jìn)程為非關(guān)鍵進(jìn)程,kill -9 后會重新啟動,對當(dāng)前數(shù)據(jù)庫無影響。 

  依次kill主庫節(jié)點(diǎn)1和節(jié)點(diǎn)2所有歸檔進(jìn)程,節(jié)點(diǎn)1仍持續(xù)報(bào)錯(cuò)ORA-01031。

  sqlplus連接確認(rèn)主備庫上SYS密碼已更改.

  檢查新生成的密碼文件是否已被應(yīng)用:

--主庫節(jié)點(diǎn)

SQL> select * from  v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS

------------------------------ ----- ----- -----

SYS                            TRUE  TRUE  FALSE

--備庫節(jié)點(diǎn)

SQL> select * from  v$pwfile_users;

no rows selected

     顯然,主庫密碼文件已被應(yīng)用,備庫密碼文件未被應(yīng)用。

     仔細(xì)檢查備庫密碼文件, 文件名未滿足orapw<$ORACLE_SID>命名規(guī)則, 密碼文件沿      用主庫密碼文件,但備庫實(shí)例名區(qū)別于主庫實(shí)例名。

     修改備庫密碼文件名:

mv $ORACLE_HOME/dbs/ssyydb1 $ORACLE_HOME/dbs/orapwdrdb1 

mv $ORACLE_HOME/dbs/ssyydb2 $ORACLE_HOME/dbs/orapwdrdb2

     持續(xù)觀察幾分鐘,ORA-01031錯(cuò)誤未解決. 

  查詢MOS,參照ORA-1031 for Remote Archive Destination on Primary (Doc ID 733793.1)解決方案操作.

1. Make sure parameter REMOTE_LOGIN_PASSWORDFILE is set to EXCLUSIVE or SHARED in both databases.  


2. Copy the password file again from primary : 


a. Defer the log_archive_dest_2 on primary: 

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2 = DEFER; 


b. Copy/ftp the password file from primary to standby and rename it accordingly on the standby database. Creating the password file on standby with orapwd-utility is not supported for 11g anymore.

Make sure that name of password file on both primary and standby is : orapw. Name of the password file is case sensitive. If SID of database on standby is prod then name of the password file should be orapwprod, orapwPROD will not work. 


c. Enable the log_archive_dest_2 on primary: 

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2 = ENABLE; 


d. Switch 2-3 log files on primary : 

SQL> ALTER SYSTEM SWITCH LOGFILE; 


e. Check the status of log_archive_dest_2 on primary. 

SQL> SELECT STATUS,ERROR FROM V$ARCHIVE_DEST WHERE DEST_ID =2; 

STATUS    ERROR 

--------- ----------------------------------------------------------------- 

VALID 

     持續(xù)跟蹤主庫節(jié)點(diǎn)alert log ,在持續(xù)ORA-01031報(bào)錯(cuò)3-5分鐘后, 主庫節(jié)點(diǎn)均能正常向備庫節(jié)點(diǎn)傳送archive log,備庫實(shí)例也能正常應(yīng)用archive log, 主庫節(jié)點(diǎn)1和節(jié)點(diǎn)2 alert log 也未曾重現(xiàn)ORA-01031和ORA-16191.

     至此,故障全部解決! 

2.2 crash分析 

    

    首先,檢查兩節(jié)點(diǎn)syslog,無異常,排除主機(jī)因素。

     實(shí)例1 alert log:

Fri Apr 05 15:58:52 2013

Archived Log entry 34220 added for thread 1 sequence 12072 ID 0x9441c6d1 dest 1:

Fri Apr 05 16:32:39 2013

Read from controlfile member /dev/oravg/rlv_cntl1 has found a corrupted block (blk# 4, cf seq# 0)

Hex dump of (file 0, block 4) in trace file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc

Corrupt block relative dba: 0x00000004 (file 0, block 4)

Bad check value found during control file block read

Data in bad block:

type: 21 format: 2 rdba: 0x00000004

last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04

spare1: 0x0 spare2: 0x0 spare3: 0x0

consistency value in tail: 0x00001501

check value in block header: 0x8f5d

computed block checksum: 0x2

Re-read from controlfile member /dev/oravg/rlv_cntl1 returned valid block 4

Hex dump of (file 0, block 4) in trace file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc:

ORA-00202: control file: /dev/oravg/rlv_cntl1

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc  (incident=888259):

ORA-00227: corrupt block detected in control file: (block 4, # blocks 1)

ORA-00202: control file: /dev/oravg/rlv_cntl1

Incident details in: /oracle/db/diag/rdbms/ssyy/ssyy1/incident/incdir_888259/ssyy1_lmon_22418_i888259.trc

Fri Apr 05 16:33:24 2013

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc:

ORA-00227: corrupt block detected in control file: (block 4, # blocks 1)

ORA-00202: control file: /dev/oravg/rlv_cntl1

LMON (ospid: 22418): terminating the instance due to error 227

     16:32:39,實(shí)例1在讀控制文件/dev/oravg/rlv_cntl1的時(shí)候出錯(cuò),發(fā)現(xiàn)壞塊。

     16:33:24,實(shí)例1因無法正常讀取控制文件導(dǎo)致實(shí)例crash。 

     檢查三個(gè)控制文件,未發(fā)現(xiàn)壞塊。

ssyy1: dbv file=/dev/datavg02/rlv_cntl1 blocksize=16384

ssyy1: dbv file=/dev/datavg02/rlv_cntl2 blocksize=16384

ssyy1: dbv file=/dev/datavg02/rlv_cntl3 blocksize=16384

     

     查看節(jié)點(diǎn)2 crsd.log: 16:35:23由于數(shù)據(jù)庫異常offline,CRS停掉實(shí)例2.

2013-04-05 16:32:42.179: [  CRSRES][6345673] Resource recovery not purged:ora.ssyy.ssyy2.inst

2013-04-05 16:32:42.205: [  CRSRES][6345673] ora.ssyy.ssyy2.inst target set to OFFLINE before stop action

2013-04-05 16:32:42.206: [  CRSRES][6345673] StopResource: setting CLI values

2013-04-05 16:32:42.252: [  CRSRES][6345673] Attempting to stop `ora.ssyy.ssyy2.inst` on member `ssyy2`

2013-04-05 16:33:40.826: [    CRSD][54] SM: rE2Ec: 4

2013-04-05 16:33:40.896: [  CRSRES][6345681] ora.ssyy.db target set to OFFLINE before stop action

2013-04-05 16:33:40.896: [  CRSRES][6345681] StopResource: setting CLI values

2013-04-05 16:33:42.288: [    CRSD][6345681] SM:dE2Ec: all E2E cmds done. 0

2013-04-05 16:35:23.123: [  CRSRES][6345695] Resource recovery not purged:ora.ssyy.db

2013-04-05 16:35:23.124: [  CRSRES][6345695] `ora.ssyy.db` is already OFFLINE.

2013-04-05 16:35:23.173: [  CRSRES][6345673] Stop of `ora.ssyy.ssyy2.inst` on member `ssyy2` succeeded.

     

     初步懷疑為bug導(dǎo)致, 發(fā)起SR,經(jīng)SSC人員及SR后臺專家共同確認(rèn),命中bug 11698676。

     該bug與bug 9549042為重復(fù)bug, 在當(dāng)前HP-UX Itanium 64 bit 平臺下,有現(xiàn)成patch 9549042。

2.3 解決方案 

     官方建議,盡快打patch 9549042, 以規(guī)避此crash故障再現(xiàn)。


文章版權(quán)歸作者所有,未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請注明本文地址:http://m.hztianpu.com/yun/130244.html

相關(guān)文章

發(fā)表評論

0條評論

最新活動
閱讀需要支付1元查看
<