成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

記一次Rac節(jié)點故障處理及恢復(上篇)

IT那活兒 / 3194人閱讀
記一次Rac節(jié)點故障處理及恢復(上篇)

某系統(tǒng)oracle數(shù)據(jù)庫 RAC有2個節(jié)點,節(jié)點2本地硬盤故障,/u01目錄無法打開,導致節(jié)點2 grid和oracle相關軟件全部丟失。下面記錄了恢復節(jié)點2 的故障處理及恢復過程。本文章同樣適用于刪除和添加節(jié)點。





  說  明  




生產(chǎn)環(huán)境的RAC有2個節(jié)點,整體步驟來自ORACLE 官方文檔:
https://docs.oracle.com/cd/E11882_01/rac.112/e41960/affffdelunix.htm#RACAD7358
環(huán)境情況如下:

節(jié)點名稱
數(shù)據(jù)庫實例名
操作系統(tǒng)
數(shù)據(jù)庫版本
異常情況
wbtdb1/wbtdb2
wbtdb1/wbtdb2
Linux 6.X
Oracle11.2.0.4

grid:  GRID_HOME 名稱為 ORACLE_HOME ,路徑為:/u01/app/11.2.0/grid

oracle: ORACLE_HOME 路徑為:/u01/app/oracle/product/11.2.0/dbhome_1

grid的base和home
[root@wbtdb1 ~]# su - grid
[grid@wbtdb1 ~]$ echo $ORACLE_HOME
/u01/11.2.0/grid
[grid@wbtdb1 ~]$ echo $ORACLE_BASE
/u01/app/oracle

ORACLE的base和home
[root@wbtdb2 ~]# su - oracle
[oracle@wbtdb2 ~]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/db_1
[oracle@wbtdb2 ~]$ echo $ORACLE_BASE
/u01/app/oracle
[oracle@wbtdb2 ~]






查看節(jié)點2狀態(tài)




從節(jié)點2來看,軟件已經(jīng)沒有了,任何oracle相關命令都無法執(zhí)行了,因為oracle相關軟件目錄已損壞。

[grid@wbtdb2 ~]$ crsctl stat res -t
-bash: crsctl: command not found
[grid@wbtdb2 ~]$
但這時候從節(jié)點1來看,狀態(tài)還是正常的:
[grid@wbtdb1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHDG.dg
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.DATADG.dg
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.LISTENER.lsnr
               ONLINE ONLINE wbtdb1
               ONLINE INTERMEDIATE wbtdb2
ora.OCRVOTING.dg
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.asm
               ONLINE ONLINE wbtdb1 Started
               ONLINE ONLINE wbtdb2 Started
ora.gsd
               OFFLINE OFFLINE wbtdb1
               OFFLINE OFFLINE wbtdb2
ora.net1.network
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.ons
               ONLINE ONLINE wbtdb1
               ONLINE INTERMEDIATE wbtdb2
ora.registry.acfs
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1 ONLINE ONLINE wbtdb1
ora.cvu
      1 ONLINE ONLINE wbtdb2
ora.oc4j
      1 ONLINE INTERMEDIATE wbtdb2
ora.scan1.vip
      1 ONLINE ONLINE wbtdb1
ora.wbtdb.db
      1 ONLINE ONLINE wbtdb1 Open
      2 ONLINE ONLINE wbtdb2 Open
ora.wbtdb1.vip
      1 ONLINE ONLINE wbtdb1
ora.wbtdb2.vip
      1 ONLINE ONLINE wbtdb2
[grid@wbtdb1 ~]$



操作大致步驟

  1. 刪除節(jié)點2 oracle實例并更新oracle_home數(shù)據(jù)庫列表

  2. 更新GRID_HOME集群列表
  3. 停止和刪除監(jiān)聽并從cluster中刪除節(jié)點2
  4. 清除VIP信息并刪除節(jié)點2
  5. 從節(jié)點1添加節(jié)點2

1. 刪除節(jié)點2 oracle實例并更新oracle_home數(shù)據(jù)庫列表


????? 1.1  dbca圖形界面刪除?????

節(jié)點2服務器壞掉,從節(jié)點1上,oracle用戶下執(zhí)行dbca

[root@wbtdb1 ~]# xhost +
access control disabled, clients can connect from any host
[root@wbtdb1 ~]# export DISPLAY=192.168.1.234:0.0
[root@wbtdb1 ~]# su - oracle
[oracle@wbtdb1 ~]$ xhost +
access control disabled, clients can connect from any host
xhost: must be on local machine to enable or disable access control.
[oracle@wbtdb1 ~]$
[oracle@wbtdb1 ~]$ dbca

大致步驟如下:


方法二:
 靜默刪除節(jié)點2 oracle實例
dbca -silent -deleteInstance [-nodeList node_name] -gdbName gdb_name -instanceName instance_name -sysDBAUserName sysdba -sysDBAPassword password

-gdbName gdb_name 這里的gdb_name是global_name
select * from global_name; 可以查看該值
node_name   是刪除節(jié)點名
gdb_name    是全局數(shù)據(jù)庫名
instance_name 是刪除的實例名
sysdba     是擁有sysdba權限的oracle用戶名稱
password    是sysdba用戶的密碼

[oracle@wbtdb1 ~]$ dbca -silent -deleteInstance -nodeList wbtdb2 -gdbName wbtdb -instanceName wbtdb2 -sysDBAUserName sys -sysDBAPassword oracle
Deleting instance
1% complete
2% complete
6% complete
13% complete
20% complete
26% complete
33% complete
40% complete
46% complete
53% complete
60% complete
66% complete
Completing instance management.
100% complete
Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/wbtdb.log" for further details.

-gdbName wbtdb 這里的wbtdb是global_name
select * from global_name; 可以查看該值
node_name  是刪除節(jié)點名
gdb_name   是全局數(shù)據(jù)庫名
instance   是刪除的實例名
sysdba     是擁有sysdba權限的oracle用戶名稱
passwd     是sysdba用戶的密碼


 1.2   更新oracle_home數(shù)據(jù)庫列表

節(jié)點1 切換oracle用戶下:

[oracle@wbtdb1 db_1]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/db_1
[oracle@wbtdb1 db_1]$ cd $ORACLE_HOME/oui/bin
[oracle@wbtdb1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1 "CLUSTER_NODES={wbtdb1}"    
--這里是填寫保留的節(jié)點(正常的節(jié)點)
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4093 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
UpdateNodeList was successful.


 1.3   刪除后驗證

查看活動的實例:
set line 200
select thread#,status,instance from v$thread;
   THREAD# STATUS INSTANCE
---------- ----------- ---------------
    1  OPEN wbtdb1


如果還有節(jié)點2的redo log ,請使用以下命令:

ALTER DATABASE DISABLE THREAD 2;


驗證OCR中 數(shù)據(jù)庫信息,語法如下:

srvctl config database -d db_unique_name


例如:
[oracle@wbtdb1 ~]$ srvctl config database -d wbtdb
Database unique name: wbtdb
Database name: wbtdb
Oracle home: /u01/app/oracle/product/11.2.0/db_1
Oracle user: oracle
Spfile: +DATADG/wbtdb/spfilewbtdb.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: wbtdb
Database instances: wbtdb1
Disk Groups: DATADG
Mount point paths:
Services:
Type: RAC
Database is administrator managed


2. 更新GRID_HOME集群列表

2.1  停止和刪除監(jiān)聽

節(jié)點1,grid用戶下:

[grid@wbtdb1 ~]$ srvctl status listener -l listener -n wbtdb2
Listener LISTENER is enabled on node(s): wbtdb2
Listener LISTENER is running on node(s): wbtdb2
執(zhí)行以下:
[grid@wbtdb1 ~]$ srvctl disable listener -l listener -n wbtdb2
[grid@wbtdb1 ~]$ srvctl stop listener -l listener -n wbtdb2
PRCR-1014 : Failed to stop resource ora.LISTENER.lsnr
PRCR-1065 : Failed to stop resource ora.LISTENER.lsnr
CRS-2675: Stop of ora.LISTENER.lsnr on wbtdb2 failed
CRS-2678: ora.LISTENER.lsnr on wbtdb2 has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.

[grid@wbtdb1 ~]$ srvctl status listener -l listener -n wbtdb2
Listener LISTENER is disabled on node(s): wbtdb2
Listener LISTENER is not running on node(s): wbtdb2
2.2   更新GRID_HOME集群列表
在所有的保留節(jié)點Oracle 用戶 $ORACLE_HOME/oui/bin 下運行以下命令來更新這些節(jié)點的清單,并指定逗號分隔的其余節(jié)點名稱列表(正常,異常都執(zhí)行):
(這個步驟的作用就是更新保留節(jié)點集群信息列表,節(jié)點2雖然軟件目錄都已經(jīng)刪除掉了,但是從節(jié)點1查詢,節(jié)點2的asm實例等還是顯示正常的,這個命令的目的就是把不正常的顯示更新掉)
官方命令:
$./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={remaining_node_list}"
所有的保留節(jié)點執(zhí)行,我的就只剩一個節(jié)點,例如我在節(jié)點1操作(grid用戶下操作):
cd $ORACLE_HOME/oui/bin
[grid@wbtdb1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={wbtdb1}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4095 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
UpdateNodeList was successful.


3. 從cluster中刪除節(jié)點  

來自官方文檔:https://docs.oracle.com/cd/E11882_01/rac.112/e41959/affffdelclusterware.htm#CWADD90992

3.1  確認節(jié)點狀態(tài)是否是Unpinned

ROOT 或者grid 執(zhí)行

[grid@wbtdb1 bin]$ olsnodes -s -t
wbtdb1 Active Unpinned
wbtdb2 Active Unpinned
[grid@wbtdb1 bin]$
如果要刪除的節(jié)點為 pinned 狀態(tài),請ROOT手工執(zhí)行以下命令。

官方文檔如下:

提別提醒:很多網(wǎng)絡上資料不正確,如果Unpinned(不固定的),根本不需要執(zhí)行unpin 命令

本次故障處理不需要執(zhí)行以下命令。
crsctl unpin css -n

例如:

crsctl unpin css -n wbtdb2
/u01/11.2.0/grid/bin/crsctl unpin css -n wbtdb2

crsctl status res -t

3.2   清除VIP信息

首先停止節(jié)點2的VIP:(VIP_name 為/etc/hosts里的名稱 rac2-vip)

srvctl stop vip -i vip_name -f
ROOT用戶:
[root@wbtdb1 ~]# /u01/11.2.0/grid/bin/srvctl stop vip -i wbtdb2-vip -f
[root@wbtdb1 ~]#
清除vip信息:srvctl remove vip -i vip_name -f
[root@wbtdb1 ~]# /u01/11.2.0/grid/bin/srvctl remove vip -i wbtdb2-vip -f

查看VIP:

/u01/11.2.0/grid/bin/crsctl status res -t
只有節(jié)點1的VIP。

3.3  刪除節(jié)點2

[grid@wbtdb1 ~]$ olsnodes -s -t
wbtdb1 Active Unpinned
wbtdb2 Active Unpinned

正常節(jié)點1上root執(zhí)行刪除節(jié)點2命令:

[root@wbtdb1 ~]# /u01/11.2.0/grid/bin/crsctl delete node -n wbtdb2

CRS-4661: Node wbtdb2 successfully deleted.

驗證:

[grid@wbtdb1 ~]$ olsnodes -s -t
wbtdb1 Active Unpinned


注:如果節(jié)點2刪除失敗報CRS-4658、CRS-4000錯誤,可以將節(jié)點2 crs相關進程殺掉即可

[root@wbtdb1 ~]# /u01/11.2.0/grid/bin/crsctl delete node -n wbtdb2

CRS-4658: The clusterware stack on node wbtdb2 is not completely down.

CRS-4000: Command Delete failed, or completed with errors.


發(fā)現(xiàn)刪除失敗,檢查crs啟動情況:

[root@wbtdb1 bin]# ./crsctl stat res -t
--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------

Local Resources
--------------------------------------------------------------------------------

ora.ARCHDG.dg
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.DATADG.dg
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.LISTENER.lsnr
               ONLINE ONLINE wbtdb1
               OFFLINE UNKNOWN wbtdb2
ora.OCRVOTING.dg
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.asm
               ONLINE ONLINE wbtdb1 Started
               ONLINE ONLINE wbtdb2 Started
ora.gsd
               OFFLINE OFFLINE wbtdb1
               OFFLINE OFFLINE wbtdb2
ora.net1.network
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
ora.ons
               ONLINE ONLINE wbtdb1
               OFFLINE UNKNOWN wbtdb2
ora.registry.acfs
               ONLINE ONLINE wbtdb1
               ONLINE ONLINE wbtdb2
--------------------------------------------------------------------------------

Cluster Resources
--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr
      1 ONLINE ONLINE wbtdb1
ora.cvu
      1 ONLINE ONLINE wbtdb2
ora.oc4j
      1 ONLINE INTERMEDIATE wbtdb2
ora.scan1.vip
      1 ONLINE ONLINE wbtdb1
ora.wbtdb.db
      1 ONLINE ONLINE wbtdb1 Open
ora.wbtdb1.vip
      1 ONLINE ONLINE wbtdb1
[root@wbtdb1 bin]#


發(fā)現(xiàn)節(jié)點2的資源都已經(jīng)停止了。但是查看節(jié)點2 crs相關進程都還在,那是因為我們是crs運行正常是刪除的軟件目錄,軟件雖然刪除了,但是進程還未清掉。清掉crs進程后節(jié)點2就可以刪除了


[root@wbtdb2 ~]# ps -ef|grep d.bin
root      2546     1  4 13:10 ? 00:03:22 /u01/11.2.0/grid/bin/ohasd.bin reboot
grid      3632     1  0 13:11 ? 00:00:06 /u01/11.2.0/grid/bin/oraagent.bin
grid      3643     1  0 13:11 ? 00:00:00 /u01/11.2.0/grid/bin/mdnsd.bin
grid      3673     1  0 13:11 ? 00:00:00 /u01/11.2.0/grid/bin/gpnpd.bin
grid      3683     1  0 13:11 ? 00:00:07 /u01/11.2.0/grid/bin/gipcd.bin
root      3685     1  0 13:11 ? 00:00:06 /u01/11.2.0/grid/bin/orarootagent.bin
root      3698     1  1 13:11 ? 00:00:46 /u01/11.2.0/grid/bin/osysmond.bin
root      3717     1  0 13:11 ? 00:00:02 /u01/11.2.0/grid/bin/cssdmonitor
root      3740     1  0 13:11 ? 00:00:02 /u01/11.2.0/grid/bin/cssdagent
grid      3751     1  0 13:11 ? 00:00:09 /u01/11.2.0/grid/bin/ocssd.bin
root      3974     1  0 13:11 ? 00:00:07 /u01/11.2.0/grid/bin/octssd.bin reboot
grid      3997     1  0 13:11 ? 00:00:07 /u01/11.2.0/grid/bin/evmd.bin
root      4408     1 30 13:12 ? 00:22:08 /u01/11.2.0/grid/bin/crsd.bin reboot
grid      4484  3997  0 13:12 ? 00:00:00 /u01/11.2.0/grid/bin/evmlogger.bin -o /u01/11.2.0/grid/evm/log/evmlogger.info -l /u01/11.2.0/grid/evm/log/evmlogger.log
grid      4519     1 27 13:12 ? 00:19:39 /u01/11.2.0/grid/bin/oraagent.bin
root      4525     1 22 13:12 ? 00:16:20 /u01/11.2.0/grid/bin/orarootagent.bin
grid      4712     1  0 13:12 ? 00:00:00 /u01/11.2.0/grid/bin/scriptagent.bin
grid      4814     1  0 13:12 ? 00:00:00 /u01/11.2.0/grid/bin/tnslsnr LISTENER -inherit
root     23792  5354  0 14:24 pts/0    00:00:00 grep d.bin
[root@wbtdb2 ~]# kill -9 2546 3632 3643 3673 3683 3685 3698 3717 3740 3751 3997 3974 4519 4525 4712 4814
[root@wbtdb2 ~]# ps -ef|grep d.bin
root     30270  5354  0 14:25 pts/0    00:00:00 grep d.bin

3.4  更新集群節(jié)點信息

Grid用戶在任何運行正常,所有保留的節(jié)點上運行以下命令:
[grid@rac11g1 ~]$ cd $ORACLE_HOME/oui/bin
[grid@rac11g1 bin]$ echo $ORACLE_HOME
/u01/11.2.0/grid
$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={remaining_nodes_list}" CRS=TRUE -silent
例如:
$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={rac1,rac3……}" CRS=TRUE -silent
操作如下:
[grid@wbtdb1 bin]$ /u01/11.2.0/grid/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES={wbtdb1}" CRS=TRUE -silent
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB. Actual 4095 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
UpdateNodeList was successful.

--檢查:

[grid@wbtdb1 bin]$ olsnodes -s -t
wbtdb1 Active Unpinned
[grid@wbtdb1 bin]$

3.5  驗證節(jié)點wbtdb2是否刪除完全

[grid@wbtdb1 bin]$ cluvfy stage -post nodedel -n wbtdb2 -verbose
Performing post-checks for node removal
Checking CRS integrity...
Clusterware version consistency passed
The Oracle Clusterware is healthy on node "wbtdb1"
CRS integrity check passed
Result:
Node removal check passed

Post-check for node removal was successful. 

[grid@racdb1 ~]$ crsctl status resource -t
[grid@wbtdb1 ~]$ crsctl status resource -t
--------------------------------------------------------------------------------
NAME           TARGET STATE SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHDG.dg
               ONLINE  ONLINE       wbtdb1
ora.DATADG.dg
               ONLINE  ONLINE       wbtdb1
ora.LISTENER.lsnr
               ONLINE  ONLINE       wbtdb1
ora.OCRVOTING.dg
               ONLINE  ONLINE       wbtdb1
ora.asm
               ONLINE  ONLINE       wbtdb1 Started
ora.gsd
               OFFLINE OFFLINE      wbtdb1
ora.net1.network
               ONLINE  ONLINE       wbtdb1
ora.ons
               ONLINE  ONLINE       wbtdb1
ora.registry.acfs
               ONLINE  ONLINE       wbtdb1
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       wbtdb1
ora.cvu
      1        ONLINE  ONLINE       wbtdb1
ora.oc4j
      1        ONLINE  ONLINE       wbtdb1
ora.scan1.vip
      1        ONLINE  ONLINE       wbtdb1
ora.wbtdb.db
      1        ONLINE  ONLINE       wbtdb1 Open
ora.wbtdb1.vip
      1        ONLINE  ONLINE       wbtdb1
[grid@wbtdb1 ~]$


至此集群中的節(jié)點2的信息完全清除完畢!

后面可以自行驗證保留的集群資源,以及實例狀態(tài)是否正常。


未完待續(xù)···


更多精彩干貨分享

點擊下方名片關注

IT那活兒

文章版權歸作者所有,未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請注明本文地址:http://systransis.cn/yun/129933.html

相關文章

發(fā)表評論

0條評論

IT那活兒

|高級講師

TA的文章

閱讀更多
最新活動
閱讀需要支付1元查看
<