当前位置: 首页 > news >正文

Oracle 11g rac 集群节点的修复过程

Oracle 11g rac 集群节点的修复过程

目录

  • Oracle 11g rac 集群节点的修复过程
    • 一、问题的产生
    • 二、修复过程
        • 1、执行 roothas.pl 命令
        • 2、执行 root.sh 命令
        • 3、查看集群信息
        • 4、查看节点2的IP地址
        • 5、查看节点2的监听信息

一、问题的产生

用户的双节点 Oracle 11g rac 集群,发现有一个节点宕机,发现集群没有启动。手工启动集群报如下错误:

[root@his02 bin]# ./crsctl start cluster
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Start failed, or completed with errors.[root@his02 bin]# ./crsctl check css
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Check failed, or completed with errors.

二、修复过程

1、执行 roothas.pl 命令
[root@his02 bin]# cd /u01/app/11.2.0/grid/crs/install
[root@his02 install]# ./roothas.pl -deconfig -force -verbose
Can't locate Env.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 . .) at crsconfig_lib.pm line 703.
BEGIN failed--compilation aborted at crsconfig_lib.pm line 703.
Compilation failed in require at ./roothas.pl line 166.
BEGIN failed--compilation aborted at ./roothas.pl line 166.

执行以上命令时出现错误,重新执行以下格式的命令:

[root@his02 install]# /u01/app/11.2.0/grid/perl/bin/perl /u01/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force
Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params
PRCR-1119 : 无法查找 ora.cluster_vip_net1.type 类型的 CRS 资源
PRCR-1068 : 无法查询资源
Cannot communicate with crsd
PRCR-1070 : 无法检查 资源 ora.gsd 是否已注册
Cannot communicate with crsd
PRCR-1070 : 无法检查 资源 ora.ons 是否已注册
Cannot communicate with crsdCRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.Successfully deconfigured Oracle clusterware stack on this node
2、执行 root.sh 命令
[root@his02 grid]# ./root.sh
Check /u01/app/11.2.0/grid/install/root_his02_2024-11-13_19-10-14.log for the output of root script

执行过程中查看日志,发现如下错误:

[root@his02 ~]# tail -f /u01/app/11.2.0/grid/install/root_his02_2024-11-13_19-10-14.log
Performing root user operation for Oracle 11g The following environment variables are set as:ORACLE_OWNER= gridORACLE_HOME=  /u01/app/11.2.0/gridCopying dbhome to /usr/local/bin ...Copying oraenv to /usr/local/bin ...Copying coraenv to /usr/local/bin ...Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
ohasd failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow: 
[client(50691)]CRS-2101:The OLR was formatted using version 3.

该错误解决方法如下:

(1)新开一个窗口,执行如下命令:

[root@his02 install]# cd /var/tmp/.oracle/
[root@his02 .oracle]# ls
npohasd
[root@his02 .oracle]# dd if=npohasd of=/dev/null bs=1024 count=1

过一段时间,重新查看日志,发现 root.sh 命令已执行完毕,节点添加成功。

[root@his02 ~]# tail -f /u01/app/11.2.0/grid/install/root_his02_2024-11-13_19-10-14.log
Performing root user operation for Oracle 11g The following environment variables are set as:ORACLE_OWNER= gridORACLE_HOME=  /u01/app/11.2.0/gridCopying dbhome to /usr/local/bin ...Copying oraenv to /usr/local/bin ...Copying coraenv to /usr/local/bin ...Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
ohasd failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow: 
[client(50691)]CRS-2101:The OLR was formatted using version 3.
2023-10-28 00:55:42.163: 
[ohasd(51763)]CRS-0715:Oracle High Availability Service has timed out waiting for init.ohasd to be started.
2024-11-13 18:04:35.572: 
[ohasd(119653)]CRS-0715:Oracle High Availability Service has timed out waiting for init.ohasd to be started.
2024-11-13 18:27:11.266: 
[ohasd(34911)]CRS-2112:The OLR service started on node his02.
2024-11-13 18:27:11.274: 
[ohasd(34911)]CRS-1301:Oracle High Availability Service started on node his02.
2024-11-13 18:55:39.514: 
[ohasd(44682)]CRS-2112:The OLR service started on node his02.
2024-11-13 18:55:39.523: 
[ohasd(44682)]CRS-1301:Oracle High Availability Service started on node his02.
2024-11-13 18:55:39.574: 
[ohasd(43387)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-24: Error in the messaging layer Messaging error [gipcretAddressInUse] [20]]. Details at (:OHAS00106:) in /u01/app/11.2.0/grid/log/his02/ohasd/ohasd.log.
[client(49054)]CRS-10001:13-Nov-24 19:07 ACFS-9459: ADVM/ACFS is not supported on this OS version: 'centos-release-7-3.1611.el7.centos.x86_64
'
[client(49056)]CRS-10001:13-Nov-24 19:07 ACFS-9201: Not Supported
2024-11-13 19:12:09.387: 
[client(53693)]CRS-2101:The OLR was formatted using version 3.CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node his01, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
PRKO-2190 : 节点 his02 存在 VIP, VIP 名称 his02-vip
软件包准备中...
cvuqdisk-1.0.9-1.x86_64
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

(2)重启服务器,然后执行如下命令:

[root@his02 ~]# cd /var/tmp/.oracle/
[root@his02 .oracle]# ll npohasd
prw-r--r-- 1 root root 0 821 14:46 npohasd
[root@his02 .oracle]# rm -rf  npohasd
[root@his02 .oracle]# touch  npohasd
[root@his02 .oracle]# chmod 755  npohasd
[root@his02 .oracle]# ll npohasd
-rwxr-xr-x 1 root root 0 821 15:02 npohasd
3、查看集群信息
[root@his02 .oracle]# su - grid
上一次登录:三 1113 19:05:15 CST 2024pts/1 上
[grid@his02 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.BAK.dg     ora....up.type ONLINE    ONLINE    his01       
ora.DATA.dg    ora....up.type ONLINE    ONLINE    his01       
ora....ER.lsnr ora....er.type ONLINE    ONLINE    his01       
ora....N1.lsnr ora....er.type ONLINE    ONLINE    his01       
ora.OCR.dg     ora....up.type ONLINE    ONLINE    his01       
ora.asm        ora.asm.type   ONLINE    ONLINE    his01       
ora.cvu        ora.cvu.type   ONLINE    ONLINE    his01       
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE               
ora....SM1.asm application    ONLINE    ONLINE    his01       
ora....01.lsnr application    ONLINE    ONLINE    his01       
ora.his01.gsd  application    OFFLINE   OFFLINE               
ora.his01.ons  application    ONLINE    ONLINE    his01       
ora.his01.vip  ora....t1.type ONLINE    ONLINE    his01       
ora....SM2.asm application    ONLINE    ONLINE    his02       
ora....02.lsnr application    ONLINE    ONLINE    his02       
ora.his02.gsd  application    OFFLINE   OFFLINE               
ora.his02.ons  application    ONLINE    ONLINE    his02       
ora.his02.vip  ora....t1.type ONLINE    ONLINE    his02       
ora.hisdb.db   ora....se.type ONLINE    ONLINE    his01       
ora....network ora....rk.type ONLINE    ONLINE    his01       
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    his01       
ora.ons        ora.ons.type   ONLINE    ONLINE    his01       
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    his01
4、查看节点2的IP地址
[grid@his02 ~]$ ifconfig
bond1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500inet 192.168.0.2  netmask 255.255.255.0  broadcast 192.168.0.255inet6 fe80::72fd:45ff:fe6b:cfb7  prefixlen 64  scopeid 0x20<link>ether 70:fd:45:6b:cf:b7  txqueuelen 1000  (Ethernet)RX packets 51878  bytes 24906169 (23.7 MiB)RX errors 0  dropped 482  overruns 0  frame 2TX packets 68845  bytes 58927700 (56.1 MiB)TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0bond1:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500inet 192.168.0.102  netmask 255.255.255.0  broadcast 192.168.0.255ether 70:fd:45:6b:cf:b7  txqueuelen 1000  (Ethernet)bond2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500inet 10.5.5.2  netmask 255.255.255.0  broadcast 10.5.5.255inet6 fe80::72fd:45ff:fe6b:cfb8  prefixlen 64  scopeid 0x20<link>ether 70:fd:45:6b:cf:b8  txqueuelen 1000  (Ethernet)RX packets 202892  bytes 172526350 (164.5 MiB)RX errors 0  dropped 85  overruns 0  frame 0TX packets 133743  bytes 65314520 (62.2 MiB)TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0bond2:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500inet 169.254.6.27  netmask 255.255.0.0  broadcast 169.254.255.255ether 70:fd:45:6b:cf:b8  txqueuelen 1000  (Ethernet)lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536inet 127.0.0.1  netmask 255.0.0.0inet6 ::1  prefixlen 128  scopeid 0x10<host>loop  txqueuelen 1  (Local Loopback)RX packets 27683  bytes 10369158 (9.8 MiB)RX errors 0  dropped 0  overruns 0  frame 0TX packets 27683  bytes 10369158 (9.8 MiB)TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
5、查看节点2的监听信息
[grid@his02 ~]$ lsnrctl statusLSNRCTL for Linux: Version 11.2.0.4.0 - Production on 13-NOV-2024 19:20:05Copyright (c) 1991, 2013, Oracle.  All rights reserved.Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                13-NOV-2024 19:14:42
Uptime                    0 days 0 hr. 5 min. 22 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/11.2.0/grid/network/admin/listener.ora
Listener Log File         /u01/app/11.2.0/grid/network/log/listener.log
Listening Endpoints Summary...(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.2)(PORT=1521)))(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.102)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).Instance "+ASM2", status READY, has 1 handler(s) for this service...
Service "HISDB" has 1 instance(s).Instance "hisdb2", status READY, has 1 handler(s) for this service...
Service "HISDBXDB" has 1 instance(s).Instance "hisdb2", status READY, has 1 handler(s) for this service...
The command completed successfully

至此,节点2已完全恢复正常!

http://www.lryc.cn/news/483274.html

相关文章:

  • c++:string(一)
  • github和Visual Studio
  • django框架-settings.py文件的配置说明
  • 【C语言】缺陷管理流程
  • 基于深度学习的猫狗识别
  • java组件安全
  • 【MongoDB】MongoDB的核心-索引原理及索引优化、及查询聚合优化实战案例(超详细)
  • qt QProcess详解
  • 软件测试面试2024最新热点问题
  • 10款录屏工具推荐,聊聊我的使用心得!!!!
  • VMware+Ubuntu+finalshell连接
  • autodl+modelscope推理stable-diffusion-3.5-large
  • 深度学习之 LSTM
  • LeetCode 3242.设计相邻元素求和服务:哈希表
  • 【AliCloud】ack + ack-secret-manager + kms 敏感数据安全存储
  • 探索JavaScript的强大功能:从基础到高级应用
  • 新增支持Elasticsearch数据源,支持自定义在线地图风格,DataEase开源BI工具v2.10.2 LTS发布
  • Spark的容错机制
  • YOLOv8改进 | 利用YOLOv8进行视频划定区域目标统计计数
  • 基于yolov8、yolov5的番茄成熟度检测识别系统(含UI界面、训练好的模型、Python代码、数据集)
  • wafw00f源码详细解析
  • 什么是crm?3000字详细解析
  • WEB3.0介绍
  • 【深度学习】LSTM、BiLSTM详解
  • 分子对接--软件安装
  • 【Python无敌】在 QGIS 中使用 Python
  • 全面解读:低代码开发平台的必备要素——系统策划篇
  • Vue开发自动生成验证码功能 前端实现不使用第三方插件实现随机验证码功能,生成的验证码添加干扰因素
  • # filezilla连接 虚拟机ubuntu系统出错“尝试连接 ECONNREFUSED - 连接被服务器拒绝, 失败,无法连接服务器”解决方案
  • 2024/11/13 英语每日一段