当前位置: 首页 > news >正文

ORACLE RAC ADG备库报错ORA-04021: timeout occurred while waiting to lock object

问题:核心的灾备 RAC ADG 备库,这两天频繁重启,并且报如下错误,通过查看MOS,发现是个BUG

ADG备库的ALERT错误日志如下:
Errors in file /u01/app/oracle/diag/rdbms/hxxxsz/hxxxsz1/trace/hxxxsz1_lgwr_69711.trc:
ORA-04021: timeout occurred while waiting to lock object
Mon Dec 16 16:26:15 2024
ORA-01555 caused by SQL statement below (SQL ID: 87gaftwrm2h68, Query Duration=899 sec, SCN: 0x05cf.7a01a7dc):
select o.owner#,o.name,o.namespace,o.remoteowner,o.linkname,o.subname from obj$ o where o.obj#=:1
LGWR (ospid: 69711): terminating the instance due to error 4021
Mon Dec 16 16:26:15 2024
System state dump requested by (instance=1, osid=69711 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/hxxxsz/hxxxsz1/trace/hxxxsz1_diag_69557_20241216162615.trc
Mon Dec 16 16:26:15 2024
ORA-1092 : opitsk aborting process
Mon Dec 16 16:26:16 2024
License high water mark = 1321
Instance terminated by LGWR, pid = 69711
USER (ospid: 42412): terminating the instance
Instance terminated by USER, pid = 42412
Mon Dec 16 16:26:23 2024
Starting ORACLE instance (normal)
************************ Large Pages Information *******************
Per process system memlock (soft) limit = UNLIMITED


解决方案:

1. 查看隐藏参数:
SELECT ksppinm, ksppstvl, ksppdesc FROM   x$ksppi x, x$ksppcv y WHERE   x.indx = y.indx AND  ksppinm ='_adg_parselock_timeout';

KSPPINM
--------------------------------------------------------------------------------
KSPPSTVL
--------------------------------------------------------------------------------
KSPPDESC
--------------------------------------------------------------------------------
_adg_parselock_timeout
0
timeout for parselock get on ADG in centiseconds

2. 执行以下语句:
alter system set "_adg_parselock_timeout"=500 scope=both sid='*';

参考MOS内容如下:

ORA-04021: timeout occurred while waiting to lock object : DR Instance terminated by LGWR (Doc ID 2183882.1)

Applies to:
Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Information in this document applies to any platform.


Symptoms

DR database crashed with below errors..

Client address: (ADDRESS=(PROTOCOL=<protocol>)(HOST=<hostname>)(PORT=<port>))
WARNING: inbound connection timed out (ORA-3136)
Wed Jul 13 13:43:24 2016
Errors in file /<path>/diag/rdbms/<db_name>/<oracle_sid>/trace/<oracle_sid>_lgwr_<pid>.trc:
ORA-04021: timeout occurred while waiting to lock object
LGWR (ospid: 31312): terminating the instance due to error 4021
Wed Jul 13 13:43:24 2016
System state dump requested by (instance=1, osid=31312 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /<path>/diag/rdbms/<db_name>/<oracle_sid>/trace/<oracle_sid>_diag_<pid>.trc
Wed Jul 13 13:43:25 2016
License high water mark = 318
Instance terminated by LGWR, pid = 31312
USER (ospid: 20898): terminating the instance
Instance terminated by USER, pid = 20898
Wed Jul 13 13:43:39 2016
Starting ORACLE instance (normal)

Cause

Bug 16717701 - ADG SHOULD GET THE INSTANCE PARSE LOCK WITH A TIMEOUT  ------> Superseded By Bug fix Bug 17018214

Bug 11712267 - ACTIVE DATA GUARD DATABASE HUNG ON 'LIBRARY CACHE: MUTEX X' WAIT EVENT

LGWR trace file (RXEPRR1_lgwr_31312.trc)

*** 2016-07-13 13:43:24.498
*** SESSION ID:(6709.1) 2016-07-13 13:43:24.498
*** CLIENT ID:() 2016-07-13 13:43:24.498
*** SERVICE NAME:(SYS$BACKGROUND) 2016-07-13 13:43:24.498
*** MODULE NAME:() 2016-07-13 13:43:24.498
*** ACTION NAME:() 2016-07-13 13:43:24.498

error 4021 detected in background process
ORA-04021: timeout occurred while waiting to lock object
kjzduptcctx: Notifying DIAG for crash event
----- Abridged Call Stack Trace -----
ksedsts()+1296<-kjzdicrshnfy()+364<-ksuitm()+1688<-ksbrdp()+4296<-opirip()+1680<-opidrv()+748<-sou2o()+88<-opimai_real()+276<-ssthrdmain()+316<-main()+316<-_start()+380
----- End of Abridged Call Stack Trace -----

Solution

Issue matches with bug 11712267 and bug 16717701

Since two bugs are matching with the case,

You can try with option (1) . As per Bug 11712267

change the cursor_sharing to force on Active dataguard (ADG).

Monitor your environment for sometime.

If it crashes again then follow with the option (2)
Option (2):

As per bug description

LGWR can request DBINSTANCE lock in X mode without any timeout which can lead to a hang / deadlock.

Both fixes are already included in 11.2.0.4 but the fix is DISABLED by default.
== > To ENABLE the fix one has to set == > "_adg_parselock_timeout" > to the number of centi-seconds == > LGWR should wait before backing off and retrying the request.

Value should be in centi seconds. == > I Don't think there is really any hard fast rule for a value - at default (0) it will not timeout.
A value representing a few seconds seems reasonable - if LGWR has been stuck for say 5 seconds waiting it seems reasonable guess it is not going to get the lock.

The param just causes it to abort the current attempt and retry If you want to play safe can start with a higher value then decrease later.
A higher value will just mean more sessions blocked for longer in case of the deadlock situation.
500 Seems reasonable , but I have no data to base it on.

There should be a statistic "ADG parselock X get attempts" If it gets set too small that value would likely increase a lot due to keep timing out and retrying.

This is a dynamic parameter

Follow option (1) .

change the cursor_sharing to force on ADG


If issue re-appears then follow option (2) as below

Please set "_adg_parselock_timeout" to 500 == >

SQL > alter system set "_adg_parselock_timeout"=500 scope=both sid='*';

http://www.lryc.cn/news/504948.html

相关文章:

  • CAPL如何设置或修改CANoe TCP/IP协议栈的底层配置
  • git使用教程(超详细)-透彻理解git
  • 【2024 Dec 超实时】编辑安装llama.cpp并运行llama
  • Docker介绍、安装、namespace、cgroup、镜像-Dya 01
  • docker 搭建自动唤醒UpSnap工具
  • 3D一览通在线协同设计,助力汽车钣金件设计与制造数字化升级
  • 基于Matlab实现三维地球模型(源码)
  • 【Tomcat】第五站:Servlet容器
  • CTF 攻防世界 Web: FlatScience write-up
  • 【SpringBoot中MySQL生成唯一ID的常见方法】
  • 使用Flink CDC实现 Oracle数据库数据同步的oracle配置操作
  • c++作业7
  • vue 上传组件 vxe-upload 实现拖拽调整顺序
  • Windows 环境实战开源项目GFPGAN 教程
  • UE5 做简单的风景观光视频
  • k8s服务搭建与实战案例
  • JavaScript学习难点
  • Qt WORD/PDF(一)使用 QtPdfium库实现 PDF 预览
  • 解决创建laravel项目,使用国外镜像超时,国内镜像缺包的问题
  • Java泛型设计详解
  • 用ue5打开网址链接
  • 【大数据】-- 读放大和写放大
  • 【前端】JavaScript 抽取字符串特定部分题目详解与实现思路
  • CNCF云原生生态版图-分类指南(一)- 观测和分析
  • 热更新解决方案3 —— xLua
  • 如何让ai在游戏中更像一个人?
  • websocket_asyncio
  • 如何在NGINX中实现基于IP的访问控制(IP黑白名单)?
  • Y3编辑器文档4:触发器1(界面及使用简介、变量作用域、入门案例)
  • echarts图表自定义配置(二)——代码封装