当前位置: 首页 > news >正文

CMake-gdb调试,解决LLVM ERROR: out of memory

问题描述

在新设备上部署VideoPipe时,CMake编译好运行中途经常遇到LLVM ERROR: out of memory的报错,

[Thread 0x7ffcd097f700 (LWP 9673) exited]
LLVM ERROR: out of memoryThread 38 "trt_yolov8_samp" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fff7afde700 (LWP 9352)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

且google搜索到的很多也是TensorRT、DeepStream相关的程序会遇到这个错误,但开发者们的描述也都是内存够用,还是out of memory,后来通过gdb调试定位到空指针解决。

错误原因

gdb bt返回的信息:

#0  0x00007ffff6476e87 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff64787f1 in __GI_abort () at abort.c:79
#2  0x00007fffd755ecbb in  () at /usr/local/tensorRT/lib/libnvinfer.so.8
#3  0x00007ffff6ad42ac in operator new(unsigned long) () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x000055555557ea7a in __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) (this=0x7fff7afd2c20, __n=3980232092549127)at /usr/include/c++/7/ext/new_allocator.h:111
#5  0x000055555557cf8b in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) (__a=..., __n=3980232092549127) at /usr/include/c++/7/bits/alloc_traits.h:436
#6  0x000055555557e9de in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) (this=0x7fff7afd2c20, __capacity=@0x7fff7afd2a70: 3980232092549126, __old_capacity=0) at /usr/include/c++/7/bits/basic_string.tcc:153
#7  0x00007ffff78e096c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) (this=0x7fff7afd2c20, __beg=0x504047389 <error: Cannot access memory at address 0x504047389>, __end=0xe24050404738f <error: Cannot access memory at address 0xe24050404738f>) at /usr/include/c++/7/bits/basic_string.tcc:219
#8  0x00007ffff78ddd5e in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct_aux<char*>(char*, char*, std::__false_type) (this=0x7fff7afd2c20, __beg=0x504047389 <error: Cannot access memory at address 0x504047389>, __end=0xe24050404738f <error: Cannot access memory at address 0xe24050404738f>) at /usr/include/c++/7/bits/basic_string.h:236
#9  0x00007ffff78dbd41 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*) (this=0x7fff7afd2c20, __beg=0x504047389 <error: Cannot access memory at address 0x504047389>, __end=0xe24050404738f <error: Cannot access memory at address 0xe24050404738f>) at /usr/include/c++/7/bits/basic_string.h:255
#10 0x00007ffff78d9b22 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (this=0x7fff7afd2c20, __str=<error: Cannot access memory at address 0x504047389>) at /usr/include/c++/7/bits/basic_string.h:440
#11 0x00007ffff79fa6f6 in vp_nodes::vp_trt_yolov8_detector::run_infer_combinations(std::vector<std::shared_ptr<vp_objects::vp_frame_meta>, std::allocator<std::shared_ptr<vp_objects::vp_frame_meta> > > const&) (this=0x555555fa1530, frame_meta_with_batch=std::vector of length 1, capacity 1 = {...}) at /home/ubuntu/yolov8n-trt-region-test/nodes/infers/vp_trt_yolov8_detector.cpp:53
#12 0x00007ffff7a42504 in vp_nodes::vp_infer_node::handle_frame_meta(std::shared_ptr<vp_objects::vp_frame_meta>) (this=0x555555fa1530, meta=std::shared_ptr<vp_objects::vp_frame_meta> (use count 8, weak count 0) = {...})at /home/ubuntu/yolov8n-trt-region-test/nodes/vp_infer_node.cpp:66
#13 0x00007ffff7a471d4 in vp_nodes::vp_node::handle_run() (this=0x555555fa1530) at /home/ubuntu/yolov8n-trt-region-test/nodes/vp_node.cpp:45
#14 0x00007ffff7a4b215 in std::__invoke_impl<void, void (vp_nodes::vp_node::*)(), vp_nodes::vp_node*>(std::__invoke_memfun_deref, void (vp_nodes::vp_node::*&&)(), vp_nodes::vp_node*&&) (__f=@0x5555b1e383a0: &virtual vp_nodes::vp_node::handle_run(), __t=@0x5555b1e38398: 0x555555fa1530) at /usr/include/c++/7/bits/invoke.h:73
#15 0x00007ffff7a4a146 in std::__invoke<void (vp_nodes::vp_node::*)(), vp_nodes::vp_node*>(void (vp_nodes::vp_node::*&&)(), vp_nodes::vp_node*&&) (__fn=@0x5555b1e383a0: &virtual vp_nodes::vp_node::handle_run()) at /usr/include/c++/7/bits/invoke.h:95
#16 0x00007ffff7a4d30b in std::thread::_Invoker<std::tuple<void (vp_nodes::vp_node::*)(), vp_nodes::vp_node*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (this=0x5555b1e38398) at /usr/include/c++/7/thread:234
#17 0x00007ffff7a4d2c1 in std::thread::_Invoker<std::tuple<void (vp_nodes::vp_node::*)(), vp_nodes::vp_node*> >::operator()() (this=0x5555b1e38398) at /usr/include/c++/7/thread:243
#18 0x00007ffff7a4d2a0 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (vp_nodes::vp_node::*)(), vp_nodes::vp_node*> > >::_M_run() (this=0x5555b1e38390) at /usr/include/c++/7/thread:186
#19 0x00007ffff6afe6df in  () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#20 0x00007ffff08516db in start_thread (arg=0x7fff7afde700) at pthread_create.c:463
#21 0x00007ffff655961f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

堆栈显示多个线程在操作推理管道:
vp_node::handle_run() std::thread 相关操作 labels 容器可能在一个线程中被修改/销毁,而另一个线程正在使用,根本原因是悬空指针导致无效内存访问,触发了超大分配请求;
存在试图访问无效内存地址,在构造字符串时使用了野指针或已释放的内存。

CMake-gbd调试

【CMake】CMake从入门到实战系列(十一)——CMake支持gdb调试

CMake开启gdb调试

CMakeLists.txt中添加:

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fPIC -fdiagnostics-color=always -pthread")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -O0 -Wall -ggdb")string(REPLACE "-w" "" CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
string(REPLACE "-g" "" CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}")

1、启动gdb:
在这里插入图片描述
2、b mainbreak main给main函数打上断点
3、r或run开始运行
在这里插入图片描述4、c 继续执行,会自动定位到报错所在位置
5、bt 查看堆栈具体信息, info stack查看堆栈更详细的信息,分析报错原因。

我这里的主要问题是yolo_detector检测的代码在检测到目标但读取不到目标的标签时会导致空指针出现:

auto label = labels.size() == 0 ? "" : labels[objbox.class_id];auto target = std::make_shared<vp_objects::vp_frame_target>(x, y, width, height, objbox.class_id, objbox.conf, frame_meta->frame_index, frame_meta->channel_index, label);

上述代码可能存在的问题:

悬空指针风险:当 labels.size() > 0 时,label 直接引用 labels[objbox.class_id] 的字符串如果 labels 容器被修改(如元素删除/移动),引用会变为无效索引越界:objbox.class_id 可能超出 labels 的有效索引范围当 class_id >= labels.size() 时,访问越界导致未定义行为生命周期问题:labels 容器可能在该行代码执行后被销毁或修改多线程环境下,其他线程可能修改 labels 容器

修改后的代码:

// 安全做法:构造新的字符串副本
std::string safe_label = labels.empty() ? "" : (objbox.class_id < labels.size() ? labels[objbox.class_id] : "unknown");  // 处理越界情况auto target = std::make_shared<vp_objects::vp_frame_target>(x, y, width, height, objbox.class_id, objbox.conf, frame_meta->frame_index, frame_meta->channel_index, safe_label  // 使用安全副本
);
http://www.lryc.cn/news/591888.html

相关文章:

  • 2021市赛复赛 初中组
  • docker重新搭建redis集群
  • 闲庭信步使用图像验证平台加速FPGA的开发:第二十课——图像还原的FPGA实现
  • 基于vue + Cesium 的蜂巢地图可视化实现
  • 数据仓库分层经典架构:ODS、DWD、DWS
  • 【通识】网络的基础知识
  • 李宏毅《生成式人工智能导论》 | 第15讲-第18讲:生成的策略-影像有关的生成式AI
  • 无线调制的几种方式
  • 2-Vue3应用介绍
  • 调用 System.gc() 的弊端及修复方式
  • 如何优雅处理 Flowable 工作流的 TaskAlreadyClaimedException?
  • Kotlin抽象类
  • github不能访问怎么办
  • Allure + JUnit5
  • 宝塔申请证书错误,提示 module ‘OpenSSL.crypto‘ has no attribute ‘sign‘
  • 开源鸿蒙5.0北向开发测试:测试鸿蒙显示帧率
  • Jenkins Git Parameter 分支不显示前缀origin/或repo/
  • MySQL安装(yum版)
  • Lotus-基于大模型的查询引擎 -开源学习整理
  • 海思3516CV610 卷绕 研究
  • 用Amazon Q Developer命令行工具(CLI)快捷开发酒店入住应用程序
  • Python编程进阶知识之第二课学习网络爬虫(requests)
  • 菜单权限管理
  • Spring底层原理(一)核心原理
  • 第十八节:第三部分:java高级:反射-获取构造器对象并使用
  • MYOJ_8518:CSP初赛题单3:数制练习专项
  • 【Java】文件编辑器
  • CSP-S模拟赛三(仍然是难度远超CSP-S)
  • 【Linux】LVS(Linux virual server)
  • 网络爬虫的详细知识点