当前位置: 首页 > news >正文

【已解决】异常断电文件损坏clickhouse启动不了:filesystem error Structure needs cleaning

问题

  • 办公室有一台二手服务器,作为平时开发测试使用。由于机器没放在机房,会偶发断电
  • 异常断电后,文件系统是有出问题的可能的,尤其是一些不断在读写合并的文件
  • 春节后,发现clickhouse启动不了,使用systemctl status clickhouse-server返回如下
● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled)Active: activating (auto-restart) (Result: exit-code) since 二 2023-02-21 15:46:41 CST; 10s agoProcess: 9688 ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid (code=exited, status=70)Main PID: 9688 (code=exited, status=70)2月 21 15:46:41 localhost.localdomain systemd[1]: clickhouse-server.service: main process exited, code=exited, status=70/n/a
2月 21 15:46:41 localhost.localdomain systemd[1]: Unit clickhouse-server.service entered failed state.
2月 21 15:46:41 localhost.localdomain systemd[1]: clickhouse-server.service failed.
  • 对于具体的报错信息,还是要查看clickhouse的报错日志,默认位置在/var/log/clickhouse-server/,错误日志如下:
2023.02.21 15:47:12.351823 [ 9848 ] {} <Error> system.query_thread_log (95febdac-d99a-4bdb-9431-653a75f3a34b): Detaching broken part /var/lib/clickhouse/store/95f/95febdac-d99a-4bdb-9431-653a75f3a34b/202302_390451_394084_2980 (size: 0.00 B). If it happened after update, it is likely because of backward incompability. You need to resolve this manually
2023.02.21 15:47:12.365662 [ 9787 ] {} <Error> Application: Caught exception while loading metadata: std::exception. Code: 1001, type: std::__1::__fs::filesystem::filesystem_error, e.what() = filesystem error: in directory_iterator::directory_iterator(...): Structure needs cleaning [/var/lib/clickhouse/store/95f/95febdac-d99a-4bdb-9431-653a75f3a34b/202302_390451_394023_2919/], Stack trace (when copying this message, always include the lines below):0. std::runtime_error::runtime_error(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x1b4107cd in ?
1. std::__1::system_error::system_error(std::__1::error_code, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x1b41a437 in ?
2. ? @ 0x145458c1 in /usr/bin/clickhouse
3. ? @ 0x1b3bf356 in /usr/bin/clickhouse
4. ? @ 0x1b3bd762 in /usr/bin/clickhouse
5. std::__1::__fs::filesystem::directory_iterator::directory_iterator(std::__1::__fs::filesystem::path const&, std::__1::error_code*, std::__1::__fs::filesystem::directory_options) @ 0x1b3bd581 in /usr/bin/clickhouse
6. DB::DiskLocalDirectoryIterator::DiskLocalDirectoryIterator(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x14552124 in /usr/bin/clickhouse
7. DB::DiskLocal::iterateDirectory(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x145483d7 in /usr/bin/clickhouse
8. DB::MergeTreeIndexGranularityInfo::getMarksExtensionFromFilesystem(std::__1::shared_ptr<DB::IDisk> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x1552dd06 in /usr/bin/clickhouse
9. DB::MergeTreeData::createPart(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::MergeTreePartInfo const&, std::__1::shared_ptr<DB::IVolume> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::IMergeTreeDataPart const*) const @ 0x1547a871 in /usr/bin/clickhouse
10. ? @ 0x154c98ac in /usr/bin/clickhouse
11. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) @ 0xaf6546a in /usr/bin/clickhouse
12. ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()&&...)::'lambda'()::operator()() @ 0xaf674a4 in /usr/bin/clickhouse
13. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xaf62837 in /usr/bin/clickhouse
14. ? @ 0xaf662fd in /usr/bin/clickhouse
15. start_thread @ 0x7ea5 in /usr/lib64/libpthread-2.17.so
16. __clone @ 0xfe96d in /usr/lib64/libc-2.17.soCannot print extra info for Poco::Exception (version 22.2.2.1)
2023.02.21 15:47:12.366162 [ 9787 ] {} <Information> Application: Shutting down storages.
2023.02.21 15:47:12.366178 [ 9787 ] {} <Information> Context: Shutdown disk default
2023.02.21 15:47:12.391451 [ 9787 ] {} <Debug> Application: Shut down storages.
2023.02.21 15:47:13.199583 [ 9787 ] {} <Debug> Application: Destroyed global context.
2023.02.21 15:47:13.202239 [ 9787 ] {} <Error> Application: filesystem error: in directory_iterator::directory_iterator(...): Structure needs cleaning [/var/lib/clickhouse/store/95f/95febdac-d99a-4bdb-9431-653a75f3a34b/202302_390451_394023_2919/]
2023.02.21 15:47:13.202291 [ 9787 ] {} <Information> Application: shutting down
2023.02.21 15:47:13.202308 [ 9787 ] {} <Debug> Application: Uninitializing subsystem: Logging Subsystem
2023.02.21 15:47:13.203474 [ 9788 ] {} <Trace> BaseDaemon: Received signal -2
2023.02.21 15:47:13.203546 [ 9788 ] {} <Information> BaseDaemon: Stop SignalListener thread
2023.02.21 15:47:13.276203 [ 9786 ] {} <Information> Application: Child process exited normally with code 70.
  • 关键报错日志是这一句:<Error> Application: filesystem error: in directory_iterator::directory_iterator(...): Structure needs cleaning [/var/lib/clickhouse/store/95f/95febdac-d99a-4bdb-9431-653a75f3a34b/202302_390451_394023_2919/],这个文件夹损坏,结构需要清理

解决

  • 百度了一堆,确实能搜到解决方式,牵涉到磁盘挂载、修复之类的,试了下没用,识别不到磁盘,可能是我哪里搞错了,暂时舍弃
  • 还有一个办法,给文件重命名,试了下,使用mv old_file_name new_file_name重命名,直接死机了
  • 重启服务器后,我使用文件传输工具查看修改文件名,竟然成功了。想了下这个损坏的文件,应该不能随便移动,应该使用rename
  • 虽然文件命名成功了,但是clickhouse还是无法启动,还是类似报错,应该是clickhouse还是会使用这个文件
  • 想着怎么才能让clickhouse忽略这个文件夹,试着把上面层级的文件夹95f改了下,改成95f1,启动服务后发现可以了
    在这里插入图片描述
http://www.lryc.cn/news/15586.html

相关文章:

  • FlinkSQL行级权限解决方案及源码
  • 【基础篇】8 # 递归:如何避免出现堆栈溢出呢?
  • 基于微信公众号(服务号)实现扫码自动登录系统功能
  • AXI实战(二)-跟着产品手册设计AXI-Lite外设(AXI-Lite转串口实现)
  • 一周搞定模拟电路视频教程,拒绝讲PPT,仿真软件配合教学,真正一周搞定
  • 高德地图获得角度
  • 【C++】-- C++11基础常用知识点(下)
  • 提到数字化,你想到哪些关键词
  • 【蓝桥杯集训·每日一题】AcWing 1249. 亲戚
  • iphone所有机型的屏幕尺寸
  • Windows10使用-处理IE自动跳转至Edge
  • linux input子系统,gpio-keys,gpio中断使用
  • 分析称勒索攻击在非洲、中东与中国增长最快
  • ArcPy批量合并矢量shape文件
  • 改写有序表的题目核心点
  • 收藏这几个开源管理系统做项目,领导看了直呼牛X!
  • 【刷题篇】链表(下)
  • Shiro
  • 使用nginx进行负载均衡配置详细说明
  • N皇后问题
  • 强化学习DQN之俄罗斯方块
  • 1.3总线:并行总线、串行总线、单工、半双工、全双工、总线宽度、总线带宽、总线的分类、数据总线、地址总线、控制总线
  • Linux驱动开发—设备树开发详解
  • 深入浅出C++ ——继承
  • 设计模式C++实现20: 桥接模式(Bridge)
  • Android中的Rxjava
  • 【RocketMQ】源码详解:消息储存服务加载、文件恢复、异常恢复
  • 数字IC设计工程师是做什么的?
  • 【040】134. 加油站[简单模拟 + 逻辑转化]
  • Python用selenium实现自动登录和下单的脚本