当前位置: 首页 > news >正文

Elasticsearch文件存储

分析Elasticsearch Index文件是如何存储的?
主要是想看一下FST文件是以什么粒度创建的?

首先通过kibana找一个索引的shard,此处咱们就以logstash-2023.05.30索引为例

查看下shard分布情况

GET /_cat/shards/logstash-2023.05.30?vindex               shard prirep state      docs   store ip             node
logstash-2023.05.30 3     p      STARTED 1520736 408.1mb 10.138.40.73  10.138.40.73-node1
logstash-2023.05.30 5     p      STARTED 1520888 409.9mb 10.138.40.74  10.138.40.74-node1
logstash-2023.05.30 6     p      STARTED 1518331 408.2mb 10.138.40.221 10.138.40.221-node1
logstash-2023.05.30 4     p      STARTED 1518186 409.3mb 10.138.204.194 10.138.204.194-node1
logstash-2023.05.30 1     p      STARTED 1519231 408.8mb 10.138.40.220 10.138.40.220-node1
logstash-2023.05.30 2     p      STARTED 1519970 409.9mb 10.138.204.195 10.138.204.195-node1
logstash-2023.05.30 0     p      STARTED 1520024 410.6mb 10.138.204.193 10.138.204.193-node1

这里以位于10.138.204.193上的shard 0为例分析。

要找到存储目录先要找到index的id

GET /logstash-2023.05.30/_settings{"logstash-2023.05.30" : {"settings" : {"index" : {"codec" : "best_compression","routing" : {"allocation" : {"include" : {"_tier_preference" : "data_content"}}},"refresh_interval" : "60s","number_of_shards" : "7","provided_name" : "logstash-2023.05.30","creation_date" : "1685376005206","number_of_replicas" : "0","uuid" : "FYWtFGTIS2CLB8yJhFXG9g",//这里就是索引的id"version" : {"created" : "7130499"}}}}
}

登录机器,找到存储索引文件的对应目录

/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g

展开一下该目录下的文件

root@prd-paas-es-01:/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g# tree -C -s
.
├── [       4096]  0
│   ├── [      20480]  index
│   │   ├── [        158]  _17f.fdm
│   │   ├── [   25578562]  _17f.fdt
│   │   ├── [       1939]  _17f.fdx
│   │   ├── [       4636]  _17f.fnm
│   │   ├── [    7981735]  _17f.kdd
│   │   ├── [      20898]  _17f.kdi
│   │   ├── [        716]  _17f.kdm
│   │   ├── [    7945983]  _17f_Lucene80_0.dvd
│   │   ├── [       3916]  _17f_Lucene80_0.dvm
│   │   ├── [    6230127]  _17f_Lucene84_0.doc
│   │   ├── [    3875001]  _17f_Lucene84_0.pos
│   │   ├── [    7448815]  _17f_Lucene84_0.tim
│   │   ├── [     108786]  _17f_Lucene84_0.tip
│   │   ├── [       1637]  _17f_Lucene84_0.tmd
│   │   ├── [        593]  _17f.si
│   │   ├── [        158]  _3uv.fdm
│   │   ├── [   33652243]  _3uv.fdt
│   │   ├── [       2555]  _3uv.fdx
│   │   ├── [       4636]  _3uv.fnm
│   │   ├── [   10520395]  _3uv.kdd
│   │   ├── [      27689]  _3uv.kdi
│   │   ├── [        716]  _3uv.kdm
│   │   ├── [   10573208]  _3uv_Lucene80_0.dvd
│   │   ├── [       3916]  _3uv_Lucene80_0.dvm
│   │   ├── [    8298061]  _3uv_Lucene84_0.doc
│   │   ├── [    5154427]  _3uv_Lucene84_0.pos
│   │   ├── [    9716222]  _3uv_Lucene84_0.tim
│   │   ├── [     142063]  _3uv_Lucene84_0.tip
│   │   ├── [       1620]  _3uv_Lucene84_0.tmd
│   │   ├── [        593]  _3uv.si
│   │   ├── [        158]  _5bg.fdm
│   │   ├── [   16433011]  _5bg.fdt
│   │   ├── [       1259]  _5bg.fdx
│   │   ├── [       4636]  _5bg.fnm
│   │   ├── [    5158094]  _5bg.kdd
│   │   ├── [      13396]  _5bg.kdi
│   │   ├── [        716]  _5bg.kdm
│   │   ├── [    5140762]  _5bg_Lucene80_0.dvd
│   │   ├── [       3916]  _5bg_Lucene80_0.dvm
│   │   ├── [    4005897]  _5bg_Lucene84_0.doc
│   │   ├── [    2583880]  _5bg_Lucene84_0.pos
│   │   ├── [    4873082]  _5bg_Lucene84_0.tim
│   │   ├── [      70979]  _5bg_Lucene84_0.tip
│   │   ├── [       1593]  _5bg_Lucene84_0.tmd
│   │   ├── [        593]  _5bg.si
│   │   ├── [        158]  _60h.fdm
│   │   ├── [   24664753]  _60h.fdt
│   │   ├── [       1886]  _60h.fdx
│   │   ├── [       4636]  _60h.fnm
│   │   ├── [    7640438]  _60h.kdd
│   │   ├── [      19996]  _60h.kdi
│   │   ├── [        716]  _60h.kdm
│   │   ├── [    7754954]  _60h_Lucene80_0.dvd
│   │   ├── [       3916]  _60h_Lucene80_0.dvm
│   │   ├── [    6147241]  _60h_Lucene84_0.doc
│   │   ├── [    3998559]  _60h_Lucene84_0.pos
│   │   ├── [    7254035]  _60h_Lucene84_0.tim
│   │   ├── [     105673]  _60h_Lucene84_0.tip
│   │   ├── [       1719]  _60h_Lucene84_0.tmd
│   │   ├── [        593]  _60h.si
│   │   ├── [        200]  _7jq.fdm
│   │   ├── [   63208093]  _7jq.fdt
│   │   ├── [       4692]  _7jq.fdx
│   │   ├── [       4636]  _7jq.fnm
│   │   ├── [   19306117]  _7jq.kdd
│   │   ├── [      51562]  _7jq.kdi
│   │   ├── [        716]  _7jq.kdm
│   │   ├── [   20228561]  _7jq_Lucene80_0.dvd
│   │   ├── [       3916]  _7jq_Lucene80_0.dvm
│   │   ├── [   15606568]  _7jq_Lucene84_0.doc
│   │   ├── [    9581341]  _7jq_Lucene84_0.pos
│   │   ├── [   17383473]  _7jq_Lucene84_0.tim
│   │   ├── [     272615]  _7jq_Lucene84_0.tip
│   │   ├── [       1592]  _7jq_Lucene84_0.tmd
│   │   ├── [        593]  _7jq.si
│   │   ├── [        437]  _82w.cfe
│   │   ├── [    4489379]  _82w.cfs
│   │   ├── [        408]  _82w.si
│   │   ├── [        437]  _87w.cfe
│   │   ├── [    4932636]  _87w.cfs
│   │   ├── [        408]  _87w.si
│   │   ├── [        437]  _8ao.cfe
│   │   ├── [   13905317]  _8ao.cfs
│   │   ├── [        408]  _8ao.si
│   │   ├── [        437]  _8ls.cfe
│   │   ├── [   20181047]  _8ls.cfs
│   │   ├── [        408]  _8ls.si
│   │   ├── [        437]  _8nq.cfe
│   │   ├── [    1234712]  _8nq.cfs
│   │   ├── [        408]  _8nq.si
│   │   ├── [        437]  _8oa.cfe
│   │   ├── [     872798]  _8oa.cfs
│   │   ├── [        408]  _8oa.si
│   │   ├── [        437]  _8pp.cfe
│   │   ├── [    1593677]  _8pp.cfs
│   │   ├── [        408]  _8pp.si
│   │   ├── [        437]  _8r5.cfe
│   │   ├── [     914008]  _8r5.cfs
│   │   ├── [        408]  _8r5.si
│   │   ├── [        437]  _8rf.cfe
│   │   ├── [     940473]  _8rf.cfs
│   │   ├── [        408]  _8rf.si
│   │   ├── [        437]  _8rz.cfe
│   │   ├── [    1315312]  _8rz.cfs
│   │   ├── [        408]  _8rz.si
│   │   ├── [        437]  _8s9.cfe
│   │   ├── [    1121692]  _8s9.cfs
│   │   ├── [        408]  _8s9.si
│   │   ├── [        437]  _8sk.cfe
│   │   ├── [     243476]  _8sk.cfs
│   │   ├── [        408]  _8sk.si
│   │   ├── [       1678]  segments_6
│   │   └── [          0]  write.lock
│   ├── [       4096]  _state
│   │   ├── [        186]  retention-leases-2865.st
│   │   └── [        125]  state-0.st
│   └── [       4096]  translog
│       ├── [         55]  translog-29.tlog
│       └── [         88]  translog.ckp
└── [       4096]  _state└── [       1230]  state-2.st5 directories, 118 files

有了文件信息,我们再来看下,segment信息

GET /logstash-2023.05.30/_segments// 这里为了直观 只展示shard 0对应的segment
{"_shards": {"total": 7,"successful": 7,"failed": 0},"indices": {"logstash-2023.05.30": {"shards": {"0": [{"routing": {"state": "STARTED","primary": true,"node": "4hEWcF8hRFWTEkQxlKQmqg"},"num_committed_segments": 17,"num_search_segments": 17,"segments": {"_17f": {"generation": 1563,"num_docs": 210331,"deleted_docs": 0,"size_in_bytes": 59203502,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_3uv": {"generation": 4999,"num_docs": 278411,"deleted_docs": 0,"size_in_bytes": 78098502,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_5bg": {"generation": 6892,"num_docs": 132645,"deleted_docs": 0,"size_in_bytes": 38291972,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_60h": {"generation": 7793,"num_docs": 199809,"deleted_docs": 0,"size_in_bytes": 57599273,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_7jq": {"generation": 9782,"num_docs": 520420,"deleted_docs": 0,"size_in_bytes": 145654675,"memory_in_bytes": 5204,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_82w": {"generation": 10472,"num_docs": 15416,"deleted_docs": 0,"size_in_bytes": 4490224,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_87w": {"generation": 10652,"num_docs": 16837,"deleted_docs": 0,"size_in_bytes": 4933481,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8ao": {"generation": 10752,"num_docs": 48855,"deleted_docs": 0,"size_in_bytes": 13906162,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8ls": {"generation": 11152,"num_docs": 70903,"deleted_docs": 0,"size_in_bytes": 20181892,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8nq": {"generation": 11222,"num_docs": 3954,"deleted_docs": 0,"size_in_bytes": 1235557,"memory_in_bytes": 6924,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8oa": {"generation": 11242,"num_docs": 2785,"deleted_docs": 0,"size_in_bytes": 873643,"memory_in_bytes": 6820,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8pp": {"generation": 11293,"num_docs": 5194,"deleted_docs": 0,"size_in_bytes": 1594522,"memory_in_bytes": 7060,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8r5": {"generation": 11345,"num_docs": 2936,"deleted_docs": 0,"size_in_bytes": 914853,"memory_in_bytes": 6748,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8rf": {"generation": 11355,"num_docs": 2920,"deleted_docs": 0,"size_in_bytes": 941318,"memory_in_bytes": 6836,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8rz": {"generation": 11375,"num_docs": 4304,"deleted_docs": 0,"size_in_bytes": 1316157,"memory_in_bytes": 6820,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8s9": {"generation": 11385,"num_docs": 3647,"deleted_docs": 0,"size_in_bytes": 1122537,"memory_in_bytes": 6892,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8sk": {"generation": 11396,"num_docs": 657,"deleted_docs": 0,"size_in_bytes": 244321,"memory_in_bytes": 7620,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}}}}]}}}
}

对比segment与shard目录中文件可以看出,两者是一一对应的。

看下es及对应lucene的版本

GET /{"name" : "10.138.204.193-node1","cluster_name" : "elasticsearch","cluster_uuid" : "XWDyVuo6TgK4yUp2XWD3lw","version" : {"number" : "7.13.4","build_flavor" : "default","build_type" : "docker","build_hash" : "c5f60e894ca0c61cdbae4f5a686d9f08bcefc942","build_date" : "2021-07-14T18:33:36.673943207Z","build_snapshot" : false,"lucene_version" : "8.8.2","minimum_wire_compatibility_version" : "6.8.0","minimum_index_compatibility_version" : "6.0.0-beta1"},"tagline" : "You Know, for Search"
}

那么shard目录中各种后缀的文件具体是什么含义呢?下面来看下

在这里插入图片描述

截图出处:
https://lucene.apache.org/core/8_8_2/core/org/apache/lucene/codecs/lucene87/package-summary.html#package.description

从表格中可以看出与FST相关的文件后缀有:tip、tim,从这里就可以看出FST文件是以segment维度来创建的。

http://www.lryc.cn/news/93104.html

相关文章:

  • chatgpt赋能python:如何安装pyecharts
  • cmake 添加一个库
  • 代码随想录二刷 226 翻转二叉树 102 二叉树的层序遍历 101 对称二叉树
  • 【深入浅出C#】章节 3: 控制流和循环:条件语句
  • Java框架学习--Spring
  • 【爬虫】Xpath和CSS信息提取的方法异同点
  • 数字IC前端学习笔记:FIFO的Verilog实现(二)
  • 2.2 搭建Spark开发环境
  • webpack指定输出资源的路径和名称
  • Spring事务四
  • 项目管理专业人员能力评价等级证书(CSPM)的级别介绍
  • 设计模式-创建型模式(单例、工厂、建造、原型)
  • 用饭店来形象比喻线程池的工作原理
  • GO学习笔记之表达式
  • 005Mybatis返回值(ResultMap 一对多,多对多)
  • 把玩数据在内存中的存储
  • Nginx运行原理与基本配置文件讲解
  • openGauss5 企业版之SQL语法和数据结构
  • TClientDataSet 模拟 EXCEL表
  • Hazel游戏引擎(012)GLFW窗口事件
  • Nenu算法复习第六章
  • 知识付费社群:最好的知识传播方式
  • 局域网内不同网段的设备互相连接设置
  • LVS+Keepalived 群集
  • windows系统cmd命令设置别名,并添加到环境变量
  • 智能学习 | MATLAB实现GWO-SVM多输入单输出回归预测(灰狼算法优化支持向量机)
  • java方法
  • LabVIEW与Space Wire配合开发
  • 开始使用chat-gpt4
  • 算法之贪心算法