当前位置: 首页 > news >正文

Elaticsearch学习

Elaticsearch

索引

1、索引创建

PUT /index_v1
{"settings": {"number_of_shards": 3,"number_of_replicas": 1},"mappings": {"properties": {"aaa": {"type": "keyword","store": true},   "hhh": {"type": "keyword","store": true}}}
}

2、索引别名

person_info_v1为索引名称,person_info为索引要创建的别名

put /person_info_v1/_alias/person_info

查询语法

1、minimum_should_match

bool查询也可以用 minimum_should_match, 如果配置成数字 3, 则表示 查询关键词被分词器分成 3 个及其以下的term 时, they are all required(条件都需要满足才能符合查询要求)

对于被analyzer分解出来的每一个term都会构造成一个should的bool query的查询,每个term变成一个term query子句。 例如"query": “how not to be”,被解析成: { “bool”: { “should”: [ { “term”: { “body”: “how”}}, { “term”: { “body”: “not”}}, { “term”: { “body”: “to”}}, { “term”: { “body”: “be”}} ],

2、查询分词效果

anlyzer后面是分词器,有ik_smart,ik_max_word等,text后面是想要查看分词效果的词

POST _analyze
{"analyzer":"ik_max_word","text":"李四"}

3、must和should混合使用

must是数据库中AND的意思,should是数据库中OR的意思,使用的时候不能简单的QueryBuilders.boolQuery.must().should(),要向下面这样使用

QueryBuilders.boolQuery().must(QueryBuilders.termQuery("is_deleted", DELETE_FLAG)).must(QueryBuilders.boolQuery().should(QueryBuilders.matchQuery("person_name", keywordVal).operator(Operator.AND).analyzer("ik_max_word") ));

Operato.AND表示查询分词要和es中的索引都匹配上才行,比如索引中内容是张三三,分词效果是三三,查询内容是张三,分词是,那这个时候就查询不到结果,查询内容改成张三三,分词效果是三三,就和索引中的分词都匹配上了,可以查询出内容。这样做的原因是防止你输入张三的时候把李三也查出来。如果不显示的声明Operator.AND,那会默认使用Operator.OR,这样的话输入张三,就会把李三也查出来,因为张三分词是,只要匹配了,就会查出来

4、查询索引中数据大小

GET /my-index-000001/_stats

5、字段匹配度排序

比如有个person_name字段,正常查询的时候按照_score排序,查询张建的时候,张建建的分值比张建的分值大,导致排序的时候张建建排在张建之前,但是按照常理来说,张建应该排在张建建之前,这就涉及到es的分词器以及分值计算问题了

解决方法是在person_name字段中设置一个子字段,不分词

"person_name": {"type": "text","analyzer": "ik_max_word","search_analyzer": "ik_smart","store": true,"index_options": "docs","fields": {"raw": { "type": "keyword", "store": true }}}

查询的时候,使用match_parse精确查询子字段并用boost设置较大的权重,使用match模糊查询person_name字段

查询语句

1、短语匹配
{"query": {"bool": {"should": [{"match_phrase": {"person_name.raw": {"query": "张建建","boost": 10}}},{"match": {"person_name": {"query": "张建建"}}}]}}
}

java代码

BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.should(QueryBuilders.matchPhraseQuery("person_name.raw",keywordVal).boost(4));
queryBuilder.should(QueryBuilders.matchQuery("person_name", keywordVal).operator(Operator.AND).analyzer("ik_max_word"));
2、查询所有
/_search
{"query": {"match_all": {}}
}
3、查询数量
/_count
{"query": {"match_all": {}}
}
4、排序
{"query": {"match": {"ent_name": "杭州乾元"}},"sort": [{"est_date": {"order": "asc"}}]
}
5、nested查询
 {"query": {"bool": {"filter": [{"nested": {"query": {"bool": {"filter": [{"term": {"clues.clue_id": {"value": "xxx","boost": 1}}}],"boost": 1}},"path": "clues","score_mode": "none","boost": 1}}],"boost": 1}}
}
6、字段+nested
{"query": {"bool": {"filter": [{"terms": {"_id": ["xxx"],"boost": 1}},{"nested": {"query": {"bool": {"filter": [{"terms": {"clues.clue_code": ["xxx"],"boost": 1}}],"adjust_pure_negative": true,"boost": 1}},"path": "clues","ignore_unmapped": false,"score_mode": "none","boost": 1}}],"adjust_pure_negative": true,"boost": 1}}
}
7、nested字段为空条件查询
{"query": {"bool": {"must_not": [{"nested": {"path": "tags","query": {"exists": {"field": "tags"}}}}]}}
}
8、案件数据为空,但是线索不为空的数据
{"query": {"bool": {"filter": [{"bool": {"should": [{"bool": {"must_not": [{"exists": {"field": "case_type"}}],"adjust_pure_negative": true,"boost": 1}}],"adjust_pure_negative": true,"boost": 1}},{"range": {"clue_num": {"from": "0","to": null,"include_lower": false,"include_upper": true,"boost": 1}}}]}}
}

删除

删除索引中的全部数据

POST /my_index/_delete_by_query
{"query": {"match_all": {}}
}

命令行删除:

curl -u elastic:'xxxx' -XPOST 'ip:port/medical_institution/_delete_by_query?refresh&slices=5&pretty' -H 'Content-Type: application/json' -d'{  "query": {    "match_all": {}  }}'

插入

POST /person_info_test_v1/_doc/
{"person_name": "张建芬"
}

更新

1、数据更新

(1)nested更新
POST  http://ip:port/case_info/_update_by_query
{"script": {"source": "ctx._source.clues[0].clue_state = 2","lang": "painless"},"query": {"bool": {"filter": [{"nested": {"query": {"bool": {"filter": [{"term": {"clues.clue_id": {"value": "xxx","boost": 1}}}],"boost": 1}},"path": "clues","score_mode": "none","boost": 1}}],"boost": 1}}
}
(2)nested字段置空
{"script": {"source": "ctx._source.clues = []","lang": "painless"},"query": {"term": {"_id": "xxx"}}
}
(3)多条件更新
POST  http://ip:port/case_info/_update_by_query
{"script": {"source": "ctx._source.obj_code = 'xxx'","lang": "painless"},"query": {"bool": {"filter": [{"term": {"case_type": "check_action"}},{"term": {"obj_code": "xxx"}}]}}
}
(4)数组(nested)字段更新
#更新为空的字段
{"script": {"source": "def tags= ctx._source.tags;def newTag=params.tagInfo; if (tags == null) {  ctx._source.tags = params.tagInfo;}","lang": "painless","params": {"tagInfo": [{"tag_code": "case_xzcf_basic_0001","tag_value": "简易程序"},{"tag_code": "case_xzcf_basic_0002","tag_value": "立案阶段"},{"tag_code": "case_xzcf_basic_0003","tag_value": "无文书"}]}},"query": {"term": {"_id": "0e978d6afb74b52a322d7aa8fbfbddf8"}}
}
#将不为空的字段置为空
{"script": {"source": "def tags= ctx._source.tags;def newTag=params.tagInfo;  ctx._source.tags = params.tagInfo;","lang": "painless","params": {"tagInfo": []}},"query": {"bool": {"must": [{"nested": {"path": "tags","query": {"exists": {"field": "tags"}}}}]}}
}

2、更新配置参数

PUT http://ip:port/case_info/_settings
{"refresh_interval": "1s"
}

访问

1、在linux中加密访问

#elastic是用户名,xxx是密码
curl ip:port -u elastic:'xxx'

2、ES健康状态查看

curl http://localhost:9200/_cat/health?v -u elastic:'xxx'

ES问题处理

一、数据插入失败

1、提示只读

] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"index [person_info_v1] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})

解决方法:首先查看磁盘空间是否被占满了,如果磁盘空间够用,则执行以下语句,将索引只读状态置为false

/indexname/_settings    PUT
{"index": {"blocks": {"read_only_allow_delete": "false"}}
}{"index": {"refresh_interval": "1s"}
}

2、cpu占用过高

在网页上输入以下地址

http://ip:port/_nodes/hotthreads

问题处理

一、数据插入失败

1、提示只读

] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"index [person_info_v1] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})

解决方法:首先查看磁盘空间是否被占满了,如果磁盘空间够用,则执行以下语句,将索引只读状态置为false

/indexname/_settings    PUT
{"index": {"blocks": {"read_only_allow_delete": "false"}}
}{"index": {"refresh_interval": "1s"}
}

2、cpu占用过高

在网页上输入以下地址

http://ip:port/_nodes/hotthreads

查询出的内容搜索cpu usage by thread即可

http://www.lryc.cn/news/242557.html

相关文章:

  • 【腾讯云云上实验室】向量数据库+LangChain+LLM搭建智慧辅导系统实践
  • 从0开始学习JavaScript--深入了解JavaScript框架
  • 【教3妹学编程-算法题】二叉树中的伪回文路径
  • 快速上手Banana Pi BPI-M4 Zero 全志科技H618开源硬件开发开发板
  • Node.js入门指南(三)
  • Leetcode—2824.统计和小于目标的下标对数目【简单】
  • 【基础架构】part-2 可扩展性
  • [SWPUCTF 2021 新生赛]no_wakeup
  • 类和对象(3)日期类的实现
  • 分布式篇---第五篇
  • SpringMVC(二)
  • kafka操作的一些坑
  • 转录组学习第5弹-比对参考基因组
  • 部署系列六基于nndeploy的深度学习 图像降噪unet部署
  • 使用 ClickHouse 做日志分析
  • 华为ospf路由协议防环和次优路径中一些难点问题分析
  • python-opencv划痕检测-续
  • c++[string实现、反思]
  • c++版本opencv计算灰度图像的轮廓点
  • 【05】ES6:函数的扩展
  • Ubuntu20.04安装搜狗输入法
  • linux的基础命令
  • linux查询某个进程使用的内存量
  • list的总结
  • c语言数字转圈
  • Apache Superset数据分析平台如何实现公网实时远程访问数据【内网穿透】
  • HarmonyOS应用开发实战—登录页面【ArkTS】
  • @RequestMapping
  • 操作系统 应用题 例题+参考答案(考研真题)
  • 免费获取GPT-4的五种工具