当前位置：首页 > news >正文

elasticsearch term match 查询

news 2025/8/16 23:03:14

1. 准备数据

PUT h1/doc/1
{"name": "rose","gender": "female","age": 18,"tags": ["白", "漂亮", "高"]
}PUT h1/doc/2
{"name": "lila","gender": "female","age": 18,"tags": ["黑", "漂亮", "高"]
}PUT h1/doc/3
{"name": "john","gender": "male","age": 18,"tags": ["黑", "帅", "高"]
}

运行结果：

{"_index" : "h1","_type" : "doc","_id" : "1","_version" : 1,"result" : "created","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 0,"_primary_term" : 1
}

2. match 查询

2.1 match 按条件查询

# 查询性别是男性的结果
GET h1/doc/_search
{"query": {"match": {"gender": "male"}}
}

查询结果：

{"took" : 59,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 1,"max_score" : 0.2876821,"hits" : [{"_index" : "h1",		# 索引"_type" : "doc",		# 文档类型"_id" : "3",			# 文档唯一 id"_score" : 0.2876821,	# 打分机制打出来的分数"_source" : {			# 查询结果"name" : "john","gender" : "male","age" : 18,"tags" : ["黑","帅","高"]}}]}
}

2.2 match_all 查询全部

# 查询 h1 中所有文档
GET h1/doc/_search
{"query": {"match_all": {}}
}

match_all的值为空，表示没有查询条件，那就是查询全部。就像select * from table_name 一样。

查询结果：

{"took" : 2,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 3,"max_score" : 1.0,"hits" : [{"_index" : "h1","_type" : "doc","_id" : "2","_score" : 1.0,"_source" : {"name" : "lila","gender" : "female","age" : 18,"tags" : ["黑","漂亮","高"]}},{"_index" : "h1","_type" : "doc","_id" : "1","_score" : 1.0,"_source" : {"name" : "rose","gender" : "female","age" : 18,"tags" : ["白","漂亮","高"]}},{"_index" : "h1","_type" : "doc","_id" : "3","_score" : 1.0,"_source" : {"name" : "john","gender" : "male","age" : 18,"tags" : ["黑","帅","高"]}}]}
}

2.3 match_phrase 短语查询

match 查询时散列映射，包含了我们希望搜索的字段和字符串，即只要文档中有我们希望的那个关键字，但也会带来一些问题。

es 会将文档中的内容进行拆分，对于英文来说可能没有太大的影响，但是中文短语就不太适用，一旦拆分就会失去原有的含义，比如以下：

1、准备数据：

PUT t1/doc/1
{"title": "中国是世界上人口最多的国家"
}PUT t1/doc/2
{"title": "美国是世界上军事实力最强大的国家"
}PUT t1/doc/3
{"title": "北京是中国的首都"
}

2、先使用 match 查询含有中国的文档：

GET t1/doc/_search
{"query": {"match": {"title": "中国"}}
}

查询结果：

{"took" : 5,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 3,"max_score" : 0.68324494,"hits" : [{"_index" : "t1","_type" : "doc","_id" : "1","_score" : 0.68324494,"_source" : {"title" : "中国是世界上人口最多的国家"}},{"_index" : "t1","_type" : "doc","_id" : "3","_score" : 0.5753642,"_source" : {"title" : "北京是中国的首都"}},{"_index" : "t1","_type" : "doc","_id" : "2","_score" : 0.39556286,"_source" : {"title" : "美国是世界上军事实力最强大的国家"}}]}
}

发现三篇文档都被返回，与我们的预期有偏差；这是因为 title 中的内容被拆分成一个个单独的字，而 id=2 的文档包含了国字也符合，所以也被返回了。es 自带的中文分词处理不太好用，后面可以使用 ik 中文分词器来处理。

3、match_phrase 查询短语

不过可以使用 match_phrase 来匹配短语，将上面的 match 换成 match_phrase 试试：

# 短语查询
GET t1/doc/_search
{"query": {"match_phrase": {"title": "中国"}}
}

查询结果：

{"took" : 2,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 2,"max_score" : 0.5753642,"hits" : [{"_index" : "t1","_type" : "doc","_id" : "1","_score" : 0.5753642,"_source" : {"title" : "中国是世界上人口最多的国家"}},{"_index" : "t1","_type" : "doc","_id" : "3","_score" : 0.5753642,"_source" : {"title" : "北京是中国的首都"}}]}
}

4、slop 间隔查询

当我们要查询的短语，中间有别的词时，可以使用 slop 来跳过。比如上述要查询 中国世界，这个短语中间被是隔开了，这时可以使用 slop 来跳过，相当于正则中的中国.*?世界：

# 短语查询，查询中国世界，加 slop 
GET t1/doc/_search
{"query": {"match_phrase": {"title": {"query": "中国世界","slop": 1}}}
}

查询结果：

{"took" : 4,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 1,"max_score" : 0.7445889,"hits" : [{"_index" : "t1","_type" : "doc","_id" : "1","_score" : 0.7445889,"_source" : {"title" : "中国是世界上人口最多的国家"}}]}
}

2.4 match_phrase_prefix 最左前缀查询

场景：当我们要查询的词只能想起前几个字符时

# 最左前缀查询，查询名字为 rose 的文档
GET h1/doc/_search
{"query": {"match_phrase_prefix": {"name": "ro"}}
}

查询结果：

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 1,"max_score" : 0.2876821,"hits" : [{"_index" : "h1","_type" : "doc","_id" : "1","_score" : 0.2876821,"_source" : {"name" : "rose","gender" : "female","age" : 18,"tags" : ["白","漂亮","高"]}}]}
}

限制结果集

最左前缀查询很费性能，返回的是一个很大的集合，一般很少使用，使用的时候最好对结果集进行限制，max_expansions 参数可以设置最大的前缀扩展数量：

# 最左前缀查询
GET h1/doc/_search
{"query": {"match_phrase_prefix": {"gender": {"query": "fe","max_expansions": 1}}}
}

查询结果：

{"took" : 2,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 2,"max_score" : 0.2876821,"hits" : [{"_index" : "h1","_type" : "doc","_id" : "2","_score" : 0.2876821,"_source" : {"name" : "lila","gender" : "female","age" : 18,"tags" : ["黑","漂亮","高"]}},{"_index" : "h1","_type" : "doc","_id" : "1","_score" : 0.2876821,"_source" : {"name" : "rose","gender" : "female","age" : 18,"tags" : ["白","漂亮","高"]}}]}
}

2.5 multi_match 多字段查询

1、准备数据：

# 多字段查询
PUT t3/doc/1
{"title": "maggie is beautiful girl","desc": "beautiful girl you are beautiful so"
}PUT t3/doc/2
{"title": "beautiful beach","desc": "I like basking on the beach,and you? beautiful girl"
}

2、查询包含 beautiful 字段的文档：

GET t3/doc/_search
{"query": {"multi_match": {"query": "beautiful",				# 要查询的词"fields": ["desc", "title"]		# 要查询的字段}}
}

还可以当做 match_phrase和match_phrase_prefix使用，只需要指定type类型即可：

GET t3/doc/_search
{"query": {"multi_match": {"query": "gi","fields": ["title"],"type": "phrase_prefix"}}
}GET t3/doc/_search
{"query": {"multi_match": {"query": "girl","fields": ["title"],"type": "phrase"}}
}

3. term 查询

3.1 初始 es 的分析器

term 查询用于精确查询，但是不适用于 text 类型的字段查询。

在此之前我们先了解 es 的分析机制，默认的标准分析器会对文档进行：

删除大多数的标点符号
将文档拆分为单个词条，称为 token
将 token 转换为小写

最后保存到倒排序索引上，而倒排序索引用来查询，如 Beautiful girl 经过分析后是这样的：

POST _analyze
{"analyzer": "standard","text": "Beautiful girl"
}# 结果，转换为小写了
{"tokens" : [{"token" : "beautiful","start_offset" : 0,"end_offset" : 9,"type" : "<ALPHANUM>","position" : 0},{"token" : "girl","start_offset" : 10,"end_offset" : 14,"type" : "<ALPHANUM>","position" : 1}]
}

3.2 term 查询

1、准备数据：

# 创建索引，自定义 mapping，后面会讲到
PUT t4
{"mappings": {"doc":{"properties":{"t1":{"type": "text"    # 定义字段类型为 text}}}}
}PUT t4/doc/1
{"t1": "Beautiful girl!"
}PUT t4/doc/2
{"t1": "sexy girl!"
}

2、match 查询：

GET t4/doc/_search
{"query": {"match": {"t1": "Beautiful girl!"}}
}

经过分析后，会得到 beautiful、girl 两个 token，然后再去 t4 索引上去查询，会返回两篇文档：

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 2,"max_score" : 0.5753642,"hits" : [{"_index" : "t4","_type" : "doc","_id" : "1","_score" : 0.5753642,"_source" : {"title" : "Beautiful girl"}},{"_index" : "t4","_type" : "doc","_id" : "2","_score" : 0.2876821,"_source" : {"title" : "sex girl"}}]}
}

3、但是我们只想精确查询包含 Beautiful girl 的文档，这时就需要使用 term 来精确查询：

GET t4/doc/_search
{"query": {"term": {"title": "beautiful"}}
}

查询结果：

{"took" : 0,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 1,"max_score" : 0.2876821,"hits" : [{"_index" : "t4","_type" : "doc","_id" : "1","_score" : 0.2876821,"_source" : {"title" : "Beautiful girl"}}]}
}

注意：term 查询不适用于类型是 text 的字段，可以使用 match 查询；另外 Beautiful 经过分析后变为 beautiful，查询时使用 Beautiful 是查询不到的~

3.3 查询多个

精确查询多个字段：

GET t4/doc/_search
{"query": {"terms": {"title": ["beautiful", "sex"]}}
}

查看全文

http://www.lryc.cn/news/2055.html

canal使用说明：MySQL、Redis实时数据同步

计算机视觉框架OpenMMLab开源学习（三）：图像分类实战

awk命令

LocalDateTime获取时间的年、月、日、时、分、秒、纳秒

MoveIT Rviz和Gazebo联合仿真

ESP32S2(12K)-DS18B20数码管显示温度

linux栈溢出定位

CSS基础：选择器和声明样式

VS中安装gismo库

元学习方法解决CDFSL以及两篇SOTA论文讲解

大数据之------------数据中台

Python 中字符串是什么？

OJ刷题Day1 · 一维数组的动态和 · 将数字变成 0 的操作次数 · 最富有的客户资产总量 · Fizz Buzz · 链表的中间结点 · 赎金信

【数据结构】栈——必做题

LearnOpenGL 笔记 - 入门 04 你好，三角形

keepalived+mysql高可用

JAVA工具篇--1 Idea中 Gradle的使用

弄懂自定义 Hooks 不难，改变开发认知有点不习惯

Java面向对象基础

基于python下selenium库实现交互式图片保存操作（批量保存浏览器中的图片）

一：Datart的下载、本地运行

Docker-compose

经典文献阅读之--PLC-LiSLAM(面，线圆柱SLAM)

计算组合数Cnk即从n个不同数中选出k个不同数共有多少种方法math.comb(n,k)

CSGO搬砖项目，23年最适合小白的项目！

谈谈会话管理

1. 准备数据

2. match 查询

2.1 match 按条件查询

2.2 match_all 查询全部

2.3 match_phrase 短语查询

2.4 match_phrase_prefix 最左前缀查询

2.5 multi_match 多字段查询

3. term 查询

3.1 初始 es 的分析器

3.2 term 查询

3.3 查询多个

相关文章：