Elasticsearch(四):query_string查询介绍
query_string查询介绍
- 1 概述
- 2 基本概念
- 3 数据准备
- 4 query_string查询示例
- 4.1 基本查询
- 4.2 复杂查询解析
- 4.3 高级过滤解析
- 4.4 模糊查询解析
- 4.5 高亮查询解析
- 4.6 分页查询解析
- 5 总结
大家好,我是欧阳方超,可以我的公众号“欧阳方超”,后续内容将在公众号首发。
1 概述
Elasticsearch中的query_string查询是一种强大的工具,允许用户使用复杂的查询语法来搜索文档。它支持多个字段、布尔逻辑、通配符等功能,适合于需要灵活搜索的场景。本文将结合示例详细讲解query_string的用法。
2 基本概念
query_stirng查询使用一种严格的语法来解析用户输入的查询字符串。允许用户使用简洁的字符串实现复杂的查询逻辑,它可以分割查询字符串并根据操作符(如and、or、not)分析每个部分,从而返回匹配的文档。
3 数据准备
创建一个存储博客信息的索引,并插入一些数据以便后续的查询。
{"settings": {"number_of_shards": 1,"number_of_replicas": 1},"mappings": {"properties": {"title": {"type": "text"},"content": {"type": "text"},"tags": {"type": "keyword"},"author": {"type": "keyword"},"publish_date": {"type": "date","format": "yyyy-MM-dd"},"views": {"type": "long"},"status": {"type": "keyword"}}}
}
插入数据准备:
{"index":{"_id":"1"}}
{"title":"Getting Started with Elasticsearch","content":"Elasticsearch is a powerful search and analytics engine. It provides a distributed, multitenant-capable full-text search engine.","tags":["elasticsearch","guide","search"],"author":"John Doe","publish_date":"2023-01-15","views":1000,"status":"published"}
{"index":{"_id":"2"}}
{"title":"Advanced Elasticsearch Query Guide","content":"Learn about complex queries in Elasticsearch including query_string, bool queries and aggregations.","tags":["elasticsearch","advanced","query"],"author":"Jane Smith","publish_date":"2023-02-20","views":800,"status":"published"}
{"index":{"_id":"3"}}
{"title":"Elasticsearch vs Solr Comparison","content":"A detailed comparison between Elasticsearch and Solr. Both are powerful search engines built on Apache Lucene.","tags":["elasticsearch","solr","comparison"],"author":"John Doe","publish_date":"2023-03-10","views":1200,"status":"published"}
{"index":{"_id":"4"}}
{"title":"Mastering Kibana Dashboards","content":"Create powerful visualizations and dashboards using Kibana with Elasticsearch data.","tags":["kibana","elasticsearch","visualization"],"author":"Alice Johnson","publish_date":"2023-04-05","views":600,"status":"draft"}
{"index":{"_id":"5"}}
{"title":"Elasticsearch Security Best Practices","content":"Learn about securing your Elasticsearch cluster, including authentication, authorization, and encryption.","tags":["elasticsearch","security","best practices"],"author":"Bob Wilson","publish_date":"2023-05-01","views":1500,"status":"published"}
4 query_string查询示例
4.1 基本查询
简单查询
下面的查询将查询content字段包含powerful字符串的文档,并将其返回。
{"query": {"query_string": {"default_field":"content","query":"powerful"}}
}
多字段查询
下面的多字段查询的查询逻辑为
- 在title和content字段中搜索同时包含elasticsearch和security的文档,注意只要在两个字段中能匹配到elasticsearch和security即可,不要求在这两个字段的每个字段中都能匹配到elasticsearch和security。
- and操作符要求两个条件都满足
{"query": {"query_string": {"fields":["title","content"],"query":"elasticsearch AND security"}}
}
只有id=5的文档能被查出来,因为它的title包含security且content包含elasticsearch。
4.2 复杂查询解析
组合条件查询
{"query": {"query_string": {"fields":["title","content"],"query":"(elasticsearch OR solr) AND (guide OR comparison)"}}
}
上面的DSL逻辑为:
- 在title和content字段中搜索
- 文档必须满足:
包含"elasticsearch"或"solr"中的至少一个,AND
包含"guide"或"comparison"中的至少一个
会查询出两个文档:
- id=2 的文档(包含elasticsearch和guide)
- id=3 的文档(包含elasticsearch/solr和comparison)
范围查询
{"query": {"query_string": {"query":"elasticsearch AND publish_date:[2023-01-01 TO 2023-03-31] AND views:>1000"}}
}
上面DSL查询逻辑为:
搜索满足以下所有条件的文档:
- 包含"elasticsearch"
- 发布日期在2023-01-01到2023-03-31之间
- 浏览量大于1000
只有id=3的文档可以被查询到。
4.3 高级过滤解析
{"query": {"query_string": {"query": "status:published AND author:\"John Doe\" AND (title:elasticsearch OR content:elasticsearch)"}}
}
搜索满足以下所有条件的文档:
- 状态为"published"
- 作者为"John Doe"
- 标题或内容中包含"elasticsearch"
最终文档1和3符合条件,被查询到。
以下是查询结果:
{"took": 8,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 2,"relation": "eq"},"max_score": 1.525382,"hits": [{"_index": "blog_index","_id": "3","_score": 1.525382,"_source": {"title": "Elasticsearch vs Solr Comparison","content": "A detailed comparison between Elasticsearch and Solr. Both are powerful search engines built on Apache Lucene.","tags": ["elasticsearch","solr","comparison"],"author": "John Doe","publish_date": "2023-03-10","views": 1200,"status": "published"}},{"_index": "blog_index","_id": "1","_score": 1.5210661,"_source": {"title": "Getting Started with Elasticsearch","content": "Elasticsearch is a powerful search and analytics engine. It provides a distributed, multitenant-capable full-text search engine.","tags": ["elasticsearch","guide","search"],"author": "John Doe","publish_date": "2023-01-15","views": 1000,"status": "published"}}]}
}
4.4 模糊查询解析
{"query": {"query_string": {"query": "elasticsearch AND status:published"}},"size" : 0,"aggs": {"authors": {"terms": {"field": "author"}},"avg_views": {"avg": {"field": "views"}}}
}
这是一个用于搜索和聚合数据的请求,稍微复杂一些,下面详细介绍下。
查询部分
- query:这是整个查询的主体,指定了要执行的搜索操作。
- query_string:这部分使用了查询字符串语法,允许通过简单的文本表达式来构建复杂的查询。
- query:这是的值是elasticsearch AND status:published,意味着要搜索包含elasticsearch这个词并且其status字段为published的文档,AND确保两个条件都满足。
聚合部分
aggs这个部分用于定义聚合操作,可以对查询结果进行统计和分析。
- query:这是的值是elasticsearch AND status:published,意味着要搜索包含elasticsearch这个词并且其status字段为published的文档,AND确保两个条件都满足。
- 作者聚合
- authors:这是一个自定义的聚合名称,用于统计不同作者的文档数量。
- terms:指定使用分组聚合,terms是桶聚合的一种,其作用类似于SQL的group by,根据字段分组,相同字段值的文档分为一组。
- “field”:"author"表示按照author字段的值进行分组,结果将返回每个作者及其对应的文档计数。
- terms:指定使用分组聚合,terms是桶聚合的一种,其作用类似于SQL的group by,根据字段分组,相同字段值的文档分为一组。
- authors:这是一个自定义的聚合名称,用于统计不同作者的文档数量。
- 平均浏览量聚合
- avg_views:这是另一个自定义聚合名称,用于计算文档的平均浏览量。
- avg:指定平均值聚合。
- “field”: “views"表示计算views字段的平均值。这将返回所有匹配文档中views字段的平均值。
注意,上面的DSL中设置了,这将仅返回聚合查询结果,不返回普通query查询结果(即"hits”: [])。以下是查询结果:
- “field”: “views"表示计算views字段的平均值。这将返回所有匹配文档中views字段的平均值。
- avg:指定平均值聚合。
- avg_views:这是另一个自定义聚合名称,用于计算文档的平均浏览量。
{"took": 4,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 4,"relation": "eq"},"max_score": null,"hits": []},"aggregations": {"avg_views": {"value": 1125},"authors": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "John Doe","doc_count": 2},{"key": "Bob Wilson","doc_count": 1},{"key": "Jane Smith","doc_count": 1}]}}
}
4.5 高亮查询解析
{"query": {"query_string": {"query": "elasticsearch security"}},"highlight": {"fields": {"title": {},"content": {}}}
}
上面的DSL分查询和高亮两部分,下面详细解释一下。
- 查询部分
- query:这是整个查询的主体,定义了要执行的搜索操作。
- query_string:这个部分使用了查询字符串语法,运行通过简单的文本表达式构建复杂的查询。
- query:这里的值是elasticsearch security,这意味着要查找包含elasticsearch和security这两个词的文档。默认情况下,elasticsearch将这些词视为单独的词进行处理,并使用OR逻辑运算符连接它们,这意味着只要文档中包含其中一个词,就会被匹配。
- fields:这个参数指定了要搜索的字段,这个例子中,搜索将在title和content字段中进行,只有这两个字段中的内容会被考虑用于匹配查询。
- 高亮部分
- highlight:这部分用于定义特殊标记的设置,每个文档中匹配的词会被特殊标记(默认用标签包围),以便在搜索结果中突出显示匹配的内容。
- fields:指定需要高亮显示的字段,上例中,指定了title和content字段,这意味着当搜索结果返回时,如果这些字段中的内容与查询匹配,它们将被高亮显示,以便用户能够快速识别相关信息。
下面是查询结果:
- fields:指定需要高亮显示的字段,上例中,指定了title和content字段,这意味着当搜索结果返回时,如果这些字段中的内容与查询匹配,它们将被高亮显示,以便用户能够快速识别相关信息。
- highlight:这部分用于定义特殊标记的设置,每个文档中匹配的词会被特殊标记(默认用标签包围),以便在搜索结果中突出显示匹配的内容。
{"took": 5,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 5,"relation": "eq"},"max_score": 1.6386936,"hits": [{"_index": "blog_index","_id": "5","_score": 1.6386936,"_source": {"title": "Elasticsearch Security Best Practices","content": "Learn about securing your Elasticsearch cluster, including authentication, authorization, and encryption.","tags": ["elasticsearch","security","best practices"],"author": "Bob Wilson","publish_date": "2023-05-01","views": 1500,"status": "published"},"highlight": {"title": ["<em>Elasticsearch</em> <em>Security</em> Best Practices"],"content": ["Learn about securing your <em>Elasticsearch</em> cluster, including authentication, authorization, and encryption"]}},{"_index": "blog_index","_id": "1","_score": 0.28161854,"_source": {"title": "Getting Started with Elasticsearch","content": "Elasticsearch is a powerful search and analytics engine. It provides a distributed, multitenant-capable full-text search engine.","tags": ["elasticsearch","guide","search"],"author": "John Doe","publish_date": "2023-01-15","views": 1000,"status": "published"},"highlight": {"title": ["Getting Started with <em>Elasticsearch</em>"],"content": ["<em>Elasticsearch</em> is a powerful search and analytics engine."]}},{"_index": "blog_index","_id": "2","_score": 0.28161854,"_source": {"title": "Advanced Elasticsearch Query Guide","content": "Learn about complex queries in Elasticsearch including query_string, bool queries and aggregations.","tags": ["elasticsearch","advanced","query"],"author": "Jane Smith","publish_date": "2023-02-20","views": 800,"status": "published"},"highlight": {"title": ["Advanced <em>Elasticsearch</em> Query Guide"],"content": ["Learn about complex queries in <em>Elasticsearch</em> including query_string, bool queries and aggregations."]}},{"_index": "blog_index","_id": "3","_score": 0.28161854,"_source": {"title": "Elasticsearch vs Solr Comparison","content": "A detailed comparison between Elasticsearch and Solr. Both are powerful search engines built on Apache Lucene.","tags": ["elasticsearch","solr","comparison"],"author": "John Doe","publish_date": "2023-03-10","views": 1200,"status": "published"},"highlight": {"title": ["<em>Elasticsearch</em> vs Solr Comparison"],"content": ["A detailed comparison between <em>Elasticsearch</em> and Solr."]}},{"_index": "blog_index","_id": "4","_score": 0.09708915,"_source": {"title": "Mastering Kibana Dashboards","content": "Create powerful visualizations and dashboards using Kibana with Elasticsearch data.","tags": ["kibana","elasticsearch","visualization"],"author": "Alice Johnson","publish_date": "2023-04-05","views": 600,"status": "draft"},"highlight": {"content": ["Create powerful visualizations and dashboards using Kibana with <em>Elasticsearch</em> data."]}}]}
}
4.6 分页查询解析
下面是一个使用查询字符串语法进行分页查询的示例:
{"query": {"query_string": {"query": "elasticsearch security"}},"from":0,"size":4,"sort":[{"views":"desc"}]
}
有三部分组成:查询部分、分页控制部分和排序部分。
- 查询部分:字符串查询语法。
- 分页控制部分:
- “from”: 0:这个参数指定从结果集中的第0个文档开始返回(即从第一条记录开始)。用于实现分页功能。
- “size”: 2:指定要返回的文档数量。在这个例子中,最多返回2条匹配的文档。这与from参数结合使用你,可以实现更灵活的分页。
- 排序部分
- sort:用于定义如何对搜索结果进行排序。
- { “views”: “desc” }:表示根据views字段进行降序排序。
下面是返回值:
{"took": 6,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 5,"relation": "eq"},"max_score": null,"hits": [{"_index": "blog_index","_id": "5","_score": null,"_source": {"title": "Elasticsearch Security Best Practices","content": "Learn about securing your Elasticsearch cluster, including authentication, authorization, and encryption.","tags": ["elasticsearch","security","best practices"],"author": "Bob Wilson","publish_date": "2023-05-01","views": 1500,"status": "published"},"sort": [1500]},{"_index": "blog_index","_id": "3","_score": null,"_source": {"title": "Elasticsearch vs Solr Comparison","content": "A detailed comparison between Elasticsearch and Solr. Both are powerful search engines built on Apache Lucene.","tags": ["elasticsearch","solr","comparison"],"author": "John Doe","publish_date": "2023-03-10","views": 1200,"status": "published"},"sort": [1200]}]}
}
5 总结
介绍了查询字符串(query_string)语法,并结合一些高级查询展示了查询字符串语法的使用。如果你觉得“查询字符串”这种叫法有些奇怪,大可不必,因为这完全是安装query_string译过来的。
我是欧阳方超,把事情做好了自然就有兴趣了,如果你喜欢我的文章,欢迎点赞、转发、评论加关注。我们下次见。