当前位置: 首页 > news >正文

ElasticSearch聚合查询

  1. 数据准备
索引创建
PUT product
{"mappings": {"properties": {"createtime": {"type": "date"},"desc": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}},"analyzer": "ik_max_word"},"lv": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}},"name": {"type": "text","analyzer": "ik_max_word","fields": {"keyword": {"type": "keyword","ignore_above": 256}}},"pice": {"type": "long"},"tags": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}},"type": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}
}
数据插入
PUT  /product/_doc/1
{"name":"小米手机","desc":"手机中的战斗机","pice":3999,"lv":"旗舰机","type":"手机","createtime":"2020-10-01","tags":["性价比","发烧","不卡顿"]
}PUT  /product/_doc/2
{"name":"小米NFC手机","desc":"支持全功能NFC,手机中的滑翔机","pice":4999,"lv":"旗舰机","type":"手机","createtime":"2020-05-21","tags":["性价比","发烧","公交卡"]
}

分组查询

# 不同标签商品数量(按照结果数量降序),和不同类型的商品数量
GET /product/_search
{"size": 0, "aggs": {"tags_group": {"terms": {"field": "tags.keyword","order": {"_count": "desc"}}},"type_group": {"terms": {"field": "type.keyword"}}}
}

指标查询

查询pice的最大值和平均值、以及所有指标聚合值
{"size": 0,"aggs": {"pice_avg": {"avg": {"field": "pice"}},"max_pice": {"max": {"field": "pice"}},"stats_pice": {"stats": {"field": "pice"}}}
}
根据name去重
{"size": 0, "aggs": {"name_count": {"cardinality": {"field": "name.keyword"}}}
}

管道聚合

# 平均价格最低的商品分类
GET /product/_search
{"size": 0,"aggs": {"type_group": {"terms": {"field": "type.keyword"},"aggs": {"avg_pice": {"avg": {"field": "pice"}}}},"min_baucket":{"min_bucket": {"buckets_path": "type_group>avg_pice"}}}
}

基于查询结果的聚合

统计电视的平均价格
GET /product/_search
{"query": {"bool": {"must": [{"term": {"type.keyword": {"value": "电视"}}}]}},"aggs": {"tags_agg": {"avg": {"field": "pice"}}}
}{"query": {"bool": {"filter": [{"term": {"type.keyword": {"value": "电视"}}}]}},"aggs": {"tags_agg": {"avg": {"field": "pice"}}}
}针对聚合后的结果做过滤
{"aggs": {"tags_agg": {"terms": {"field": "tags.keyword"}}},"post_filter": {"term": {"tags.keyword": "性价比"}}
}# 价格大于三千的 价格最小值,平均值 ,所有数据的平均值
GET /product/_search
{"query": {"bool": {"must": [{"range": {"pice": {"gte": 3000}}}]}},"size": 0,"aggs": {"min_pice": {"min": {"field": "pice"}},"avg_pice": {"avg": {"field": "pice"}},"all_avg_pic": {"global": {}, //取消了外层的条件过滤"aggs": {"avg_pic": {"avg": {"field": "pice"}}}},"muti_avg_pic": {"filter": {  // 结合外层条件取交集"range": {"pice": {"gte": 4000}}},"aggs": {"avg_pic": {"avg": {"field": "pice"}}}}}
}

聚合排序

过滤出手机耳机 再根据类型分组,计算各统计聚合值(平均,最大,最小),最好喝根据最小值排序
{"size": 0,"query": {"bool": {"filter": {"terms": {"type.keyword": ["手机","耳机"]}}}},"aggs": {"avg_tag_pice": {"terms": {"field": "type.keyword","order": {"pic_stats.min": "desc"}},"aggs": {"pic_stats": {"stats": {"field": "pice"}}}}}
}

常用聚合函数

histogram 函数
统计价格在每个区段(间隔200)的产品数量
{"size": 0, "aggs": {"pice_histogram": {"histogram": {"field": "pice","interval": 200,  # 分割间隔"keyed": false,  # true,则返回 key_value形式"min_doc_count": 1, # 满足结果大于等于1的带才返回"missing": 0 #  空值默认}}}
}
date_histogram 函数
统计每月产品数量
{"size": 0, "aggs": {"create_time_histogram": {"date_histogram": {"field": "createtime","calendar_interval": "month",  # 分割间隔  "fixed_interval"  间隔小最大单位 天"format": "yyyy-MM",   # 日期格式"extended_bounds": {  # 统计数据时间区段"min": "2020-01","max": "2020-12"},"order": { # 排序"_count": "desc"}}}}
}
统计每月产品数量,再做累加
{"size": 0, "aggs": {"create_time_histogram": {"date_histogram": {"field": "createtime","calendar_interval": "month","min_doc_count": 0,"format": "yyyy-MM", "extended_bounds": {"min": "2020-01","max": "2020-12"}},"aggs": {  "sum_age": { # 求每月的总和"sum": {"field": "pice"}},"pice_cumulative_sum":{ # 累加每月总和"cumulative_sum": {"buckets_path": "sum_age"}}}}}
}
percentiles 函数 百分比占比统计, 数量越大统计越准确
{"size": 0, "aggs": {"pice_percentiles": {"percentiles": {"field": "pice","percents": [1,5,25,50,75,95,99]}}}
}
percentile_ranks 函数 范围占比统计 数量越大统计越准确
{"size": 0, "aggs": {"pice_percentiles": {"percentile_ranks": {"field": "pice","values": [2000,4000,6000]}}}
}
http://www.lryc.cn/news/316873.html

相关文章:

  • 【毕设级项目】基于AI技术的多功能消防机器人(完整工程资料源码)
  • 【一】【设计模式】类关系UML图
  • 【DevOps基础篇】容器化架构基础设施监控方案
  • 【QT】文件流操作(QTextStream/QDataStream)
  • CentOS 7 devtoolset编译addressSanitizer版本失败的问题解决
  • ubuntu2004桌面系统英伟达显卡驱动安装方法
  • Java通过Excel批量上传数据!!!
  • 【PyQT/Pysider】控件背景渐变
  • ChatGPT-4 VS 文心一言4.0
  • MYSQL------从概述到DQL
  • MATLAB算法实战应用案例精讲-【图像处理】图像识别(基础篇)(二)
  • Leetcode 3.12
  • 【天池课堂】零基础入门数据挖掘-课程汇总
  • 表单进阶(3)-上传文件和隐藏字段
  • LLM(大语言模型)常用评测指标-MAP@R
  • 腾讯面经学习笔记
  • 北京某中厂凉经
  • 离线数仓(五)【数据仓库建模】
  • python | 类与对象
  • 基于Qt 和python 的自动升级功能
  • 【论文阅读】IEEE Access 2019 BadNets:评估深度神经网络的后门攻击
  • Unity 让角色动起来(动画控制器)
  • ubuntu22.04环境中安装pylint
  • 主流数据库的区别
  • veeam备份基础
  • Flink并行度
  • 这届留学生是懂作弊的,ChatGPT震惊教授一整年!
  • CVE-2023-38836 BoidCMSv.2.0.0 后台文件上传漏洞
  • pf4j插件实践验证
  • 计算机组成原理之运算方法和运算器