当前位置: 首页 > news >正文

AWS OpenSearch 搜索排序常见用法

背景介绍

AWS OpenSearch是AWS的一个检索分析服务,是基于开源的Elasticsearch 7.x分支fork出来的独立的一个代码仓库,做了独立的维护,加入了一些自己的优化,本文在这里主要介绍是常见的基础用法

引入相关依赖

 <dependency><groupId>org.opensearch.client</groupId><artifactId>opensearch-java</artifactId><version>2.17.0</version></dependency>

查询返回指定属性字段

按照前端要求的返回字段(“productId”, “title”, “rating”, “images”,“productTags”)进行返回,而不是返回所有字段

SearchResponse<ProductVO> search = openSearchClient.search(s -> s.index("product_index").source(c -> c.filter(e -> e.includes("productId", "title", "rating",  "images","productTags"))).query(q -> q.terms(t -> t.field("productId").terms(ts -> ts.value(values)))),ProductVO.class);

分页查询返回

根据前端传入的分页参数,当前页(pageNo)和每页的条数(pageSize)执行查询

  PageResult<ProductVO> pageResult = new PageResult<>();Integer pageIndex = requestVO.getPageNo();int pageSize = requestVO.getPageSize() != null ? requestVO.getPageSize() : 10;int from = (pageIndex - 1) * pageSize;
SearchResponse<ProductVO> search = openSearchClient.search(s -> s.index("product_index").source(c -> c.filter(e -> e.includes("productId", "title", "rating",  "images","productTags"))).from(from).size(pageSize).query(q -> q.terms(t -> t.field("productId").terms(ts -> ts.value(values)))),ProductVO.class);
List<ProductVO> productList = Lists.newArrayList();                    
if (CollectionUtils.isNotEmpty(search.hits().hits())) {search.hits().hits().forEach(h -> {ProductVO productVo = h.source();productList.add(productVo);});}int total = Math.toIntExact(search.hits().total().value());pageResult.setCurrentPage(pageIndex);pageResult.setPageSize(pageSize);pageResult.setTotal(total);if (pageResult.getTotal() % pageResult.getPageSize() == 0) {pageResult.setTotalPage(pageResult.getTotal() / pageResult.getPageSize());} else {pageResult.setTotalPage((pageResult.getTotal() / pageResult.getPageSize()) + 1);}pageResult.setItems(productList);

复合查询

List<Query> mustQueryList = new ArrayList<>();
List<Query> mustNotQueryList = new ArrayList<>();List<FieldValue> values = new ArrayList<>();
List<String> languages = List.of("zh");
languages.forEach(c -> {values.add(FieldValue.of(c));
});
int rating = Integer.parseInt("4.5");
Query ratingQuery = RangeQuery.of(r -> r.field("rating").gte(JsonData.of(rating - 0.25)).lt(JsonData.of(rating + 0.75))).toQuery();List<FieldValue> categoriesValues = new ArrayList<>();
categoriesValues.add(FieldValue.of("CA0001","CA0002"));
Query nestedQuery = NestedQuery.of(n -> n.path("categories").query(q -> q.terms(r -> r.field("categories.id").terms(t -> t.value(categoriesValues)).boost(1000f)))
).toQuery();
mustQueryList.add(TermsQuery.of(t -> t.field("language").terms(new TermsQueryField.Builder().value(values).build())).toQuery());
mustQueryList.add(ratingQuery);
mustNotQueryList.add(nestedQuery);
Query complexQuery = BoolQuery.of(b -> b.must(mustQueryList).mustNot(mustNotQueryList)).toQuery();
SearchResponse<ProductVO> search = openSearchClient.search(s -> s.index("product_index").source(c -> c.filter(e -> e.includes("productId", "title", "rating",  "images","productTags"))).from(from).size(pageSize).query(complexQuery),ProductVO.class);

聚合统计

以下是聚合查询商品每个评分的数量

 Aggregation aggregation = Aggregation.of(a -> a.terms(ts -> ts.field("rating").size(1000)));
SearchResponse<ProductBO> search = openSearchClient.search(s -> s.index("product_index").source(c -> c.filter(e -> e.includes("productId", "title", "rating"))).aggregations("ratingAgg", aggregation).query(filterQuery),ProductBO.class);
if (null != search.aggregations()) {Collection<Aggregate> aggregateCollection = search.aggregations().values();List<FacetHit> facetHitsList = Lists.newArrayList();aggregateCollection.forEach(aggregate -> {String kind = aggregate._kind().name();log.info("The aggregation type is {}", kind);switch (kind) {case "Nested" -> {Collection<Aggregate> nestedAggregateCollection = aggregate.nested().aggregations().values();nestedAggregateCollection.forEach(nestedAggregate -> {addFacetHits(nestedAggregate, facetHitsList);});}case "Sterms" -> {addFacetHits(aggregate, facetHitsList);}case "Dterms" -> {Buckets<DoubleTermsBucket> buckets = aggregate.dterms().buckets();buckets.array().forEach(bucket -> {String key = String.valueOf(bucket.key());FacetHit facetHit = new FacetHit();facetHit.setCount(Math.toIntExact(bucket.docCount()));facetHit.setValue(key);facetHitsList.add(facetHit);});}default -> log.warn("Unrecognized type:{} cannot be processed", kind);}});
}private void addFacetHits(Aggregate aggregate, List<FacetHit> facetHitsList) {Buckets<StringTermsBucket> buckets = aggregate.sterms().buckets();List<StringTermsBucket> stringTermsBuckets = buckets.array();stringTermsBuckets.forEach(s -> {String key = s.key();long docCount = s.docCount();FacetHit facetHit = new FacetHit();facetHit.setCount(Math.toIntExact(docCount));Aggregate parentAggregate = s.aggregations().get("parent_docs");if (null != parentAggregate && AggregateConstants.KIND_REVERSE_NESTED.equals(parentAggregate._kind().name())) {ReverseNestedAggregate reverseNested = parentAggregate.reverseNested();if (reverseNested != null) {long parentDocCount = reverseNested.docCount();facetHit.setCount(Math.toIntExact(parentDocCount));}}facetHit.setValue(key);facetHitsList.add(facetHit);});
}

基础排序

按照定义的排序字段进行排序

SearchResponse<ProductVO> search = openSearchClient.search(s -> s.index("product_index").source(c -> c.filter(e -> e.includes("productId", "title", "rating",  "images","productTags"))).from(from).size(pageSize).query(q -> q.terms(t -> t.field("productId").terms(ts -> ts.value(values)))).sort(t -> t.field(f -> f.field("publishDate").order(SortOrder.Desc))) .sort(t -> t.field(f -> f.field("rating").order(SortOrder.Desc))),ProductVO.class);

高阶排序

按照特定的一批商品排在查询结果的最前面

List<String> topProductIdList = List.of("1","2");//特定的商品编号
SearchResponse<ProductVO> search = openSearchClient.search(s -> s.index("product_index").source(c -> c.filter(e -> e.includes("productId", "title", "rating",  "images","productTags"))).from(from).size(pageSize).query(q -> q.terms(t -> t.field("productId").terms(ts -> ts.value(values)))).sort(getHightSortOptions(topProductIdList)), ProductVO.class);private List<SortOptions> getHightSortOptions(List<String> topProductIdList) {List<SortOptions> sortOptions = Lists.newArrayList();sortOptions.add(SortOptions.of(f -> f.script(st -> st.type(ScriptSortType.Number).script(Script.of(sf -> sf.inline(ie -> ie.source("params.topProductIds.indexOf(doc['productId'].value) >= 0 ? params.topProductIds.indexOf(doc['productId'].value) : params.topProductIds.size()").lang("painless").params(Map.of("topProductIds", JsonData.of(topProductIdList)))))).order(SortOrder.Asc))));sortOptions.add(SortOptions.of(t -> t.field(f -> f.field("rating").order(SortOrder.Desc))));sortOptions.add(SortOptions.of(t -> t.field(f -> f.field("publishDate").order(SortOrder.Desc))));return sortOptions;}

查看OpenSearch的数据

可以通过OpenSearch dashboard查看,如下图所示:
在这里插入图片描述

http://www.lryc.cn/news/596148.html

相关文章:

  • 如何加固Endpoint Central服务器的安全?(上)
  • 【运维】SGLang服务器参数配置详解
  • Python趣味算法:折半查找(二分查找)算法终极指南——原理、实现与优化
  • SQL Server 查询优化
  • 电子电气架构 --- 从软件质量看组织转型路径
  • 【NLP舆情分析】基于python微博舆情分析可视化系统(flask+pandas+echarts) 视频教程 - 访问鉴权功能实现
  • 5G 智慧矿山监控终端
  • UE5 UI 控件切换器
  • 记录解决问题--使用maven help插件一次性上传所有依赖到离线环境,spring-boot-starter-undertow离线环境缺少依赖
  • Jenkins 多架构并发构建实战
  • gitlab私服搭建
  • wed前端简单解析
  • k8s:离线部署tomcatV11.0.9,报Cannot find /opt/bitnami/tomcat/bin/setclasspath.sh
  • 中国在远程医疗智能化方面有哪些特色发展模式?
  • 公交车客流人数统计管理解决方案:智能化技术与高效运营实践
  • DAY20 奇异值SVD分解
  • 【bug】Yolo11在使用tensorrt推理numpy报错
  • 【数据可视化-70】奶茶店销量数据可视化:打造炫酷黑金风格的可视化大屏
  • 使用qt编写上位机程序,出现串口死掉无法接受数据的bug
  • vue2 webpack 部署二级目录、根目录nginx配置及打包配置调整
  • 【深度解析】从AWS re_Invent 2025看云原生技术发展趋势
  • kafka主题管理详解 - kafka-topics.sh
  • C++ 结构体(struct)与联合体(union)
  • 逻辑回归全景解析:从数学本质到工业级优化
  • AWS PrivateLink方式访问Redis
  • NIO技术原理以及应用(AI)
  • AWS RDS 排查性能问题
  • 图像基础:从像素到 OpenCV 的入门指南
  • 基于python django深度学习的中文文本检测+识别,可以前端上传图片和后台管理图片
  • 【学习路线】Python全栈开发攻略:从编程入门到AI应用实战