ElastaticSearch -- es深度分页 searchAfter
searchAfter深度分页
es一次只能查1万条数据,如果超过1万,会报错如下:
"reason": {"type": "query_phase_execution_exception","reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10001]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}
可以调整es配置的index.max_result_window这个参数,来查询更多的数据,但这并不是很好的办法,最好使用 searchAfter.
searchAfter作用类似于以下sql语句:
-- 查询第10001--10005条数据
SELECT * FROM t_user order by name,birthDay LIMIT 10000,5;
dsl
- 第一次查询,查询1–10000条数据,找出第10000条数据的排序字段结果
GET /user_info/_search
{"size": 10000,"sort": [{"name": {"order": "asc"}},{"birthDay": {"order": "desc"}}]
}
- 第二次查询,根据上面最后一条数据的排序字段,查询出第 10001–20000条数据;
多了一个search_after的查询条件,对应的排序字段为第一步查出来的最后一条数据(也就是第10000条数据)的排序字段
GET /user_info/_search
{"size": 10000,"sort": [{"name": {"order": "asc"}},{"birthDay": {"order": "desc"}}],"search_after": ["wang", "1993-12-01"]
}
java代码
public SearchSourceBuilder searchAfterTest() {SearchSourceBuilder searc = new SearchSourceBuilder().size(10000);BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
// boolQuery.filter(QueryBuilders.termQuery("province", "深圳市"));searc.sort("name", SortOrder.ASC).sort("birthDay", SortOrder.DESC);String[] searchAfter = new String[]{"wang", "1993-12-01"};searc.searchAfter(searchAfter);return searc.query(boolQuery);}