0%

elasticsearch中bucket_script聚合父级必须是多桶聚合

场景

业务需求中,在计算人均通话数时,使用聚合的时候使用到了内联以及反内联聚合,当bucket_script聚合作为反内联reverse_nested的子聚和的时候,会报如下错误:

1
org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=class_cast_exception, reason=org.elasticsearch.search.aggregations.bucket.nested.InternalReverseNested cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation]

该场景中的嵌套聚合中的子聚合如下(父级聚合此处不展示):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
"reverse_nested": {
"reverse_nested": {},
"aggregations": {
"agg_count": {
"value_count": {
"field": "taskId.keyword"
}
},
"agg_cardinality_tel": {
"cardinality": {
"field": "tel.keyword"
}
},
"agg_bucketScript": {
"bucket_script": {
"buckets_path": {
"userCount": "agg_cardinality_tel",
"docCount": "agg_count"
},
"script": {
"source": "params.docCount / params.userCount",
"lang": "painless"
},
"gap_policy": "skip"
}
}
}
}
}

原因分析

bucket_script聚合的是建立在多桶聚合的前提下,即父级聚合必须是多桶聚合,官方文档已对此做过说明。

bucket_selector 等聚合方式的聚合场景亦是如此。

A parent pipeline aggregation which executes a script which can perform per bucket computations on specified metrics in the parent multi-bucket aggregation. The specified metric must be numeric and the script must return a numeric value.

解决方案

bucket_script聚合与reverse_nested聚合放在同层级,关联path即可。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
"reverse_nested": {
"reverse_nested": {},
"aggregations": {
"agg_count": {
"value_count": {
"field": "taskId.keyword"
}
},
"agg_cardinality_tel": {
"cardinality": {
"field": "tel.keyword"
}
}
}
},
"agg_bucketScript": {
"bucket_script": {
"buckets_path": {
"userCount": "reverse_nested>agg_cardinality_tel",
"docCount": "reverse_nested>agg_count"
},
"script": {
"source": "params.docCount / params.userCount",
"lang": "painless"
},
"gap_policy": "skip"
}
}
}