[opt](nereids) support partitionTopn for multi window exprs#38393
[opt](nereids) support partitionTopn for multi window exprs#38393englefly merged 6 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
1 similar comment
|
run buildall |
7e65c5a to
29f6a0b
Compare
|
run buildall |
29f6a0b to
04ff675
Compare
|
run buildall |
TPC-H: Total hot run time: 39375 ms |
TPC-DS: Total hot run time: 172696 ms |
ClickBench: Total hot run time: 30.74 s |
|
run buildall |
TPC-H: Total hot run time: 39737 ms |
TPC-DS: Total hot run time: 172790 ms |
ClickBench: Total hot run time: 31.38 s |
|
run buildall |
TPC-H: Total hot run time: 39385 ms |
TPC-DS: Total hot run time: 173456 ms |
ClickBench: Total hot run time: 30.41 s |
|
please add a sql example in "Proposed changes" section |
Done |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
| * pushPartitionLimitThroughWindow is used to push the partitionLimit through the window | ||
| * and generate the partitionTopN. If the window can not meet the requirement, | ||
| * it will return null. So when we use this function, we need check the null in the outside. | ||
| * check and get valid window function and partition limit value |
There was a problem hiding this comment.
add comment to explain the input parameter and return value
| long chosenRowNumberPartitionLimit = Long.MAX_VALUE; | ||
| boolean hasRowNumber = false; | ||
| for (NamedExpression windowExpr : windowExpressions) { | ||
| WindowExpression windowFunc = (WindowExpression) windowExpr.child(0); |
| Set<Expression> conjuncts = filter.getConjuncts(); | ||
| Set<Expression> relatedConjuncts = extractRelatedConjuncts(conjuncts, windowExpr.getExprId()); | ||
| for (Expression conjunct : relatedConjuncts) { | ||
| Preconditions.checkArgument(conjunct instanceof BinaryOperator); |
There was a problem hiding this comment.
add error msg? log warning log and return null instead throw exception?
There was a problem hiding this comment.
remove the pre-condition checking since the former logic has been done.
| Preconditions.checkArgument(leftChild instanceof SlotReference | ||
| && rightChild instanceof IntegerLikeLiteral); |
There was a problem hiding this comment.
add error msg? log warning log and return null instead throw exception?
There was a problem hiding this comment.
remove the pre-condition checking since the former logic has been done.
d053abc to
95c8ad6
Compare
|
run buildall |
TPC-H: Total hot run time: 41944 ms |
TPC-DS: Total hot run time: 168646 ms |
ClickBench: Total hot run time: 30.61 s |
|
PR approved by at least one committer and no changes requested. |
Support partitionTopn for multi window exprs. If row_number exists, choose row_number with the minimal limit value; if not, choose others with the minimal limit value. Example: ``` mysql> explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, rank() over(partition by c1 order by c3) as rk from push_down_multi_predicate_through_window_t) t where rn <= 1 and rk <= 1; +----------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +----------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rk <= 1) and (rn <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[LOCAL_SORT] | | ----------PhysicalDistribute[DistributionSpecHash] | | ------------PhysicalWindow | | --------------PhysicalQuickSort[LOCAL_SORT] | | ----------------PhysicalDistribute[DistributionSpecHash] | | ------------------PhysicalPartitionTopN | | --------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +----------------------------------------------------------------------------------+ ``` --------- Co-authored-by: zhongjian.xzj <zhongjian.xzj@zhongjianxzjdeMacBook-Pro.local>
intro by #38393 Fix the cases whose window function both contains row_number and other types but only the other types contains pushing down filter.
…e#39233) intro by apache#38393 Fix the cases whose window function both contains row_number and other types but only the other types contains pushing down filter.
Support partitionTopn for multi window exprs. If row_number exists, choose row_number with the minimal limit value; if not, choose others with the minimal limit value. Example: ``` mysql> explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, rank() over(partition by c1 order by c3) as rk from push_down_multi_predicate_through_window_t) t where rn <= 1 and rk <= 1; +----------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +----------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rk <= 1) and (rn <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[LOCAL_SORT] | | ----------PhysicalDistribute[DistributionSpecHash] | | ------------PhysicalWindow | | --------------PhysicalQuickSort[LOCAL_SORT] | | ----------------PhysicalDistribute[DistributionSpecHash] | | ------------------PhysicalPartitionTopN | | --------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +----------------------------------------------------------------------------------+ ``` --------- Co-authored-by: zhongjian.xzj <zhongjian.xzj@zhongjianxzjdeMacBook-Pro.local>
intro by #38393 Fix the cases whose window function both contains row_number and other types but only the other types contains pushing down filter.
## Proposed changes pick from #38393 Co-authored-by: xiongzhongjian <xiongzhongjian@selectdb.com>
…forbidden type (#44617) Related PR: #38393 Problem Summary: In the previous pr which supporting multi win expr ptopN pushdown, it handled partial forbidden type unexpectly and will lead some case to push down the pTopN wrongly. plan before fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +------------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +------------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalPartitionTopN | | ----------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +------------------------------------------------------------------------------------+ plan after fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +----------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +----------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +----------------------------------------------------------------------------------+
…forbidden type (#44617) Related PR: #38393 Problem Summary: In the previous pr which supporting multi win expr ptopN pushdown, it handled partial forbidden type unexpectly and will lead some case to push down the pTopN wrongly. plan before fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +------------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +------------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalPartitionTopN | | ----------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +------------------------------------------------------------------------------------+ plan after fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +----------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +----------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +----------------------------------------------------------------------------------+
…forbidden type (#44617) Related PR: #38393 Problem Summary: In the previous pr which supporting multi win expr ptopN pushdown, it handled partial forbidden type unexpectly and will lead some case to push down the pTopN wrongly. plan before fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +------------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +------------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalPartitionTopN | | ----------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +------------------------------------------------------------------------------------+ plan after fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +----------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +----------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +----------------------------------------------------------------------------------+
…forbidden type (#44617) Related PR: #38393 Problem Summary: In the previous pr which supporting multi win expr ptopN pushdown, it handled partial forbidden type unexpectly and will lead some case to push down the pTopN wrongly. plan before fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +------------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +------------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalPartitionTopN | | ----------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +------------------------------------------------------------------------------------+ plan after fixing: explain shape plan select * from (select row_number() over(partition by c1, c2 order by c3) as rn, sum(c2) over(order by c2 range between unbounded preceding and unbounded following) as sw from push_down_multi_predicate_through_window_t) t where rn <= 1 and sw <= 1; +----------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +----------------------------------------------------------------------------------+ | PhysicalResultSink | | --PhysicalProject | | ----filter((rn <= 1) and (sw <= 1)) | | ------PhysicalWindow | | --------PhysicalQuickSort[MERGE_SORT] | | ----------PhysicalDistribute[DistributionSpecGather] | | ------------PhysicalQuickSort[LOCAL_SORT] | | --------------PhysicalWindow | | ----------------PhysicalQuickSort[LOCAL_SORT] | | ------------------PhysicalDistribute[DistributionSpecHash] | | --------------------PhysicalOlapScan[push_down_multi_predicate_through_window_t] | +----------------------------------------------------------------------------------+
Support partitionTopn for multi window exprs.
If row_number exists, choose row_number with the minimal limit value; if not, choose others with the minimal limit value.
Example: