[fix](mtmv) Fix data wrong if base table add new partition when query rewrite by partition rolled up mv#36414
Conversation
… rewrite by partition rolled up mv
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
TPC-H: Total hot run time: 39926 ms |
TPC-DS: Total hot run time: 171766 ms |
ClickBench: Total hot run time: 30 s |
|
|
||
| def mv_name = "mv_10086" | ||
| sql """DROP MATERIALIZED VIEW IF EXISTS ${mv_name}""" | ||
| sql """DROP TABLE IF EXISTS ${mv_name}""" |
There was a problem hiding this comment.
i think this is needed since mv and table name are use same space
| l_suppkey; | ||
| """ | ||
|
|
||
| def roll_up_all_partition_sql = """ |
There was a problem hiding this comment.
What is the difference between roll_up_mv_def_sql and roll_up_all_partition_sql?
There was a problem hiding this comment.
one is used by mv def, the other is used by query, the sql is the same
| """ | ||
|
|
||
| sql """DROP MATERIALIZED VIEW IF EXISTS ${mv_name}""" | ||
| sql """DROP TABLE IF EXISTS ${mv_name}""" |
|
|
||
| explain { | ||
| sql("${roll_up_all_partition_sql}") | ||
| contains("${mv_name}(${mv_name})") |
There was a problem hiding this comment.
because the key word in explain plan is mv_name(mv_name) when use the materialized view
|
PR approved by anyone and no changes requested. |
|
PR approved by at least one committer and no changes requested. |
… rewrite by partition rolled up mv (#36414) This is brought by #35562 When mv is partition rolled up mv, which is rolled up by date_trunc. If base table add new partition. if query rewrite successfully by the partition mv, the data will lost the new partition data. This pr fix this problem. For example as following: mv def is: CREATE MATERIALIZED VIEW roll_up_mv BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by (date_trunc(`col1`, 'month')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS select date_trunc(`l_shipdate`, 'day') as col1, l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total from lineitem left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate group by col1, l_shipdate, o_orderdate, l_partkey, l_suppkey; if run the insert comand insert into lineitem values (1, 2, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-11-21', '2023-11-21', '2023-11-21', 'a', 'b', 'yyyyyyyyy'); then run query as following, result will not return the 2023-11-21 partition data select date_trunc(`l_shipdate`, 'day') as col1, l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total from lineitem left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate group by col1, l_shipdate, o_orderdate, l_partkey, l_suppkey;
… rewrite by partition rolled up mv (apache#36414) This is brought by apache#35562 When mv is partition rolled up mv, which is rolled up by date_trunc. If base table add new partition. if query rewrite successfully by the partition mv, the data will lost the new partition data. This pr fix this problem. For example as following: mv def is: CREATE MATERIALIZED VIEW roll_up_mv BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by (date_trunc(`col1`, 'month')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS select date_trunc(`l_shipdate`, 'day') as col1, l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total from lineitem left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate group by col1, l_shipdate, o_orderdate, l_partkey, l_suppkey; if run the insert comand insert into lineitem values (1, 2, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-11-21', '2023-11-21', '2023-11-21', 'a', 'b', 'yyyyyyyyy'); then run query as following, result will not return the 2023-11-21 partition data select date_trunc(`l_shipdate`, 'day') as col1, l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total from lineitem left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate group by col1, l_shipdate, o_orderdate, l_partkey, l_suppkey;
Proposed changes
This is brought by #35562
When mv is partition rolled up mv, which is rolled up by date_trunc. If base table add new partition.
if query rewrite successfully by the partition mv, the data will lost the new partition data. This pr fix this problem.
For example as following:
mv def is:
if run the insert comand
then run query as following, result will not return the
2023-11-21 partitiondata