Skip to content

[fix](cloud) compaction and schema change potential data race when retrying prepare rowset#51048

Merged
gavinchou merged 8 commits intoapache:masterfrom
luwei16:luwei/fix-compact-prepare
Jun 12, 2025
Merged

[fix](cloud) compaction and schema change potential data race when retrying prepare rowset#51048
gavinchou merged 8 commits intoapache:masterfrom
luwei16:luwei/fix-compact-prepare

Conversation

@luwei16
Copy link
Copy Markdown
Contributor

@luwei16 luwei16 commented May 19, 2025

related PR #51129

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@luwei16
Copy link
Copy Markdown
Contributor Author

luwei16 commented May 19, 2025

run buildall

@gavinchou gavinchou changed the title fix cloud compaction and schema change potential data loss when retrying prepare rowset [fix](cloud) compaction and schema change potential data race when retrying prepare rowset May 19, 2025
Comment thread gensrc/proto/cloud.proto
optional doris.RowsetMetaCloudPB rowset_meta = 2;
optional bool temporary = 3;
optional int64 txn_id = 4;
optional string tablet_job_id = 5;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit rowset also need this parameter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@doris-robot
Copy link
Copy Markdown

Cloud UT Coverage Report

Increment line coverage 2.50% (1/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.31% (1113/1336)
Line Coverage 66.13% (18667/28227)
Region Coverage 65.73% (9267/14098)
Branch Coverage 55.51% (4986/8982)

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 34071 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 06f3b1be23de8e453e17191a22c54d67fcaf6c51, data reload: false

------ Round 1 ----------------------------------
q1	26219	5099	5010	5010
q2	2087	287	184	184
q3	10382	1265	715	715
q4	10238	1009	550	550
q5	7561	2452	2362	2362
q6	188	166	135	135
q7	972	782	614	614
q8	9318	1309	1166	1166
q9	6803	5114	5100	5100
q10	6820	2318	1879	1879
q11	493	290	279	279
q12	366	353	216	216
q13	17802	3681	3084	3084
q14	232	227	215	215
q15	534	490	490	490
q16	415	447	369	369
q17	635	891	385	385
q18	7656	7222	7137	7137
q19	1097	956	579	579
q20	344	351	232	232
q21	4028	3165	2382	2382
q22	1091	1026	988	988
Total cold run time: 115281 ms
Total hot run time: 34071 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5122	5140	5153	5140
q2	240	324	228	228
q3	2157	2726	2282	2282
q4	1322	1793	1356	1356
q5	4529	4458	4418	4418
q6	220	167	124	124
q7	1949	1896	1754	1754
q8	2652	2439	2442	2439
q9	7253	7146	7089	7089
q10	3036	3166	2774	2774
q11	588	542	486	486
q12	689	772	624	624
q13	3485	3884	3265	3265
q14	283	304	268	268
q15	518	478	469	469
q16	444	493	450	450
q17	1177	1570	1435	1435
q18	7732	7559	7433	7433
q19	827	844	918	844
q20	1999	2018	1832	1832
q21	4952	4462	4356	4356
q22	1119	1075	1022	1022
Total cold run time: 52293 ms
Total hot run time: 50088 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 185549 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 06f3b1be23de8e453e17191a22c54d67fcaf6c51, data reload: false

query1	1011	481	493	481
query2	6554	1827	1828	1827
query3	6762	218	226	218
query4	26025	23224	23387	23224
query5	4300	602	469	469
query6	305	202	220	202
query7	4619	495	288	288
query8	291	263	241	241
query9	8643	2615	2604	2604
query10	469	316	264	264
query11	15823	15076	14726	14726
query12	162	108	101	101
query13	1647	534	400	400
query14	9258	6212	6276	6212
query15	216	192	170	170
query16	7211	635	487	487
query17	1214	746	598	598
query18	1985	413	312	312
query19	195	195	164	164
query20	122	126	120	120
query21	217	128	110	110
query22	4136	4249	3927	3927
query23	34069	33102	33122	33102
query24	8441	2388	2407	2388
query25	572	484	424	424
query26	1248	272	163	163
query27	2748	517	355	355
query28	4357	2103	2094	2094
query29	776	555	436	436
query30	281	211	189	189
query31	958	844	753	753
query32	71	64	63	63
query33	577	356	333	333
query34	815	837	521	521
query35	782	887	745	745
query36	948	1005	909	909
query37	118	106	80	80
query38	4068	4059	3975	3975
query39	1505	1440	1397	1397
query40	210	117	106	106
query41	59	58	54	54
query42	122	111	111	111
query43	515	505	469	469
query44	1333	832	833	832
query45	177	174	170	170
query46	852	1031	634	634
query47	1735	1786	1672	1672
query48	389	424	312	312
query49	781	510	421	421
query50	645	719	420	420
query51	4182	4196	4119	4119
query52	106	108	97	97
query53	227	267	191	191
query54	597	575	502	502
query55	85	82	80	80
query56	319	302	324	302
query57	1117	1129	1065	1065
query58	267	259	270	259
query59	2565	2696	2554	2554
query60	327	313	298	298
query61	134	130	126	126
query62	796	707	641	641
query63	239	222	194	194
query64	4372	1004	666	666
query65	4282	4270	4200	4200
query66	1158	419	308	308
query67	15936	15786	15459	15459
query68	8376	884	522	522
query69	463	305	267	267
query70	1169	1078	1103	1078
query71	469	322	307	307
query72	5531	4689	4634	4634
query73	713	591	352	352
query74	9176	8984	8559	8559
query75	3861	3176	2661	2661
query76	3642	1178	767	767
query77	783	379	293	293
query78	9984	10170	9439	9439
query79	2637	806	588	588
query80	638	540	501	501
query81	462	248	222	222
query82	456	128	96	96
query83	284	265	245	245
query84	299	115	92	92
query85	791	351	316	316
query86	356	310	290	290
query87	4376	4399	4311	4311
query88	3214	2356	2368	2356
query89	407	309	283	283
query90	1939	225	205	205
query91	139	142	121	121
query92	75	62	56	56
query93	1190	932	591	591
query94	668	420	310	310
query95	372	295	286	286
query96	505	589	292	292
query97	2723	2763	2657	2657
query98	239	218	199	199
query99	1423	1374	1319	1319
Total cold run time: 274386 ms
Total hot run time: 185549 ms

@doris-robot
Copy link
Copy Markdown

ClickBench: Total hot run time: 28.65 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 06f3b1be23de8e453e17191a22c54d67fcaf6c51, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.10	0.12
query3	0.25	0.20	0.19
query4	1.58	0.20	0.20
query5	0.42	0.42	0.45
query6	1.14	0.64	0.66
query7	0.03	0.02	0.01
query8	0.05	0.03	0.04
query9	0.57	0.52	0.50
query10	0.56	0.58	0.57
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.61	0.60	0.60
query14	0.79	0.80	0.81
query15	0.87	0.86	0.85
query16	0.38	0.38	0.37
query17	1.03	1.03	1.00
query18	0.22	0.22	0.21
query19	1.89	1.80	1.84
query20	0.01	0.02	0.01
query21	15.41	0.87	0.54
query22	0.76	1.28	0.66
query23	14.83	1.40	0.63
query24	6.81	1.91	0.37
query25	0.42	0.30	0.13
query26	0.61	0.16	0.13
query27	0.05	0.05	0.05
query28	9.75	0.86	0.44
query29	12.59	3.94	3.29
query30	0.27	0.09	0.06
query31	2.82	0.60	0.39
query32	3.23	0.55	0.49
query33	3.03	3.11	3.06
query34	15.84	5.08	4.54
query35	4.53	4.55	4.50
query36	0.66	0.49	0.49
query37	0.09	0.06	0.07
query38	0.06	0.04	0.04
query39	0.03	0.03	0.02
query40	0.17	0.15	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.04	0.03
Total cold run time: 102.98 s
Total hot run time: 28.65 s

@doris-robot
Copy link
Copy Markdown

BE UT Coverage Report

Increment line coverage 0.00% (0/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 55.94% (14922/26674)
Line Coverage 44.75% (132363/295754)
Region Coverage 43.85% (66591/151872)
Branch Coverage 38.44% (34122/88774)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.35% (20837/26258)
Line Coverage 72.61% (214732/295751)
Region Coverage 70.79% (126301/178424)
Branch Coverage 64.54% (65450/101408)

@luwei16
Copy link
Copy Markdown
Contributor Author

luwei16 commented May 26, 2025

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 96.23% (51/53) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.26% (1114/1338)
Line Coverage 66.25% (18733/28277)
Region Coverage 65.86% (9296/14114)
Branch Coverage 55.75% (5014/8994)

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 34007 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit bad63c07bb0feeda4b123ae59d1d6f8de4415ca4, data reload: false

------ Round 1 ----------------------------------
q1	26342	6033	4989	4989
q2	2103	284	190	190
q3	10370	1245	716	716
q4	10231	1003	524	524
q5	7541	2393	2406	2393
q6	182	160	130	130
q7	915	725	635	635
q8	9344	1304	1152	1152
q9	6905	5133	5150	5133
q10	6818	2348	1899	1899
q11	504	299	280	280
q12	350	353	209	209
q13	17751	3685	3110	3110
q14	252	242	218	218
q15	534	488	483	483
q16	441	434	379	379
q17	622	888	369	369
q18	7594	7287	7041	7041
q19	1211	964	548	548
q20	343	341	242	242
q21	4251	3264	2436	2436
q22	1047	1036	931	931
Total cold run time: 115651 ms
Total hot run time: 34007 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5120	5074	5083	5074
q2	244	335	238	238
q3	2213	2694	2316	2316
q4	1348	1786	1529	1529
q5	4517	4472	4365	4365
q6	215	165	125	125
q7	1983	1928	1790	1790
q8	2603	2509	2507	2507
q9	7143	7157	7171	7157
q10	3031	3188	2763	2763
q11	577	513	501	501
q12	700	764	579	579
q13	3506	3896	3284	3284
q14	294	316	271	271
q15	513	472	480	472
q16	446	500	463	463
q17	1193	1513	1411	1411
q18	7692	7740	7429	7429
q19	827	835	922	835
q20	1991	2038	1920	1920
q21	5107	4304	4371	4304
q22	1080	1032	998	998
Total cold run time: 52343 ms
Total hot run time: 50331 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 185858 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit bad63c07bb0feeda4b123ae59d1d6f8de4415ca4, data reload: false

query1	1018	497	519	497
query2	6568	1858	1861	1858
query3	6747	218	220	218
query4	26182	23822	23094	23094
query5	4367	635	465	465
query6	295	208	200	200
query7	4633	503	293	293
query8	298	240	238	238
query9	8594	2618	2625	2618
query10	485	350	280	280
query11	15651	15100	14855	14855
query12	166	108	114	108
query13	1696	554	418	418
query14	8680	6104	6361	6104
query15	201	198	174	174
query16	7126	646	468	468
query17	1184	726	595	595
query18	1960	390	292	292
query19	195	189	151	151
query20	116	114	121	114
query21	238	125	108	108
query22	4054	4061	4002	4002
query23	34171	33208	33092	33092
query24	8374	2397	2410	2397
query25	536	457	388	388
query26	1240	273	155	155
query27	2756	502	338	338
query28	4341	2103	2084	2084
query29	765	587	455	455
query30	295	219	195	195
query31	911	880	782	782
query32	79	71	64	64
query33	572	375	339	339
query34	811	855	523	523
query35	792	817	735	735
query36	946	1003	919	919
query37	114	104	83	83
query38	4113	4039	4045	4039
query39	1499	1470	1424	1424
query40	227	131	112	112
query41	65	58	63	58
query42	126	116	115	115
query43	519	494	489	489
query44	1359	817	822	817
query45	183	175	174	174
query46	868	1031	646	646
query47	1757	1791	1761	1761
query48	401	463	342	342
query49	810	568	423	423
query50	678	685	414	414
query51	4161	4124	4053	4053
query52	118	113	102	102
query53	226	255	195	195
query54	570	584	490	490
query55	86	88	83	83
query56	304	317	304	304
query57	1157	1161	1064	1064
query58	274	259	258	258
query59	2638	2621	2655	2621
query60	337	313	309	309
query61	141	119	126	119
query62	796	724	662	662
query63	228	196	189	189
query64	4325	996	676	676
query65	4334	4277	4248	4248
query66	1145	409	303	303
query67	16037	15558	15441	15441
query68	8726	894	534	534
query69	478	312	268	268
query70	1211	1082	1109	1082
query71	444	332	299	299
query72	5426	4762	4893	4762
query73	722	628	359	359
query74	8855	9051	8757	8757
query75	4076	3219	2661	2661
query76	3617	1187	764	764
query77	815	376	291	291
query78	10072	10142	9295	9295
query79	2140	783	575	575
query80	624	526	440	440
query81	484	253	220	220
query82	477	127	105	105
query83	290	256	232	232
query84	297	112	90	90
query85	794	350	315	315
query86	395	305	302	302
query87	4423	4480	4343	4343
query88	3414	2278	2290	2278
query89	392	336	287	287
query90	1827	215	208	208
query91	148	150	125	125
query92	73	67	57	57
query93	1663	973	587	587
query94	667	411	303	303
query95	387	296	290	290
query96	504	576	282	282
query97	2719	2760	2621	2621
query98	244	207	219	207
query99	1424	1432	1289	1289
Total cold run time: 274467 ms
Total hot run time: 185858 ms

@doris-robot
Copy link
Copy Markdown

ClickBench: Total hot run time: 28.94 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit bad63c07bb0feeda4b123ae59d1d6f8de4415ca4, data reload: false

query1	0.03	0.04	0.03
query2	0.12	0.11	0.11
query3	0.26	0.19	0.20
query4	1.60	0.19	0.10
query5	0.44	0.42	0.41
query6	1.15	0.68	0.66
query7	0.02	0.01	0.02
query8	0.04	0.04	0.04
query9	0.58	0.51	0.51
query10	0.56	0.59	0.56
query11	0.15	0.11	0.11
query12	0.15	0.11	0.12
query13	0.62	0.59	0.59
query14	0.79	0.81	0.81
query15	0.89	0.86	0.85
query16	0.38	0.38	0.40
query17	1.04	1.02	1.03
query18	0.23	0.20	0.21
query19	1.88	1.82	1.83
query20	0.02	0.01	0.02
query21	15.42	0.91	0.54
query22	0.76	1.19	0.66
query23	14.95	1.37	0.63
query24	6.97	2.10	0.81
query25	0.52	0.21	0.09
query26	0.53	0.16	0.14
query27	0.05	0.05	0.05
query28	10.26	0.87	0.47
query29	12.54	3.97	3.26
query30	0.25	0.10	0.07
query31	2.82	0.62	0.40
query32	3.23	0.54	0.48
query33	3.03	3.05	3.13
query34	15.81	5.06	4.48
query35	4.50	4.52	4.49
query36	0.67	0.49	0.49
query37	0.08	0.06	0.07
query38	0.05	0.03	0.04
query39	0.03	0.02	0.03
query40	0.18	0.13	0.12
query41	0.08	0.03	0.03
query42	0.03	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 103.75 s
Total hot run time: 28.94 s

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 56.05% (14935/26644)
Line Coverage 44.85% (132935/296428)
Region Coverage 43.92% (66813/152140)
Branch Coverage 38.54% (34281/88956)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.49% (20848/26227)
Line Coverage 72.66% (215367/296415)
Region Coverage 70.87% (126601/178647)
Branch Coverage 64.62% (65640/101582)

@luwei16
Copy link
Copy Markdown
Contributor Author

luwei16 commented Jun 4, 2025

run buildall

@luwei16
Copy link
Copy Markdown
Contributor Author

luwei16 commented Jun 4, 2025

run buildall

@luwei16
Copy link
Copy Markdown
Contributor Author

luwei16 commented Jun 4, 2025

run buildall

@doris-robot
Copy link
Copy Markdown

Cloud UT Coverage Report

Increment line coverage 96.36% (53/55) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.33% (1120/1344)
Line Coverage 66.87% (19249/28787)
Region Coverage 66.50% (9537/14341)
Branch Coverage 56.52% (5182/9168)

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 33667 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9344a50f17b2616229f288d96e00fb60b4c6f50d, data reload: false

------ Round 1 ----------------------------------
q1	26123	5150	5002	5002
q2	1935	271	172	172
q3	10311	1266	712	712
q4	10232	1034	512	512
q5	7544	2454	2380	2380
q6	177	165	138	138
q7	888	732	623	623
q8	9306	1286	1082	1082
q9	6814	5080	5147	5080
q10	6902	2318	1869	1869
q11	500	290	273	273
q12	337	347	215	215
q13	17773	3726	3096	3096
q14	228	231	211	211
q15	572	492	489	489
q16	426	437	379	379
q17	588	877	367	367
q18	7577	7311	6991	6991
q19	2083	962	569	569
q20	324	341	228	228
q21	3730	3216	2286	2286
q22	1086	1011	993	993
Total cold run time: 115456 ms
Total hot run time: 33667 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5228	5117	5145	5117
q2	247	326	222	222
q3	2194	2711	2289	2289
q4	1333	1804	1324	1324
q5	4567	4444	4369	4369
q6	249	173	125	125
q7	2007	1935	1752	1752
q8	2596	2551	2501	2501
q9	7206	7190	7238	7190
q10	3029	3229	2753	2753
q11	567	493	503	493
q12	711	773	635	635
q13	3479	3931	3224	3224
q14	290	288	278	278
q15	524	481	481	481
q16	448	483	434	434
q17	1146	1589	1402	1402
q18	7758	7513	7419	7419
q19	772	779	787	779
q20	2010	2068	1933	1933
q21	4972	4391	4214	4214
q22	1091	1022	991	991
Total cold run time: 52424 ms
Total hot run time: 49925 ms

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 12, 2025
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

@sollhui sollhui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gavinchou gavinchou merged commit b31648a into apache:master Jun 12, 2025
27 of 30 checks passed
luwei16 added a commit to luwei16/Doris that referenced this pull request Jun 18, 2025
dataroaring pushed a commit that referenced this pull request Jun 18, 2025
dataroaring pushed a commit that referenced this pull request Jun 18, 2025
luwei16 added a commit to luwei16/Doris that referenced this pull request Jun 20, 2025
morrySnow pushed a commit that referenced this pull request Jun 23, 2025
…race when retrying prepare rowset #51048 (#52075)

related PR #51129

pick master #51048
Hastyshell pushed a commit to Hastyshell/doris that referenced this pull request Jul 21, 2025
Hastyshell pushed a commit to Hastyshell/doris that referenced this pull request Jul 30, 2025
@gavinchou gavinchou mentioned this pull request Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. cloud dev/3.0.7-merged dev/3.1.0-merged p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants