[fix](filecache) fix warm up cancel failure when BE is down#58035
[fix](filecache) fix warm up cancel failure when BE is down#58035dataroaring merged 3 commits intoapache:masterfrom
Conversation
Fixed issue where cancel flow would exit if a BE was offline,preventing subsequent BEs from receiving clear_job RPC.Now skips failed BEs and continues sending RPCs to others. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 34254 ms |
TPC-DS: Total hot run time: 187417 ms |
ClickBench: Total hot run time: 27.66 s |
FE Regression Coverage ReportIncrement line coverage |
| try { | ||
| TNetworkAddress addr = beToAddr == null ? null : beToAddr.get(beId); | ||
| if (addr != null) { | ||
| ClientPool.backendPool.invalidateObject(addr, client); |
There was a problem hiding this comment.
we cannot invalidate the pool unless we check the exception to ensure it needs invalidation.
There was a problem hiding this comment.
The pool is used locally, so invalidating one item won't affect other RPCs or jobs. Invalidation is to skip the failed BE and continue with the following.
Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 34679 ms |
TPC-DS: Total hot run time: 181379 ms |
ClickBench: Total hot run time: 27.36 s |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
Fixed issue where cancel flow would exit if a BE was offline,preventing subsequent BEs from receiving clear_job RPC.Now skips failed BEs and continues sending RPCs to others. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
Fixed issue where cancel flow would exit if a BE was offline,preventing subsequent BEs from receiving clear_job RPC.Now skips failed BEs and continues sending RPCs to others. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
Fixed issue where cancel flow would exit if a BE was offline,preventing subsequent BEs from receiving clear_job RPC.Now skips failed BEs and continues sending RPCs to others.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)