Skip to content

Antalya 25.8: Backport upstream fixes for parallel replicas + RIGHT/JOIN chains LOGICAL_ERROR#1724

Open
CarlosFelipeOR wants to merge 4 commits intoantalya-25.8from
backport/antalya-25.8/right-joins-chain-fix
Open

Antalya 25.8: Backport upstream fixes for parallel replicas + RIGHT/JOIN chains LOGICAL_ERROR#1724
CarlosFelipeOR wants to merge 4 commits intoantalya-25.8from
backport/antalya-25.8/right-joins-chain-fix

Conversation

@CarlosFelipeOR
Copy link
Copy Markdown
Collaborator

The fix was authored with assistance from AI model Claude Opus 4.7.

Cherry-picks three upstream PRs to fix the LOGICAL_ERROR
Expected JOIN table expression to be table, table function, query or union node
that crashes the server when parallel_replicas is enabled and a query has a chain of JOINs (LEFT/INNER...RIGHT, RIGHT...RIGHT, n-way with GLOBAL/FULL).

The crash is reproducible in Stress test (amd_tsan) on antalya-25.8 (e.g. run 25215170829 on sha 66579939, log_comment 03208_multiple_joins_with_storage_join.sql). Same crash also happens in upstream/25.8 (~7.8% fail rate over 30 days) - the fix landed on master in Feb/2026 but was never backported to any stable branch (25.3, 25.6, 25.7, 25.8).

Cherry-picked commits, in order:

ClickHouse#97316 had a small textual conflict in src/Planner/PlannerJoinTree.cpp (different variable names from the intermediate state landed by ClickHouse#87178) and tests/parallel_replicas_blacklist.txt (Altinya-25.8 carries extra blacklist entries upstream master no longer has). Both resolved manually keeping the logic from ClickHouse#97316 verbatim.

Related upstream issues: ClickHouse#74341, ClickHouse#81144, ClickHouse#84771, ClickHouse#63984.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Backport upstream fixes for LOGICAL_ERROR in queries with parallel replicas and multiple JOINs (LEFT/INNER...RIGHT, RIGHT...RIGHT, n-way with GLOBAL/FULL). Such queries now correctly fall back to non-parallel execution instead of crashing.

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

Made with Cursor

devcrafter added 3 commits May 3, 2026 00:29
…-expression

PR: fix n-way join with GLOBAL join
(cherry picked from commit 77418de)
…non-merge-tree

PR: fix LEFT/INNER ... RIGHT ... JOINS chain
(cherry picked from commit d1ad996)
PR: fix RIGHT joins chain
(cherry picked from commit 27128fd)
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 3, 2026

Workflow [PR], commit [8d40b8f]

The cherry-pick of ClickHouse#97316 onto antalya-25.8 had a manual conflict
resolution that lost two parts of the upstream version of
should_disable_parallel_replicas():

  * the n-way CROSS JOIN branch (sets is_cross_join = true and
    consumes it next to is_full_join / is_global_join)
  * the explicit fallback for RIGHT JOIN with a distributed/remote
    right-side table (consumes is_right_join_with_remote_table)

Without these reads the two variables were "set but not used", which
fast-test catches as -Werror,-Wunused-but-set-variable /
-Wunused-variable. Restore them verbatim from upstream/master.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants