Skip to content

Named Scalars#1720

Open
filimonov wants to merge 1 commit intoantalya-26.3from
named_scalars-antalya-26.3
Open

Named Scalars#1720
filimonov wants to merge 1 commit intoantalya-26.3from
named_scalars-antalya-26.3

Conversation

@filimonov
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Experimental Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Adds named scalars - server-side cached values that you define once and reuse across queries instead of recomputing or storing them in a one-row table. Use CREATE [LOCAL|SHARED] NAMED SCALAR <name> [REFRESH EVERY N {SECOND|MINUTE|HOUR|DAY}] AS SELECT ... to define a scalar, getNamedScalar('<name>') (or getNamedScalarOrDefault('<name>', default)) to read it, and SYSTEM REFRESH NAMED SCALAR <name> to force an out-of-schedule refresh. LOCAL scalars are cached per server; SHARED scalars are coordinated cluster-wide via Keeper so every replica sees the same value and exactly one replica refreshes per tick. Refresh bodies run under SQL SECURITY DEFINER, are visible in system.processes and system.query_log (is_internal = 1), and can be interrupted with KILL QUERY. Inspect state via system.named_scalars. Gated behind the experimental setting allow_experimental_named_scalars.

Documentation entry for user-facing changes

Add server-side named, refreshable scalar values backed by either a local on-disk cache or a shared Keeper-backed cache, accessed via getNamedScalar / getNamedScalarOrDefault and surfaced in system.named_scalars.

Surface:

  • DDL: CREATE [OR REPLACE] [LOCAL|SHARED] NAMED SCALAR [IF NOT EXISTS] [ON CLUSTER ...] [DEFINER = ...] [SQL SECURITY DEFINER] [REFRESH EVERY ] AS <SELECT ...> DROP NAMED SCALAR [IF EXISTS] [ON CLUSTER ...]
  • Functions: getNamedScalar(name), getNamedScalarOrDefault(name, default).
  • SYSTEM commands: REFRESH NAMED SCALAR ; {STOP|START} NAMED SCALAR REFRESHES [].
  • system.named_scalars table (two-tier: value tier via getNamedScalar grant, operator tier via SHOW_NAMED_SCALARS).
  • Profile-events / metrics: NamedScalarRefresh{Attempts,Successes, Failures,SkippedByPeer,DurationMicroseconds}; BackgroundNamedScalarRefreshPool{Task,Size}.
  • Server settings: background_named_scalar_refresh_pool_size, named_scalar_definitions_path, named_scalar_definitions_zookeeper_path, named_scalar_local_cache_path, default_named_scalar_cache, max_named_scalars, named_scalar_max_value_size.
  • User setting: allow_experimental_named_scalars (experimental gate).
  • Access: CREATE_NAMED_SCALAR, DROP_NAMED_SCALAR, SHOW_NAMED_SCALARS, SYSTEM_REFRESH_NAMED_SCALAR, SYSTEM_NAMED_SCALAR_REFRESHES, getNamedScalar (function-execute, with getNamedScalarOrDefault alias).
  • Error codes 766–771 (NAMED_SCALAR_NOT_FOUND, NAMED_SCALAR_ALREADY_EXISTS, SHARED_NAMED_SCALARS_NOT_CONFIGURED, NAMED_SCALAR_NOT_REFRESHABLE, NAMED_SCALAR_VALUE_TOO_LARGE, NAMED_SCALAR_HAS_NO_VALUE).

Architecture:

  • Definitions are immutable (UUID-identified parsed records). Local scalars persist their definition on disk; shared scalars publish to Keeper. The manager dispatches reads through tryGetScalar; refreshes run on a dedicated BackgroundSchedulePool thread per server.
  • Refresh bodies execute via executeQuery({.internal=true}) so they appear in system.processes (killable via KILL QUERY) and system.query_log; DROP / OR REPLACE / shutdown cancel the in-flight body via QueryStatus::cancelQuery.
  • DEFINER privileges are used during refresh; setUser(definer_id) applies the definer's profile (max_execution_time, max_memory_usage, etc.) so resource limits inherit the standard policy without a bespoke cap.
  • Shared scalars are coordinated by SharedNamedScalarsWatcher: a Keeper child-watch on the definitions root drives reconcile; ephemeral leases serialise refresh evaluation across replicas.
  • system.named_scalars column names align with system.view_refreshes (last_refresh_time, last_success_time, next_refresh_time, exception).
  • Two-tier disclosure: value-tier columns are non-Nullable / always populated for getNamedScalar grantees; operator-tier columns are NULL unless the caller holds SHOW_NAMED_SCALARS.

Tests:

  • 17 stateless tests (03800–03816) covering CRUD, refresh, persistent cadence, OR REPLACE under refresh, definer database resolution, reload from disk, no-Keeper fallback, query_log/processes visibility, KILL QUERY interruption, two-tier access matrix, and orphan cleanup.
  • A 712-line integration suite (test_shared_named_scalars_cluster) covering cross-node discovery, shared refresh failover, ZK session loss, restart-during-refresh, OR REPLACE racing the watcher, and drop-while-discovery-in-flight.

Documentation:

  • docs/en/sql-reference/statements/create/named-scalar.md (full DDL, cache kinds, OR REPLACE, examples, when-to-use patterns, access).
  • docs/en/sql-reference/functions/named-scalar-functions.md (getNamedScalar, getNamedScalarOrDefault).
  • docs/en/operations/system-tables/named_scalars.md (column reference, operational signals query, refresh visibility & cancellation).
  • docs/en/sql-reference/statements/system.md (SYSTEM REFRESH/STOP/START NAMED SCALAR REFRESHES).

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Workflow [PR], commit [d03b265]

Add server-side named, refreshable scalar values backed by either a
local on-disk cache or a shared Keeper-backed cache, accessed via
getNamedScalar / getNamedScalarOrDefault and surfaced in
system.named_scalars.

Surface:
- DDL: CREATE [OR REPLACE] [LOCAL|SHARED] NAMED SCALAR [IF NOT EXISTS]
       <name> [ON CLUSTER ...] [DEFINER = ...] [SQL SECURITY DEFINER]
       [REFRESH EVERY <N> <unit>] AS <SELECT ...>
       DROP NAMED SCALAR [IF EXISTS] <name> [ON CLUSTER ...]
- Functions: getNamedScalar(name), getNamedScalarOrDefault(name, default).
- SYSTEM commands: REFRESH NAMED SCALAR <name>;
                   {STOP|START} NAMED SCALAR REFRESHES [<name>].
- system.named_scalars table (two-tier: value tier via getNamedScalar
  grant, operator tier via SHOW_NAMED_SCALARS).
- Profile-events / metrics: NamedScalarRefresh{Attempts,Successes,
  Failures,SkippedByPeer,DurationMicroseconds};
  BackgroundNamedScalarRefreshPool{Task,Size}.
- Server settings: background_named_scalar_refresh_pool_size,
  named_scalar_definitions_path,
  named_scalar_definitions_zookeeper_path,
  named_scalar_local_cache_path, default_named_scalar_cache,
  max_named_scalars, named_scalar_max_value_size.
- User setting: allow_experimental_named_scalars (experimental gate).
- Access: CREATE_NAMED_SCALAR, DROP_NAMED_SCALAR, SHOW_NAMED_SCALARS,
  SYSTEM_REFRESH_NAMED_SCALAR, SYSTEM_NAMED_SCALAR_REFRESHES,
  getNamedScalar (function-execute, with getNamedScalarOrDefault alias).
- Error codes 766–771 (NAMED_SCALAR_NOT_FOUND, NAMED_SCALAR_ALREADY_EXISTS,
  SHARED_NAMED_SCALARS_NOT_CONFIGURED, NAMED_SCALAR_NOT_REFRESHABLE,
  NAMED_SCALAR_VALUE_TOO_LARGE, NAMED_SCALAR_HAS_NO_VALUE).

Architecture:
- Definitions are immutable (UUID-identified parsed records). Local
  scalars persist their definition on disk; shared scalars publish to
  Keeper. The manager dispatches reads through tryGetScalar; refreshes
  run on a dedicated BackgroundSchedulePool thread per server.
- Refresh bodies execute via executeQuery({.internal=true}) so they
  appear in system.processes (killable via KILL QUERY) and
  system.query_log; DROP / OR REPLACE / shutdown cancel the in-flight
  body via QueryStatus::cancelQuery.
- DEFINER privileges are used during refresh; setUser(definer_id)
  applies the definer's profile (max_execution_time, max_memory_usage,
  etc.) so resource limits inherit the standard policy without a
  bespoke cap.
- Shared scalars are coordinated by SharedNamedScalarsWatcher: a
  Keeper child-watch on the definitions root drives reconcile;
  ephemeral leases serialise refresh evaluation across replicas.
- system.named_scalars column names align with system.view_refreshes
  (last_refresh_time, last_success_time, next_refresh_time, exception).
- Two-tier disclosure: value-tier columns are non-Nullable / always
  populated for getNamedScalar grantees; operator-tier columns are
  NULL unless the caller holds SHOW_NAMED_SCALARS.

Tests:
- 17 stateless tests (03800–03816) covering CRUD, refresh, persistent
  cadence, OR REPLACE under refresh, definer database resolution,
  reload from disk, no-Keeper fallback, query_log/processes visibility,
  KILL QUERY interruption, two-tier access matrix, and orphan cleanup.
- A 712-line integration suite (test_shared_named_scalars_cluster)
  covering cross-node discovery, shared refresh failover, ZK session
  loss, restart-during-refresh, OR REPLACE racing the watcher, and
  drop-while-discovery-in-flight.

Documentation:
- docs/en/sql-reference/statements/create/named-scalar.md (full DDL,
  cache kinds, OR REPLACE, examples, when-to-use patterns, access).
- docs/en/sql-reference/functions/named-scalar-functions.md
  (getNamedScalar, getNamedScalarOrDefault).
- docs/en/operations/system-tables/named_scalars.md (column reference,
  operational signals query, refresh visibility & cancellation).
- docs/en/sql-reference/statements/system.md
  (SYSTEM REFRESH/STOP/START NAMED SCALAR REFRESHES).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@filimonov filimonov force-pushed the named_scalars-antalya-26.3 branch from 10f4b84 to d03b265 Compare May 1, 2026 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant