[Feature](iceberg) Add manifest-level cache for Iceberg tables to reduce I/O and parsing overhead#59056
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 36570 ms |
TPC-DS: Total hot run time: 180705 ms |
ClickBench: Total hot run time: 27.21 s |
|
run buildall |
TPC-H: Total hot run time: 36120 ms |
TPC-DS: Total hot run time: 178998 ms |
ClickBench: Total hot run time: 27.32 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 35151 ms |
|
run buildall |
TPC-H: Total hot run time: 35385 ms |
TPC-DS: Total hot run time: 178544 ms |
ClickBench: Total hot run time: 27.24 s |
FE Regression Coverage ReportIncrement line coverage |
5c598d1 to
53318e7
Compare
|
run buildall |
TPC-H: Total hot run time: 34962 ms |
TPC-DS: Total hot run time: 178232 ms |
ClickBench: Total hot run time: 27.41 s |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 35242 ms |
TPC-DS: Total hot run time: 178687 ms |
|
run external |
… remove obsolete explain test
|
run buildall |
TPC-H: Total hot run time: 36088 ms |
TPC-DS: Total hot run time: 179920 ms |
ClickBench: Total hot run time: 27.17 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…uce I/O and parsing overhead (#59056) ### What problem does this PR solve? ## Motivation During Iceberg query planning, FE needs to read and parse the metadata chain: ManifestList → Manifest → DataFile/DeleteFile. When frequently querying hot partitions or executing small batch queries, the same Manifest files are repeatedly read and parsed, causing significant I/O and CPU overhead. ## Solution This PR introduces a manifest-level cache (`IcebergManifestCache`) in FE to cache the parsed DataFile/DeleteFile lists per manifest file. The cache is implemented using Caffeine with weight-based LRU eviction and TTL support. ### Key Components - **IcebergManifestCache**: Core cache implementation using Caffeine - Weight-based LRU eviction controlled by `iceberg.manifest.cache.capacity-mb` - TTL expiration via `iceberg.manifest.cache.ttl-second` - Single-flight loading to prevent duplicate parsing of the same manifest - **ManifestCacheKey**: Cache key consisting of: - Manifest file path - **ManifestCacheValue**: Cached payload containing: - List of `DataFile` or `DeleteFile` - Estimated memory weight for eviction - **IcebergManifestCacheLoader**: Helper class to load and populate the cache using `ManifestFiles.read()` ### Cache Invalidation Strategy - Key changes automatically invalidate stale entries (length/lastModified/sequenceNumber changes) - TTL prevents stale data when underlying storage doesn't support precise mtime/etag - Different snapshots use different manifest paths/keys, ensuring snapshot-level isolation ### Iceberg Catalog Properties | Config | Default | Description | |--------|---------|-------------| | `iceberg.manifest.cache.enable` | `true` | Enable/disable manifest cache | | `iceberg.manifest.cache.capacity-mb` | `1024` | Maximum cache capacity in MB | | `iceberg.manifest.cache.ttl-second` | `48 * 60 * 60` | Cache entry expiration after access | ### Integration Point The cache is integrated in `IcebergScanNode.planFileScanTaskWithManifestCache()`, which: 1. Loads delete manifests via cache and builds `DeleteFileIndex` 2. Loads data manifests via cache and creates `FileScanTask` for each data file 3. Falls back to original scan if cache loading fails
…ables to reduce I/O and parsing overhead apache#59056 (apache#59408) Cherry-picked from apache#59056 Co-authored-by: Socrates <suyiteng@selectdb.com>
What problem does this PR solve?
Motivation
During Iceberg query planning, FE needs to read and parse the metadata chain: ManifestList → Manifest → DataFile/DeleteFile. When frequently querying hot partitions or executing small batch queries, the same Manifest files are repeatedly read and parsed, causing significant I/O and CPU overhead.
Solution
This PR introduces a manifest-level cache (
IcebergManifestCache) in FE to cache the parsed DataFile/DeleteFile lists per manifest file. The cache is implemented using Caffeine with weight-based LRU eviction and TTL support.Key Components
IcebergManifestCache: Core cache implementation using Caffeine
iceberg.manifest.cache.capacity-mbiceberg.manifest.cache.ttl-secondManifestCacheKey: Cache key consisting of:
ManifestCacheValue: Cached payload containing:
DataFileorDeleteFileIcebergManifestCacheLoader: Helper class to load and populate the cache using
ManifestFiles.read()Cache Invalidation Strategy
Iceberg Catalog Properties
iceberg.manifest.cache.enabletrueiceberg.manifest.cache.capacity-mb1024iceberg.manifest.cache.ttl-second48 * 60 * 60Integration Point
The cache is integrated in
IcebergScanNode.planFileScanTaskWithManifestCache(), which:DeleteFileIndexFileScanTaskfor each data fileRelease note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)