perf(iterator): seek-ahead MVCC version skipping#2303
Open
shaunpatterson wants to merge 1 commit into
Open
Conversation
Forward, non-AllVersions iteration previously walked the whole version chain of each user key one mi.Next() at a time: (a) skipping versions above readTs, and (b) skipping older duplicate versions after yielding a key. Both are O(versions) linear walks through the merge tree and are dgraph's dgraph-io#1 read cost on long posting-list version chains. Replace both with a single Seek: - version > readTs: Seek(KeyWithTs(userKey, readTs)) jumps straight to the first version <= readTs of the same user key. - dedup of older versions: Seek(KeyWithTs(userKey+0x00, MaxUint64)) jumps to the newest version of the next user key. Appending 0x00 (not incrementing the last byte) is required so prefix-extension keys like k10 are not overshot when skipping past k1. Both optimizations are guarded to forward, non-AllVersions iteration; reverse and AllVersions keep stepping. A forceStepSkip test hook lets a differential test prove byte-identical iteration vs the step-skip path over randomized multi-version data (deleted/expired/SinceTs/prefix/ reverse/allVersions, across memtable + LSM levels). Point Get is already version-safe (db.get seeks to KeyWithTs(key,readTs) and takes the max version across all tables), so no new method is added; TestVersionSafePointGet documents and verifies this. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NtGkC4K2J2XYwcAKwjhHbM
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
In
parseItem, the iterator skips versions newer thanreadTsand older duplicates of an already-returned key onemi.Next()at a time through the whole merge tree (iterator.go). For keys with long version chains — e.g. a frequently-rewritten posting list / counter — this is O(versions scanned) per read, with no seek-ahead.What changed
Two linear version-stepping loops in
parseItembecome seeks, for forward + non-AllVersionsiteration only:> readTs,Seek(KeyWithTs(userKey, readTs))jumps straight to the first version<= readTsof the same user key instead of stepping over every too-new version.Seekjumps to the next user key (KeyWithTs(userKey+0x00, MaxUint64)) instead of stepping through every older version.The seek target
userKey+0x00is the smallest key strictly greater than all versions ofuserKeyand never overshoots prefix-extension keys (for"k1"it still visits"k10", since"k1\x00" < "k10"), becausey.CompareKeyscompares the user-key portion before the timestamp suffix.Compatibility & correctness
Pure optimization — iteration results are unchanged. Reverse and
AllVersionspaths are untouched. A differential test compares seek-skip against forced step-skip byte-for-byte over a matrix of{reverse, AllVersions, prefixes, SinceTs, readTs}with deleted/expired versions and data spread across memtable + LSM levels.Testing
go build,go vet, fullgo test . -count=1green; the differential test passes under-race.🤖 Generated with Claude Code
https://claude.ai/code/session_01NtGkC4K2J2XYwcAKwjhHbM