feat: Add Garbage Collection (GC) and MaxArenasToKeep feature#98
Open
eeliu wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR introduces configurable garbage collection (GC) for arena files in bigqueue, adds a public GC() entrypoint, and expands documentation and tests to validate cleanup, concurrency, and crash recovery behaviors.
Changes:
- Added
SetMaxArenasToKeepconfiguration and metadata head updating to support arena file garbage collection. - Implemented arena deletion logic inside
arenaManagerand exposedMmapQueue.GC()as a public API. - Added extensive GC + concurrency + crash-recovery tests and updated docs/README examples to the
NewMmapQueueAPI.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| metadata.go | Re-enables putHead so GC can persist updated global head in metadata. |
| config.go | Adds maxArenasToKeep config, option setter, and validation error. |
| bigqueue.go | Exposes MmapQueue.GC() to trigger arena cleanup with queue locking. |
| arenamanager.go | Refactors arena tracking (slice→map) and implements GC deletion logic + head updates. |
| gc_test.go | Adds multi-scenario tests validating arena deletion and consumer-head semantics. |
| gc_concurrency_test.go | Adds concurrent producer/consumer test with periodic GC. |
| crash_recovery_test.go | Adds multi-process crash recovery tests for enqueue, dequeue, and torn GC state. |
| bigqueue_test.go | Adds unit test for negative SetMaxArenasToKeep validation. |
| doc.go | Updates examples to NewMmapQueue and documents GC usage. |
| README.md | Documents SetMaxArenasToKeep and manual GC() usage; updates examples to NewMmapQueue. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This pull request introduces an automatic and manual Garbage Collection (GC) mechanism to
bigqueue. By default, bigqueue retains all arena files indefinitely, which can lead to storage exhaustion for long-running queues. This feature enables users to periodically or automatically clean up consumed arena files.Key Features and Implementation Details
SetMaxArenasToKeep(n): Users can configure the queue to keep a maximum ofnconsumed arenas. Expired arenas before this threshold are deleted from the disk.GC(): Added an explicitGC()method allowing users to manually trigger disk cleanup (e.g., during off-peak hours).arenascollection inarenaManagerfrom a[]*mmap.File(slice) to amap[int]*mmap.File. This is necessary to support non-contiguous Arena IDs that arise when old files are deleted from the disk.GC Workflow
flowchart TD A[Trigger GC] --> B[Gather Consumer Heads] B --> C[Calculate minHeadAid = min of all consumers] C --> D{Is minHeadAid valid?} D -- Yes --> E[Update Global Head to minHeadAid] D -- No --> Z[Exit GC] E --> F[Calculate limitAid = minHeadAid - maxArenasToKeep] F --> G{limitAid > 0?} G -- Yes --> H[Iterate aid from 0 to limitAid-1] H --> I[Unmap arena from memory] I --> J[Delete .dat file from disk] J --> K[Remove from in-memory arena map] K --> L[Repeat for next expired arena] L --> Z G -- No --> ZTest Cases Introduced
gc_test.go: Contains basic functionality tests verifying that configuringSetMaxArenasToKeepcorrectly cleans up the anticipated arena files upon consumption.gc_concurrency_test.go: Highly concurrent stress tests runningEnqueue,Dequeue, andGCprecisely at the same time to ensure no race conditions arise during the memory unmap or file deletion stages.crash_recovery_test.go: Tests the resiliency of the queue during unexpected closures. It simulates a crashed state while an ongoing GC is only partially completed, verifying that the queue can restore itself correctly upon the next boot.bigqueue_test.go(Updates): Validation checks ensuring that negative numbers formaxArenasToKeepreturn the appropriate initialization errors.All tests are passing cleanly with expected coverage. Please let me know if there are any aspects of the implementation you would like me to adjust.