Release stale deployment locks and add TTL#107
Conversation
Reclaim stale deployment locks and prevent permanent lockout: aa-core now releases a stale lock when the account isn't deployed and returns NotDeployed so callers can retry. The external bundler uses Redis SET with NX + EX via SetOptions/SetExpiry and adds a 300s fallback LOCK_TTL_SECONDS to ensure locks auto-expire if a holder dies. This avoids stuck deployments and makes lock acquisition atomic with an expiry.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdds TTL-backed Redis deployment locks and an owner-checked unlock; stale locks trigger a chain re-check and are released when the account is confirmed not deployed. ChangesStale Deployment Lock Handling
Sequence DiagramsequenceDiagram
participant Client
participant DeploymentManager
participant RedisDeploymentLock
participant Redis
participant Chain
Client->>RedisDeploymentLock: acquire_lock
RedisDeploymentLock->>Redis: SET key NX EX(LOCK_TTL_SECONDS)
Redis-->>RedisDeploymentLock: Lock acquired atomically
Client->>DeploymentManager: check_deployment_status
DeploymentManager->>RedisDeploymentLock: Inspect lock age
Note over DeploymentManager: Detect stale lock
DeploymentManager->>Chain: check_chain
Chain-->>DeploymentManager: Account not deployed
DeploymentManager->>RedisDeploymentLock: release_lock_if_owner(chain_id, addr, lock_id)
RedisDeploymentLock->>Redis: EVAL compare-and-delete Lua script
Redis-->>RedisDeploymentLock: Deleted? (true/false)
DeploymentManager-->>Client: Return NotDeployed
🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@aa-core/src/userop/deployment.rs`:
- Around line 108-111: The stale-lock cleanup currently calls
self.lock.release_lock(chain_id, account_address) unconditionally; change the
call site in deployment.rs to pass the observed lock_id returned by
check_lock(), and update the LockStore::release_lock implementation to perform
an atomic compare-and-delete (compare stored lock value to the provided lock_id
and only DEL if equal) e.g. via a Redis EVAL/Lua script or Redis CAS primitive;
keep the return behavior (Ok(DeploymentStatus::NotDeployed)) but ensure only the
exact observed lock_id is removed to avoid deleting a freshly re-acquired lock.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 66a6fecf-d45d-446e-8721-44307c523d21
📒 Files selected for processing (2)
aa-core/src/userop/deployment.rsexecutors/src/external_bundler/deployment.rs
Add a DeploymentLock::release_lock_if_owner method and use it when reclaiming stale deployment locks to avoid deleting a lock that another worker may have acquired. Implemented RedisDeploymentLock::release_lock_if_owner with an atomic compare-and-delete Lua script (GET + cjson.decode + DEL if lock_id matches) and map Redis errors to EngineError. Update aa-core deployment flow to call the new method instead of unconditionally deleting the lock.
Reclaim stale deployment locks and prevent permanent lockout: aa-core now releases a stale lock when the account isn't deployed and returns NotDeployed so callers can retry. The external bundler uses Redis SET with NX + EX via SetOptions/SetExpiry and adds a 300s fallback LOCK_TTL_SECONDS to ensure locks auto-expire if a holder dies. This avoids stuck deployments and makes lock acquisition atomic with an expiry.
Summary by CodeRabbit