Skip to content

fix: resolve test deadlock and integration timeout#7330

Merged
lpcox merged 2 commits into
mainfrom
fix/test-deadlock-and-timeout
Jun 10, 2026
Merged

fix: resolve test deadlock and integration timeout#7330
lpcox merged 2 commits into
mainfrom
fix/test-deadlock-and-timeout

Conversation

@lpcox

@lpcox lpcox commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Problem

Two tests were failing in make agent-finished:

  1. TestGetOrLaunch_RaceConditionDoubleCheck (600s timeout/deadlock) — The test connected 100 goroutines to localhost:9999 (no listener). NewHTTPConnection tries 3 transports sequentially (streamable HTTP → SSE → plain JSON-RPC), each with a 30s connect timeout. Since this happens inside the write lock in syncutil.GetOrCreate, all other goroutines blocked indefinitely.

  2. TestFullDIFCConfigFromJSON (15s startup timeout) — The test configured a stdio server (test/server1:latest Docker container) and an HTTP server at localhost:9999. Without Docker running, the stdio backend hangs. The HTTP backend also times out on TCP connect to a non-existent listener.

Fix

  • Race condition test: Use httptest.NewServer that responds to JSON-RPC initialize requests immediately. Test now completes in <1s.
  • Integration test: Start a local HTTP server (returns 500) and point both backends at it. Removes Docker dependency and ensures fast failure (~50ms per backend).

- TestGetOrLaunch_RaceConditionDoubleCheck: Replace localhost:9999 (no
  listener) with httptest server that responds quickly. The old test
  deadlocked because NewHTTPConnection tries 3 transports (30s timeout
  each) while holding the write lock, blocking all 100 goroutines.

- TestFullDIFCConfigFromJSON: Replace stdio container (requires Docker)
  and hardcoded localhost:9999 with a local HTTP test server returning
  500. This avoids Docker dependency and TCP connect timeouts that
  caused the 15s startup wait to expire.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 10, 2026 19:40

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates flaky/slow tests to avoid long connection timeouts and external Docker dependencies, improving make agent-finished reliability and runtime.

Changes:

  • Replace a deadlock-prone “connect to localhost:9999” concurrency test setup with a local httptest backend.
  • Update the DIFC integration test to use a local HTTP server for fast backend failures instead of Docker/unused ports.
Show a summary per file
File Description
test/integration/difc_config_test.go Starts a local failing HTTP server and points both backends at it to eliminate Docker/TCP timeout hangs.
internal/launcher/getorlaunch_stdio_test.go Uses an httptest server URL for the race-condition/double-check locking stress test to avoid long connect stalls.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread internal/launcher/getorlaunch_stdio_test.go
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@lpcox lpcox merged commit 18751f3 into main Jun 10, 2026
26 checks passed
@lpcox lpcox deleted the fix/test-deadlock-and-timeout branch June 10, 2026 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants