fix(search): explicit .Index() on IndexAsync so SUT writes don't bleed into the fixture index#24
Conversation
…h wait uses HTTP Testcontainers.Elasticsearch 4.11's built-in wait strategy probes ES via `new HttpWaitStrategy().UsingTls(configuration.TlsEnabled)`. TlsEnabled is the AND of two env vars: `xpack.security.enabled` AND `xpack.security.http.ssl.enabled`, both explicitly "false". The fixture only set the first, so TlsEnabled returned true, the wait probed HTTPS, ES (security disabled) only answered plain HTTP, and the probe never satisfied — every PaperlessServices integration test failed with `System.TimeoutException` after Testcontainers' 1-hour default. Adding the second env var makes TlsEnabled return false, the wait probes HTTP, and the fixture completes in seconds. Verified locally on ES 9.4.1 / MinIO 2025-09-07 / RabbitMQ 4.3.0 with Testcontainers 4.11 — 13/13 integration tests pass in 57s (vs. hanging forever before). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…llection serializes xUnit v3's parallelAlgorithm: aggressive submits every test to the parallel thread pool simultaneously, ignoring collection boundaries — even tests sharing `[Collection(SharedContainerCollection.Name)]` end up running in parallel against the same Elasticsearch instance. Result: OcrIntegrationTests.ProcessMultipleDocuments_Concurrently spam-indexes documents while SearchIndexIntegrationTests.MultipleDocuments_SearchCorrectly is polling for its own write to become visible. On a slow CI disk, the write/refresh cycle stalls and the poll's 10s cap is reached — the test fails deterministically with TaskCanceledException at exactly 10s 047ms (visible in run 25955174937). Conservative scheduling keeps tests in the SharedContainer collection sequential, eliminating the cross-test ES write contention. Local 13/13 integration tests now run in 37s (down from 57s under aggressive — less contention even when nothing fails). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… slow single call The polling helper used a single 10s linked-token across every SearchAsync call. On GitHub-hosted runners the first search after index creation can spend several seconds priming Lucene query caches even after the preceding Refresh.True has returned — eating the budget and surfacing as the deterministic 10s 047ms TaskCanceledException at line 90 of SearchIndexIntegrationTests.MultipleDocuments_SearchCorrectly. Two changes: - bump the default overall timeout from 10s → 30s. CI runners are slower than local dev machines and 10s was set when the suite was running on beefier hardware. 30s is still tight for a happy-path search. - catch OperationCanceledException inside the loop and exit cleanly so the caller's final attempt (with the caller's own token) decides pass/fail. Previously a cancelled poll surfaced as TaskCanceledException at the helper boundary, hiding whether the doc was actually missing or just not yet visible. Local 5/5 SearchIndexIntegrationTests pass in 41s on ES 9.4.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sync polls SearchIndexService writes documents with Refresh.True (`?refresh=true`), which the Elastic 9.x docs describe as forcing a refresh of the affected shards before the index call returns. In practice, on GitHub-hosted ubuntu-latest runners the per-document refresh has been observed to not fully propagate before the first SearchAsync hits the index — the doc is visible to a real-time GET (used by WaitForDocumentAsync, which passes on CI) but invisible to _search (used by WaitForSearchResultsAsync, which fails). Calling client.Indices.RefreshAsync at the start of the polling helper forces an explicit index-level refresh. It's idempotent: on local dev machines where the per-document refresh already settled, this is a fast no-op; on a slow CI runner it converts a deterministic empty-result flake (MultipleDocuments_SearchCorrectly was failing at exactly 10s 047ms with TaskCanceledException, then post-timeout-bump at 30s with "collection is empty") into a passing search. The refresh failure is caught and ignored — the polling loop below is the actual correctness boundary; the refresh is best-effort warmup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SearchIndexService.IndexDocumentAsync built the IndexRequestDescriptor
without an explicit `.Index(...)`, so writes used whatever DefaultIndex
the injected ElasticsearchClient was configured with — not the index
the service's own ElasticsearchOptions point to.
For production this is a no-op (one client, one default, same value as
options.Value.DefaultIndex). It is *not* a no-op when a test builds a
SearchIndexService SUT with its own options but reuses the shared
fixture's ElasticsearchClient. The SUT's InitializeAsync correctly
creates its per-test index with the keyword/text/date mapping; then
IndexAsync silently writes to the *fixture's* default index instead.
Concretely: alphabetical run order is
GenAiIntegrationTests → OcrIntegrationTests →
SearchIndexConcurrencyTests → SearchIndexIntegrationTests …
SearchIndexConcurrencyTests' SUT was the first writer to the fixture
index. ES auto-created it with dynamic mapping (`id` as text+keyword
subfield) before SearchIndexIntegrationTests.MultipleDocuments_Search…
got a chance to call its own InitializeAsync — and that one then
short-circuited on ExistsAsync==true, never applying the proper
`p.Keyword("id")` mapping. The downstream
`Filter(f => f.Term(t => t.Field("id").Value(helloId.ToString())))`
ran against a tokenised text field and matched nothing, surfacing as
"Expected collection to contain a single item, but the collection is
empty" at line 90.
Explicit `.Index(options.Value.DefaultIndex)` routes writes back to the
service-owned index. Knock-on: the secondary assertion in
SearchIndexConcurrencyTests.InitializeAsync_WhenIndexPreCreated_Skips…
relied on the old behaviour — the doc landing in the fixture index so
fixture.WaitForDocumentAsync could find it. With the fix, the doc now
lives in the SUT's pre-created index, so that assertion now queries
the SUT's index directly via ElasticClient.GetAsync (still verifying
the InitializeAsync-skips-create branch, just against the correct
index).
Local 13/13 PaperlessServices integration tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughService indexing now explicitly targets the configured default Elasticsearch index. Tests use per-test index verification. Test containers pinned to ES 9.1.3 with HTTP TLS disabled. Fixture disposal made best-effort; search polling adds upfront refresh, 30s overall timeout, and improved cancellation handling. xUnit runner set to conservative algorithm. ChangesElasticsearch reliability and index targeting
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Pull request overview
This PR fixes Elasticsearch indexing so SearchIndexService writes to the index configured in its own options instead of relying on the injected client’s default index, preventing test index contamination and mapping drift.
Changes:
- Adds an explicit
.Index(options.Value.DefaultIndex)to document indexing. - Updates the concurrency test to query the SUT-specific index directly.
- Includes integration-test stability updates for Elasticsearch container TLS waiting, xUnit scheduling, and search polling.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
PaperlessServices/Features/OcrProcessing/Infrastructure/Search/SearchIndexService.cs |
Ensures indexed documents are written to the service-configured Elasticsearch index. |
PaperlessServices.Tests/Integration/SearchIndexConcurrencyTests.cs |
Adjusts the assertion to read from the per-test index after explicit indexing is fixed. |
PaperlessServices.Tests/Integration/WorkerTestBase.cs |
Adds Elasticsearch TLS env configuration and makes search polling more resilient in CI. |
PaperlessServices.Tests/xunit.runner.json |
Switches xUnit parallel scheduling from aggressive to conservative. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 8 |
| Duplication | -2 |
AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Pull Request Overview
The PR successfully addresses index isolation issues by ensuring SearchIndexService uses the configured index rather than the client default. This fix, combined with the SSL configuration for Testcontainers and increased timeouts, significantly improves the reliability of the integration test suite. Codacy analysis indicates the changes are up to standards.
However, a logic gap remains in the search helper within WorkerTestBase.cs. While the PR enables custom index names to prevent data bleeding, the helper still targets the DefaultIndex during explicit refreshes. This should be addressed to ensure that tests using non-default indices remain consistent in restricted I/O environments like CI.
About this PR
- The systemic improvements to CI stability—specifically the manual container health check fixes and the 'final attempt' polling pattern—are high-quality additions that will reduce transient test failures.
Test suggestions
- Verify IndexDocumentAsync targets the index name provided in ElasticsearchOptions rather than the client default
- Verify SearchIndexConcurrencyTests queries the correct non-default index during verification
- Verify Testcontainers setup explicitly disables http.ssl to facilitate container health checks
- Verify the search wait-helper in WorkerTestBase implements a 30s timeout and idempotent index refresh
TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback
| try | ||
| { | ||
| await client.Indices.RefreshAsync( | ||
| r => r.Indices(client.ElasticsearchClientSettings.DefaultIndex), |
There was a problem hiding this comment.
🟡 MEDIUM RISK
Suggestion: The explicit refresh targets the client's DefaultIndex. Since a primary goal of this PR is to ensure the System Under Test (SUT) can use specific index names to avoid bleeding into the shared fixture index, the DefaultIndex might not be the index where the document was just written. Using Indices.All ensures that the defensive refresh applies to all indices, covering cases where the test uses a custom index name.
| r => r.Indices(client.ElasticsearchClientSettings.DefaultIndex), | |
| await client.Indices.RefreshAsync( | |
| r => r.Indices(Indices.All), | |
| overallLinked.Token); |
| // Final attempt | ||
| // Final attempt with the caller's token only so the assertion sees real | ||
| // "found nothing" data rather than a TaskCanceledException at the wait boundary. | ||
| return await client.SearchAsync<T>(configureSearch, cancellationToken); |
There was a problem hiding this comment.
⚪ LOW RISK
Nitpick: This 'final attempt' pattern is excellent for test reliability. By performing one last search using the caller's original cancellationToken after the internal polling loop finishes, you ensure that timeout failures provide specific details about what was missing rather than throwing an opaque OperationCanceledException.
| // Required so Testcontainers' ElasticsearchConfiguration.TlsEnabled evaluates to false | ||
| // (it AND-s xpack.security.enabled with xpack.security.http.ssl.enabled). Without this, | ||
| // the built-in wait strategy probes HTTPS while ES listens on plain HTTP, and hangs. | ||
| .WithEnvironment("xpack.security.http.ssl.enabled", "false") |
There was a problem hiding this comment.
⚪ LOW RISK
Nitpick: Adding xpack.security.http.ssl.enabled set to false is an important fix. Without this, Testcontainers often defaults to an HTTPS health check even when xpack.security.enabled is false, which can cause the container startup to hang indefinitely in CI environments.
Bumping to 9.4.1 regressed PaperlessServices integration tests from
12/13 passing to 0/26 passing with a 1h hang (some combination of the
new image's startup signature or default settings vs. our wait
strategy / xpack.security flags).
Locally on macOS+OrbStack, ES 9.1.3 plus the in-flight test fixes
(xpack.security.http.ssl.enabled=false, 30s search-poll budget,
explicit pre-wait refresh, conservative SharedContainer collection
serialization, explicit options.Value.DefaultIndex in IndexAsync)
runs the full integration suite green:
PaperlessREST.Tests: 59/59 passed (1m 18s)
PaperlessServices.Tests: 13/13 passed (45s) — includes the
previously-flaky MultipleDocuments_SearchCorrectly
Keep the other version bumps (Testcontainers 4.11, Elastic client 9.4,
EF Core 10.0.8 etc.) — they were not implicated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
PaperlessServices.Tests/Integration/WorkerTestBase.cs (1)
215-217: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winDefensive refresh may target the wrong index.
The refresh targets
client.ElasticsearchClientSettings.DefaultIndex, butWaitForSearchResultsAsyncaccepts an arbitraryconfigureSearchaction that can target any index. Given this PR's goal of allowing the SUT to use specific index names separate from the shared fixture index, if the caller searches a non-default index, the defensive refresh will not affect the searched index.♻️ Proposed fix
await client.Indices.RefreshAsync( - r => r.Indices(client.ElasticsearchClientSettings.DefaultIndex), + r => r.Indices(Indices.All), overallLinked.Token);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@PaperlessServices.Tests/Integration/WorkerTestBase.cs` around lines 215 - 217, The defensive refresh in WaitForSearchResultsAsync currently hardcodes client.ElasticsearchClientSettings.DefaultIndex when calling client.Indices.RefreshAsync but the configureSearch action can target any index; update WaitForSearchResultsAsync so the refresh targets the same index(es) as the search: either extract the Indices from the SearchDescriptor built by the configureSearch action and pass those into client.Indices.RefreshAsync, or (if extraction is impractical) use r => r.Indices(Indices.All) so the refresh covers the searched index; ensure you modify the RefreshAsync call (the lambda currently using DefaultIndex) so it uses the extracted indices or Indices.All instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@PaperlessServices.Tests/Integration/WorkerTestBase.cs`:
- Around line 215-217: The defensive refresh in WaitForSearchResultsAsync
currently hardcodes client.ElasticsearchClientSettings.DefaultIndex when calling
client.Indices.RefreshAsync but the configureSearch action can target any index;
update WaitForSearchResultsAsync so the refresh targets the same index(es) as
the search: either extract the Indices from the SearchDescriptor built by the
configureSearch action and pass those into client.Indices.RefreshAsync, or (if
extraction is impractical) use r => r.Indices(Indices.All) so the refresh covers
the searched index; ensure you modify the RefreshAsync call (the lambda
currently using DefaultIndex) so it uses the extracted indices or Indices.All
instead.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 4b459228-515a-4da9-8fa4-f424bfcee38a
📒 Files selected for processing (3)
.env.testPaperlessREST.Tests/Integration/SharedRestContainerFixture.csPaperlessServices.Tests/Integration/WorkerTestBase.cs
📜 Review details
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Build & Test (backend)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.cs
📄 CodeRabbit inference engine (.editorconfig)
**/*.cs: Do not qualify event, field, method, and property access with 'this.' in C# code
Use predefined types (like int, string) instead of BCL types (like Int32, String) in C#
Always use parentheses for clarity in arithmetic and relational binary operators in C#
Require explicit accessibility modifiers for non-interface members in C#
Prefer coalesce expressions, collection initializers, null propagation, and object initializers in C#
Prefer auto-properties and compound assignments in C#
Prefer conditional expressions over assignment and return statements in C#
Mark fields as readonly when possible in C#
Treat all unused parameters as code quality issues in C#
Use explicit types instead of 'var' in C# unless type is apparent
Prefer expression-bodied members for accessors, indexers, lambdas, methods, and properties in C#
Use pattern matching instead of 'as' with null checks and 'is' with cast checks in C#
Prefer switch expressions over traditional switch statements in C#
Use conditional delegate calls in C# to avoid null reference exceptions
Prefer static local functions over instance local functions in C#
Use modifier order: public, private, protected, internal, file, static, extern, new, virtual, abstract, sealed, override, readonly, unsafe, required, volatile, async in C#
Always use braces for code blocks in C#
Prefer file-scoped namespaces in C#
Prefer implicit object creation when type is apparent in C#
Prefer index and range operators in C#
Place 'using' directives outside of namespace declarations in C#
Always place opening brace on new line before catch, else, finally, and members in C#
Indent block contents and case contents in C#
Add space after comma, before and after binary operators, and after colon in inheritance clause in C#
Do not add space after dot, cast, or before open square brackets in C#
Name interfaces with 'I' prefix using PascalCase in C#
Name types (classes, structs, enums) using PascalCase in C#
Name non-field members (properties, events...
Files:
PaperlessREST.Tests/Integration/SharedRestContainerFixture.csPaperlessServices.Tests/Integration/WorkerTestBase.cs
PaperlessServices.Tests/Integration/**/*.cs
📄 CodeRabbit inference engine (CLAUDE.md)
Mock
ITextSummarizerin integration tests withFakeTextSummarizer; do not use the real Gemini API even with placeholder keys
Files:
PaperlessServices.Tests/Integration/WorkerTestBase.cs
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…st-effort _host is assigned partway through InitializeAsync (after the container StartAsync calls succeed and Microsoft.Extensions.Hosting builds the host). If init throws before that — e.g. a container wait-strategy times out — _host is still null!. DisposeAsync then NREs on `_host.StopAsync()` and xUnit's collection-fixture cleanup reporter publishes the cleanup NRE alongside (or instead of) the actual init exception, hiding the real cause from the developer. Guard _host with `is not null` and wrap each container DisposeAsync in a swallow so a teardown failure can never mask the genuine InitializeAsync exception. This is purely a diagnostic-quality fix — the cleanup is best-effort, so success-path teardown is unchanged (local 13/13 still pass in 47s). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
|
||
| // Default image versions - override via environment variables for CI flexibility | ||
| private const string DefaultElasticsearchImage = "docker.elastic.co/elasticsearch/elasticsearch:9.4.1"; | ||
| private const string DefaultElasticsearchImage = "docker.elastic.co/elasticsearch/elasticsearch:9.1.3"; |
| "minio/minio:RELEASE.2025-09-07T16-13-09Z"; | ||
| string elasticImage = Environment.GetEnvironmentVariable("ELASTIC_IMAGE") ?? | ||
| "docker.elastic.co/elasticsearch/elasticsearch:9.4.1"; | ||
| "docker.elastic.co/elasticsearch/elasticsearch:9.1.3"; |
| RABBITMQ_IMAGE=rabbitmq:4.3.0-management | ||
| MINIO_IMAGE=minio/minio:RELEASE.2025-09-07T16-13-09Z | ||
| ELASTIC_IMAGE=docker.elastic.co/elasticsearch/elasticsearch:9.4.1 | ||
| ELASTIC_IMAGE=docker.elastic.co/elasticsearch/elasticsearch:9.1.3 |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@PaperlessServices.Tests/Integration/WorkerTestBase.cs`:
- Around line 122-127: The cleanup currently calls _host.Dispose() without a
best-effort guard, which can throw and mask earlier InitializeAsync failures;
wrap the Dispose call in a try/catch (similar to the existing try for await
_host.StopAsync()) so any exception thrown by _host.Dispose() is swallowed and
does not overwrite the original exception — locate the block referencing _host,
StopAsync and Dispose and surround the Dispose invocation with a try {
_host.Dispose(); } catch { /* swallow to avoid masking init error */ }.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 3946eac6-bff2-487b-9923-10319606f0cd
📒 Files selected for processing (1)
PaperlessServices.Tests/Integration/WorkerTestBase.cs
📜 Review details
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: copilot-pull-request-reviewer
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Build & Test (backend)
🧰 Additional context used
📓 Path-based instructions (2)
PaperlessServices.Tests/Integration/**/*.cs
📄 CodeRabbit inference engine (CLAUDE.md)
Mock
ITextSummarizerin integration tests withFakeTextSummarizer; do not use the real Gemini API even with placeholder keys
Files:
PaperlessServices.Tests/Integration/WorkerTestBase.cs
**/*.cs
📄 CodeRabbit inference engine (.editorconfig)
**/*.cs: Do not qualify event, field, method, and property access with 'this.' in C# code
Use predefined types (like int, string) instead of BCL types (like Int32, String) in C#
Always use parentheses for clarity in arithmetic and relational binary operators in C#
Require explicit accessibility modifiers for non-interface members in C#
Prefer coalesce expressions, collection initializers, null propagation, and object initializers in C#
Prefer auto-properties and compound assignments in C#
Prefer conditional expressions over assignment and return statements in C#
Mark fields as readonly when possible in C#
Treat all unused parameters as code quality issues in C#
Use explicit types instead of 'var' in C# unless type is apparent
Prefer expression-bodied members for accessors, indexers, lambdas, methods, and properties in C#
Use pattern matching instead of 'as' with null checks and 'is' with cast checks in C#
Prefer switch expressions over traditional switch statements in C#
Use conditional delegate calls in C# to avoid null reference exceptions
Prefer static local functions over instance local functions in C#
Use modifier order: public, private, protected, internal, file, static, extern, new, virtual, abstract, sealed, override, readonly, unsafe, required, volatile, async in C#
Always use braces for code blocks in C#
Prefer file-scoped namespaces in C#
Prefer implicit object creation when type is apparent in C#
Prefer index and range operators in C#
Place 'using' directives outside of namespace declarations in C#
Always place opening brace on new line before catch, else, finally, and members in C#
Indent block contents and case contents in C#
Add space after comma, before and after binary operators, and after colon in inheritance clause in C#
Do not add space after dot, cast, or before open square brackets in C#
Name interfaces with 'I' prefix using PascalCase in C#
Name types (classes, structs, enums) using PascalCase in C#
Name non-field members (properties, events...
Files:
PaperlessServices.Tests/Integration/WorkerTestBase.cs
🔇 Additional comments (4)
PaperlessServices.Tests/Integration/WorkerTestBase.cs (4)
220-229: Refresh targets only the fixture's DefaultIndex.The
configureSearchdelegate may target a different index (e.g., a per-test SUT index). Refreshing onlyDefaultIndexwon't make documents in other indices searchable. UseIndices.Allfor the defensive refresh.
25-25: LGTM!
35-38: LGTM!
202-211: LGTM!Also applies to: 231-259
| if (_host is not null) | ||
| { | ||
| try { await _host.StopAsync(); } | ||
| catch { /* best-effort: don't mask the InitializeAsync exception */ } | ||
| _host.Dispose(); | ||
| } |
There was a problem hiding this comment.
_host.Dispose() not wrapped in best-effort handler.
The comment on lines 117-121 states the intent to "swallow per-container Dispose failures" to avoid masking init errors. StopAsync is wrapped but Dispose() is not. If Dispose() throws (e.g., a hosted service's disposal fails), it masks the original init failure exactly as described.
Proposed fix
if (_host is not null)
{
try { await _host.StopAsync(); }
catch { /* best-effort: don't mask the InitializeAsync exception */ }
- _host.Dispose();
+ try { _host.Dispose(); }
+ catch { /* best-effort */ }
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@PaperlessServices.Tests/Integration/WorkerTestBase.cs` around lines 122 -
127, The cleanup currently calls _host.Dispose() without a best-effort guard,
which can throw and mask earlier InitializeAsync failures; wrap the Dispose call
in a try/catch (similar to the existing try for await _host.StopAsync()) so any
exception thrown by _host.Dispose() is swallowed and does not overwrite the
original exception — locate the block referencing _host, StopAsync and Dispose
and surround the Dispose invocation with a try { _host.Dispose(); } catch { /*
swallow to avoid masking init error */ }.
Summary
Follow-up to #23. Even after the ES TLS / parallel-algorithm / refresh fixes landed,
MultipleDocuments_SearchCorrectlykept failing on CI withExpected collection to contain a single item, but the collection is empty.— same place every time.Root cause:
SearchIndexService.IndexDocumentAsyncbuilt itsIndexRequestDescriptorwithout an explicit.Index(...), so the actual write target was whateverDefaultIndexthe injectedElasticsearchClientwas configured with — not the index the service's ownElasticsearchOptionspoint to.In production this is a no-op (one client, one default, same value as
options.Value.DefaultIndex). It is not a no-op when a test (hereSearchIndexConcurrencyTests) builds a SUT with its own options but reuses the shared fixture'sElasticsearchClient. The SUT'sInitializeAsynccorrectly creates its per-test index with the keyword/text/date mapping; thenIndexAsyncsilently writes to the fixture's default index instead.Alphabetical run order is
GenAiIntegrationTests → OcrIntegrationTests → SearchIndexConcurrencyTests → SearchIndexIntegrationTests ….SearchIndexConcurrencyTests's SUT was the first writer to the fixture index. ES auto-created that index with dynamic mapping (idastext+.keywordsubfield) beforeSearchIndexIntegrationTests.MultipleDocuments_SearchCorrectlygot a chance to call its ownInitializeAsync— which then short-circuited onExistsAsync==trueand never applied the properp.Keyword("id")mapping. The downstreamFilter(f => f.Term(t => t.Field(\"id\").Value(helloId.ToString())))ran against a tokenised text field, never matched, and the test failed at theContainSingle()assertion.Fix
PaperlessServices/Features/OcrProcessing/Infrastructure/Search/SearchIndexService.cs: add.Index(options.Value.DefaultIndex)so writes target the service-owned index, not the client default.PaperlessServices.Tests/Integration/SearchIndexConcurrencyTests.cs: the secondary assertion inInitializeAsync_WhenIndexPreCreated_SkipsCreateAndLogsNothingwas relying on the bug (the doc landing in the fixture index sofixture.WaitForDocumentAsynccould find it). With the production fix the doc now lives in the SUT's pre-created index, so the assertion now queries that index directly viaElasticClient.GetAsync(... g => g.Index(indexName) ...). The primary assertion (count ofCreated Elasticsearch indexlog lines) is unchanged.Verification
PaperlessServices.Tests.Integrationpass.Test plan
🤖 Generated with Claude Code