Skip to content

feat: wire up Prometheus metrics server#25

Closed
jacderida wants to merge 11 commits intoWithAutonomi:mainfrom
jacderida:feat-wire_up_metrics_server
Closed

feat: wire up Prometheus metrics server#25
jacderida wants to merge 11 commits intoWithAutonomi:mainfrom
jacderida:feat-wire_up_metrics_server

Conversation

@jacderida
Copy link
Copy Markdown
Collaborator

@jacderida jacderida commented Mar 16, 2026

Summary

  • Wire up saorsa-core's MetricEvent broadcast channel to a Prometheus-compatible HTTP metrics server
  • Add MetricsAggregator that consumes high-frequency metric events from saorsa-core into atomic counters and sliding windows
  • Add PrometheusRenderer that formats aggregated metrics as Prometheus text exposition
  • Add MetricsSnapshot for periodic state polling (transport stats, routing strategy stats, EigenTrust scores)
  • Expose metrics via HTTP server on configurable port (default 9100, 0 to disable)
  • Add CLI flags: --metrics-port, --metrics-host

New modules

  • src/metrics/aggregator.rs — Event-driven metrics accumulator with sliding windows and atomic counters
  • src/metrics/prometheus.rs — Prometheus text format renderer (119 metrics across 14 categories)
  • src/metrics/snapshot.rs — Periodic state snapshot from saorsa-core accessors

Commits

  • 790a3c4 feat: wire up HTTP metrics server from saorsa-core HealthServer
  • 25e65f0 feat: add Prometheus metrics aggregation and export pipeline
  • 5f122ca feat: add phase 2 metrics — transport, DHT latency, replication
  • b885e26 fix: subscribe to metric events before starting P2P node
  • e8fa264 chore: track Cargo.lock for reproducible builds
  • b28cfe9 fix: adapt to saorsa-core MetricEvent API changes

Testnet Validation

These metrics were observed on a live testnet with a corresponding saorsa-core PR that consumes the event channel and exposes metrics via a Prometheus-compatible HTTP server.

Infrastructure

Genesis VM 138.68.180.191 (s-2vcpu-4gb, Ubuntu 24.04)
Regular VMs 10 × s-2vcpu-4gb, Ubuntu 24.04
Nodes per VM 10
Total Nodes 101 (1 genesis + 100 regular)
Genesis Port 12000
Metrics Ports 9100–9109 per VM

Monitoring Stack

Monitoring VM 178.128.168.94
InfluxDB v2.7.10 on :8086
Grafana v11.4.0 on :3000
Telegraf v1.33.0, scraping all 101 nodes at 10s intervals

Observed Metrics (119 total across 12 categories)

On an idle network (no data storage or lookups), connection and transport metrics were active while operation-specific metrics remained at zero as expected:

Category Count Highlights
Health & Runtime 13 All 4 components healthy, uptime tracking, system resource gauges
Connection 1 p2p_connected_peers = 100
Transport 13 100 active connections, 630K sent / 1.6M received, 100% success rate
Handshake 3 Latency percentiles (p50/p95/p99)
Lookups 8 Total, latency percentiles, hop counts, timeout rate
DHT Operations 11 Put/Get totals, success rate, latency percentiles
Throughput 1 p2p_operations_per_second
Auth 1 p2p_auth_failures_total
Storage 15 5 metrics × 3 operations (read/write/delete)
Routing Table 3 Size, buckets filled, bucket fullness
Replication 10 Factor=8, health=1.0, cycle tracking, grace periods
Security 16 Eclipse/sybil/collusion scores, BFT mode, churn rate, diversity rejections
Trust 12 EigenTrust avg=0.5, min=0, max=1.0, interaction counters, witness receipts
Placement 12 Geographic diversity=1.0, load balance=1.0, capacity tracking

Test plan

🤖 Generated with Claude Code

jacderida and others added 6 commits March 13, 2026 00:18
Connect the existing metrics_port config to saorsa-core's HealthServer,
exposing /health, /ready, /metrics, and /debug/vars endpoints.

- Move metrics_port from PaymentConfig to NodeConfig (semantically correct)
- Add metrics_host field to NodeConfig and --metrics-host CLI arg
- Instantiate HealthManager with 5 component checkers (DHT, network,
  transport, peers, storage) using P2PNode data sources
- Spawn HealthServer as background task with graceful shutdown
- Disable metrics server when port is 0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Build a complete metrics pipeline with two data paths:
- Event-driven: MetricsAggregator processes MetricEvents from saorsa-core
  into atomic counters and sliding windows (lookups, DHT ops, auth,
  streams, storage, peer connections)
- Pull-based: SnapshotCollector reads state snapshots from saorsa-core
  accessors on each /metrics scrape (DHT health, security, trust,
  placement, transport, EigenTrust scores)

Replace saorsa-core's HealthServer with our own Axum server that combines
health component metrics with ~80 domain metric families in Prometheus
text exposition format on /metrics.

Update saorsa-core dependency to git branch feat-metrics_event_channel
which provides the MetricEvent channel and accessor methods.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extend the metrics pipeline with ~25 new metric families:

- Handshake latency percentiles (PQ key exchange timing)
- Separate DHT put/get latency percentiles (p50/p95/p99)
- Operations per second (derived from total ops / uptime)
- Extended transport stats: connection success/failure counts,
  byte counters, NAT traversal success rate, connection pool size
- Connection failure breakdown by reason (labeled counter)
- Replication timing: repair cycle duration, keys repaired,
  bytes transferred, grace period expiry tracking

Update saorsa-core dependency to feat-metrics_phase2 branch which
provides new MetricEvent variants (ConnectionEstablished,
ConnectionFailed, HandshakeCompleted, ReplicationStarted,
ReplicationCompleted, GracePeriodExpired) and extended TransportStats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move start_metric_event_loop() before p2p_node.start() so that
connection and handshake MetricEvents emitted during startup are
captured by the aggregator instead of being lost.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- StrategyStats.name replaced with strategy: StrategyChoice enum;
  format via Debug when rendering Prometheus labels
- HandshakeCompleted.duration and ConnectionEstablished.duration are
  now Option<Duration>; guard with if-let before recording latency
- Add match arm for new ConnectionLost metric event variant
- Update test fixtures and assertions accordingly

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 16, 2026 16:37
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a Prometheus-compatible metrics/health HTTP server to saorsa-node, plumbing saorsa-core’s metric event stream into an in-process aggregator and formatting both event-driven and snapshot-derived state into Prometheus text exposition.

Changes:

  • Add an Axum HTTP server exposing /health, /ready, /metrics, and /debug/vars, and subscribe to saorsa-core MetricEvents during node startup.
  • Introduce a metrics pipeline: MetricsAggregator (event-driven), SnapshotCollector (pull-based), and PrometheusFormatter (text renderer).
  • Move metrics config to top-level NodeConfig (metrics_port, metrics_host) and update CLI + bootstrap peer conversion for updated saorsa-core config types.

Reviewed changes

Copilot reviewed 9 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/node.rs Wires up health manager, metric event loop, and Axum metrics/health server handlers.
src/metrics/aggregator.rs Implements atomic counters + bounded sliding windows for high-frequency metric events.
src/metrics/snapshot.rs Adds pull-based snapshot collection for transport/DHT/security/trust/placement state.
src/metrics/prometheus.rs Renders aggregated + snapshot metrics into Prometheus text exposition (+ tests).
src/metrics/mod.rs Exposes the new metrics modules.
src/lib.rs Exports the new metrics module.
src/config.rs Adds metrics_port/metrics_host to NodeConfig and migrates deprecated payment.metrics_port.
src/bin/saorsa-node/cli.rs Adds CLI flags for metrics host/port and maps them into NodeConfig.
src/bin/saorsa-cli/main.rs Adapts bootstrap peer config to MultiAddr.
src/devnet.rs Adapts bootstrap peer config to MultiAddr.
Cargo.toml Switches saorsa-core to a git branch and adds axum.
.gitignore Stops ignoring Cargo.lock so it can be committed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/metrics/prometheus.rs Outdated
Comment thread src/node.rs
Comment thread src/node.rs
Comment thread src/node.rs Outdated
Comment thread src/config.rs
Comment thread Cargo.toml
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Mar 16, 2026

Greptile Summary

Adds a Prometheus-compatible metrics HTTP server to saorsa-node, wiring up saorsa-core's MetricEvent broadcast channel to an event-driven MetricsAggregator (counters + sliding windows) and a pull-based SnapshotCollector, formatted via PrometheusFormatter into 119 metrics across 14 categories. The server is served via Axum on a configurable port (default 9100) with /health, /ready, /metrics, and /debug/vars endpoints.

  • New modules: src/metrics/aggregator.rs (event-driven accumulator), src/metrics/prometheus.rs (text exposition formatter), src/metrics/snapshot.rs (pull-based state collector)
  • Config migration: metrics_port moved from PaymentConfig to top-level NodeConfig with backward-compatible deprecation and auto-migration
  • API adaptation: bootstrap_peers converted from SocketAddr to MultiAddr across CLI, devnet, and node to match saorsa-core API changes
  • Dependency concern: saorsa-core currently points to a personal fork branch (jacderida/saorsa-core#feat-metrics_phase2) — must be updated to the organization repo once the companion PR (saorsa-core#46) merges
  • Snapshot collectors are standalone: DhtMetricsCollector, SecurityMetricsCollector, TrustMetricsCollector, and PlacementMetricsCollector are freshly instantiated and not shared with saorsa-core's internals, so snapshot-based metrics will report defaults until these collectors are wired to the actual DHT layer

Confidence Score: 3/5

  • Safe to merge for event-driven metrics, but snapshot metrics will report zero values and the saorsa-core dependency must be updated before release.
  • Score reflects two concerns: (1) standalone snapshot collectors that won't receive data from saorsa-core's internal DHT/security layers, meaning a significant subset of the 119 metrics will always be zero, and (2) the saorsa-core dependency pointing to a personal fork branch. The event-driven metrics pipeline (aggregator, Prometheus formatter, HTTP server) is well-implemented with good test coverage.
  • Pay close attention to Cargo.toml (personal fork dependency) and src/node.rs (standalone collectors, premature log message).

Important Files Changed

Filename Overview
Cargo.toml Adds axum = "0.8" for HTTP server and switches saorsa-core to a personal fork branch — should be updated before merge.
src/config.rs Moves metrics_port/metrics_host to NodeConfig with backward-compatible deprecation migration. Well-tested with 4 new tests.
src/metrics/aggregator.rs Event-driven metrics accumulator with bounded sliding windows and atomic counters. Well-structured with comprehensive test coverage (14 tests). Uses Ordering::Relaxed appropriately for counters.
src/metrics/prometheus.rs 1634-line Prometheus text exposition formatter covering 119 metrics across 14 categories. Follows spec correctly with HELP/TYPE/value ordering. Comprehensive tests verify no orphaned headers.
src/metrics/snapshot.rs Pull-based snapshot collector for saorsa-core state. Clean design but depends on standalone metric collector instances that won't receive data from saorsa-core's internals.
src/node.rs Core integration: wires up Axum HTTP server, metric event loop, health manager, and snapshot collector. Has premature log message and uses standalone collector instances that won't be populated.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph "saorsa-core"
        ME[MetricEvent Channel]
        PE[P2PEvent Channel]
        PN[P2PNode]
        TE[EigenTrustEngine]
    end

    subgraph "saorsa-node metrics pipeline"
        MA[MetricsAggregator]
        SC[SnapshotCollector]
        PF[PrometheusFormatter]
    end

    subgraph "HTTP Server (Axum)"
        HE["/health"]
        RE["/ready"]
        MET["/metrics"]
        DV["/debug/vars"]
    end

    HM[HealthManager]

    ME -->|"subscribe_metric_events()"| MA
    PE -->|"PeerConnected/Disconnected"| MA
    PN -->|"transport_stats()"| SC
    TE -->|"cached_global_trust()"| SC

    MA -->|"counters + sliding windows"| PF
    SC -->|"point-in-time snapshot"| PF
    HM -->|"component health"| HE
    HM -->|"readiness status"| RE
    PF -->|"text exposition"| MET
    HM -->|"PrometheusExporter"| MET
    HM -->|"debug info"| DV
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/node.rs
Line: 837

Comment:
**Misleading log before bind completes**

This `info!` runs synchronously immediately after `tokio::spawn`, before the spawned task has actually called `TcpListener::bind`. If the bind fails (port already in use, permission denied), the log will still say "listening on ..." even though the server never started. Consider moving this log inside the spawned task, after the bind succeeds, to avoid misleading operators.

```suggestion
        // Log is emitted inside the spawned task after successful bind.
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/node.rs
Line: 478-500

Comment:
**Standalone collectors will report zero values**

`build_snapshot_collector` creates new, empty instances of `DhtMetricsCollector`, `TrustMetricsCollector`, `PlacementMetricsCollector`, and `SecurityMetricsCollector`. Since these are not the same `Arc`s used internally by saorsa-core's DHT/security layers, they will never be populated and will always report default (zero) values for routing table, replication, security, trust, and placement snapshot metrics.

Only `transport_stats()` (line 418 in `snapshot.rs`) and `trust_engine()` (line 490 here) go through the actual `P2PNode`, so those work correctly.

Is this intentional as a placeholder until saorsa-core exposes its internal collectors, or is there a way to get the actual collector instances from `P2PNode`?

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: Cargo.toml
Line: 35

Comment:
**Dependency on personal fork branch**

`saorsa-core` is pinned to a personal fork (`jacderida/saorsa-core`) on a feature branch. The PR description references a companion PR (saorsa-labs/saorsa-core#46) — once that lands, this should be updated to point to the organization repo with a versioned release or at minimum `saorsa-labs/saorsa-core` on `main`. Merging with a personal fork reference would break builds if the branch is deleted or force-pushed.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: b28cfe9

Comment thread src/node.rs Outdated
Comment thread src/node.rs
Comment thread Cargo.toml
jacderida and others added 2 commits March 16, 2026 17:42
- Fix all clippy pedantic/nursery lints without blanket #[allow]:
  - Replace u128-as-u64 casts with u64::try_from().unwrap_or()
  - Extract helper functions (ratio, percentile_index, duration_to_micros)
  - Drop RwLock guards early to fix significant_drop_tightening
  - Split long functions (security, trust, placement, transport metrics)
  - Use #[expect] for unavoidable u64->f64 precision-loss casts
  - Fix format! in format_args, pass StreamClass by value, use Option<&T>
  - Combine identical ConnectionEstablished/ConnectionLost match arms
- Fix formatting issues (cargo fmt)
- Fix e2e test compilation: convert Vec<SocketAddr> to Vec<MultiAddr>
- Address review comments:
  - Move metrics server log inside spawned task after bind succeeds
  - Fix JSON error responses with serde_json::json! for proper escaping
  - Only start metric event loop when metrics are enabled

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use the same Arc collector instances that saorsa-core's DHT layer
writes to, instead of creating standalone empty instances. This means
snapshot-based metrics (DHT health, security, trust, placement) now
reflect live data rather than always reporting zeros.

Falls back to standalone instances only when security_dashboard is
None (minimal test configurations).

Depends on saorsa-core aaad76c (SecurityDashboard accessor methods).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 16, 2026 18:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR wires saorsa-core’s metric event channel into an HTTP endpoint that exposes Prometheus-compatible metrics, combining event-driven aggregation with per-scrape snapshots and health endpoints.

Changes:

  • Add metrics pipeline modules (MetricsAggregator, SnapshotCollector, PrometheusFormatter) and expose them via saorsa_node::metrics.
  • Add an Axum-based health/metrics server to RunningNode, including /metrics, /health, /ready, and /debug/vars.
  • Update configs/CLI/tests to support metrics_host/metrics_port and adapt bootstrap peer types to saorsa-core’s MultiAddr.

Reviewed changes

Copilot reviewed 11 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/e2e/testnet.rs Convert bootstrap peers to MultiAddr for e2e config compatibility.
tests/e2e/live_testnet.rs Convert bootstrap peers to MultiAddr in live testnet client.
src/node.rs Integrate health manager, metrics aggregation, snapshot collection, and Axum server into node lifecycle.
src/metrics/snapshot.rs Add pull-based snapshot collector for saorsa-core state.
src/metrics/prometheus.rs Add Prometheus text exposition renderer and unit tests.
src/metrics/mod.rs New metrics module wiring + public re-exports.
src/metrics/aggregator.rs Add event-driven metrics accumulator with counters and sliding windows.
src/lib.rs Export the new metrics module.
src/devnet.rs Convert bootstrap peers to MultiAddr in devnet config.
src/config.rs Add top-level metrics_port/metrics_host + deprecated migration from payment.metrics_port.
src/bin/saorsa-node/cli.rs Add CLI args for metrics host/port and map into NodeConfig.
src/bin/saorsa-cli/main.rs Convert bootstrap peers to MultiAddr in client node construction.
Cargo.toml Switch saorsa-core to git dependency and add axum.
.gitignore Stop ignoring Cargo.lock so it can be committed.
Comments suppressed due to low confidence (1)

src/node.rs:865

  • connected_peers is only updated inside start_protocol_routing(), but that function returns early when ant_protocol is None. This means the p2p_connected_peers gauge (and peer connect/disconnect tracking) will stay at 0 for configs where storage/chunk protocol routing is disabled (e.g., client nodes). Consider subscribing to P2PEvents for peer connect/disconnect in a separate task that always runs when metrics are enabled, or restructure start_protocol_routing() to always drain events and only gate the chunk-message handling on ant_protocol being present.
    fn start_protocol_routing(&mut self) {
        let protocol = match self.ant_protocol {
            Some(ref p) => Arc::clone(p),
            None => return,
        };

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/metrics/prometheus.rs Outdated
Comment thread src/config.rs
jacderida and others added 2 commits March 16, 2026 19:46
Update doc to accurately describe emission behavior: counters and gauges
are always emitted (even when zero) for stable metric names, while
optional families like stream bandwidth are conditional.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Set allow_loopback = true in e2e test node config. saorsa-core 0.15.0
  defaults to rejecting loopback addresses, which caused all e2e tests
  to fail with "No remote peers found near target address" since every
  test node runs on 127.0.0.1.
- Update prometheus module doc to accurately describe emission behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 16, 2026 21:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Prometheus-compatible metrics/health HTTP server to saorsa-node, wiring saorsa-core’s MetricEvent stream plus periodic state snapshots into a single /metrics endpoint for scraping/monitoring.

Changes:

  • Introduce a metrics pipeline: MetricsAggregator (event-driven), SnapshotCollector (pull-based), and PrometheusFormatter (text exposition).
  • Add an Axum-based HTTP server exposing /health, /ready, /metrics, and /debug/vars, controlled via metrics_host/metrics_port (0 disables).
  • Update config/CLI and e2e/devnet/bootstrap wiring to adapt to saorsa-core API changes (SocketAddr → MultiAddr, loopback allowance).

Reviewed changes

Copilot reviewed 11 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/e2e/testnet.rs Allow loopback + convert bootstrap addrs to MultiAddr for core config.
tests/e2e/live_testnet.rs Convert bootstrap addrs to MultiAddr.
src/node.rs Wire health manager, metrics tasks, and Axum metrics/health server into node runtime.
src/metrics/aggregator.rs New event-driven aggregator with counters + bounded sliding windows.
src/metrics/snapshot.rs New pull-based snapshot collector from saorsa-core accessors.
src/metrics/prometheus.rs New Prometheus text formatter + unit tests for emitted output.
src/metrics/mod.rs Export metrics module types.
src/lib.rs Expose metrics module publicly.
src/devnet.rs Convert devnet bootstrap addrs to MultiAddr.
src/config.rs Add metrics_host/metrics_port to NodeConfig and migrate deprecated payment.metrics_port.
src/bin/saorsa-node/cli.rs Add --metrics-host flag and map CLI metrics settings onto NodeConfig.
src/bin/saorsa-cli/main.rs Convert bootstrap addrs to MultiAddr for client node creation.
Cargo.toml Add axum and switch saorsa-core to a git branch dependency.
.gitignore Stop ignoring Cargo.lock (lockfile now tracked).
Comments suppressed due to low confidence (1)

src/node.rs:865

  • p2p_connected_peers is driven by MetricsAggregator::record_peer_connected/disconnected(), but those updates only happen inside start_protocol_routing(). Since start_protocol_routing() returns early when ant_protocol is None, nodes running without storage/protocol routing will always report p2p_connected_peers = 0 even when connected. Consider tracking peer connect/disconnect in a dedicated P2P event task (independent of ant_protocol), or updating connected_peers from MetricEvent::ConnectionEstablished/ConnectionLost and ensuring the two sources don’t double-count.
    /// Also tracks peer connect/disconnect events in the metrics aggregator.
    fn start_protocol_routing(&mut self) {
        let protocol = match self.ant_protocol {
            Some(ref p) => Arc::clone(p),
            None => return,
        };

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Cargo.toml
Comment thread src/metrics/prometheus.rs
Comment on lines +25 to +27
/// Formats aggregated + snapshot metrics into Prometheus text exposition format.
pub struct PrometheusFormatter;

The test_payment_with_node_failures test shuts down 3 of 10 nodes then
tries to store a chunk. On Windows with saorsa-core 0.15.0, DHT routing
table convergence after node failures is slower due to loopback/diversity
changes. Increase the wait-after-failure and post-warmup sleeps from 15s
to 30s to give the routing tables time to adapt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jacderida
Copy link
Copy Markdown
Collaborator Author

Metrics will be explored in a more experimental way.

@jacderida jacderida closed this Mar 21, 2026
mickvandijke added a commit that referenced this pull request Apr 1, 2026
Complete the Section 18 test matrix with the remaining scenarios:

- #3: Fresh replication stores chunk + updates PaidForList on remote nodes
- #9: Fetch retry rotates to alternate source
- #10: Fetch retry exhaustion with single source
- #11: Repeated ApplicationFailure events decrease peer trust score
- #12: Bootstrap node discovers keys stored on multiple peers
- #14: Hint construction covers all locally stored keys
- #15: Data and PaidForList survive node shutdown (partition)
- #17: Neighbor sync request returns valid response (admission test)
- #21: Paid-list majority confirmed from multiple peers via verification
- #24: PaidNotify propagates paid-list entries after fresh replication
- #25: Paid-list convergence verified via majority peer queries
- #44: PaidForList persists across restart (cold-start recovery)
- #45: PaidForList lost in fresh directory (unrecoverable scenario)

All 56 Section 18 scenarios now have test coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mickvandijke added a commit that referenced this pull request Apr 1, 2026
Complete the Section 18 test matrix with the remaining scenarios:

- #3: Fresh replication stores chunk + updates PaidForList on remote nodes
- #9: Fetch retry rotates to alternate source
- #10: Fetch retry exhaustion with single source
- #11: Repeated ApplicationFailure events decrease peer trust score
- #12: Bootstrap node discovers keys stored on multiple peers
- #14: Hint construction covers all locally stored keys
- #15: Data and PaidForList survive node shutdown (partition)
- #17: Neighbor sync request returns valid response (admission test)
- #21: Paid-list majority confirmed from multiple peers via verification
- #24: PaidNotify propagates paid-list entries after fresh replication
- #25: Paid-list convergence verified via majority peer queries
- #44: PaidForList persists across restart (cold-start recovery)
- #45: PaidForList lost in fresh directory (unrecoverable scenario)

All 56 Section 18 scenarios now have test coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants