DevOps: Add contract and backend monitoring with alerting

## Summary
There is no monitoring or alerting configured for the Trivela system on mainnet. Contract pause events, backend downtime, Soroban RPC degradation, or abnormal error rates will go undetected until users report them. For a mainnet platform, proactive monitoring is non-negotiable.

## Problem
- Backend `/health` and `/metrics` endpoints exist but nothing consumes them with alerting rules
- No alerting on: contract paused event, backend 5xx spike, RPC health check failure, campaign DB write errors
- No uptime monitoring (e.g. UptimeRobot, BetterStack) configured
- No runbook for common failure scenarios

## Acceptance Criteria
### Prometheus/Grafana (self-hosted option)
- [ ] Add `prometheus.yml` scrape config targeting the backend `/metrics` endpoint
- [ ] Add `alerting_rules.yml` with alerts for:
  - Backend error rate > 5% over 5 min
  - RPC health status `degraded` for > 2 min
  - Process uptime reset (restart detected)
- [ ] Add Grafana dashboard JSON (`monitoring/dashboards/trivela.json`) with: request rate, error rate, uptime, route breakdown
- [ ] Add `monitoring/` directory with compose override (`compose.monitoring.yml`) for local Prometheus + Grafana

### Soroban Event Monitoring
- [ ] Add an alert when contract `paused` event is indexed (from issue #283 indexer)
- [ ] Add an alert when RPC returns consecutive errors for > 60s

### Runbook
- [ ] Add `docs/RUNBOOK.md` with procedures for: backend restart, RPC failover, contract pause response, DB backup restore

## References
- `backend/src/index.js` — `/metrics` endpoint (Prometheus format)
- `compose.yaml`
- `docs/ARCHITECTURE_OVERVIEW.md`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DevOps: Add contract and backend monitoring with alerting #306

Summary

Problem

Acceptance Criteria

Prometheus/Grafana (self-hosted option)

Soroban Event Monitoring

Runbook

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

DevOps: Add contract and backend monitoring with alerting #306

Description

Summary

Problem

Acceptance Criteria

Prometheus/Grafana (self-hosted option)

Soroban Event Monitoring

Runbook

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions