Add Prometheus + Grafana monitoring stack#1437
Conversation
Set up docker-compose services for Prometheus and Grafana to visualize gRPC metrics from the /metrics endpoint. Includes a pre-provisioned dashboard with request rate, error rate, latency percentiles, heatmap, and error ratio panels. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughウォークスルーPrometheus と Grafana を組み込む Docker Compose 構成、Prometheus のスクレイプ設定、Grafana のプロビジョニング(データソース・ダッシュボード)および「StationAPI gRPC Metrics」ダッシュボード JSON を追加し、ローカルで gRPC メトリクスの可視化を可能にします。 変更内容
シーケンス図sequenceDiagram
participant Dev as 開発者 (docker compose)
participant Compose as Docker Compose
participant API as StationAPI (gRPC)
participant Prom as Prometheus
participant Graf as Grafana
Dev->>Compose: docker compose up -d api prometheus grafana
Compose->>API: 起動 (METRICS_HOST=0.0.0.0)
API-->>Prom: /metrics を 50052 で公開
Prom->>API: 定期スクレイプ (api:50052/metrics)
Compose->>Graf: プロビジョニング読み込み (datasource, dashboards)
Graf->>Prom: クエリ (Prometheus HTTP API)
Dev->>Graf: ブラウザでダッシュボード閲覧
Graf->>Dev: 可視化 (StationAPI gRPC Metrics)
推定コードレビュー工数🎯 3 (Moderate) | ⏱️ ~20 minutes 関連の可能性がある PR
ポエム
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip You can get early access to new features in CodeRabbit.Enable the |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
compose.yml (1)
36-46::latestタグの使用は将来のメジャーバージョン更新で破壊的変更を招くため、具体バージョンへの固定を推奨します。Line 36 の
prom/prometheus:latestと Line 45 のgrafana/grafana:latestは、現在はそれぞれ v3、v12.4 を指していますが、以下のリスクがあります:
- Prometheus: メジャーバージョンが v4 に上がった際、破壊的変更が入り得ます。また、過去に
latestが v2.53.5 を指していた時期があり、環境の再現性が損なわれる可能性があります。- Grafana: バージョン更新時に起動時の自動マイグレーション(DB スキーマ移行など)が実行され、予期しない挙動変更が発生する可能性があります。
${PROMETHEUS_VERSION}や${GRAFANA_VERSION}などの環境変数経由でバージョン固定することで、開発環境の再現性と制御可能性が向上します。🔧 提案差分
- image: prom/prometheus:latest + image: prom/prometheus:v3.10.0 - image: grafana/grafana:latest + image: grafana/grafana:12.4あるいは環境変数で制御する場合:
- image: prom/prometheus:latest + image: prom/prometheus:${PROMETHEUS_VERSION:-v3.10.0} - image: grafana/grafana:latest + image: grafana/grafana:${GRAFANA_VERSION:-12.4}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compose.yml` around lines 36 - 46, Replace the floating :latest image tags to pinned versions or environment variables to ensure reproducible deployments: change prom/prometheus:latest and grafana/grafana:latest to either explicit versions (e.g. prom/prometheus:v3.x.x, grafana/grafana:v12.4.x) or use env vars like ${PROMETHEUS_VERSION} and ${GRAFANA_VERSION} and reference those in the image fields so the Compose file uses fixed, controllable versions instead of :latest.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker/grafana/dashboards/stationapi-grpc.json`:
- Line 207: The current PromQL expression using grpc_requests_total divides
directly and produces NaN/Inf when the denominator is zero; compute the
numerator (sum of rate for status="error") and the denominator (sum of rate for
all grpc_requests_total) as separate aggregates with identical label dimensions,
then replace the direct division with a safe expression that yields the ratio
only when the denominator is non‑zero and falls back to zero otherwise (i.e.,
evaluate N and D separately, compute N/D only when D != 0, and return 0 for the
undefined intervals); update the expression that references grpc_requests_total
accordingly so labels align between the two aggregates and the fallback avoids
producing Inf/NaN.
In `@README.md`:
- Around line 57-59: The README's startup command only starts prometheus and
grafana (the line with "docker compose up -d prometheus grafana") which leaves
the api service down and causes scrape targets to be marked down; update the
instruction to either include the api service in the command (e.g., run the same
docker compose command with api added) or explicitly state the prerequisite that
the api must be running before starting prometheus/grafana so users won't
encounter down scrape targets.
---
Nitpick comments:
In `@compose.yml`:
- Around line 36-46: Replace the floating :latest image tags to pinned versions
or environment variables to ensure reproducible deployments: change
prom/prometheus:latest and grafana/grafana:latest to either explicit versions
(e.g. prom/prometheus:v3.x.x, grafana/grafana:v12.4.x) or use env vars like
${PROMETHEUS_VERSION} and ${GRAFANA_VERSION} and reference those in the image
fields so the Compose file uses fixed, controllable versions instead of :latest.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 65241a09-7414-48cf-80cf-86e6eb9285a8
📒 Files selected for processing (6)
README.mdcompose.ymldocker/grafana/dashboards/stationapi-grpc.jsondocker/grafana/provisioning/dashboards/dashboards.ymldocker/grafana/provisioning/datasources/prometheus.ymldocker/prometheus/prometheus.yml
…sions - Use safe division in Grafana error rate panel to avoid NaN/Inf - Include api service in README startup command so scrape targets are up - Pin Prometheus and Grafana image versions for reproducible deployments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
localhost:50052/metrics) を可視化できるようにした詳細
compose.yml:prometheus,grafanaサービス追加。api にMETRICS_HOSTと ポート 50052 を追加docker/prometheus/prometheus.yml:api:50052を 15 秒間隔でスクレイプdocker/grafana/: データソース・ダッシュボードの自動プロビジョニング設定localhost:3001(admin/admin)、Prometheus はlocalhost:9090Test plan
docker compose up -dで全サービスが起動することを確認localhost:9090/targetsで Prometheus が api ターゲットを正常にスクレイプしていることを確認localhost:3001で Grafana にログインし、「StationAPI gRPC Metrics」ダッシュボードが表示されることを確認🤖 Generated with Claude Code
Summary by CodeRabbit
リリースノート
新機能
ドキュメント