From ed3e78d027595a9bf528f2c8156331aa9a0ad453 Mon Sep 17 00:00:00 2001 From: Simon Gurcke Date: Mon, 13 Apr 2026 14:19:05 +1000 Subject: [PATCH 1/3] Add metrics command --- AGENTS.md | 2 + README.md | 1 + skills/apitally-cli/SKILL.md | 1 + skills/apitally-cli/references/commands.md | 62 +++- .../apitally-cli/references/duckdb_tables.md | 32 +- src/main.rs | 118 ++++++- src/metrics.rs | 290 ++++++++++++++++++ src/reset_db.rs | 4 +- 8 files changed, 505 insertions(+), 5 deletions(-) create mode 100644 src/metrics.rs diff --git a/AGENTS.md b/AGENTS.md index 38bfcf1..8ad6db4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -14,6 +14,7 @@ src/ apps.rs Apps command (fetch, DB write) consumers.rs Consumers command (paginated fetch, DB write) endpoints.rs Endpoints command (fetch, DB write) + metrics.rs Metrics command (Arrow IPC or NDJSON streaming) request_logs.rs Request logs command (Arrow IPC or NDJSON streaming) request_details.rs Request details command (single request fetch, DB write) sql.rs SQL command (query DuckDB, output NDJSON) @@ -46,6 +47,7 @@ skills/ | `apps` | `GET /v1/apps` | | `consumers` | `GET /v1/apps/{app_id}/consumers` | | `endpoints` | `GET /v1/apps/{app_id}/endpoints` | +| `metrics` | `POST /v1/apps/{app_id}/metrics` | | `request-logs` | `POST /v1/apps/{app_id}/request-logs` | | `request-details` | `GET /v1/apps/{app_id}/request-logs/{request_uuid}` | | `sql` | Local DuckDB | diff --git a/README.md b/README.md index 6d1fda5..9d2f792 100644 --- a/README.md +++ b/README.md @@ -79,6 +79,7 @@ You can also set the API key via the `APITALLY_API_KEY` environment variable or | `apps` | List all apps in your team | | `consumers` | List consumers for an app | | `endpoints` | List endpoints for an app | +| `metrics` | Fetch aggregated metrics for an app | | `request-logs` | Fetch request log data for an app | | `request-details` | Fetch full details for a specific request | | `sql` | Run SQL queries against a local DuckDB database | diff --git a/skills/apitally-cli/SKILL.md b/skills/apitally-cli/SKILL.md index 67cad20..8dbe67e 100644 --- a/skills/apitally-cli/SKILL.md +++ b/skills/apitally-cli/SKILL.md @@ -38,6 +38,7 @@ All commands are run via `npx @apitally/cli `. For full details, see [r - `apps [--db []]` -- list apps (get app IDs) - `consumers [--requests-since
] [--db []]` -- list consumers for an app (get consumer IDs) - `endpoints [--method ] [--path ] [--db []]` -- list endpoints for an app +- `metrics --since
[--until
] --metrics [--interval ] [--group-by ] [--filters ] [--timezone ] [--db []]` -- fetch aggregated metrics - `request-logs --since
[--until
] [--fields ] [--filters ] [--limit ] [--db []]` -- fetch request logs (max 1,000,000 rows at once) - `request-details [--db []]` -- fetch full details for a single request (including headers, payloads, exception info, application logs, and spans) - `sql "" [--db ]` -- run SQL against local DuckDB diff --git a/skills/apitally-cli/references/commands.md b/skills/apitally-cli/references/commands.md index 589f6db..025427f 100644 --- a/skills/apitally-cli/references/commands.md +++ b/skills/apitally-cli/references/commands.md @@ -81,6 +81,66 @@ Example NDJSON output (without `--db`): {"id":2,"method":"GET","path":"/v1/users/{user_id}"} ``` +## `metrics` + +``` +npx @apitally/cli metrics --since --metrics \ + [--until ] [--interval ] [--group-by ] \ + [--filters ] [--timezone ] [--db []] +``` + +Fetch aggregated metrics for an app. Outputs NDJSON to stdout by default. + +- `--since`: Start of time range, inclusive (ISO 8601, required) +- `--until`: End of time range, exclusive (ISO 8601, defaults to now) +- `--metrics`: JSON array of metric names to include (required) +- `--interval`: Time interval for grouping (`month`, `day`, `hour`, `minute`). When omitted, returns a single row per group for the entire time range +- `--group-by`: JSON array of field names to group by, in addition to time period +- `--filters`: JSON array of filter objects (see below) +- `--timezone`: Timezone for intervals and to interpret since/until if not tz-aware (defaults to UTC) +- `--db`: Write to `metrics` table in DuckDB instead of outputting NDJSON to stdout + +### Available metrics + +| Metric | Type | Description | +| --------------------- | ------- | -------------------------------------- | +| `requests` | integer | Total request count | +| `requests_per_minute` | float | Requests per minute | +| `bytes_received` | integer | Total bytes received | +| `bytes_sent` | integer | Total bytes sent | +| `client_errors` | integer | 4xx errors (excluding expected errors) | +| `server_errors` | integer | 5xx errors (excluding expected errors) | +| `error_rate` | float | Ratio of errors to total requests | +| `response_time_p50` | integer | 50th percentile response time (ms) | +| `response_time_p75` | integer | 75th percentile response time (ms) | +| `response_time_p95` | integer | 95th percentile response time (ms) | + +### Group-by fields + +`env`, `consumer_id`, `method`, `path`, `status_code` + +### Filters + +Pass `--filters` as a JSON array of filter objects. Supported fields and operators: + +- **string fields** (`env`, `method`, `path`): `eq`, `neq`, `in`, `not_in`, `like`, `not_like`, `contains`, `not_contains` +- **numeric fields** (`consumer_id`, `status_code`): `eq`, `neq`, `gt`, `gte`, `lt`, `lte`, `in`, `not_in`, `is_null`, `is_not_null` + +Filter examples: + +```json +[{"field":"method","op":"eq","value":"GET"}] +[{"field":"status_code","op":"gte","value":400}] +[{"field":"path","op":"like","value":"/v1/users/%"}] +``` + +Example NDJSON output (without `--db`): + +```json +{"period_start":"2026-01-01T00:00:00Z","period_end":"2026-01-01T01:00:00Z","env":"prod","requests":1234,"error_rate":0.02} +{"period_start":"2026-01-01T01:00:00Z","period_end":"2026-01-01T02:00:00Z","env":"prod","requests":987,"error_rate":0.01} +``` + ## `request-logs` ``` @@ -206,7 +266,7 @@ Run a SQL query against a local DuckDB database. The query can be passed as an a - `--db`: Path to DuckDB database -Available tables: `apps`, `app_envs`, `consumers`, `endpoints`, `request_logs`, `application_logs`, `spans`. See [duckdb_tables.md](duckdb_tables.md) for schemas. +Available tables: `apps`, `app_envs`, `consumers`, `endpoints`, `metrics`, `request_logs`, `application_logs`, `spans`. See [duckdb_tables.md](duckdb_tables.md) for schemas. **Important:** The database may contain data from previous sessions. Always filter queries by `app_id`, `timestamp`, and other relevant fields to avoid including unrelated data. diff --git a/skills/apitally-cli/references/duckdb_tables.md b/skills/apitally-cli/references/duckdb_tables.md index 2d02d93..7e3ca62 100644 --- a/skills/apitally-cli/references/duckdb_tables.md +++ b/skills/apitally-cli/references/duckdb_tables.md @@ -1,6 +1,6 @@ # DuckDB Table Schemas -Tables are created automatically when using the `--db` flag with `apps`, `consumers`, `endpoints`, `request-logs`, or `request-details` commands. DuckDB uses a [PostgreSQL-compatible SQL dialect](https://duckdb.org/docs/stable/sql/dialect/overview). +Tables are created automatically when using the `--db` flag with `apps`, `consumers`, `endpoints`, `metrics`, `request-logs`, or `request-details` commands. DuckDB uses a [PostgreSQL-compatible SQL dialect](https://duckdb.org/docs/stable/sql/dialect/overview). ## apps @@ -56,6 +56,33 @@ CREATE TABLE endpoints ( ); ``` +## metrics + +```sql +CREATE TABLE metrics ( + app_id INTEGER NOT NULL, + period_start TIMESTAMPTZ NOT NULL, + period_end TIMESTAMPTZ NOT NULL, + env VARCHAR, + consumer_id BIGINT, + method VARCHAR, + path VARCHAR, + status_code INTEGER, + requests BIGINT, + requests_per_minute DOUBLE, + bytes_received BIGINT, + bytes_sent BIGINT, + client_errors BIGINT, + server_errors BIGINT, + error_rate DOUBLE, + response_time_p50 INTEGER, -- milliseconds + response_time_p75 INTEGER, -- milliseconds + response_time_p95 INTEGER -- milliseconds +); +``` + +Columns are only populated if included in `--metrics` or `--group-by` during fetch. No unique constraint; deduplication is handled by deleting existing rows for the time range being inserted. + ## request_logs ```sql @@ -129,9 +156,12 @@ Populated by the `request-details` command when using `--db`. ## Relationships - `request_logs.consumer_id` references `consumers.consumer_id` (join on both `app_id` and `consumer_id`) +- `metrics.consumer_id` references `consumers.consumer_id` (join on both `app_id` and `consumer_id`, only when metrics are grouped by consumer_id) - `endpoints.app_id` references `apps.app_id` +- `metrics.app_id` references `apps.app_id` - `request_logs.app_id` references `apps.app_id` - `app_envs.app_id` references `apps.app_id` - `request_logs.env` matches `app_envs.name` (string, not a foreign key to `app_env_id`) +- `metrics.env` matches `app_envs.name` (string, only when metrics are grouped by env) - `application_logs.request_uuid` references `request_logs.request_uuid` (join on both `app_id` and `request_uuid`) - `spans.request_uuid` references `request_logs.request_uuid` (join on both `app_id` and `request_uuid`) diff --git a/src/main.rs b/src/main.rs index 65f4366..e4f9cd1 100644 --- a/src/main.rs +++ b/src/main.rs @@ -2,6 +2,7 @@ mod apps; mod auth; mod consumers; mod endpoints; +mod metrics; mod request_details; mod request_logs; mod reset_db; @@ -117,6 +118,73 @@ enum Command { db: Option>, }, + /// Retrieve aggregated metrics for an app + /// + /// Outputs newline-delimited JSON (one object per line). + /// With --db, inserts rows into the `metrics` table instead. + Metrics { + #[command(flatten)] + api: ApiArgs, + + /// App ID + app_id: i64, + + /// Since date/time (ISO 8601) + #[arg(long)] + since: String, + + /// Until date/time (ISO 8601, defaults to now) + #[arg(long)] + until: Option, + + /// JSON array of metric names to include + /// + /// Available metrics: requests, requests_per_minute, bytes_received, + /// bytes_sent, client_errors, server_errors, error_rate, + /// response_time_p50, response_time_p75, response_time_p95. + #[arg(long)] + metrics: String, + + /// Time interval for grouping + /// + /// Available intervals: month, day, hour, minute. + /// When omitted, returns a single row per group for the entire time range. + #[arg(long)] + interval: Option, + + /// JSON array of field names to group by (in addition to time interval) + /// + /// Available fields: env, consumer_id, method, path, status_code. + #[arg(long)] + group_by: Option, + + /// JSON array of filter objects with "field", "op", and "value" keys + /// + /// Supported fields: env, consumer_id, method, path, status_code. + /// + /// Supported operators: + /// string fields (env, method, path): eq, neq, in, not_in, like, not_like, contains, not_contains + /// numeric fields (consumer_id, status_code): eq, neq, gt, gte, lt, lte, in, not_in, is_null, is_not_null + /// + /// Examples: + /// [{"field":"method","op":"eq","value":"GET"}] + /// [{"field":"status_code","op":"gte","value":400}] + #[arg(long)] + filters: Option, + + /// Timezone for intervals and to interpret since/until if not tz-aware + /// + /// Defaults to UTC. Example: America/New_York. + #[arg(long)] + timezone: Option, + + /// Store results in DuckDB instead of outputting NDJSON + /// + /// Defaults to ~/.apitally/data.duckdb if no path is given. + #[arg(long, num_args = 0..=1)] + db: Option>, + }, + /// Retrieve request log data for an app /// /// Outputs newline-delimited JSON (one object per line). @@ -208,8 +276,8 @@ enum Command { /// Run a SQL query against local DuckDB /// - /// Available tables: apps, app_envs, consumers, endpoints, request_logs, - /// application_logs, spans. + /// Available tables: apps, app_envs, consumers, endpoints, metrics, + /// request_logs, application_logs, spans. Sql { /// SQL query to execute (reads from stdin if omitted) query: Option, @@ -319,6 +387,34 @@ fn run(cli: Cli) -> Result<()> { std::io::stdout().lock(), ) } + Command::Metrics { + api, + app_id, + since, + until, + metrics, + interval, + group_by, + filters, + timezone, + db, + } => { + let db = utils::resolve_db(db)?; + metrics::run( + app_id, + &since, + until.as_deref(), + &metrics, + interval.as_deref(), + group_by.as_deref(), + filters.as_deref(), + timezone.as_deref(), + db.as_deref(), + api.api_key.as_deref(), + api.api_base_url.as_deref(), + std::io::stdout().lock(), + ) + } Command::RequestLogs { api, app_id, @@ -399,6 +495,10 @@ mod tests { assert!(Cli::try_parse_from(["apitally"]).is_err()); // missing command assert!(Cli::try_parse_from(["apitally", "consumers"]).is_err()); // missing app_id assert!(Cli::try_parse_from(["apitally", "endpoints"]).is_err()); // missing app_id + assert!(Cli::try_parse_from(["apitally", "metrics", "42"]).is_err()); // missing --since and --metrics + assert!( + Cli::try_parse_from(["apitally", "metrics", "42", "--since", "2025-01-01"]).is_err() + ); // missing --metrics assert!(Cli::try_parse_from(["apitally", "request-logs", "42"]).is_err()); // missing --since assert!(Cli::try_parse_from(["apitally", "request-details", "42"]).is_err()); // missing request_uuid assert!(Cli::try_parse_from(["apitally", "sql", "SELECT 1", "--db"]).is_err()); // missing db path @@ -428,6 +528,20 @@ mod tests { .command, Command::Endpoints { app_id: 42, .. } )); + assert!(matches!( + Cli::try_parse_from([ + "apitally", + "metrics", + "42", + "--since", + "2025-01-01", + "--metrics", + r#"["requests","error_rate"]"# + ]) + .unwrap() + .command, + Command::Metrics { app_id: 42, .. } + )); assert!(matches!( Cli::try_parse_from(["apitally", "request-logs", "42", "--since", "2025-01-01"]) .unwrap() diff --git a/src/metrics.rs b/src/metrics.rs new file mode 100644 index 0000000..0752ae3 --- /dev/null +++ b/src/metrics.rs @@ -0,0 +1,290 @@ +use std::io; +use std::path::Path; + +use anyhow::Result; +use duckdb::arrow::ipc::reader::StreamReader; +use duckdb::vtab::arrow::{ArrowVTab, arrow_recordbatch_to_query_params}; + +use crate::auth::{resolve_api_base_url, resolve_api_key}; +use crate::utils::{api_post, input_err, open_db}; + +pub(crate) fn ensure_metrics_table(conn: &duckdb::Connection) -> Result<()> { + conn.execute_batch( + "CREATE TABLE IF NOT EXISTS metrics ( + app_id INTEGER NOT NULL, + period_start TIMESTAMPTZ NOT NULL, + period_end TIMESTAMPTZ NOT NULL, + env VARCHAR, + consumer_id BIGINT, + method VARCHAR, + path VARCHAR, + status_code INTEGER, + requests BIGINT, + requests_per_minute DOUBLE, + bytes_received BIGINT, + bytes_sent BIGINT, + client_errors BIGINT, + server_errors BIGINT, + error_rate DOUBLE, + response_time_p50 INTEGER, + response_time_p75 INTEGER, + response_time_p95 INTEGER + )", + )?; + Ok(()) +} + +#[allow(clippy::too_many_arguments)] +pub fn run( + app_id: i64, + since: &str, + until: Option<&str>, + metrics: &str, + interval: Option<&str>, + group_by: Option<&str>, + filters: Option<&str>, + timezone: Option<&str>, + db: Option<&Path>, + api_key: Option<&str>, + api_base_url: Option<&str>, + mut writer: impl io::Write, +) -> Result<()> { + let api_key = resolve_api_key(api_key)?; + let api_base_url = resolve_api_base_url(api_base_url); + let db = db.map(|p| open_db(p).map(|c| (p, c))).transpose()?; + + let metrics_value: serde_json::Value = serde_json::from_str(metrics) + .map_err(|e| input_err(format!("invalid JSON for --metrics: {e}")))?; + let format = if db.is_some() { "arrow" } else { "ndjson" }; + let mut body = serde_json::json!({ + "format": format, + "since": since, + "metrics": metrics_value, + }); + if let Some(until) = until { + body["until"] = serde_json::json!(until); + } + if let Some(interval) = interval { + body["interval"] = serde_json::json!(interval); + } + if let Some(group_by) = group_by { + let group_by_value: serde_json::Value = serde_json::from_str(group_by) + .map_err(|e| input_err(format!("invalid JSON for --group-by: {e}")))?; + body["group_by"] = group_by_value; + } + if let Some(filters) = filters { + let filters_value: serde_json::Value = serde_json::from_str(filters) + .map_err(|e| input_err(format!("invalid JSON for --filters: {e}")))?; + body["filters"] = filters_value; + } + if let Some(timezone) = timezone { + body["timezone"] = serde_json::json!(timezone); + } + let url = format!("{api_base_url}/v1/apps/{app_id}/metrics"); + let response = api_post(&url, &api_key, &body)?; + + if let Some((db_path, conn)) = &db { + conn.register_table_function::("arrow")?; + ensure_metrics_table(conn)?; + + conn.execute_batch( + "CREATE TEMPORARY TABLE metrics_staging AS SELECT * FROM metrics LIMIT 0", + )?; + + let reader = StreamReader::try_new(response.into_body().into_reader(), None)?; + let col_list = reader + .schema() + .fields() + .iter() + .map(|f| f.name().as_str()) + .collect::>() + .join(", "); + let insert_sql = format!( + "INSERT INTO metrics_staging (app_id, {col_list}) \ + SELECT {app_id}, {col_list} FROM arrow(?, ?)" + ); + + const CHUNK_SIZE: usize = 2048; // DuckDB's vector size + let mut total = 0usize; + + eprint!( + "0 metrics rows written to table 'metrics' in {}...", + db_path.display() + ); + + for batch in reader { + let batch = batch?; + total += batch.num_rows(); + for offset in (0..batch.num_rows()).step_by(CHUNK_SIZE) { + let chunk = batch.slice(offset, (batch.num_rows() - offset).min(CHUNK_SIZE)); + let params = arrow_recordbatch_to_query_params(chunk); + conn.execute(&insert_sql, params)?; + } + eprint!( + "\r{total} metrics rows written to table 'metrics' in {}...", + db_path.display() + ); + } + + // Delete existing rows that overlap with the time range being inserted, + // then move staged data into the main table. + conn.execute_batch(&format!( + "DELETE FROM metrics WHERE app_id = {app_id} \ + AND period_start >= (SELECT MIN(period_start) FROM metrics_staging) \ + AND period_end <= (SELECT MAX(period_end) FROM metrics_staging); \ + INSERT INTO metrics BY NAME SELECT * FROM metrics_staging; \ + DROP TABLE metrics_staging;" + ))?; + + eprintln!("\nDone."); + } else { + io::copy(&mut response.into_body().into_reader(), &mut writer)?; + } + + Ok(()) +} + +#[cfg(test)] +mod tests { + use std::sync::Arc; + + use duckdb::arrow::array::{Float64Array, Int64Array, StringArray, TimestampMillisecondArray}; + use duckdb::arrow::datatypes::{DataType, Field, Schema, TimeUnit}; + use duckdb::arrow::ipc::writer::StreamWriter; + use duckdb::arrow::record_batch::RecordBatch; + + use super::*; + use crate::utils::open_db; + use crate::utils::test_utils::{parse_ndjson, temp_db}; + + fn sample_metrics_ndjson() -> &'static str { + "{\"period_start\":\"2025-01-01T00:00:00Z\",\"period_end\":\"2025-01-01T01:00:00Z\",\"method\":\"GET\",\"requests\":100,\"error_rate\":0.05}\n\ + {\"period_start\":\"2025-01-01T01:00:00Z\",\"period_end\":\"2025-01-01T02:00:00Z\",\"method\":\"POST\",\"requests\":50,\"error_rate\":0.1}\n" + } + + fn sample_metrics_arrow_ipc() -> Vec { + let schema = Arc::new(Schema::new(vec![ + Field::new( + "period_start", + DataType::Timestamp(TimeUnit::Millisecond, Some("UTC".into())), + false, + ), + Field::new( + "period_end", + DataType::Timestamp(TimeUnit::Millisecond, Some("UTC".into())), + false, + ), + Field::new("method", DataType::Utf8, true), + Field::new("requests", DataType::Int64, false), + Field::new("error_rate", DataType::Float64, false), + ])); + let batch = RecordBatch::try_new( + schema.clone(), + vec![ + Arc::new( + TimestampMillisecondArray::from(vec![1_735_689_600_000i64]) + .with_timezone("UTC"), + ), + Arc::new( + TimestampMillisecondArray::from(vec![1_735_693_200_000i64]) + .with_timezone("UTC"), + ), + Arc::new(StringArray::from(vec![Some("GET")])), + Arc::new(Int64Array::from(vec![100])), + Arc::new(Float64Array::from(vec![0.05])), + ], + ) + .unwrap(); + + let mut buf = Vec::new(); + let mut writer = StreamWriter::try_new(&mut buf, &schema).unwrap(); + writer.write(&batch).unwrap(); + writer.finish().unwrap(); + buf + } + + fn mock_metrics_endpoint( + server: &mut mockito::Server, + app_id: i64, + body: impl AsRef<[u8]>, + ) -> mockito::Mock { + server + .mock("POST", format!("/v1/apps/{app_id}/metrics").as_str()) + .with_status(200) + .with_body(body) + .create() + } + + #[test] + fn test_run_ndjson() { + let mut server = mockito::Server::new(); + let mock = mock_metrics_endpoint(&mut server, 1, sample_metrics_ndjson()); + + let mut buf = Vec::new(); + run( + 1, + "2025-01-01", + Some("2025-01-02"), + r#"["requests","error_rate"]"#, + Some("hour"), + Some(r#"["method"]"#), + None, + None, + None, + Some("test-key"), + Some(&server.url()), + &mut buf, + ) + .unwrap(); + mock.assert(); + + let rows = parse_ndjson(&buf); + assert_eq!(rows.len(), 2); + assert_eq!(rows[0]["method"], "GET"); + assert_eq!(rows[0]["requests"], 100); + assert_eq!(rows[1]["method"], "POST"); + assert_eq!(rows[1]["requests"], 50); + } + + #[test] + fn test_run_with_db() { + let mut server = mockito::Server::new(); + let mock = mock_metrics_endpoint(&mut server, 1, sample_metrics_arrow_ipc()); + let (_dir, db_path) = temp_db(); + + run( + 1, + "2025-01-01", + None, + r#"["requests","error_rate"]"#, + Some("hour"), + Some(r#"["method"]"#), + None, + None, + Some(&db_path), + Some("test-key"), + Some(&server.url()), + Vec::new(), + ) + .unwrap(); + mock.assert(); + + let conn = open_db(&db_path).unwrap(); + + let count: i64 = conn + .query_row("SELECT count(*) FROM metrics", [], |row| row.get(0)) + .unwrap(); + assert_eq!(count, 1); + + let (method, requests, error_rate): (String, i64, f64) = conn + .query_row( + "SELECT method, requests, error_rate FROM metrics WHERE app_id = 1", + [], + |row| Ok((row.get(0)?, row.get(1)?, row.get(2)?)), + ) + .unwrap(); + assert_eq!(method, "GET"); + assert_eq!(requests, 100); + assert!((error_rate - 0.05).abs() < f64::EPSILON); + } +} diff --git a/src/reset_db.rs b/src/reset_db.rs index bd4c6bf..4184034 100644 --- a/src/reset_db.rs +++ b/src/reset_db.rs @@ -3,7 +3,7 @@ use std::path::Path; use anyhow::Result; use crate::utils::open_db; -use crate::{apps, consumers, endpoints, request_details, request_logs}; +use crate::{apps, consumers, endpoints, metrics, request_details, request_logs}; pub fn run(db: &Path) -> Result<()> { let conn = open_db(db)?; @@ -20,6 +20,7 @@ pub fn run(db: &Path) -> Result<()> { apps::ensure_apps_tables(&conn)?; consumers::ensure_consumers_table(&conn)?; endpoints::ensure_endpoints_table(&conn)?; + metrics::ensure_metrics_table(&conn)?; request_logs::ensure_request_logs_table(&conn)?; request_details::ensure_application_logs_table(&conn)?; request_details::ensure_spans_table(&conn)?; @@ -61,6 +62,7 @@ mod tests { "apps", "consumers", "endpoints", + "metrics", "request_logs", "spans" ] From e108abfb21d5370375ed29550c6912f1cd58ff0d Mon Sep 17 00:00:00 2001 From: Simon Gurcke Date: Mon, 13 Apr 2026 17:46:50 +1000 Subject: [PATCH 2/3] Update skill --- skills/apitally-cli/SKILL.md | 131 +++++++++++---------- skills/apitally-cli/references/commands.md | 2 + 2 files changed, 73 insertions(+), 60 deletions(-) diff --git a/skills/apitally-cli/SKILL.md b/skills/apitally-cli/SKILL.md index 8dbe67e..2ff0a39 100644 --- a/skills/apitally-cli/SKILL.md +++ b/skills/apitally-cli/SKILL.md @@ -1,15 +1,19 @@ --- name: apitally-cli description: > - Retrieve and investigate API request log data from Apitally. Fetches request logs, - consumers, and app metadata via the Apitally CLI, stores data in a local - DuckDB database, and runs SQL queries to investigate issues or answer questions. - Use when the user mentions Apitally, the Apitally CLI, API request logs, or API consumers. + Retrieve and investigate API metrics and request log data from Apitally. Fetches + aggregated metrics, request logs, consumers, and app metadata via the Apitally CLI, + stores data in a local DuckDB database, and runs SQL queries to investigate issues + or answer questions. Use when the user mentions Apitally, the Apitally CLI, API + metrics, API request logs, or API consumers. --- # Apitally CLI -The Apitally CLI retrieves API request log data from [Apitally](https://apitally.io) and optionally stores it in a local DuckDB database for investigation with SQL. Each record is an individual API request with method, URL, status code, response time, consumer, headers, payloads, exceptions, and more. Request log retention is **15 days**. +The Apitally CLI retrieves API metrics and request log data from [Apitally](https://apitally.io) and optionally stores it in a local DuckDB database for investigation with SQL. Two main data sources: + +- **Metrics** — pre-aggregated data (request counts, error rates, response time percentiles, throughput). Retention: **30 days** at 1-minute intervals, **13 months** at 30-minute intervals. +- **Request logs** — individual API requests with method, URL, status code, response time, consumer, headers, payloads, exceptions, traces, and more. Retention: **15 days**. Run commands with `npx` (no install needed): @@ -52,42 +56,47 @@ All commands are run via `npx @apitally/cli `. For full details, see [r 3. **Determine the time range** — check if the user specified a time range (e.g. "last 24 hours", "since Monday", a specific date). If not, default to the last 7 days. Use this time range consistently for `--requests-since` / `--since` / `--until` flags and SQL `WHERE` conditions throughout the investigation. -4. **Fetch endpoints if needed** — skip this step unless you need to discover available endpoints to filter request logs. Fetch endpoints using the `endpoints` command: +4. **Fetch supporting data if needed** — skip unless you need endpoint discovery or consumer identification. + - **Endpoints**: use `endpoints` to discover available method/path combinations for filtering. Use `--method` and/or `--path` to filter (e.g. `--path '*users*'`). - ``` - npx @apitally/cli endpoints [--method ] [--path ] - ``` + ``` + npx @apitally/cli endpoints [--method ] [--path ] + ``` - Use `--method` and/or `--path` to filter (e.g. `--path '*users*'`). Read the NDJSON output to identify relevant endpoints, then use their method/path to filter request logs in step 6. + - **Consumers**: use `consumers` to map identifiers (emails, usernames, groups) to `consumer_id` values and vice versa, if the question involves consumers. -5. **Fetch consumers if needed** — skip this step if the investigation doesn't involve consumers. Otherwise, fetch consumers into DuckDB using the `consumers` command: + ``` + npx @apitally/cli consumers [--requests-since ""] --db + ``` - ``` - npx @apitally/cli consumers [--requests-since ""] --db - ``` + ``` + npx @apitally/cli sql "SELECT consumer_id, identifier, name, \"group\" FROM consumers WHERE app_id = AND identifier ILIKE '%@example.com'" + ``` - If the user is asking about specific consumers (e.g. by email, name, or group), query to find their `consumer_id` and use it as a filter when fetching request logs in step 6: +5. **Fetch data** — choose based on the question. Always read the [command reference](references/commands.md) for available options. + - **Metrics** — for questions that can be answered with aggregated metrics: traffic volume, error rates, response time trends, throughput, endpoint comparisons. Use `--group-by` and `--interval` to break down by environment, endpoint, consumer, status code, or time period. - ``` - npx @apitally/cli sql "SELECT consumer_id, identifier, name, \"group\" FROM consumers WHERE app_id = AND identifier ILIKE '%@example.com'" - ``` + ``` + npx @apitally/cli metrics --since "" \ + --metrics '["requests","error_rate","response_time_p50","response_time_p95"]' \ + --group-by '["method","path"]' --interval day --db + ``` -6. **Fetch request logs** into DuckDB using the `request-logs` command with time range, fields, and filters tailored to the investigation. Always read the [command reference](references/commands.md) for available fields and filters. - - ``` - npx @apitally/cli request-logs --since "" \ - --fields '' \ - --filters '' \ - --db - ``` + - **Request logs** — for questions that require individual request data: specific errors, exceptions, headers, payloads, traces, etc. Narrow down fields and use filters to avoid fetching unnecessarily large volumes of data. Refetching replaces existing records in DuckDB (no duplicates). - If filtering by endpoint, add method/path filters: `[{"field":"method","op":"eq","value":"GET"},{"field":"path","op":"eq","value":"/v1/users/{user_id}"}]` + ``` + npx @apitally/cli request-logs --since "" \ + --fields '' \ + --filters '' \ + --db + ``` - If filtering by consumers, add a consumer filter: `[{"field":"consumer_id","op":"in","value":[1,2,3]}]` + Filter by endpoint: `--filters '[{"field":"method","op":"eq","value":"GET"},{"field":"path","op":"eq","value":"/v1/users/{user_id}"}]'` + Filter by consumer: `--filters '[{"field":"consumer_id","op":"in","value":[1,2,3]}]'` - Narrow down fields and use filters as much as possible to avoid fetching unnecessarily large volumes of data. Refetching data later (e.g. with more fields) replaces existing records in DuckDB and does not create duplicates. + - **Both** — for broad investigations, start with metrics for an overview, then fetch request logs to drill into specifics. -7. **Query DuckDB** using the `sql` command — **CRITICAL: The DuckDB database is persistent and retains data from previous fetches, including other sessions. You MUST filter your SQL queries to match the scope of your current investigation.** Always include `WHERE` conditions on `app_id`, `timestamp`, and any other relevant fields. Without these filters, results will include unrelated data and will be **wrong**. +6. **Query DuckDB** using the `sql` command — **CRITICAL: The DuckDB database is persistent and retains data from previous fetches, including other sessions. You MUST filter your SQL queries to match the scope of your current investigation.** Always include `WHERE` conditions on `app_id`, `period_start`/`timestamp`, and any other relevant fields. Without these filters, results will include unrelated data and will be **wrong**. ``` npx @apitally/cli sql "SELECT method, path, status_code, COUNT(*) as n FROM request_logs WHERE app_id = AND timestamp >= '' AND status_code >= 400 GROUP BY ALL ORDER BY n DESC" @@ -95,54 +104,56 @@ All commands are run via `npx @apitally/cli `. For full details, see [r Read the [DuckDB schema reference](references/duckdb_tables.md) for available tables, columns and relationships. -8. **Iterate if needed** — refine filters, fetch additional fields (headers, bodies, exceptions), or widen the time range as needed. +7. **Iterate if needed** — refine filters, fetch additional fields (headers, bodies, exceptions), or widen the time range as needed. ## Investigation Patterns -### Inspect a specific request +### Error investigation -Use `request-details` to fetch full details (headers, body, exception, application logs, spans) for a single request: +Fetch request counts grouped by endpoint and status code to find the most frequent errors: ``` -npx @apitally/cli request-details +npx @apitally/cli metrics --since "" \ + --metrics '["requests"]' \ + --group-by '["method","path","status_code"]' \ + --filters '[{"field":"status_code","op":"gte","value":400}]' --db ``` -### Trace a consumer's activity - ```sql -SELECT r.timestamp, r.method, r.path, r.status_code, r.response_time_ms, - c.identifier, c.name as consumer_name -FROM request_logs r -JOIN consumers c ON r.app_id = c.app_id AND r.consumer_id = c.consumer_id -WHERE r.app_id = - AND r.timestamp >= '' - AND r.timestamp < '' - AND c.identifier = 'user@example.com' -ORDER BY r.timestamp DESC +SELECT method, path, status_code, sum(requests) as requests_sum +FROM metrics +WHERE app_id = + AND period_start >= '' +GROUP BY method, path, status_code +ORDER BY requests_sum DESC ``` -### Exception investigation - -Fetch with exception fields first: +Then fetch request logs for a specific error to investigate further: ``` npx @apitally/cli request-logs --since "" \ - --fields '["timestamp","request_uuid","method","path","status_code","exception_type","exception_message","exception_stacktrace"]' \ - --filters '[{"field":"status_code","op":"eq","value":500}]' \ - --db + --fields '["timestamp","request_uuid","url","status_code","response_body_json","exception_type","exception_message"]' \ + --filters '[{"field":"method","op":"eq","value":""},{"field":"path","op":"eq","value":""},{"field":"status_code","op":"eq","value":}]' \ + --limit 5 ``` -Then group by exception type: +Use `request-details` to fetch full details (headers, body, exception, application logs, spans) for a specific request: + +``` +npx @apitally/cli request-details +``` + +### Trace a consumer's activity ```sql -SELECT exception_type, exception_message, COUNT(*) as count, - MIN(timestamp) as first_seen, MAX(timestamp) as last_seen -FROM request_logs -WHERE app_id = - AND timestamp >= '' - AND exception_type IS NOT NULL -GROUP BY exception_type, exception_message -ORDER BY count DESC +SELECT r.timestamp, r.method, r.url, r.status_code, r.response_time_ms +FROM request_logs r +JOIN consumers c ON r.app_id = c.app_id AND r.consumer_id = c.consumer_id +WHERE r.app_id = + AND r.timestamp >= '' + AND r.timestamp < '' + AND c.identifier = 'user@example.com' +ORDER BY r.timestamp ASC ``` ### Query headers diff --git a/skills/apitally-cli/references/commands.md b/skills/apitally-cli/references/commands.md index 025427f..c34b81c 100644 --- a/skills/apitally-cli/references/commands.md +++ b/skills/apitally-cli/references/commands.md @@ -100,6 +100,8 @@ Fetch aggregated metrics for an app. Outputs NDJSON to stdout by default. - `--timezone`: Timezone for intervals and to interpret since/until if not tz-aware (defaults to UTC) - `--db`: Write to `metrics` table in DuckDB instead of outputting NDJSON to stdout +**Deduplication in DuckDB:** Deletes all existing rows for the same `app_id` within the fetched time range before inserting new data. + ### Available metrics | Metric | Type | Description | From 324dc2656ad59a28e5a38d23f0785290e12244c2 Mon Sep 17 00:00:00 2001 From: Simon Gurcke Date: Mon, 13 Apr 2026 18:58:40 +1000 Subject: [PATCH 3/3] Fix --- skills/apitally-cli/references/commands.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/apitally-cli/references/commands.md b/skills/apitally-cli/references/commands.md index c34b81c..a643336 100644 --- a/skills/apitally-cli/references/commands.md +++ b/skills/apitally-cli/references/commands.md @@ -270,7 +270,7 @@ Run a SQL query against a local DuckDB database. The query can be passed as an a Available tables: `apps`, `app_envs`, `consumers`, `endpoints`, `metrics`, `request_logs`, `application_logs`, `spans`. See [duckdb_tables.md](duckdb_tables.md) for schemas. -**Important:** The database may contain data from previous sessions. Always filter queries by `app_id`, `timestamp`, and other relevant fields to avoid including unrelated data. +**Important:** The database may contain data from previous sessions. Always filter queries by `app_id`, time (`timestamp` for `request_logs`, `period_start`/`period_end` for `metrics`), and other relevant fields to avoid including unrelated data. DuckDB uses a [PostgreSQL-compatible SQL dialect](https://duckdb.org/docs/stable/sql/dialect/overview).