[OTLP Validation] OTLP Data Quality Validation - Run 26270334891 (Partial Dataset)

## A. Executive Summary

**Overall Status:** ⚠️ **WARN** - Partial dataset (in-progress run)

**Main Finding:** This validation analyzed **its own workflow run during execution**. The JSONL mirror contains only the setup span emitted at job start. Conclusion and agent spans will be emitted after this validation completes.

**Data Quality (for available spans):** ✅ **PASS** - The setup span is fully compliant with all normative requirements.

**Root Cause:** Self-referential validation timing - analyzing run 26270334891 while executing as run 26270334891.

---

## B. Trace Completeness

| Metric | Value | Status |
|--------|-------|--------|
| **Expected span count** (full run) | 3 | - |
| **Observed span count** (JSONL mirror) | 1 | ⚠️ Partial |
| **Backend query visibility** | N/A | Not queried |
| **Unique trace IDs** | 1 | ✅ Single trace |
| **Duplicate spans** | 0 | ✅ No duplicates |
| **Trace ID consistency** | ✅ Match | Environment and JSONL match |

**Trace ID:** `a063a8f94ff65d2e5b1cc09206b17571`

**Missing spans:**
- ❌ `gh-aw.agent.conclusion` - Expected at job completion
- ❌ `gh-aw.agent.agent` - Expected during agent AI execution

**Confidence:** High - This is expected behavior for in-progress runs. The setup span was emitted 1.5 seconds into job execution.

---

## C. Span Hierarchy Validation

**Status:** ⚠️ Cannot fully validate (incomplete dataset)

| Check | Status | Details |
|-------|--------|---------||
| Setup span exists | ✅ Pass | `gh-aw.agent.setup` present |
| Setup span has parent | ✅ Pass | `parentSpanId: 5b1a561f88180142` |
| Conclusion span parents under setup | ⚠️ N/A | Not yet emitted |
| Agent span parents under conclusion | ⚠️ N/A | Not yet emitted |
| Span naming pattern | ✅ Pass | `gh-aw.agent.setup` matches `gh-aw.<job>.<op>` |

**Expected hierarchy** (per §9.3):
```
Root Setup Parent (5b1a561f88180142)
└── gh-aw.agent.setup (a43292f5a34bd201) ✅
    └── gh-aw.agent.conclusion (not yet emitted) ⚠️
        └── gh-aw.agent.agent (not yet emitted) ⚠️
```

---

## D. Attribute Contract Validation

### Setup Span Attributes (§10.1)

**Required attributes:** ✅ **ALL PRESENT**

| Attribute | Value | Status |
|-----------|-------|--------|
| `gh-aw.job.name` | `agent` | ✅ |
| `gh-aw.workflow.name` | `OTLP Data Quality Validator` | ✅ |
| `gh-aw.run.id` | `26270334891` | ✅ |
| `gh-aw.run.attempt` | `1` | ✅ |
| `gh-aw.run.actor` | `mnkiefer` | ✅ |
| `gh-aw.repository` | `github/gh-aw` | ✅ |
| `gh-aw.staged` | `false` | ✅ |

**Conditional attributes present:**
- `gen_ai.system`: `github_models` ✅
- `gh-aw.engine.id`: `copilot` ✅
- `gh-aw.event_name`: `schedule` ✅
- `gh-aw.episode.id`, `gh-aw.episode.kind`, `gh-aw.hop.id`, `gh-aw.workflow_call.id` ✅

### Conclusion Span Attributes (§10.2)

⚠️ **Cannot validate** - Span not yet emitted

Expected required attributes:
- `gh-aw.run.status` (success/failure/timeout/cancelled)
- `gh-aw.error_count`
- `gh-aw.warning_count`
- `gh-aw.action_minutes`
- `gh-aw.output.item_count`
- `gh-aw.otlp.export_errors`

### Agent Span GenAI Attributes (§10.3)

⚠️ **Cannot validate** - Span not yet emitted

Expected required attributes:
- `gen_ai.system`
- `gen_ai.request.model`
- `gen_ai.operation.name` (must be `"chat"`)
- `gen_ai.usage.input_tokens`
- `gen_ai.usage.output_tokens`

### Resource Attributes (§11.1)

**Required resource attributes:** ✅ **ALL PRESENT**

| Attribute | Value | Status |
|-----------|-------|--------|
| `service.name` | `gh-aw.otlp-data-quality-validator` | ✅ |
| `service.version` | `1.0.48` | ✅ |
| `github.repository` | `github/gh-aw` | ✅ |
| `github.run_id` | `26270334891` | ✅ |
| `github.run_attempt` | `1` | ✅ |
| `github.actions.run_url` | `https://github.com/github/gh-aw/actions/runs/26270334891` | ✅ |

**Instrumentation scope:**
- `scope.name`: `gh-aw` ✅ (correct)
- `scope.version`: `1.0.48` ✅ (matches service.version)

---

## E. Export and Fan-Out Health

**Export error count:** 0 ✅

**JSONL mirror status:** ✅ Healthy
- Location: `/tmp/gh-aw/otel.jsonl`
- Spans written: 1
- No write failures

**Multi-endpoint fan-out:** Not yet observable
- Configured endpoints: Sentry, Grafana Tempo (per workflow env vars)
- Per-endpoint status: Will be determined after conclusion span exports
- Export errors file: Not present (no failures)

**Timestamp validation:**
- Start time: `2026-05-22T05:33:13Z` (valid)
- End time: `2026-05-22T05:33:14Z` (valid)
- Duration: 1.468 seconds (reasonable for setup)
- Ordering: start < end ✅
- Recency: Within 24 hours ✅

---

## F. Root-Cause Hypothesis

**Primary cause:** Self-referential validation timing

**Evidence:**
1. Validation timestamp: `2026-05-22T05:34:59Z`
2. Setup span end time: `2026-05-22T05:33:14Z`
3. Time delta: 1 minute 45 seconds after setup completed
4. Run ID being analyzed: `26270334891`
5. Run ID executing this agent: `26270334891` (same)

**Explanation:**
This validation agent is part of the workflow run it is analyzing. The JSONL mirror is updated incrementally as spans are emitted:
1. Setup span written immediately when agent job starts
2. Conclusion span written when job completes (after this validation)
3. Agent span written during AI execution (after this validation)

**Alternative explanations:**
- ❌ Data loss: No export errors, JSONL mirror healthy
- ❌ Configuration issue: All required attributes present on available span
- ❌ Export failure: Zero export errors recorded
- ❌ Backend ingestion delay: Not applicable (analyzing local mirror)

**Validation for future runs:**
To validate complete span datasets, analyze **completed** workflow runs by:
- Using `gh aw audit <run-id>` after run completion
- Downloading artifacts from past runs
- Querying backend systems for historical data

---

## G. Recommended Fixes (Prioritized)

### 1. **Improve validation timing** (High Priority)

**Current issue:** Self-referential validation captures partial datasets

**Recommendation:** Modify workflow to validate **previous** runs instead of the current run:
- Query GitHub API for recently completed runs
- Download JSONL artifacts from completed runs
- Analyze complete span datasets with all three span types

**Benefit:** Full span hierarchy and attribute validation

### 2. **Add backend reconciliation** (Medium Priority)

**Current gap:** No verification that exported spans reach Sentry/Grafana

**Recommendation:**
- Query Sentry/Grafana backends for trace ID `a063a8f94ff65d2e5b1cc09206b17571`
- Compare JSONL mirror span count vs backend-visible span count
- Verify all span attributes survived backend ingestion

**Benefit:** Confirms end-to-end export pipeline health

### 3. **Add multi-endpoint validation** (Medium Priority)

**Current gap:** Cannot verify independent fan-out to multiple backends

**Recommendation:**
- For each configured OTLP endpoint, query backend independently
- Confirm same trace ID appears in both Sentry AND Grafana
- Verify endpoint failure isolation (one endpoint failure doesn't block others)

**Benefit:** Validates multi-endpoint fan-out independence per spec §6

### 4. **Document expected behavior for in-progress runs** (Low Priority)

**Current gap:** Unclear whether partial datasets should PASS or WARN

**Recommendation:**
- Update validation logic to detect in-progress runs
- Exit early with clear messaging when analyzing self
- Or delay validation until job completion using dependent job

**Benefit:** Clearer signal on actual data quality issues

---

## H. Validation Queries and Commands

### Dataset Overview
```bash
# Total spans in JSONL mirror
jq '.resourceSpans[].scopeSpans[].spans | length' /tmp/gh-aw/otel.jsonl
# Output: 1

# Unique trace IDs
jq -r '.resourceSpans[].scopeSpans[].spans[].traceId' /tmp/gh-aw/otel.jsonl | sort -u
# Output: a063a8f94ff65d2e5b1cc09206b17571

# Span identity check (duplicates)
jq -r '.resourceSpans[].scopeSpans[].spans[] | "\(.traceId):\(.spanId)"' /tmp/gh-aw/otel.jsonl | sort | uniq -c
# Output: 1 a063a8f94ff65d2e5b1cc09206b17571:a43292f5a34bd201
```

### Span Hierarchy
```bash
# Extract span name, kind, parent relationship
jq -r '.resourceSpans[].scopeSpans[].spans[] | [.traceId, .spanId, .parentSpanId, .name, .kind] | `@tsv`' /tmp/gh-aw/otel.jsonl
# Output: a063a8f94ff65d2e5b1cc09206b17571  a43292f5a34bd201  5b1a561f88180142  gh-aw.agent.setup  1
```

### Attribute Validation
```bash
# Setup span attributes
jq '.resourceSpans[].scopeSpans[].spans[] | select(.name | endswith(".setup")) | .attributes[] | {key: .key, value: (.value | to_entries[0].value)}' /tmp/gh-aw/otel.jsonl

# Resource attributes
jq '.resourceSpans[].resource.attributes[] | {key: .key, value: (.value | to_entries[0].value)}' /tmp/gh-aw/otel.jsonl

# Instrumentation scope
jq '.resourceSpans[].scopeSpans[].scope' /tmp/gh-aw/otel.jsonl
```

### Trace ID Consistency
```bash
# Environment variable
echo $GITHUB_AW_OTEL_TRACE_ID
# Output: a063a8f94ff65d2e5b1cc09206b17571

# JSONL mirror
jq -r '.resourceSpans[].scopeSpans[].spans[].traceId' /tmp/gh-aw/otel.jsonl | sort -u
# Output: a063a8f94ff65d2e5b1cc09206b17571
```

### Export Health
```bash
# Export error count
cat /tmp/gh-aw/otlp-export-errors.count 2>/dev/null || echo "0"
# Output: 0

# Export error details
cat /tmp/gh-aw/otlp-export-errors.jsonl 2>/dev/null || echo "No export errors"
# Output: No export errors
```

### Timestamp Validation
```bash
# Extract and validate timestamps
jq -r '.resourceSpans[].scopeSpans[].spans[] | "Start: \(.startTimeUnixNano)\nEnd: \(.endTimeUnixNano)"' /tmp/gh-aw/otel.jsonl

# Convert to human-readable
START_NANO=1779427993055000000
END_NANO=1779427994523000000
START_SEC=$((START_NANO / 1000000000))
END_SEC=$((END_NANO / 1000000000))
date -u -d @$START_SEC +%Y-%m-%dT%H:%M:%SZ
date -u -d @$END_SEC +%Y-%m-%dT%H:%M:%SZ
# Output: 2026-05-22T05:33:13Z, 2026-05-22T05:33:14Z
```

---

**Normative References:**
- Spec §9.3: Trace Model and Span Hierarchy
- Spec §10.1: Setup Span Attributes
- Spec §10.2: Conclusion Span Attributes
- Spec §10.3: Agent Span GenAI Attributes
- Spec §11.1: Required Resource Attributes
- Spec §12: Trace ID Propagation and Lookup

**Validation Date:** 2026-05-22T05:34:59Z  
**Run ID:** 26270334891  
**Workflow:** OTLP Data Quality Validator  
**gh-aw Version:** 1.0.48







> Generated by [🧭 OTLP Data Quality Validator](https://github.com/github/gh-aw/actions/runs/26270334891) · ● 1.2M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fotlp-data-quality-validator%22&type=issues)
> - [x] expires  on May 29, 2026, 5:40 AM UTC

Attribute	Value	Status
`gh-aw.job.name`	`agent`	✅
`gh-aw.workflow.name`	`OTLP Data Quality Validator`	✅
`gh-aw.run.id`	`26270334891`	✅
`gh-aw.run.attempt`	`1`	✅
`gh-aw.run.actor`	`mnkiefer`	✅
`gh-aw.repository`	`github/gh-aw`	✅
`gh-aw.staged`	`false`	✅

Attribute	Value	Status
`service.name`	`gh-aw.otlp-data-quality-validator`	✅
`service.version`	`1.0.48`	✅
`github.repository`	`github/gh-aw`	✅
`github.run_id`	`26270334891`	✅
`github.run_attempt`	`1`	✅
`github.actions.run_url`	`https://github.com/github/gh-aw/actions/runs/26270334891`	✅

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OTLP Validation] OTLP Data Quality Validation - Run 26270334891 (Partial Dataset) #33943

A. Executive Summary

B. Trace Completeness

C. Span Hierarchy Validation

D. Attribute Contract Validation

Setup Span Attributes (§10.1)

Conclusion Span Attributes (§10.2)

Agent Span GenAI Attributes (§10.3)

Resource Attributes (§11.1)

E. Export and Fan-Out Health

F. Root-Cause Hypothesis

G. Recommended Fixes (Prioritized)

1. Improve validation timing (High Priority)

2. Add backend reconciliation (Medium Priority)

3. Add multi-endpoint validation (Medium Priority)

4. Document expected behavior for in-progress runs (Low Priority)

H. Validation Queries and Commands

Dataset Overview

Span Hierarchy

Attribute Validation

Trace ID Consistency

Export Health

Timestamp Validation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Value	Status
Expected span count (full run)	3	-
Observed span count (JSONL mirror)	1	⚠️ Partial
Backend query visibility	N/A	Not queried
Unique trace IDs	1	✅ Single trace
Duplicate spans	0	✅ No duplicates
Trace ID consistency	✅ Match	Environment and JSONL match

[OTLP Validation] OTLP Data Quality Validation - Run 26270334891 (Partial Dataset) #33943

Description

A. Executive Summary

B. Trace Completeness

C. Span Hierarchy Validation

D. Attribute Contract Validation

Setup Span Attributes (§10.1)

Conclusion Span Attributes (§10.2)

Agent Span GenAI Attributes (§10.3)

Resource Attributes (§11.1)

E. Export and Fan-Out Health

F. Root-Cause Hypothesis

G. Recommended Fixes (Prioritized)

1. Improve validation timing (High Priority)

2. Add backend reconciliation (Medium Priority)

3. Add multi-endpoint validation (Medium Priority)

4. Document expected behavior for in-progress runs (Low Priority)

H. Validation Queries and Commands

Dataset Overview

Span Hierarchy

Attribute Validation

Trace ID Consistency

Export Health

Timestamp Validation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions