Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
343 changes: 343 additions & 0 deletions docs/content/docs/features/data-sovereignty.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,343 @@
---
title: Data Sovereignty
description: Control where AI inference runs with data residency, compliance metadata, and API key enforcement
---

import { Callout } from "fumadocs-ui/components/callout";

Hadrian tracks where each provider processes data and can enforce geographic and compliance constraints per API key — ensuring regulated workloads stay within approved boundaries.

## Overview

Sovereignty metadata flows through three layers:

1. **Provider-level defaults** — set once, inherited by all models from that provider
2. **Model-level overrides** — override specific fields for individual models
3. **API key requirements** — enforce constraints at request time, rejecting models that don't comply

This lets you configure a provider's general posture (e.g., "Anthropic is US-based, SOC 2 certified") while overriding individual models that differ (e.g., a model served from EU infrastructure).

## Provider Configuration

Add a `sovereignty` section to any provider in `hadrian.toml`:

```toml
[providers.anthropic]
type = "anthropic"
api_key = "${ANTHROPIC_API_KEY}"

[providers.anthropic.sovereignty]
hq_country = "US"
inference_countries = ["US"]
certifications = ["soc2", "hipaa-baa", "gdpr"]
trains_on_data = false
data_retention = "30d"
license = "proprietary"
```

### Fields

| Field | Type | Description |
| --------------------- | ---------- | -------------------------------------------------------------------------- |
| `hq_country` | `string` | ISO 3166-1 alpha-2 country code of provider headquarters |
| `inference_countries` | `string[]` | Countries where model inference runs |
| `certifications` | `string[]` | Compliance certifications (see below) |
| `on_prem` | `bool` | Whether this runs on your own infrastructure |
| `trains_on_data` | `bool` | Whether the provider trains on customer data |
| `data_retention` | `string` | Data retention policy (`"none"`, `"30d"`, `"90d"`, `"1y"`, `"indefinite"`) |
| `license` | `string` | Model license (`"proprietary"`, `"apache-2.0"`, `"mit"`, etc.) |
| `notes` | `string` | Free-form notes for additional context |

### Well-Known Certifications

Use lowercase identifiers for certifications:

| Certification | Description |
| -------------- | ------------------------------------- |
| `gdpr` | EU General Data Protection Regulation |
| `hipaa` | US Health Insurance Portability Act |
| `hipaa-baa` | HIPAA Business Associate Agreement |
| `soc2` | SOC 2 Type I |
| `soc2-type2` | SOC 2 Type II |
| `iso27001` | ISO 27001 Information Security |
| `fedramp` | US FedRAMP (Moderate) |
| `fedramp-high` | US FedRAMP High |
| `pci-dss` | Payment Card Industry DSS |
| `c5` | German BSI C5 |
| `ismap` | Japan ISMAP |
| `ccpa` | California Consumer Privacy Act |
| `dpa` | Data Processing Agreement |

## Model-Level Overrides

Override specific sovereignty fields for individual models. Model values take precedence over provider defaults — `Some` wins over `None`, and a non-empty list wins over the provider's list:

```toml
[providers.anthropic.sovereignty]
hq_country = "US"
inference_countries = ["US"]
certifications = ["soc2", "hipaa-baa"]

# This model is served from EU infrastructure
[providers.anthropic.models."claude-sonnet-4-20250514".sovereignty]
inference_countries = ["DE", "FR"]
certifications = ["soc2", "hipaa-baa", "gdpr", "c5"]
```

In this example, `claude-sonnet-4-20250514` inherits `hq_country = "US"` from the provider but uses its own `inference_countries` and `certifications`.

## Enforcement

Sovereignty requirements are enforced at request time. They can be set in two places:

1. **API key requirements** — stored on the key, enforced on every request made with that key
2. **Per-request requirements** — passed as a Hadrian extension field on individual requests

When both are present, they are merged using the most restrictive combination: allowed lists are intersected, required lists are unioned, and boolean flags are OR'd.

### API Key Requirements

Restrict which models an API key can access:

```json
{
"name": "eu-only-key",
"sovereignty_requirements": {
"allowed_inference_countries": ["DE", "FR", "NL", "IE"],
"required_certifications": ["gdpr"],
"blocked_hq_countries": ["CN", "RU"]
}
}
```

### Per-Request Requirements

Pass `sovereignty_requirements` as a Hadrian extension field on `/v1/chat/completions`, `/v1/responses`, or `/v1/completions`:

```json
{
"model": "anthropic/claude-sonnet-4-20250514",
"messages": [{ "role": "user", "content": "Hello" }],
"sovereignty_requirements": {
"allowed_inference_countries": ["DE"],
"required_certifications": ["gdpr", "c5"]
}
}
```

If the resolved model doesn't satisfy the requirements, the request is rejected with a `403 sovereignty_violation` error.

### Merge Behavior

When an API key has `sovereignty_requirements` and a request also includes them, the two are merged:

| Field type | Merge strategy | Example |
| ---------------- | -------------- | ----------------------------------------------------------------- |
| Allowed lists | Intersection | Key allows `[DE, FR]`, request allows `[FR, NL]` → `[FR]` |
| Required lists | Union | Key requires `[soc2]`, request requires `[gdpr]` → `[soc2, gdpr]` |
| Blocked lists | Union | Key blocks `[CN]`, request blocks `[RU]` → `[CN, RU]` |
| Boolean requires | OR | Either source requiring `true` → `true` |

### Requirement Fields

| Field | Type | Description |
| ----------------------------- | ---------- | --------------------------------------------------- |
| `allowed_inference_countries` | `string[]` | Only allow models with inference in these countries |
| `require_on_prem` | `bool` | Only allow on-premises providers |
| `required_certifications` | `string[]` | Provider must have **all** of these certifications |
| `require_open_weights` | `bool` | Only allow open-weight models |
| `blocked_hq_countries` | `string[]` | Block providers headquartered in these countries |
| `allowed_licenses` | `string[]` | Only allow models with these licenses |

<Callout type="info">
All requirement fields are optional. Only set the constraints you need — unset fields impose no
restriction.
</Callout>

## Dynamic Providers

User-added providers (via the admin API or self-service) also support sovereignty metadata:

```http
POST /admin/v1/providers
Content-Type: application/json

{
"name": "eu-llm",
"type": "open_ai",
"base_url": "https://eu-llm.example.com/v1",
"api_key": "sk-...",
"sovereignty": {
"hq_country": "DE",
"inference_countries": ["DE"],
"certifications": ["gdpr", "c5", "iso27001"],
"on_prem": true,
"trains_on_data": false,
"data_retention": "none"
}
}
```

This metadata is stored in the database alongside the provider configuration and appears in the model picker and `/v1/models` responses.

## Model Picker

The Studio UI model picker displays sovereignty metadata and supports filtering:

- **Country filter** — dropdown filters models by inference country, populated from all available models' metadata
- **On-Prem filter** — chip filter shows only on-premises models
- **Details panel** — click the info icon on any model to see its full sovereignty metadata including HQ country, inference locations, certifications, data retention, and license

## `/v1/models` Response

Sovereignty metadata appears on each model in the standard models list:

```json
{
"id": "anthropic/claude-sonnet-4-20250514",
"object": "model",
"sovereignty": {
"hq_country": "US",
"inference_countries": ["DE", "FR"],
"certifications": ["soc2", "hipaa-baa", "gdpr", "c5"],
"trains_on_data": false,
"data_retention": "30d",
"license": "proprietary"
}
}
```

The `sovereignty` field is only present when at least one sovereignty field is set on the provider or model.

## Custom Metadata Fields

Beyond the built-in fields, define your own sovereignty metadata fields. Custom fields are defined globally in the config and then set per-provider or per-model.

### Define Fields

Add a top-level `[sovereignty]` section to `hadrian.toml`:

```toml
[[sovereignty.custom_fields]]
key = "data_residency"
title = "Data Residency"
description = "Where customer data is physically stored"

[[sovereignty.custom_fields]]
key = "audit_frequency"
title = "Audit Frequency"
description = "How often security audits are conducted"

[[sovereignty.custom_fields]]
key = "encryption_standard"
title = "Encryption Standard"
description = "Encryption standard used for data at rest"
```

### Set Values

Set custom field values on providers and models using `sovereignty.custom`:

```toml
[providers.eu-llm.sovereignty]
hq_country = "DE"
inference_countries = ["DE"]

[providers.eu-llm.sovereignty.custom]
data_residency = "EU (Frankfurt)"
audit_frequency = "Quarterly"
encryption_standard = "AES-256"

# Model-level override
[providers.eu-llm.models."llama-3.1-70b".sovereignty.custom]
data_residency = "EU (Paris)"
```

Model custom values override provider custom values for the same key. Provider values are inherited for keys the model doesn't set.

### Display

Custom fields appear in the model picker details panel alongside the built-in sovereignty fields. The `title` from the field definition is used as the label — if no definition matches a key, the raw key is shown.

Custom field values are also included in the `/v1/models` response:

```json
{
"sovereignty": {
"hq_country": "DE",
"inference_countries": ["DE"],
"custom": {
"data_residency": "EU (Frankfurt)",
"audit_frequency": "Quarterly",
"encryption_standard": "AES-256"
}
}
}
```

## Complete Example

A multi-provider setup with sovereignty enforcement for a European regulated deployment:

```toml
[providers.anthropic]
type = "anthropic"
api_key = "${ANTHROPIC_API_KEY}"

[providers.anthropic.sovereignty]
hq_country = "US"
inference_countries = ["US"]
certifications = ["soc2", "hipaa-baa"]
trains_on_data = false
data_retention = "30d"
license = "proprietary"

[providers.eu-llm]
type = "open_ai"
base_url = "https://eu-inference.example.com/v1"
api_key = "${EU_LLM_API_KEY}"

[providers.eu-llm.sovereignty]
hq_country = "DE"
inference_countries = ["DE"]
certifications = ["gdpr", "c5", "iso27001", "soc2"]
on_prem = true
trains_on_data = false
data_retention = "none"

[providers.self-hosted]
type = "open_ai"
base_url = "http://vllm.internal:8000/v1"

[providers.self-hosted.sovereignty]
inference_countries = ["DE"]
on_prem = true
trains_on_data = false
data_retention = "none"
license = "apache-2.0"
```

Create an API key that only allows EU-based, GDPR-compliant models:

```bash
curl -X POST http://localhost:8080/admin/v1/api-keys \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-d '{
"name": "eu-regulated-workload",
"sovereignty_requirements": {
"allowed_inference_countries": ["DE", "FR", "NL", "IE"],
"required_certifications": ["gdpr"],
"blocked_hq_countries": ["CN", "RU"]
}
}'
```

With this key, requests to `anthropic/claude-sonnet-4-20250514` (US inference) would be rejected, while `eu-llm/*` and `self-hosted/*` models would work.

## Next Steps

- [Provider Configuration](/docs/configuration/providers) — Full provider setup reference
- [Data Privacy & GDPR](/docs/features/data-privacy) — Data export, deletion, and retention
- [Authorization](/docs/features/authorization) — CEL-based access control policies
- [Multi-Tenancy](/docs/features/multi-tenancy) — Organization and project hierarchy
4 changes: 3 additions & 1 deletion docs/content/docs/features/meta.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@
"chat-ui",
"multi-tenancy",
"budgets",
"---Security---",
"---Security & Compliance---",
"sso-admin-guide",
"saml",
"scim",
"authorization",
"data-privacy",
"data-sovereignty",
"---Advanced---",
"knowledge-bases",
"chat-modes",
Expand Down
4 changes: 4 additions & 0 deletions migrations_sqlx/postgres/20250101000000_initial.sql
Original file line number Diff line number Diff line change
Expand Up @@ -562,6 +562,8 @@ CREATE TABLE IF NOT EXISTS api_keys (
-- Per-key rate limit overrides (null = use global defaults)
rate_limit_rpm INTEGER,
rate_limit_tpm INTEGER,
-- Sovereignty requirements (data residency constraints for this key)
sovereignty_requirements JSONB,
-- Status timestamps
revoked_at TIMESTAMPTZ,
expires_at TIMESTAMPTZ,
Expand Down Expand Up @@ -604,6 +606,8 @@ CREATE TABLE IF NOT EXISTS dynamic_providers (
config JSONB,
-- Supported models (JSON array)
models JSONB NOT NULL DEFAULT '[]',
-- Sovereignty metadata (data residency, compliance requirements)
sovereignty JSONB,
is_enabled BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
Expand Down
4 changes: 4 additions & 0 deletions migrations_sqlx/sqlite/20250101000000_initial.sql
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,8 @@ CREATE TABLE IF NOT EXISTS api_keys (
-- Per-key rate limit overrides (null = use global defaults)
rate_limit_rpm INTEGER,
rate_limit_tpm INTEGER,
-- Sovereignty requirements (data residency constraints for this key)
sovereignty_requirements TEXT,
-- Status timestamps
revoked_at TEXT,
expires_at TEXT,
Expand Down Expand Up @@ -558,6 +560,8 @@ CREATE TABLE IF NOT EXISTS dynamic_providers (
config TEXT,
-- Supported models (JSON array)
models TEXT NOT NULL DEFAULT '[]',
-- Sovereignty metadata (data residency, compliance requirements)
sovereignty TEXT,
is_enabled INTEGER NOT NULL DEFAULT 1,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now')),
Expand Down
Loading
Loading