Skip to content

Spike (v1.5.0): native GraphQL middleware (transport, subscriptions, APQ) #247

@FumingPower3925

Description

@FumingPower3925

Context

Celeris has no GraphQL story today. Unlike #246 (gRPC), GraphQL needs zero core changes — it's a JSON-over-HTTP protocol and every transport piece already exists:

Transport Celeris has it
POST application/json query c.BindJSON + c.JSON
GET ?query=…&variables=… c.Query family
Batched queries ([{…},{…}]) ✅ same JSON path
graphql-transport-ws subscriptions middleware/websocket (docs already call this out as a canonical subprotocol example)
graphql-sse subscriptions middleware/sse
APQ (Automatic Persisted Queries) cache middleware/store.KV
@defer / @stream incremental StreamWriter (once H2 response trailers land via #246)

Goal: ship middleware/graphql that wires an existing Go executor (user-chosen) into celeris ergonomics, plus the two subscription transports. A from-scratch GraphQL parser + validator + executor is explicitly out of scope for this spike — tracked separately for future evaluation if adoption justifies it.

Scope

In scope

  • Package layout: middleware/graphql/ as a separate go.mod submodule (matches middleware/protobuf, keeps the executor dependency off the core go.sum).
  • Transport adapter for HTTP queries + mutations:
    • POST application/json (standard form)
    • POST application/graphql (raw query body)
    • POST multipart/form-data (the file-upload form — spec for operations/map/files)
    • GET with query, variables, operationName, extensions
    • Batched queries: [{…},{…}] array at the top level
    • Response envelope: {"data": …, "errors": [{"message": …, "locations": [...], "path": [...], "extensions": {...}}]}
  • Executor-agnostic contract. Define a small interface so users can plug:
    • github.com/99designs/gqlgen
    • github.com/graphql-go/graphql
    • github.com/graph-gophers/graphql-go
    • github.com/wundergraph/graphql-go-tools
      The interface is basically Execute(ctx, request) Response — everything else stays inside the user's executor choice.
  • Subscription transport: graphql-transport-ws
    • Message envelopes: connection_init, connection_ack, ping / pong, subscribe, next, error, complete.
    • Implements the standard timing rules (connection_init timeout, ping/pong keep-alive).
    • Runs on top of middleware/websocket, reuses its backpressure + pause/resume hooks.
    • Graceful legacy graphql-ws (Apollo) fallback as an opt-in flag.
  • Subscription transport: graphql-sse
    • Both single-connection mode (per spec §2) and distinct-connection mode (§3).
    • Runs on top of middleware/sse. Emits next / complete events; disconnect cleanup.
  • APQ via middleware/store.KV: client-sent SHA-256 hash → full query string cache lookup. Miss → 200 with PersistedQueryNotFound error; subsequent POST with query+hash populates the store.
  • GraphiQL / Sandbox / Playground static UI endpoint (optional, opt-in).
  • Error mapping: executor error → GraphQL errors[] entry; celeris-layer errors (body too large, malformed JSON) → matching error shape so clients see a clean envelope.
  • Request-context propagation: celeris c.Context() → executor ctx.Context (so downstream tracing / auth / request-id middlewares just work).

Out of scope (explicit non-goals)

  • Dependency-free executor. A from-scratch parser + validator + executor is ~10-15 k LOC, 1-2 engineer-months, and has to re-litigate every edge case of GraphQL's null-propagation + 20+ validation rules. Defer to a follow-up spike once adoption signals justify it.
  • Schema management / codegen. That's the executor's problem, not the middleware's.
  • @defer / @stream incremental delivery over H2 multipart. Blocked on Spike (v1.5.0): native gRPC support (unary, server/client/bidirectional streaming) #246's trailer work; revisit once landed.

Conformance / test strategy

  • Transport layer: run graphql/graphql-http audit suite (~100 checks over GET/POST/headers/status codes/error envelope). Wire into CI.
  • Subscription WS: exercise enisdenjo/graphql-ws's server-side conformance harness.
  • Subscription SSE: exercise enisdenjo/graphql-sse's server-side conformance harness.
  • APQ: unit tests against an in-memory store.KV, plus one integration pass against Redis (middleware/store's existing Redis adapter).
  • Cross-engine: run the above against each engine (iouring, epoll, std) — the middleware should be engine-agnostic but the subscription transports touch Detach, so verify per-engine.

Exit criteria

  1. middleware/graphql/ submodule boots with a gqlgen-style executor adapter (pick one for the prototype — gqlgen is the most common so start there) and serves a trivial {hello} query end-to-end.
  2. graphql-http audit passes against the prototype handler (allow ≤ 2 documented exemptions if any).
  3. graphql-ws conformance harness reports green; one working subscription demo against a published event channel.
  4. graphql-sse conformance harness reports green; same subscription demo also over SSE.
  5. APQ hit/miss flow works against both NewMemoryKV and middleware/store/redisstore.
  6. Benchmark: simple query RPS on msr1 vs a baseline net/http + gqlgen server with the same schema. Goal is parity or better — the JSON fast path in celeris should help.
  7. README + godoc showing the three transport wirings.
  8. Follow-up issue filed: "Investigate dependency-free GraphQL executor (size/XL, blocked on adoption data)."

Critical files (likely touched)

  • New submodule: middleware/graphql/{go.mod, http.go, ws.go, sse.go, apq.go, errors.go, doc.go, example_test.go}.
  • No core touches expected. If any surface, flag immediately — that's a scope-creep signal.
  • middleware/websocket + middleware/sse may grow godoc examples but no API change.

Related

  • Spike (v1.5.0): native gRPC support (unary, server/client/bidirectional streaming) #246 Spike: native gRPC (same "does celeris speak real modern-API protocols" theme; they share the API-ergonomics question but have opposite effort shapes — gRPC is engine-hard, GraphQL is middleware-only).
  • middleware/store unification (v1.5.0) — APQ is a natural, non-trivial consumer of the unified KV.
  • middleware/websocket already ships graphql-transport-ws as a subprotocol example — this spike turns that doc note into an actual, conformance-tested integration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/apiPublic-facing API surfaceenhancementNew feature or requestsize/M~2-3 days of work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions