Skip to content

feat(004): decouple API from Git storage via gRPC git service#122

Merged
juliuskrah merged 22 commits into
mainfrom
004-grpc-git-service
May 8, 2026
Merged

feat(004): decouple API from Git storage via gRPC git service#122
juliuskrah merged 22 commits into
mainfrom
004-grpc-git-service

Conversation

@juliuskrah
Copy link
Copy Markdown
Contributor

Summary

  • Introduces a versioned gRPC contract (gitstore.git.v1) between gitstore-api and gitstore-git-service, eliminating all direct filesystem and git-library access from the API
  • Replaces the shared-volume mount (git-data:/data/repos:ro) with a gRPC client (GITSTORE_GIT_GRPC) — API now holds zero repository state
  • Covers concurrent write isolation, catalogue-reload coalescing, WebSocket-triggered reload, per-RPC Prometheus metrics, and testcontainers-based integration tests in CI

Closes #65

Changes

gitstore-git-service (Rust)

  • Implemented all 8 gRPC methods: GetFile, GetFileStream, ListFiles, CommitFile, DeleteFile, CreateTag, ListTags, GetLatestTag
  • Added per-RPC Prometheus metrics (request count, latency histogram, error rate) via Tower middleware
  • Upgraded tonic dependency family to 0.14.x (tonic, tonic-prost, tonic-prost-build); added protobuf-dev to Docker builder stage
  • Fixed clippy::result_large_err and clippy::ptr_arg warnings

gitstore-api (Go)

  • Added gitclient.Client (gRPC client wrapper with retry and connection management)
  • Added catalog.GRPCLoader implementing CatalogLoader interface — replaces go-git backed Loader
  • Deleted all internal/gitclient git-library files (commit.go, push.go, pool.go, tag.go, etc.) and the dead internal/graph/mutations.go
  • Removed shared-volume mount from API service in compose.yml
  • Integration tests (testcontainers) covering: read, write, concurrent mutations, error paths, catalogue reload coalescing

Infrastructure & Docs

  • compose.yml: removed git-data:/data/repos:ro volume from API service; added GITSTORE_GIT_GRPC env var
  • .github/workflows/ci.yml: added grpc-integration-test job (builds git-service image, runs go test -tags integration)
  • docs/architecture.md: added Service Boundary section with env-var table
  • docs/developer-guide.md: replaced GITSTORE_GIT_REPO with GITSTORE_GIT_GRPC
  • specs/004-grpc-git-service/spec.md: marked status as Done

Test plan

  • cargo test --verbose in gitstore-git-service — unit tests pass
  • go test -v -race ./... in gitstore-api — all unit tests pass (service mutations, cache coalescing)
  • go test -tags integration -v -timeout 5m ./tests/integration/... — all integration tests pass against real git-service container
  • docker compose -f compose.yml up -d --build — smoke test with no shared volume mount on API service
  • Prometheus scrape of both services exposes per-RPC metrics for all 8 contract methods

🤖 Generated with Claude Code

juliuskrah and others added 11 commits May 6, 2026 15:42
Adds Spec Kit skill definitions for both .claude and .agents, along
with .specify extension configuration, Git workflow scripts, and
integration manifests. Updates AGENTS.md with the Spec Kit context
pointer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… plan

Adds full Spec Kit artifacts for GH#65 — decouple gitstore-api from git
storage via gRPC service contract:

- spec.md: feature specification superseding 002 User Story 2 / T152
- plan.md: implementation plan with technical context and phase structure
- research.md: toolchain decisions (buf, tonic, go-grpc, testcontainers)
- data-model.md: gRPC message shapes and state transitions
- contracts/gitstore.git.v1.proto: versioned service contract (8 RPCs)
- quickstart.md: developer setup and manual testing guide
- tasks.md: 59 ordered tasks across 7 phases, test-first per constitution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 1 — Proto toolchain:
- Add canonical gitstore.git.v1 proto to shared/proto/
- Add buf.yaml for lint/breaking-change enforcement
- Add buf.gen.go.yaml and buf.gen.rust.yaml templates
- Commit generated Go stubs (gitstore-api/gen/) via protoc
- Add proto-breaking CI job to .github/workflows/ci.yml

Phase 2 — Foundational gRPC skeleton:
- Add tonic/prost/prometheus/tonic-build to gitstore-git-service/Cargo.toml
- Add build.rs compiling shared proto via tonic_build
- Add grpc module (server.rs, metrics.rs) — all RPCs return UNIMPLEMENTED
- Bind gRPC server on GITSTORE_GRPC_PORT (default 50051) in main.rs
- Add grpc/protobuf/go-grpc-prometheus deps to gitstore-api/go.mod
- Add gitclient.Client wrapping GitServiceClient with prometheus interceptors

Both services compile cleanly; gRPC port bound at startup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e (Phase 3)

Read RPCs implemented in gitstore-git-service (Rust):
- GetFile: resolve ref → tree, return blob bytes
- GetFileStream: 256 KiB chunked streaming RPC
- ListFiles: walk tree under path_prefix, return FileEntry list
- GetLatestTag: semver sort, found=true/false response
- ListTags: prefix filter over all refs/tags/

gRPC read client for gitstore-api (Go):
- gitclient/read.go: ReadFile, ListFiles, GetLatestTag, ListTags
- gitclient.NewClientFromConn for testability (bufconn)

Catalogue loader rewritten to use gRPC:
- catalog/grpc_loader.go: GRPCLoader backed by GitReader interface
- catalog/loader_iface.go: CatalogLoader interface (LoadFromTag, LoadFromLatestTag)
- catalog/loader.go deleted (go-git backed loader removed from catalog package)
- cache/manager.go: accepts CatalogLoader interface, reloads via LoadFromLatestTag

Main entrypoint:
- cmd/server/main.go: GITSTORE_GIT_GRPC replaces GITSTORE_GIT_REPO
- No shared volume mount or local repo path required at startup

compose.yml:
- API service: removed volume mount, added GITSTORE_GIT_GRPC=git-service:50051
- git-service: added port 50051 and GITSTORE_GRPC_PORT=50051

Tests:
- gitclient/read_test.go: bufconn unit tests (7 cases, all pass)
- catalog/loader_test.go: mock GitReader tests (5 cases, all pass)
- tests/integration/grpc_catalogue_load_test.go: testcontainers integration test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…4 US2)

- Add CommitFile, DeleteFile, CreateTag RPC implementations to git-service
  (Rust/tonic): temp-clone → modify → commit → push back to bare repo
- Add gitclient write.go: CommitFile, DeleteFile, CreateTag client wrappers
- Add gitclient write_test.go: 6 bufconn-based unit tests
- Update graph/service.go: introduce GitWriter interface; replace go-git
  CommitBuilder/PushClient with gRPC calls throughout all product mutations
- Update graph/resolver.go and cmd/server/main.go: pass gRPC client as GitWriter
- Add graph/service_mutations_test.go: 5 unit tests with mockGitWriter
- Add tests/integration/grpc_mutations_test.go: concurrent CommitFile/DeleteFile,
  CreateTag, and no-filesystem-artefacts assertions (testcontainers-go, tag: integration)
- Delete go-git-backed gitclient/{commit,push,pool,tag,http_client,writer}.go
  and their test files — replaced entirely by gRPC client
- Delete dead-code graph/mutations.go and all three legacy mutation test files
- Run go mod tidy: go-git and all its transitive deps removed from go.sum

API now holds zero git state on disk; all write operations are atomic gRPC calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Phases 5-7)

Phase 5 (US3 - WS-triggered reload):
- Add cache/manager_test.go: unit tests for coalescing and gRPC-backed reload
- Add tests/integration/grpc_reload_test.go: tag push → WS invalidate → gRPC
  reload via testcontainers-go; verifies no shared volume needed
- Confirm cache.Manager already provides coalescing via double-checked locking;
  websocket/client.go requires no changes

Phase 6 (US4 - contract enforcement):
- Add tests/integration/grpc_errors_test.go: file not found, ref not found,
  delete nonexistent, duplicate tag, empty-repo GetLatestTag error paths
- Add tests/integration/grpc_concurrency_test.go: 10 concurrent CommitFile calls
  all produce distinct SHAs; mixed CommitFile+DeleteFile concurrency
- CI ci.yml: add grpc-integration-test job (testcontainers-go, builds image
  from source, runs go test -tags integration -timeout 5m); T048 buf breaking
  check already present
- testutil/graphql.go: add StartGRPCStub helper for in-process bufconn stubs

Phase 7 (Polish):
- Rust clippy: add #![allow(clippy::result_large_err)] for tonic Status
  return types; fix &PathBuf → &Path in verify_repo
- Go: staticcheck + go vet clean; AGPL headers on all new files verified
- go mod tidy: go-git absent from go.sum
- docs/architecture.md: add Service Boundary section with env var table
- docs/developer-guide.md: replace GITSTORE_GIT_REPO with GITSTORE_GIT_GRPC

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bump all tonic-related crates to match the 0.14 family:
- tonic 0.14.6 (was 0.12)
- tonic-prost 0.14 (new runtime codec crate, replaces codec::ProstCodec in tonic 0.12)
- tonic-prost-build 0.14 (build-dep, replaces tonic-build 0.12; API: tonic_prost_build::configure())
- tonic-health 0.14 (was 0.12)
- prost 0.14.3 (was 0.13)
- prost-types 0.14.3 (was 0.13)
- prometheus 0.14.0 (already bumped by user)

Update build.rs: tonic_build::configure() → tonic_prost_build::configure()

Run cargo fmt --all to fix inline-block style in server.rs.
All 39 unit tests pass; cargo clippy and cargo fmt --check are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…build 0.14

tonic-prost-build 0.14 uses prost-build which requires an external protoc
binary (unlike tonic-build 0.12 which bundled protoc-bin-vendored).

- apk add protobuf-dev to provide protoc in the builder stage
- Copy build.rs before the dep-caching step so proto stubs are generated
  with the current tonic version during the dependency build
- Copy shared/proto to /shared/proto (matching ../shared/proto relative path
  in build.rs) before the dep-caching step
- Touch build.rs in the final build step and remove stale build-script
  output dirs to force proto regeneration when the cargo cache has old stubs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 76aa9b1098

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread gitstore-git-service/src/grpc/server.rs Outdated
Comment thread gitstore-git-service/src/grpc/server.rs Outdated
juliuskrah and others added 4 commits May 7, 2026 23:03
- gofmt -s on grpc_catalogue_load_test.go and testutil/graphql.go
- buf-action@v1 requires an explicit version, not "latest"; pin to 1.69.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… breaking check

- tonic-prost-build 0.14 requires external protoc; add apt-get install protobuf-compiler to rust-test job
- buf breaking --against '.git#branch=main' needs full history; set fetch-depth: 0 on proto-breaking checkout

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n tests

- buf-action@v1 auto-runs 'buf build' from repo root and double-discovers
  the module; add setup_only: true and invoke buf commands manually
- Docker build fails with 'cannot find -lz' on musl static link; add
  zlib-dev and zlib-static to Alpine builder stage
- Fix grpc-integration-test Docker build: Dockerfile is at
  docker/git-service.Dockerfile, not gitstore-git-service/
- Gate integration-test and grpc-integration-test behind 'integration'
  label, main push, or merge_group so they don't run on every PR commit
- build-status treats skipped integration jobs as passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract integration-test and grpc-integration-test into a dedicated
ci-integration.yml that only triggers when gitstore-api, gitstore-git-service,
shared/proto, the git-service Dockerfile, or compose.yml change — mirroring
the pattern used by ci-admin.yml.

- Fix Docker build command to use docker/git-service.Dockerfile
- Remove integration jobs and their skipped-job workaround from ci.yml
- build-status no longer depends on the integration jobs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread .github/workflows/ci-integration.yml Fixed
Comment thread .github/workflows/ci-integration.yml Fixed
juliuskrah and others added 3 commits May 7, 2026 23:44
- Introduce GetFileStreamRequest (own message for GetFileStream RPC)
  and GetFileStreamResponse (replaces FileChunk) to satisfy buf STANDARD
  lint rules: each RPC must use unique request/response type names.
- Replace DEFAULT with STANDARD in shared/proto/buf.yaml to resolve
  the deprecation warning.
- Update gitstore-git-service/src/grpc/server.rs to reference the new
  message types (GetFileStreamRequest, GetFileStreamResponse).
- Regenerate gitstore-api/gen protobuf Go files via protoc to reflect
  the new message names.
- Add docker/api.Dockerfile to ci-integration.yml paths so the
  integration-test job triggers when the API Dockerfile changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Run cargo fmt on server.rs to wrap the long GetFileStreamStream type
  alias so it passes the `cargo fmt --all -- --check` gate.
- Make the `buf breaking` step conditional: skip when the proto file
  does not yet exist on main (first-introduction branch). This prevents
  the "couldn't find remote ref main" failure when the proto is new.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract proto lint and breaking change check into ci-proto.yml which only
triggers when shared/proto/** changes — same pattern as ci-admin.yml and
ci-integration.yml. Remove the job and its dependency from ci.yml.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread .github/workflows/ci-proto.yml Fixed
Three bugs in the git-service caused all write integration tests to fail
against a fresh container (empty bare repository):

1. commit_file called get_head_commit unconditionally — fails when no
   commits exist yet. Use .ok() and pass empty parents for first commit,
   then explicitly create refs/heads/main so the push refspec resolves.

2. init_or_open_repository initialised bare repos with HEAD pointing to
   refs/heads/master (git2 default). Set initial_head("main") so HEAD
   resolves correctly after the first push to refs/heads/main.

3. wait.ForLog("gRPC server listening") never matched — the server logs
   "gRPC server starting on" before serve() is called. Switch to
   wait.ForListeningPort("50051/tcp") which probes the TCP port directly.
   All 21 integration tests now pass locally in ~3 minutes; keep 5m timeout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@juliuskrah
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d5d0f6c79

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread gitstore-git-service/src/grpc/server.rs
Comment thread gitstore-git-service/src/grpc/server.rs
juliuskrah and others added 3 commits May 8, 2026 10:00
- Validate repo paths in CommitFile and DeleteFile to reject absolute
  paths and components containing '..', preventing path traversal
- Honor the recursive flag in ListFiles: when false, only direct
  children of the requested prefix are returned
- Add permissions: {} to ci-integration.yml and ci-proto.yml to
  satisfy GitHub Advanced Security least-privilege findings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CodeQL requires each job to declare an explicit permissions block in
addition to the workflow-level default. Add `permissions: contents: read`
to each job in ci-integration.yml and ci-proto.yml.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@juliuskrah
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ea4b25b6fd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


// Load catalog from HEAD (includes uncommitted work)
newCatalog, err := m.loader.LoadFromHEAD(ctx)
newCatalog, err := m.loader.LoadFromLatestTag(ctx)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reload cache from HEAD-equivalent after write commits

Switching Reload to LoadFromLatestTag causes post-mutation reloads to discard just-written commits unless a new release tag is created for every write. CreateProduct/UpdateProduct/DeleteProduct call cacheManager.Reload immediately after CommitFile/DeleteFile, so the cache can revert to the previous tagged snapshot and newly written entities become invisible (and follow-up operations can fail with not-found) until a later tag is published.

Useful? React with 👍 / 👎.

func (l *GRPCLoader) LoadFromTag(ctx context.Context, tag string) (*Catalog, error) {
l.logger.Info("loading catalogue from tag via gRPC", zap.String("tag", tag))

cat := NewCatalog("", tag)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve commit SHA when constructing loaded catalog

The gRPC loader initializes catalogs with an empty commit (NewCatalog("", tag)), even though commit identity is available from the gRPC read path. This regresses metadata fidelity: catalogVersion.commit and related logs no longer identify the loaded revision, which breaks commit-based traceability and any client logic expecting a concrete commit hash.

Useful? React with 👍 / 👎.

}
files.push(FileEntry {
path: full_path,
size_bytes: 0,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return real blob sizes in ListFiles responses

Each FileEntry is emitted with size_bytes: 0, which violates the contract field semantics and makes size-aware client behavior incorrect (for example, choosing unary vs streaming reads based on file size). The server already has access to the tree entry/blob metadata, so this should return actual byte counts instead of a constant zero.

Useful? React with 👍 / 👎.

Comment on lines +522 to +529
let tag_oid = repo
.tag(&req.tag_name, &target_obj, &sig, &req.message, false)
.map_err(|e| Status::internal(format!("tag: {}", e)))?;

Ok(Response::new(CreateTagResponse {
tag_name: req.tag_name,
tag_sha: tag_oid.to_string(),
}))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Emit release notifications when creating tags via gRPC

The CreateTag RPC mutates repository tags but does not publish any websocket event. API replicas currently rely on websocket git events to invalidate and reload catalog state, so tags created through this RPC will not trigger cross-replica refresh and readers can continue serving stale data until TTL expiry or an unrelated event occurs.

Useful? React with 👍 / 👎.

@juliuskrah juliuskrah added this pull request to the merge queue May 8, 2026
Merged via the queue into main with commit 10cb14c May 8, 2026
21 of 22 checks passed
@juliuskrah juliuskrah deleted the 004-grpc-git-service branch May 8, 2026 08:40
@github-project-automation github-project-automation Bot moved this from Todo to Done in GitStore Roadmap May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Initiative] Decouple API from Git Storage via gRPC Git Service

2 participants