Skip to content

Go SDK: stop logging fetched edge workloads verbatim#68355

Open
drewrukin wants to merge 2 commits into
apache:mainfrom
drewrukin:ss-290e-redact-edge-worker-logs
Open

Go SDK: stop logging fetched edge workloads verbatim#68355
drewrukin wants to merge 2 commits into
apache:mainfrom
drewrukin:ss-290e-redact-edge-worker-logs

Conversation

@drewrukin

Copy link
Copy Markdown

This changes the Go edge worker to stop logging fetched jobs as raw structs in fetchJob.

Instead of logging the full API response and decoded workload, the worker now logs only non-sensitive execution identifiers and metadata. This keeps the fetch log useful for debugging while avoiding accidental disclosure of token- bearing workload fields.

A regression test was added to verify that fetchJob still preserves the workload token for execution, but does not emit it to logs.

Was generative AI tooling used to co-author this PR?
  • Yes (Codex)

Generated-by: Codex following the guidelines

@boring-cyborg

boring-cyborg Bot commented Jun 10, 2026

Copy link
Copy Markdown

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example Dag that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@potiuk

potiuk commented Jun 12, 2026

Copy link
Copy Markdown
Member

Nice cleanup — the structured buildFetchedJobLogAttrs is the right approach, and the regression test asserting the token is still returned but never logged is exactly what this needs. 👍

One thing I think this misses: there's a third site that still logs the workload verbatim, at Debug level —

// go-sdk/edge/worker.go
case workload := <-workloadsChan:
    w.logger.Debug("Got allocation", "workload", workload)

workload.ExecuteTaskWorkload carries the Token, and ExecuteTaskWorkload has no LogValue(), so with debug logging enabled the token still ends up in the logs there. Given the PR title is "stop logging fetched edge workloads verbatim", it'd be good to cover this one too.

Two options:

  1. Log only safe identifiers here as well (same shape as the new helper), or
  2. Add a redacting slog.LogValuer (LogValue()) on ExecuteTaskWorkload that masks Token — so any current or future log site that logs the workload is safe by construction, not just the ones we remember to fix.

I'd lean toward (2) as the more durable fix, optionally combined with (1) for this site. Minor nit: buildFetchedJobLogAttrs reconstructs an EdgeJobFetched from resp just to read a few fields — you could pass resp directly.

Thanks for the careful work on this!

@ashb ashb left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use https://pkg.go.dev/log/slog?ref=blog.arcjet.com#LogValuer -- it's built in to slog.

@drewrukin

Copy link
Copy Markdown
Author

@ashb @potiuk thanks for the careful review!

Addressed in c3d96ef. I updated the worker logging to use slog.LogValuer-backed safe wrappers, covered the remaining debug-level Got allocation path, and extended the regression test so it now checks that the token is not logged in either the fetch or allocation paths.

@potiuk potiuk requested a review from ashb June 17, 2026 11:56
@drewrukin drewrukin force-pushed the ss-290e-redact-edge-worker-logs branch from c3d96ef to ee4d12a Compare June 17, 2026 13:58
@potiuk potiuk modified the milestone: Airflow 3.3.0 Jun 21, 2026
@eladkal eladkal added this to the Go SDK 1.0 milestone Jun 21, 2026
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:go-sdk ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants