fix(sdk): regenerate service.instance.id post-fork in MeterProvider and TracerProvider#5000
fix(sdk): regenerate service.instance.id post-fork in MeterProvider and TracerProvider#5000sterchelen wants to merge 2 commits intoopen-telemetry:mainfrom
Conversation
| This hook runs post-fork in each worker and replaces service.instance.id | ||
| with a fresh UUID, ensuring each worker is a distinct instance. | ||
| """ | ||
| self._sdk_config.resource = self._sdk_config.resource.merge( |
There was a problem hiding this comment.
We are not setting this in the resource attributes, also not sure the issue is metrics specific.
There was a problem hiding this comment.
We are not setting this in the resource attributes
Not sure to get your point but if the user didn't set it, workers would be completely indistinguishable in the backend after fork, which is worse than having no ID at all. By always generating one post-fork we ensure every worker has a distinct identity regardless of whether the user configured it upfront.
On the second point: agreed, the problem affects both metrics and traces 👍🏼
There was a problem hiding this comment.
My point is that if we have a solution that work fine after forks() since the semantic conventions is now stable we can add this to the default resource detector. This way we have the very same attributes before and after fork.
63676db to
70b6fff
Compare
The Python SDK did not auto-generate service.instance.id, unlike the Java SDK and the stable semantic convention recommendation. Add it to _DEFAULT_RESOURCE so every process gets a unique instance identity at startup without any user configuration.
…nd TracerProvider When a prefork server (e.g. gunicorn) forks workers, all workers inherit the same Resource from the master process, including the same service.instance.id. Register an os.register_at_fork(after_in_child=...) hook on both MeterProvider and TracerProvider that replaces service.instance.id with a fresh UUID in each forked worker, ensuring distinct resource identities without any user configuration. Resource.merge() preserves all other resource attributes. WeakMethod is used for the hook reference, consistent with the existing pattern in PeriodicExportingMetricReader and BatchSpanProcessor. Fixes: open-telemetry#4390 Related: open-telemetry#3885
70b6fff to
3256bf0
Compare
Description
When a prefork server (e.g. gunicorn) forks workers, all workers inherit the same
Resourcefrom the master process — including the sameservice.instance.id. The SDK already restarts background threads post-fork (PeriodicExportingMetricReader,BatchSpanProcessor) but never updates the resource identity. This causes metric and trace collisions in OTLP backends where multiple workers exporting with the same resource identity result in incorrect aggregation (last-write-wins) instead of correct summation.This PR registers an
os.register_at_fork(after_in_child=...)hook on bothMeterProviderandTracerProviderthat replacesservice.instance.idwith a fresh UUID in each forked worker. All other resource attributes are preserved viaResource.merge().WeakMethodis used for the hook reference, consistent with the existing pattern inPeriodicExportingMetricReaderandBatchSpanProcessor.Fixes #4390
Related: #3885
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Two new test files covering both providers:
tests/metrics/test_meter_provider_fork.pytests/trace/test_tracer_provider_fork.pyEach covers:
_at_fork_reinitassigns a newservice.instance.idOther resource attributes are preserved
A real fork() produces a distinct ID in the child vs the parent
4 concurrent forks each produce a unique ID
Run with:
pytest opentelemetry-sdk/tests/metrics/test_meter_provider_fork.pypytest opentelemetry-sdk/tests/trace/test_tracer_provider_fork.pyDoes This PR Require a Contrib Repo Change?
Checklist: