gh-142048: Fix quadratically increasing GC delays#142051
gh-142048: Fix quadratically increasing GC delays#142051colesbury merged 8 commits intopython:mainfrom
Conversation
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Thinking about this solution a little more, perhaps an alternative strategy that's more aligned with the original intent would be to only update the global counter to be non-negative, allowing the local ones to be negative. I'm not sure how to do that with the atomics though. I think I have a better intuition about the core bug though: the compounding happens because garbage collection first resets the global count to zero, then frees objects. Each freed object calls |
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Adjust allocation count handling to prevent negative values and update GC state accordingly.
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Co-authored-by: Sam Gross <colesbury@gmail.com>
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Please add a NEWS blurb. You can either use the blurb-it webapp mentioned above or the blurb command. (I find the blurb command via |
Python/pystate.c
Outdated
| #ifdef Py_GIL_DISABLED | ||
| // Flush the thread's local GC allocation count to the global count | ||
| // before the thread state is deleted, otherwise the count is lost. | ||
| _Py_atomic_add_int(&tstate->interp->gc.young.count, | ||
| (int)tstate_impl->gc.alloc_count); | ||
| tstate_impl->gc.alloc_count = 0; | ||
| #endif | ||
|
|
There was a problem hiding this comment.
Let's remove this for now. It's not essential for the fix and we can add it back correctly in a subsequent PR.
This is causing a thread sanitizer failure. You can see it in the job if you expand the "Display logs" section:
WARNING: ThreadSanitizer: data race (pid=19505)
Atomic write of size 4 at 0x55bb91207c44 by thread T52:
#0 _Py_atomic_add_int /home/runner/work/cpython/cpython/./Include/cpython/pyatomic_gcc.h:15:10 (python+0x63520e) (BuildId: c37d32af53b8782ca7323d5cb2febaf42023f6ab)
#1 tstate_delete_common /home/runner/work/cpython/cpython/Python/pystate.c:1816:5 (python+0x63520e)
#2 _PyThreadState_DeleteCurrent /home/runner/work/cpython/cpython/Python/pystate.c:1893:5 (python+0x635582) (BuildId: c37d32af53b8782ca7323d5cb2febaf42023f6ab)
...
Previous write of size 4 at 0x55bb91207c44 by main thread:
#0 gc_collect_internal /home/runner/work/cpython/cpython/Python/gc_free_threading.c:2240:33 (python+0x5940bb) (BuildId: c37d32af53b8782ca7323d5cb2febaf42023f6ab)
#1 gc_collect_main /home/runner/work/cpython/cpython/Python/gc_free_threading.c:2414:5 (python+0x59199c) (BuildId: c37d32af53b8782ca7323d5cb2febaf42023f6ab)
#2 _PyGC_Collect /home/runner/work/cpython/cpython/Python/gc_free_threading.c:2723:12 (python+0x591cb7) (BuildId: c37d32af53b8782ca7323d5cb2febaf42023f6ab)
...
The problem is that gc_collect_internal reads this during a stop the world pause, but this thread is no longer considered active (i.e., we have a "detached" thread state that's getting deleted).
We could move this to PyThreadState_Clear(), but let's just not include it for now. I'd like to keep the critical fix small for backporting and "losing" a few local counts probably doesn't matter.
Removed flushing of thread's local GC allocation count before thread state deletion.
|
Thanks @kevmo314 for the PR, and @colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14. |
) The GC for the free threaded build would get slower with each collection due to effectively double counting objects freed by the GC. (cherry picked from commit eb89286) Co-authored-by: Kevin Wang <kevmo314@gmail.com>
|
Sorry, @kevmo314 and @colesbury, I could not cleanly backport this to |
|
GH-142166 is a backport of this pull request to the 3.14 branch. |
…ngh-142051) The GC for the free threaded build would get slower with each collection due to effectively double counting objects freed by the GC. (cherry picked from commit eb89286) Co-authored-by: Kevin Wang <kevmo314@gmail.com>
|
GH-142167 is a backport of this pull request to the 3.13 branch. |
This comment was marked as resolved.
This comment was marked as resolved.
|
I think the buildbot failure is unrelated. The change only affects the free threaded build and the buildbot isn't for free threaded Python. |
Fixes quadratically increasing GC delays in the free-threaded build. This is done by updating the local to global allocation count propagation to require the global count to be non-negative. This matches the existing behavior in non-free-threaded
gc.c. Additionally, adds a flush that I found was causing allocations to be lost.