Skip to content

added thread lock to avoid conflicts between GC and parameter#175

Merged
rozyczko merged 6 commits intodevelopfrom
dict_size_changed_fix
Dec 17, 2025
Merged

added thread lock to avoid conflicts between GC and parameter#175
rozyczko merged 6 commits intodevelopfrom
dict_size_changed_fix

Conversation

@rozyczko
Copy link
Copy Markdown
Member

Fixed the RuntimeError: dictionary changed size during iteration issue in the vertices method by implementing thread-safety: iterations handling modifications during garbage collection.
This seems to have been a concurrency problem with weak references, where the dictionary is being modified during iteration.

Take the attached script (.txt -> .py cause GitHub)
run_test_suite.txt

Run it in say, EasyReflectometryLib or corelib

This now runs fine.

Root Cause

The Map class uses weakref.WeakValueDictionary() to store object references. When:

  1. generate_unique_name() is called (e.g., when creating a new Parameter)
  2. It calls self.map.vertices() which returns list(self._store.keys())
  3. During the iteration over .keys(), the garbage collector may run
  4. GC cleans up objects with no remaining strong references
  5. This removes entries from the WeakValueDictionary
  6. Python raises RuntimeError: dictionary changed size during iteration

This is particularly likely to happen when:

  • Many temporary Parameter objects are created (e.g., in Lattice.volume property)
  • The GC runs during the iteration
  • Multiple objects are being created/destroyed in quick succession

@rozyczko rozyczko marked this pull request as ready for review December 13, 2025 10:41
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request does not contain a valid label. Please add one of the following labels: ['[scope] bug', '[scope] enhancement', '[scope] documentation', '[scope] significant', '[scope] maintenance']

@rozyczko rozyczko added [scope] bug Bug report or fix (major.minor.PATCH) [priority] highest Urgent. Needs attention ASAP labels Dec 13, 2025
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.15%. Comparing base (29e696e) to head (5b6a217).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #175      +/-   ##
===========================================
+ Coverage    80.80%   81.15%   +0.35%     
===========================================
  Files           52       52              
  Lines         4261     4267       +6     
  Branches       739      740       +1     
===========================================
+ Hits          3443     3463      +20     
+ Misses         633      624       -9     
+ Partials       185      180       -5     
Flag Coverage Δ
unittests 81.15% <100.00%> (+0.35%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/easyscience/global_object/map.py 96.92% <100.00%> (+7.50%) ⬆️

@rozyczko
Copy link
Copy Markdown
Member Author

Nooope. Going back to the thread lock.

gc.disable() affects the GLOBAL garbage collector state across the whole process (all threads).
We use corelib in libraries (same thread) and in GUIs (in a separate - non-GUI thread).
Disabling GC in a GUI thread can have unintented consequences, the least being slowing down due to memory buildup.

@damskii9992
Copy link
Copy Markdown
Contributor

gc.disable() affects the GLOBAL garbage collector state across the whole process (all threads). We use corelib in libraries (same thread) and in GUIs (in a separate - non-GUI thread). Disabling GC in a GUI thread can have unintented consequences, the least being slowing down due to memory buildup.

Did you encounter issues in the application when you tried it out? It is only a (VERY) temporary disabling of GC, so memory buildup shouldn't occur.

@rozyczko
Copy link
Copy Markdown
Member Author

gc.disable() affects the GLOBAL garbage collector state across the whole process (all threads). We use corelib in libraries (same thread) and in GUIs (in a separate - non-GUI thread). Disabling GC in a GUI thread can have unintented consequences, the least being slowing down due to memory buildup.

Did you encounter issues in the application when you tried it out? It is only a (VERY) temporary disabling of GC, so memory buildup shouldn't occur.

Doesn't matter. This is still a valid issue - we shouldn't do things in the library, which can globally affect behaviour of the code linked to it?

with self._lock:
if item_id in self._store.keys():
return self._store[item_id]
raise ValueError('Item not in map.')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to avoid this entire error if we just use normal introspection of the dictionary rather than first get the keys:

if item_id in self_store:
   return self._store[item_id]

Then we're not iterating over values and thus don't need this (it should also be substantially faster).

Copy link
Copy Markdown
Member Author

@rozyczko rozyczko Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'''
• Key existence vs. value availability:
item_id in self._store checks only if the key is present. If the value has been collected, the key will also be removed, so you won’t get a dangling key.
• Iteration safety:
Iterating over self._store.keys() or items() is safe, but note that the contents may shrink during iteration if garbage collection occurs.
• Direct access:
self._store[item_id] will raise KeyError if the object was collected.
'''

So even with direct query we will get the same issue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you might be right.
Querying the dict directly with the KeyError except check should be enough. Let's check this with the test script

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, was that a ChatGPT prompt output? It obviously hallucinates. I literally said that its safe to iterate over self._store.keys(), which is what is causing our error.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is what you get when you take AI verbatim. Some good points but all covered in nonsense.

Uses a threading lock to prevent RuntimeError from WeakValueDictionary.
"""
with self._lock:
return vertex.unique_name in self._store.keys()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we can avoid the lock and this issue if we don't call the keys() method. Just use regular inspection.

self.__type_dict[name].type = obj_type
with self._lock:
name = obj.unique_name
if name in self._store.keys():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

del self.__type_dict[vertex1][self.__type_dict[vertex1].index(vertex2)]

def prune(self, key: str):
if key in self.__type_dict.keys():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And again here.

for vertex in self.vertices():
self.prune(vertex)
gc.collect()
self.__type_dict = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, can't we get rid of this method and just use the built-in clear method of a regular dictionary?

@damskii9992 damskii9992 linked an issue Dec 17, 2025 that may be closed by this pull request
@damskii9992
Copy link
Copy Markdown
Contributor

I really like the retry look implementation for the vertices method. If this works I'd say its' good to merge :)
Although it looks like codecov wants more tests.

@rozyczko rozyczko merged commit 6e823b2 into develop Dec 17, 2025
35 checks passed
@rozyczko rozyczko deleted the dict_size_changed_fix branch December 17, 2025 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[priority] highest Urgent. Needs attention ASAP [scope] bug Bug report or fix (major.minor.PATCH)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RuntimeError: dictionary changed size during iteration

3 participants