-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
gh-135552: Make the GC clear weakrefs later. #136189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Clear the weakrefs to unreachable objects after finalizers are called.
I can confirm this PR fixes the gh-132413 issue as well. |
I think this fixes (or mostly fixes) gh-91636 as well. |
🤖 New build scheduled with the buildbot fleet by @nascheme for commit 12f0b5c 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F136189%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
This introduces refleaks, it seems. One of the leaking tests: |
This is the smallest leaking example I could make so far. Something with
|
Other leaking examples (on Windows): 1. test_logging:import logging
import logging.config
import logging.handlers
from multiprocessing import Queue, Manager
class ConfigDictTest(unittest.TestCase):
def test_multiprocessing_queues_XXX(self):
config = {
'version': 1,
'handlers' : {
'spam' : {
'class': 'logging.handlers.QueueHandler',
'queue': Manager().Queue() , # Leak
# 'queue': Manager().JoinableQueue() # Leak
# 'queue': Queue(), # No leak
},
},
'root': {'handlers': ['spam']}
}
logger = logging.getLogger()
logging.config.dictConfig(config)
while logger.handlers:
h = logger.handlers[0]
logger.removeHandler(h)
h.close() 2. test_interpreters.test_api:import contextlib
import threading
import types
from concurrent import interpreters
def func():
raise Exception('spam!')
@contextlib.contextmanager
def captured_thread_exception():
ctx = types.SimpleNamespace(caught=None)
def excepthook(args):
ctx.caught = args
orig_excepthook = threading.excepthook
threading.excepthook = excepthook
try:
yield ctx
finally:
threading.excepthook = orig_excepthook
class TestInterpreterCall(unittest.TestCase):
def test_call_in_thread_XXX(self):
interp = interpreters.create()
call = (interp._call, interp.call)[1] # 0: No leak, 1: Leak
with captured_thread_exception() as _:
t = threading.Thread(target=call, args=(interp, func, (), {}))
t.start()
t.join() |
The majority (maybe all) of these leaks are caused by the |
This avoids breaking tests while fixing the issue with tp_subclasses. In the long term, it would be better to defer the clear of all weakrefs, not just the ones referring to classes. However, that is a more distruptive change and would seem to have a higher chance of breaking user code. So, it would not be something to do in a bugfix release.
Nope, that doesn't fix all the leaks. And having to explicitly clean the weakrefs from the WeakValueDictionary really shouldn't be needed, I think. The For the purposes of having a fix that we can backport (should probably be backported to all maintained Python versions), a less disruptive fix would be better. To that end, I've changed this PR to only defer clearing weakrefs to class objects. That fixes the |
🤖 New build scheduled with the buildbot fleet by @nascheme for commit 2f3daba 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F136189%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
We need to clear those before executing the callback. Since this ensures they can't run a second time, we don't need _PyGC_SET_FINALIZED(). Revise comments.
Ah, the I revised this PR to be something that is potentially suitable for backporting. To minimize the behaviour change, I'm only deferring the clear of weakrefs that refer to types (in order to allow tp_subclasses to work) or with callbacks. I still have an extra clearing pass that gets done after the finalizers are called. That avoids bugs like #91636. If this gets merged, I think we should consider deferring clearing all weakrefs (without callbacks) until after finalizers are executed. I think that's more correct since it gives the finalizers a better opportunity to do their job. |
* *not* be cleared so that caches based on the type version are correctly | ||
* invalidated (see GH-135552 as a bug caused by this). | ||
*/ | ||
clear_weakrefs(&final_unreachable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ough! Nice trick with final_unreachable
!
This is a bit trickly because the wrlist can be changing as we iterate over it.
FWIW, current approach for clearing weakrefs immediately before |
Thanks a lot for the PR @nascheme! I have made a pass over the code and couldn't find anything incorrect (apart from the leaks and whatnot) but I would like to do another pass or at least discuss how are are now avoiding resurrection from the callbacks since I assume now is possible that we have callbacks triggering AFTER we handle resurrected objects so I would like to make sure that the bug we fixed in #16687 doesn't come back in some weird form |
It was always the case that finalizers (both del methods and weakref callbacks) can trigger during the What needs to be true is that the unreachable set is not accessible by any finalizer that can run during This kind of change can definitely have some extra scrutiny. The interaction of weakrefs and GC has caused so many bugs already. I don't feel all that confident that this doesn't introduce or re-introduce a bug. Adding some additional unit tests, as Sergey suggests would be a good idea. |
Do you need some hands on this? |
Sure! Adding tests based on the reproducer scripts in gh-135552 would be a good start. |
Should I open a new PR? I'm not sure if I can/might push to this branch. |
Yes, I think a PR just for additional unit tests would be the easiest way to manage it. |
I think this PR, as is, will no longer fix GH-132413. Only weakrefs with callbacks or weakrefs to classes are deferred from clearing. GH-132413 is caused by a weakref to a module object being cleared early. Depending on how serious we think that bug is, we could change this PR back to deferring the clear of all weakrefs without callbacks. Doing that doesn't break many unit tests (just a couple in |
Weakref can be deferred from clearing if a dummy callback is given, right? If so, the following result on the PR is unexpected to me: import itertools as mod
callback = lambda wr: None
def gen():
import weakref
wr = weakref.ref(mod, callback)
print(1, wr())
try:
yield
finally:
print(2, wr())
it = gen()
next(it)
print('//')
I guess gh-132413 could have another backporable workaround like this. |
Fix two bugs related to the interaction of weakrefs and the garbage
collector. The weakrefs in the
tp_subclasses
dictionary are needed inorder to correctly invalidate type caches (for example, by calling
PyType_Modified()
). Clearing weakrefs before calling finalizers causesthe caches to not be correctly invalidated. That can cause crashes since the
caches can refer to invalid objects. This is fixed by deferring the
clearing of weakrefs to classes and without callbacks until after finalizers
are executed.
The second bug is caused by weakrefs created while running finalizers. Those
weakrefs can be outside the set of unreachable garbage and therefore survive
the
delete_garbage()
phase (wheretp_clear()
is called on objects).Those weakrefs can expose to Python-level code objects that have had
tp_clear()
called on them. See GH-91636 as an example of this kind ofbug. This is fixed be clearing all weakrefs to unreachable objects after
finalizers have been executed.
datetime.timedelta
(possibly from datetime's Cdelta_new
) #132413