-
-
Notifications
You must be signed in to change notification settings - Fork 34.3k
Description
Bug report
Bug description:
The following shows a bug in OrderedDict when keys are mutated and a resize is triggered. This same bug does not exist with dict. It may not be entirely valid to change a key of an ordered collection, but the inconsistency between OrderedDict and dict seems to be a real problem.
full disclosure: I was investigating a strange conan2 failure using copilot (Claude). It eventually tracked down an issue with OrderedDict and created this standalone example. I will also submit a bug to the Conan project. I've tested locally that changing Conan source from using OrderedDict to dict fixes the issue I'm facing.
The remaining description and reproducible script come from copilot.
Since OrderedDict subclasses dict, code that works with dict (even when doing something ill-advised) shouldn't silently produce corrupt state when the container is swapped.
Regular dict degrades gracefully. If you mutate a key's eq and insert a "duplicate," dict simply ends up with two entries for the same logical key. Messy, but internally consistent — len(d) == len(list(d)) always holds, iteration visits every entry, and the program keeps working.
OrderedDict degrades catastrophically. The same sequence silently corrupts iteration: len(od) != len(list(od)), entries vanish from for loops, values(), items(), etc. — with no error raised. This is a much worse failure mode.
"""
Minimal reproducer for CPython OrderedDict corruption.
When an OrderedDict key's __eq__ is mutated in-place, duplicate entries
can be created. If the key is later mutated back so __eq__ returns True
between the duplicates, _odict_resize maps both linked-list nodes to the
same fast_nodes slot — the second overwrites the first. The iterator
then follows the wrong node's `next` pointer, skipping entries.
Result: len(od) != len(list(od))
Tested on CPython 3.13.11 (Windows x86_64). The linked list is intact
(all nodes connected); only the fast_nodes mapping is wrong. Regular
dict is not affected because it has no parallel linked-list index.
USAGE:
python od_bug_repro.py
"""
import sys
from collections import OrderedDict
class Key:
"""Key with mutable __eq__. hash is immutable (depends only on name)."""
__slots__ = ("name", "tag")
def __init__(self, name, tag=0):
self.name = name
self.tag = tag
def __hash__(self):
return hash(self.name)
def __eq__(self, other):
if not isinstance(other, Key):
return NotImplemented
return self.name == other.name and self.tag == other.tag
def __repr__(self):
return f"Key({self.name!r}, {self.tag})"
def reproduce():
od = OrderedDict()
# 1. Insert entries, including one that will be mutated
victim = Key("victim", tag=0)
od[victim] = "original"
for i in range(40):
od[Key(f"pkg_{i}", tag=0)] = i
# 2. Mutate victim so __eq__ no longer matches tag=0
victim.tag = 999
# 3. Insert duplicate — __eq__ is False → creates a second entry
dup = Key("victim", tag=0)
od[dup] = "duplicate"
assert len(od) == 42 # 40 + victim + dup
# 4. Revert victim — now victim.__eq__(dup) is True again
victim.tag = 0
# 5. Force dict resize (triggers _odict_resize on next access)
for i in range(80):
od[Key(f"extra_{i}", tag=0)] = i
# 6. Iterate — this triggers _odict_resize internally
n = len(od)
it = len(list(od))
print(f"Python {sys.version}")
print(f"len(od) = {n}")
print(f"len(list(od)) = {it}")
if n != it:
print(f"\n*** BUG REPRODUCED: {n - it} entries lost during iteration ***")
print("The linked list has all nodes, but _odict_resize mapped two")
print("nodes to the same fast_nodes slot. The iterator follows the")
print("wrong node, skipping entries.")
else:
print("\nNo corruption (may depend on hash seed / probe order).")
print("Try: PYTHONHASHSEED=0 python od_bug_repro.py")
# Control: regular dict has no issue
d = {}
k = Key("victim", tag=0)
d[k] = "original"
for i in range(40):
d[Key(f"pkg_{i}", tag=0)] = i
k.tag = 999
d[Key("victim", tag=0)] = "duplicate"
k.tag = 0
for i in range(80):
d[Key(f"extra_{i}", tag=0)] = i
print(f"\nControl (dict): len={len(d)}, iter={len(list(d))}")
if __name__ == "__main__":
reproduce()CPython versions tested on:
3.13
Operating systems tested on:
Windows