Kaydet (Commit) c978633e authored tarafından Raymond Hettinger's avatar Raymond Hettinger

Futher improvements to frozenset hashing (based on Yitz Gale's battery of

tests which nicely highly highlight weaknesses).

* Initial value is now a large prime.
* Pre-multiply by the set length to add one more basis of differentiation.
* Work a bit harder inside the loop to scatter bits from sources that
  may have closely spaced hash values.

All of this is necessary to make up for keep the hash function commutative.
Fortunately, the hash value is cached so the call to frozenset_hash() will
only occur once per set.
üst 27e403eb
......@@ -663,20 +663,22 @@ frozenset_hash(PyObject *self)
PySetObject *so = (PySetObject *)self;
PyObject *key, *value;
int pos = 0;
long hash = 1905176217L;
long hash = 1927868237L;
if (so->hash != -1)
return so->hash;
hash *= (PyDict_Size(so->data) + 1);
while (PyDict_Next(so->data, &pos, &key, &value)) {
/* Multiplying by a large prime increases the bit dispersion for
closely spaced hash values. The is important because some
use cases have many combinations of a small number of
elements with nearby hashes so that many distinct combinations
collapse to only a handful of distinct hash values. */
hash ^= PyObject_Hash(key) * 3644798167u;
}
hash *= 69069L;
/* Work to increase the bit dispersion for closely spaced hash
values. The is important because some use cases have many
combinations of a small number of elements with nearby
hashes so that many distinct combinations collapse to only
a handful of distinct hash values. */
long h = PyObject_Hash(key);
hash ^= (h ^ (h << 16) ^ 89869747L) * 3644798167u;
}
hash = hash * 69069L + 907133923L;
if (hash == -1)
hash = 590923713L;
so->hash = hash;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment