The hash table is implemented as an open-addressed table with
linear probing collision resolution.
- For those not up to CS parlance, it means that all the hash entries
- (pairs of pointers key and value) are stored in a contiguous array.
- The position of each mapping is determined by the hash value of its
- key and the size of the table: location := hash(key) % size. If
- two different keys end up on the same position (collide), the one
- that came second is placed at the next empty position following the
+ In regular language, it means that all the hash entries (pairs of
+ pointers key and value) are stored in a contiguous array. The
+ position of each mapping is determined by the hash value of its key
+ and the size of the table: location := hash(key) % size. If two
+ different keys end up on the same position (collide), the one that
+ came second is placed at the next empty position following the
occupied place. This collision resolution technique is called
"linear probing".
/* We use all-bit-set marker to mean that a mapping is empty. It is
(hopefully) illegal as a pointer, and it allows the users to use
NULL (as well as any non-negative integer) as key. */
+
#define NON_EMPTY(mp) (mp->key != (void *)~(unsigned long)0)
+#define MARK_AS_EMPTY(mp) (mp->key = (void *)~(unsigned long)0)
/* "Next" mapping is the mapping after MP, but wrapping back to
MAPPINGS when MP would reach MAPPINGS+SIZE. */
for (; NON_EMPTY (mp); mp = NEXT_MAPPING (mp, mappings, size))
/* Return the position of KEY in hash table SIZE large, hash function
- being HASHFUN. #### Some implementations multiply HASHFUN's output
- with the table's "golden ratio" to get better spreading of keys.
- I'm not sure if that is necessary with our hash functions. */
+ being HASHFUN. */
#define HASH_POSITION(key, hashfun, size) ((hashfun) (key) % size)
/* Find a prime near, but greather than or equal to SIZE. Of course,
struct mapping *mappings = ht->mappings;
hashfun_t hasher = ht->hash_function;
- mp->key = NULL;
+ MARK_AS_EMPTY (mp);
--ht->count;
/* Rehash all the entries following MP. The alternative
goto next_rehash;
*mp_new = *mp;
- mp->key = NULL;
+ MARK_AS_EMPTY (mp);
next_rehash:
;
return hash_table_new (items, string_hash_nocase, string_cmp_nocase);
}
-/* Hashing of pointers. Used for hash tables that are keyed by
- pointer identity. (Common Lisp calls them EQ hash tables, and Java
- calls them IdentityHashMaps.) */
+/* Hashing of numeric values, such as pointers and integers. Used for
+ hash tables that are keyed by pointer identity. (Common Lisp calls
+ them EQ hash tables, and Java calls them IdentityHashMaps.)
+
+ This implementation is the Robert Jenkins' 32 bit Mix Function,
+ with a simple adaptation for 64-bit values. It offers excellent
+ spreading of values and doesn't need to know the hash table size to
+ work (unlike the very popular Knuth's multiplication hash). */
static unsigned long
ptrhash (const void *ptr)