summaryrefslogtreecommitdiffstats
path: root/tests/012/parse.tl
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2023-06-20 20:14:19 -0700
committerKaz Kylheku <kaz@kylheku.com>2023-06-20 20:14:19 -0700
commit82141a5d9e0114fbc138b3b6c4797e3c220de243 (patch)
tree8678e9b5e516b3ba1d1ed31e1d5fe16fe162ebb4 /tests/012/parse.tl
parent96ea20112aee6ba8d4b5aa5d6a8a0d6b5d3dd2e0 (diff)
downloadtxr-82141a5d9e0114fbc138b3b6c4797e3c220de243.tar.gz
txr-82141a5d9e0114fbc138b3b6c4797e3c220de243.tar.bz2
txr-82141a5d9e0114fbc138b3b6c4797e3c220de243.zip
hash: support existing mutation+iteration semantics.
The original chained hashing scheme makes certain guarantees in situation when a hash table that is being iterated is also undergoing insertions or deletions. The original scheme meets these requirements simply, by putting a freeze against hash table growth while there are outstanding iterations. Chained hashing gracefully handles load factors above 1. Load factors above 1 are not possible under open addressing (and we don't even want to approach 1) but we would like to preserve the requirements. The requirements are: 1. If an iterator has already visited an item, it will not see that item again, regardless of insertions which cause the table to be reorganized. 2. It is not specified/required that an iterator will visit newly inserted items. It may visit some of those items, but not others. 3. If an iterator has not yet visited an item, and that item is deleted, it will not see that item, regardless of any insertions that reorganize the table. In this commit, we implement a "table stack" scheme. 1. When a table is resized due to insertions, and it is being iterated (h->usecount > 0), in that situation it will push the existing small table onto a stack, the h->tblstack (table stack). 2. Iterators take a hash table's current table and its size, and work with that snapshot of the table. If the original hash table grows, existing iterators work with the original table as it existed just before the reorganization. So after that they do not see any new insertions. 3. Whenever the delete operation (hash_remove) finds the item and removes it from the current table, it also walks the table stack, searches for the item in every table in the stack and nulls it out. This search is oblivious to nil; it's a blind search that goes around the table starting at the first probe position, looking for the identical cons cell to take out. This algorithm ensures that iterators will not see a deleted item, unless they already visited it before the deletion, of course. * hash.h (struct hash_iter): New members table, and mask. * hash.c (struct hash): New member, tblstack. (hash_grow): We drop the vec argument and recreate it locally (not essential to this commit). If we find that the usecount is positive, we push the existing table onto the table stack. Otherwise, we take the opportunity to obliterate the table stack. (hash_insert): Drop the restriction that hash_grow is only called when the use count is zero. Adjust calls to hash_grow to drop the vec argument. (hash_remove): When an item is found and deleted, and the table is in use by iterators, walk the table stack and delete it from each previous table. Otherwise, if the table is not in use by iterators, obliterate the table stack. (hash_mark): Exit early also if there is a table stack, and mark that stack. (do_make_hash, make_similar_hash, copy_hash): Initialize table stack in new hash. (hash_iter_mark): Mark the iterator's table. This is likely not necessary since we also mark the hash table, which should have a pointer to that same table. That wouldn't be defensive programming, though. (hash_iter_init, us_hash_iter_init): Initialize table and mask. (hash_iter_next_impl, hash_iter_peek): These functions have to walk the table snapshot taken by the iterator, using the captured mask, and not the current table. (has_reset): If the target table's use count drops to zero, obliterate its table stack. We add a missing setcheck here; this operation potentially stores a different hash into an existing iterator. It's not being done safely with regard to generational GC. * tests/010/hash.tl: New tests.
Diffstat (limited to 'tests/012/parse.tl')
0 files changed, 0 insertions, 0 deletions