| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vargs typedef is underused. Let's use it consistently
everywhere.
* args.c,
* args.h,
* args.c,
* args.h,
* arith.c,
* eval.c
* ffi.c,
* gc.c,
* hash.c,
* lib.c,
* lib.h,
* parser.c,
* stream.c,
* struct.c,
* struct.h,
* syslog.c,
* syslog.h,
* unwind.c,
* vm.c,
* vm.h: All "struct args * declarations replaced with existing
"varg" typedef that comes from lib.h.
|
|
|
|
|
|
|
| |
* hash.c (hash_iter_peek): The loop here must be a top-of-test
while loop, not a bottom-test do loop. In the chained hashing
implementation, this was a do loop, but it also had a test
with a break for the index.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since their inception, the top_fb and top_fb hashes associated
symbols with bindings cells, causing an extra cons cell to be
allocated for each entry. I don't remember why this is. It
might have been that way so that gethash(top_fb, sym) would
return a cell when the variable exists, or else nil. This was
before we had functions like gethash_e and inhash that return
the hash cell itself. A hash cell is also a cons and can serve
as a binding just fine by itself.Let's make it so.
For now, the macro and symbol macro environments stay the
way they are; I will likely convert them also.
* eval.c (env_fbind, env_vbind, lookup_global_var, lookup_sym_lisp1,
lookup_fun, func_get_name, rt_defv, rt_defun, set_symbol_value,
reg_fun, reg_varl): Update all these functions so they treat the
hash cell from top_vb or top_fb as the binding cell, rather than
putting or expecting the cdr of that cell (i.e the hash value)
to be a binding cell.
* hash.[ch] (gethash_d): New function. Jus gethash_e without the
pesky self argument, that would only be needed for error
reporting if we pass an object that isn't a hash.
* stdlib/place.tl (sys:get-fun-getter-setter, sys:get-vb):
These two functions must be adjusted since they refer to the
top-fb and top-vb variables. sys:get-vb isn't used anywhere;
it continues to exist in order to provide run-time support
to files that were compiled with an buggy version of the
symbol-value place.
|
|
|
|
|
|
|
|
| |
* hash.c (group_reduce): Use seq_iter_t instead of
obsolete vector and list iteration.
* txr.1: Use 1..11 range in one group-reduce
example instead of (range 1 10).
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.c (group_by): Replace obsolete list/vector switching
with seq_iter_t iteration. The group-by function is limited
otherwise.
* txr.1: Update group-by documentation not to assert that
the sequence argument is required to be a list or vector.
Examples for group-by and group-map now use a 0..11 range
instead of (range 0 10).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hash-map converts a function mapping over a sequence
into a hash table.
* hash.[ch] (hash_map): New function.
* tests/010/hash.tl: Test case.
* genman.txr: The hash-map identifier introduces
a hash collision. We have to deal with that somehow now.
(colli): We put the conflicting entries into a new hash called
colli which maps them to an increment value.
(hash-title): Increment the hash code h by the amount
indicated in colli, if the title is found there.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
|
| |
hash.c (hash_remove): In the algorithm that moves cells
forward in the cluster in order to avoid the use of
tombstones, we use slightly better variables. The cell
we are looking to replace is called wipe rather than
start. The cryptic varaible name k is renamed to iprobe,
because it the initial probe location for the cell
at position i.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original chained hashing scheme makes certain
guarantees in situation when a hash table that is being
iterated is also undergoing insertions or deletions.
The original scheme meets these requirements simply,
by putting a freeze against hash table growth while
there are outstanding iterations. Chained hashing
gracefully handles load factors above 1.
Load factors above 1 are not possible under open
addressing (and we don't even want to approach 1)
but we would like to preserve the requirements.
The requirements are:
1. If an iterator has already visited an item, it
will not see that item again, regardless of insertions
which cause the table to be reorganized.
2. It is not specified/required that an iterator will
visit newly inserted items. It may visit some of those
items, but not others.
3. If an iterator has not yet visited an item, and
that item is deleted, it will not see that item,
regardless of any insertions that reorganize the table.
In this commit, we implement a "table stack" scheme.
1. When a table is resized due to insertions, and
it is being iterated (h->usecount > 0), in that situation
it will push the existing small table onto a stack,
the h->tblstack (table stack).
2. Iterators take a hash table's current table and its
size, and work with that snapshot of the table.
If the original hash table grows, existing iterators
work with the original table as it existed just before
the reorganization. So after that they do not see any
new insertions.
3. Whenever the delete operation (hash_remove) finds
the item and removes it from the current table,
it also walks the table stack, searches for the item
in every table in the stack and nulls it out.
This search is oblivious to nil; it's a blind search
that goes around the table starting at the first
probe position, looking for the identical cons cell
to take out. This algorithm ensures that iterators
will not see a deleted item, unless they already visited
it before the deletion, of course.
* hash.h (struct hash_iter): New members table, and mask.
* hash.c (struct hash): New member, tblstack.
(hash_grow): We drop the vec argument and recreate it
locally (not essential to this commit).
If we find that the usecount is positive, we push the
existing table onto the table stack. Otherwise,
we take the opportunity to obliterate the table stack.
(hash_insert): Drop the restriction that hash_grow is
only called when the use count is zero. Adjust calls
to hash_grow to drop the vec argument.
(hash_remove): When an item is found and deleted, and
the table is in use by iterators, walk the table stack
and delete it from each previous table. Otherwise,
if the table is not in use by iterators, obliterate
the table stack.
(hash_mark): Exit early also if there is a table stack,
and mark that stack.
(do_make_hash, make_similar_hash, copy_hash): Initialize
table stack in new hash.
(hash_iter_mark): Mark the iterator's table. This is
likely not necessary since we also mark the hash table,
which should have a pointer to that same table.
That wouldn't be defensive programming, though.
(hash_iter_init, us_hash_iter_init): Initialize table and mask.
(hash_iter_next_impl, hash_iter_peek): These functions
have to walk the table snapshot taken by the iterator,
using the captured mask, and not the current table.
(has_reset): If the target table's use count drops to zero,
obliterate its table stack. We add a missing setcheck here;
this operation potentially stores a different hash into
an existing iterator. It's not being done safely with
regard to generational GC.
* tests/010/hash.tl: New tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.h (struct hash_iter): chain and cons members removed;
index member added.
* hash.c (struct hash_ops): acons_new_c_fun member removed.
(hash_ops_init): Reduced to three arguments.
(hash_grow): Function removed.
(hash_find_slot, hash_lookup, hash_grow, hash_insert,
hash_remove): New static functions.
(hash_mark): Iterate the simple open table, which directly
contains pointers to the hash entry cons cells, rather
than lists of these.
(hash_acons_new_c, hash_aconsql_new_c, hash_aconsq_new_c):
Static functions removed.
(hash_eq_ops, hash_eql_ops, hash_equal_ops): Initializer
macro calls reduced to three arguments.
(copy_hash_chain): Function removed.
(copy_hash): copy_hash_chain logic altered to duplicate
the hash cells like copy_hash_chain did.
(gethash_c, gethash_e, remhash): Retargeted to the new
static functions that do open addressing with linear
probing.
(hash_iter_mark): No cons member to mark any more.
(hash_iter_init, us_hash_iter_init): Initialization updated
to new representation.
(hash_iter_next_impl, hash_iter_peek): Rewritten to traverse simple table.
Loses second argument.
(hash_iter_next, hash_next): Don't pass second argument
to hash_iter_next_impl.
(do_weak_tables): Adjusted to walk open table.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hash tables are supposed to start with a mask of 255,
which is the 256 modulus less one. However, due to a
coding mistake, they get a modulus of 252. Now this still
works. It just has a few zero bits in it: 1111 1100
rather than 11111111. When this grows by doubling in size,
we get 1 1111 1001, 11 1111 0011 and so on.
Now this still all works, because hash values AND-ed
with such a mask are no greater than that mask. The
table isn't accessed out of bounds. It's just inefficiently
used.
* hash.c (do_make_hash, make_similar_hash, clearhash):
When converting the power-of-two modulus, which is a Lisp
word, to the integer mask, subtract 1 from the integer
value, not the original Lisp word.
|
|
|
|
|
|
| |
* hash.c (hash_mark): Cache the table, vector and mask
in local variables, so they don't have to be reloaded
into registers when external functions are called.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* gc.c (mark_obj_norec): New function. Just marks
an object reachable without recursing over its
sub-objects.
(gc_mark_norec): New function.
* gc.h (gc_mark_norec): Declared.
* hash.c (do_weak_tables): Cache the table, mask and
vector pointer in a local variable, since these pointers
are not expected to change across function calls, and
can go into registers.
When visiting an entry that should be reachable, we
mark that entry immediately, and also use the new
gc_mark_norec function to mark the chain cons cell
reachable. I.e. we mark the chain backbone cons,
and that cons' car field, that being the entry.
Thus by the time we march through the chain, we
have marked all of it. Thus, the table entries don't
have to be iterated and marked any more. We use
gc_mark_norec to mark the table, and explicitly mark
its two special slots. The upshot of all this is that
we don't have to make an extra pass over the table,
and the chains, to mark things; we combine the marking
with the expunging of weak values.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't have a function in the hash table module which can
create a populated hash table in one step without requiring
the caller to create auxiliary lists. This new function fills
that gap, albeit with some limitations.
* hash.c (hash_props): New function.
(hash_init): Register hash-props intrinsic.
* tests/010/hash.tl: New tests.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In most places in the hash module, we reduce a hash
code into the power-of-two sized table using
h & (hash->modulus - 1). In some places we wastefully
modulo operation h % hash->modulus. Why don't we
replace the modulus with a mask so we can just do
h & hash->mask.
* hash.c (struct hash_ops): Replace modulus member with
mask, which has a value one less.
(hash_mark, hash_grow, do_make_hash, make_similar_hash,
copy_hash, gethash_c, gethash_e, remhash, clearhash,
hash_iter_next_impl, hash_iter_peek, do_weak_tables):
Work with mask rather than modulus, preserving existing
behavior.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl,
stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl,
stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl,
stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl,
stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl,
stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl,
stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl,
stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl,
stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl,
stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl,
stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl,
stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl,
stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl,
stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright year bumped to 2023.
|
|
|
|
|
|
| |
* hash.c (hash_double): If the input is equal to 0.0, return
0, so that both negative and positive zero have the same hash
value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Research indicates that this is something useful in
languages that abuse strings for implementing symbols.
We have interned symbols.
* lib.h (struct string): Remove hash member.
* lib.c (string_own, string, string_utf8, mkustring,
string_extend, replace_str, chr_str_set): Remove
all initializations and updates of the removed
hash member.
* hash.c (equal_hash): Do not cache string hash value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This relates to the November 2021 commit
9108e9f8f4434fb29200b08a4b576df47c407c01:
hash: 64 bit string and buffer hashing and seeds.
In the hash constructor, a 32 bit cast was left
behind, needlessly reducing the seed value to 32
bits on 64 bit platforms.
* hash.c (do_make_hash): Don't convert the ucnum
value to u32_t; just store it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On platforms which have the malloc_usable_size function,
we don't have to store the allocated size of an object;
malloc provides us the allocated size (which may be larger
than we requested). Here we take advantage of this for
strings. And since we don't have to store the string
allocated size any more, we use that field for something
else: storing the hash code (for seed zero). This can speed
up some hashing operations.
* configure (have_malloc_usable_size): New variable.
Configure test for have_malloc_usable size. We have to
try several header files, too. We set the configure
variable HAVE_MALLOC_USABLE_SIZE, and possibly
HAVE_MALLOC_H or HAVE_MALLOC_NP_H.
* lib.h (struct string): If HAVE_MALLOC_USABLE_SIZE
is true, we define a member called hash insetad of
alloc. Also, we change alloc to cnum.
* lib.c: Include <malloc_np.h> if HAVE_MALLOC_NP_H
is defined.
(string_own, string, string_utf8, mkstring, mkustring,
init_str, string_extend, string_finish, string_set_code,
string_get_code, length_str, replace_str, chr_str_set):
Fix code for both cases. On platforms with malloc_usable_size,
we have the allocated size from malloc, so we don't have to
retrieve it from the object or store it. Any operations which
mutate the string must reset the hash field to zero; zero
means "hash has not been calculated".
* hash.c (equal_hash): Just retrive a string's hash value, if
it is nonzero, otherwise calculate, cache it and return it.
* gc.c (mark_obj): The alloc member of struct string is a
machine integer now; no need to mark it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On platforms with 64 bit pointers, and therefore 64-bit-wide
TXR values, we can use a representation technique which allows
double floating-point values to be unboxed.
Fixnum integers are reduced from 62 bits to 50, and there is
a little more complexity in the run-time type checking and
dispatch which costs extra cycles.
The support is currently off by default; it must be explicitly
enabled with ./configure --nan-boxing.
* lib.h (NUM_MAX, NUM_MIN, NUM_BIT): Define separately for
NaN boxing.
(TAG_FLNUM, TAG_WIDTH, NAN_TAG_BIT, NAN_TAG_MASK, TAG_BIGMASK,
TAG_BIGSHIFT, NAN_FLNUM_DELTA): New preprocessor symbols.
(enum type, type_t): The FLNUM enumeration constant moves
to just after LIT, so that its value is the same as TAG_FLNUM.
(struct flonum): Does not exist under NaN boxing.
(union obj): No fl member under NaN boxing.
(tag, is_ptr): Separately defined for NaN boxing.
(is_flo): New function under NaN boxing.
(tag_ex): New function. It's like tag, but identifies
floating-point values as TAG_FLNUM. The tag function continues
to map them to TAG_PTR, which is wrong under NaN boxing,
but needed in order not to separately write tons of cases in
the arith.c module.
(type): Use tag_ex, so TAG_FLNUM is handled, if it exists.
(auto_str, static_str, litptr, num_fast, chr, c_n, c_u):
Different definition for NaN boxing.
(c_ch, c_f): New function.
(throw_mismatch): Attribute with NORETURN.
(nao): Separate definition for NaN boxing.
* lib.c (seq_kind_tab): Reorder initializer to follow enum
reordering.
(seq_iter_rewind): use c_n and c_ch functions, since type
checking has been done in those cases. The self parameter
is no longer needed.
(iter_more): use c_ch on CHR object.
(equal): Use c_f accessor to get double value rather than
assuming there is a struct flonum representation.
(stringp): Use tag_ex, otherwise a floating-point number
is identified as TAG_PTR.
(diff, isec, isecp): Don't pass removed self parameter
to seq_iter_rewind.
* arith.c (c_unum, c_dbl_num, c_dbl_unum, plus, minus,
signum, gt, lt, ge, le, numeq, logand, logior,
logxor, logxor_old, bit, bitset, tofloat, toint,
width, c_num, c_fixnum): Extract floating-point value
using c_f accessor. Handle CHR type separately from NUM
because the storage representation is no longer identical;
CHR values have a two bit tag over bits where NUM has
ordinary value bits. NUM is tagged at the NaN level with
the upper 14 bits being 0xFFFC. The remaining 50 bits
are the value.
(flo): Construct unboxed float under NaN boxing by taking
image of double as a 64 bit value, and adding the
delta offset, then casting to the val pointer type.
(c_flo): Separate implementation for NaN boxing.
(integerp, numberp): Use tag_ex.
* buf.c (str_buf, buf_int): Separate CHR and NUM cases,
like in numerous arith.c functions.
* chksum.c (sha256_hash, md5_hash): Use c_ch accessor for
CHR value.
* hash.c (equal_hash, eql_hash): Handle CHR separately. Use
c_f accessor for floating-point value.
(eq_hash): Use tag_ex and handle TAG_FLNUM value under NaN
boxing. Handle CHR separately from NUM.
* ffi.c (ffi_float_put, ffi_double_put, carray_uint, carray_int):
Handle CHR and NUM separately.
* stream.c (formatv): Use c_f accessor.
* configure: disable automatic selection of NaN boxing
on 64 bit platforms, for now.
Add test whether -Wno-strict-aliasing is supported by
the compiler, performed only if NaN boxing is enabled.
We need to disable this warning because it goes off on
the code that reinterprets an integer as a double and
vice versa.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.c (group_map): New function.
(hash_init): group-map intrinsic registered.
* hash.h (group_map): Declared.
* tests/010/hash.tl: New test case.
* txr.1: Documented together with group-by.
Extra paren removed from group-by example.
|
|
|
|
|
|
|
| |
* hash.c (group_reduce): Replace loop with call to hash_update
which is exactly the same logic, and even more efficient
because it avoids calling us_rplacd.
(hash_update): Fix incorrect self name.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* arith.c (highest_significant_bit): Bugfix: do not pass a
negative value to highest_bit, where we will get then get the
wrong idea about the number of significant bits in the value,
since the __builtin_clz primitives will include the sign bit.
We want to complement the all the bits, so that the sign bit
will go to zero. We can do this arithmetically by taking the
additive inverse (which is the two's complement (which is the
complement plus one)) and subtracting one.
(ash): Avoid left shifting a negative number in HAVE_UBSAN
mode using the same trick as in num_fast.
* ffi.c (ffi_swap_u16): Here the shift and or calculation is
producing a value beyond 16 bits which we are relying on
the implicit conversion back to uin16_t to trim away.
We add the cast to uint16_t to make it explicit.
* hash.c (equal_hash): Also handle the CHR and NUM cases here
via c_u like in eql_hash and eq_hash.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.h (c_u): New inline function: unsafe conversion to
ucnum, analogous to c_n for cnum.
* hash.c (equal_hash, hash_iter_init): Use UINT_PTR_MAX
instead of convert(ucnum, -1).
(eql_hash): mp_hash returns unsigned long, so
shouldn't require a cast to go to the uint_ptr_t.
The types are of the same size, or at worst it is a widening.
Also replace convert(ucnum, -1) by UINT_PTR_MAX here.
Combining the TAG_CHR and TAG_NUM cases, and using c_u,
which is more efficient since c_chr and c_num are non-inlined
functions which redundantly check type. We no longer
need a self variable in this function.
(eq_hash): Same TAG_CHR and TAG_NUM changes as eql_hash.
* regex.c (char_set_add): Reformat change to avoid line break
across assignment.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implicit conversions were discovered with Clang's UBSan (with
the -fsanitizer=implicit-conversion option).
* gc.c (sweep_one): Convert only the inverted REACHABLE, since
block->t.type is already of the right type.
* hash.c (eql_hash, eq_hash, hash_iter_init, us_hash_iter_init):
Explicitly convert to ucnum.
* linenoise/linenoise.c (enable_raw_mode): Explicitly convert the
inverted flag sets to tcflag_t.
* mpi/mpi.c (mp_set_uintptr): Explicitly convert to uint_ptr_t.
* regex.c (char_set_add): Explicitly convert to bitcell_t.
* struct.c (struct_inst_hash): Correct type of hash from cnum to
ucnum.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c,
cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h,
combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h,
ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h,
glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h,
parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h,
rand.c, rand.h, regex.c, regex.h, signal.c, signal.h,
socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl,
stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl,
stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1,
txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h,
vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year
bumped to 2022.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The use of -ansi doesn't by itself diagnose instances of some
constructs we don't want in the project, like mixed
declarations and statements.
* configure (diag_flags): Add -Werror=declaration-after-statement.
This is C only, so filter it out for C++.
Also add -Werror=vla.
* HACKING: Update inaccurate statements about what dialect we
are using. TXR isn't pure C90: some GCC extensions are used.
We even use long long if the configure script detects it as
working, and some C99 library features.
* buf.c (replace_buf, buf_list): Fix by reordering.
* eval.c (op_dohash, op_load_time_lit): Fix by reordering.
* ffi.c (ffi_simple_release): Fix by reordering.
(align_sw_get): Fix empty macro to expand to dummy declaration
so a semicolon after it isn't interpreted as a statement.
On platforms with alignment, remove a semicolon from the macro
so that it requires one.
(ffi_i8_put, ffi_u8_put): Fix by reordering.
* gc.c (gc_init): Fix with extra braces.
* hash.c (hash_init): Fix by reordering.
* lib.c (list_collect_revappend, sub_iter, replace_str,
replace_vec, mapcar_listout, mappend, mapdo, window_map_list,
subst): Fix by reordering.
(gensym, find, rfind, pos, rpos, in, search_common): Fix by
renaming optional argument and using declaration instead of
assignment.
* linenoise/linenoise.c (edit_in_editor): Fix by reordering.
* parser.c (is_balanced_line): Fix by reordering.
* regex.c (nfa_count_one, print_rec): Fix by reordering.
* signal.c (sig_mask): Fix by reordering.
* stream.c (get_string): Fix by renaming optional argument and
using declaration instead of assignment.
* struct.c (lookup_static_slot_desc): Fix by turning mutated
variable into block local.
(umethod_args_fun): Fix by reordering.
(get_special_slot): Fix by new scope via braces.
* sysif.c (usleep_wrap): Fix by new scope via braces.
(setrlimit_wrap): Fix by new scope via braces.
* time.c (time_string_meth, time_parse_meth): Fix by reordering.
* tree.c (tr_do_delete_spec): Fix by new scope via braces.
* unwind.h (uw_block_beg): New macro which doesn't define
RESULTVAR but expects it to refers to an existing one.
(uw_block_begin): Replace do while (0) with enum trick
so that we have a declaration that requires a semicolon,
rather than a statement, allowing declarations to follow.
(uw_match_env_begin): Now opens a scope and features the
same enum trick as in uw_block_begin.
This fixes a declaration-follows-statement issue in
the v_output function in match.c.
(uw_match_env_end): Closes scope opened by uw_match_env_begin.
* unwind.c (revive_cont): Fix by introducing variable, and
using new uw_block_beg macro.
* vm.c (vm_execute_closure): Fix using combination of local
variable and reordering.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.c (randbox, hash_c_str, hash_buf): Separate
implementation for 64 bit pointers, using 64 bit random
values, and producing a 64 bit hash, taking in a 64 bit seed.
(gen_hash_seed): Use time_sec_nsec to get nanoseconds.
On 64 bit, put together the seed differently to generate
a wider value.
* tests/009/json.txr: Change from hash tables to lists,
so the order of the output doesn't change between 64 and 32
bits, due to the different string hashing.
* tests/009/json.expected: Updated.
* txr.1: Documented that seeds are up to 64 bits, but
with possibly only the lower 32 bits being used.
|
|
|
|
|
| |
* hash.c (hash_print_op): Only set the need_space flag if
some leading item is printed.
|
|
|
|
|
|
|
|
|
| |
* hash.c (copy_hash): The order of allocating the hash object
and vector is incorrect. The hash must be allocated last, like
it is in do_make_hash and make_similar_hash. If the vector is
allocated after the hash, it can trigger gc, and then the
garbage collector will traverse the uninitialized parts of the
hash object.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.c (struct hash): modulus and count change from cnum to
ucnum.
(hash_mark, hash_grow, copy_hash, do_weak_tables): Use ucnum
local vars.
(do_make_hash, make_similar_hash): Use c_unum to obtain modulus.
(gethash_c, gethash_e): Use & masking operation to reduce hash
value to table size.
(remhash): Move sanity check before decrement since unsigned
value can't go below zero.
(clearhash): Use ucnum and c_unum.
(hash_iter_peek): Use ucnum for chain count local.
* hash.h (struct hash_iter): chain changes from cnum to ucnum.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c,
buf.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h,
combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, ffi.c,
ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c,
glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c,
lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c,
parser.h, parser.l, parser.y, rand.c, rand.h, regex.c,
regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl,
stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.c,
txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h:
License reformatted.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The make_hash function now takes the hash_weak_opt_t
enumeration instead of a pair of flags.
* hash.c (do_make_hash): Take enum argument instead of pair of
flags. Just store the option; nothing to calculate.
(weak_opt_from_flags): New static function.
(tweak_hash): Function removed.
(make_seeded_hash): Adjust to new do_make_hash interface with
help from weak_opt_from_flags.
(make_hash, make_eq_hash): Take enum argument instead of pair
of flags.
(hashv): Calculate hash_weak_opt_t enum from the extracted
flags, pass down to make_eq_hash or make_hash.
* hash.h (tweak_hash): Declration removed.
(make_hash, make_eq_hash): Declarations updated.
* eval.c (me_case, expand_switch): Update make_hash
calls to new style.
(eval_init): Update make_hash calls and get rid of tweak_hash
calls. This renders the tweak_hash function unused.
* ffi.c (make_ffi_type_enum, ffi_init): Update make_hash calls
to new style.
* filter.c (make_trie, trie_add, filter_init): Likewise.
* lib.c (make_package_common, obj_init, obj_print): Likewise.
* lisplib.c (lisplib_init): Likewise.
* match.c (dir_tables_init): Likewise.
* parser.c (parser_circ_def, repl, parse_init): Likewise.
* parser.l (parser_l_init): Likewise.
* struct.c (struct_init, get_slot_syms): Likewise.
* sysif.c (get_env_hash): Likewise.
* lex.yy.c.shipped, y.tab.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The flags field of hashes isn't really functioning as flags;
it's an enumeration whose numeric properties are exploited in
one place in the code.
* hash.h (enum hash_flags): Rename to enum hash_weak_opt.
(hash_flags_t): Renum to hash_weak_opt_t.
(tweak_hash): Declaration updated.
* hash.c (struct hash): Rename flags member to wkopt.
(hash_print_op, hash_mark, do_make_hash, tweak_hash,
make_similar_hash, copy_hash, do_weak_tables): Follow renames.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A weak table with and-semantics expires entries only when both
their key and value is unreachable. When this condition is not
met, therefore, the hash table generates a reference to both
the key and value. This gives rise to a subtlety that must
be be correctly handles in the marking phase.
* hash.c (hash_mark): When marking an and-semantics table,
whenever we find a reachable key or value, we know that the
entry is staying. Therefore we mark it: if the key is
unreachable, we mark the value and vice versa. This is
important because these unreachable objects may be the only
references for reaching reach some other objects via one or
more weak hash tables. Those secondary objects may
spontaneously disappear due to those other hash tables
removing their entries. E.g suppose H0 has and-semantics, and
some K-V entry in H1 has a reachable K, but unreachable V.
Therefore the entry is not eligible for removal, and thus
maintains references to K and V. Suppose V happens to be a key
in a weak-key hash table H1. If, while marking H0, we do not
mark V, then there is a risk that H1 will be processed first
during the later weak procesing stage, and H1 will wrongly
expire its V entry due to the key V being unreachable. Then
when H0 is processed, it will mark V, making it reachable, but
too late: the V entry in H1 is already spuriously gone.
The main principle at play is that an entry in an
and-semantics table strongly holds on to a key if the value is
reachable and vice versa. Only if both are simultaneously
unreachable does it relinquish its references.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hash tables with weak keys and values now support a choice of both
possible semantics: under and-semantics, an entry lapses when both the
key and value are unreachable. Under or-semantics, an entry lapses if
either the key or value is unreachable.
The and-semantics is new. Until TXR 266, only or-semantics was
supported. This will be the default: when a hash table is specified
as :weak-keys and :weak-vals, it will have or-semantics.
The keywords :weak-or and :weak-and specify weak keys and values,
with the specific semantics. They are utually exclusive, but tolerate
the presence of :weak-keys and :weak-vals.
The make-hash function is being extended such that if its leftmost
argument, <weak-keys>, is specified as one of the keywords :weak-and
or :weak-or, then the hash table will have weak keys and values with the
specified semantics, and the <weak-vals> argument is ignored
(values are weak even if that argument is false).
* eval.c (eval_init): Initially register the top_vb,
top_mb, top_smb, special and builtin hashes as ordinary hashes: no weak
keys or values. Then use tweak_hash to switch to weak keys+vals
with and-semantics. We do it this way because the keywords are not yet
initialized; we cannot use them.
* hash.h (enum hash_flags, hash_flags_t): Moved to header. Member
hash_weak_both renamed to hash_weak_or. New member hash_weak_and.
(weak_and_k, weak_or_k): New keyword variables.
(hash_print_op): Handle hash_weak_and by printing :weak-and.
(hash_mark): Handle hash_weak_and by marking nothing, like hash_weak_or.
(do_make_hash): Check first argument against the two new keywords and
set flags accordingly. This function is called from eval_init before
the keywords have been initialized, in which case weak_keys ==
weak_and_k is true when both are nil; we watch for that.
(tweak_hash): Now returns void and takes a hash_flags_t argument which
is simply planted.
(do_wak_tables): Implement hash_weak_and case. Remove the compat 266
stuff from hash_weak_or. Compatibility is no longer required since we
are not changing the default semantics of hash tables. Phew; that's a
load of worry off the plate.
(hashv): Parse the two new keywords, validate and provide semantics.
(hash_init): Initialize weak_and_k and weak_or_k kewyords.
* hash.h (enum hash_flags, hash_flags_t): Moved here now.
(weak_and_k, weak_or_k): Declared.
* lib.c (compat_fixup): Remove call to parse_compat_fixup.
* parser.c (parse_init): Create stream_parser_hash with and-semantics.
(parse_compat_fixup): Function removed.
* parser.h (parse_compat_fixup): Declaration removed.
* txr.1: Hash documentation updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This addresses the problem that
a4c376979d15323ad729e92e41ba43768e8dc163
tried to fix.
* eval.c (eval_init): Make all the top-level binding tables,
top_fb, top_vb, top_mb, top_smb, special and builtin,
weak-both tables: keys and values are weak.
This way, the entries disappear if both key and value
are unreachable, even if they refer to each other.
(eval_compat_fixup): In 266 or earlier compat mode, weak-both
tables don't have the right semantics, so we tweak
the tables to weak-key tables.
* parser.c (parse_init): Same treatment for
stream_parser_hash. We want an entry to disappear from the
hash if neither the parser nor the stream are reachable.
(parse_compat_fixup): New function.
* parser.h (parse_compat_function): Declared.
* hash.c, hash.h (tweak_hash): New function.
* lib.c (compat_fixup): Call parse_compat_fixup.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From now on, hash tables with both weak keys and values have
dijunctive retention semantics. If either the key or value of
an entry is reachable, then the entry stays. This is subject
to compatibility.
* hash.c (do_weak_tables): Expire an entry if neither the key
nor the value is reachable. In 266 or lower compatibility
mode, expire an entry if either the key or value is
unreachable, like before.
* txr.1: Document the change, with compat notes. Add a
cautionary note about the referencing issue which defeats weak
key or weak value tables.
|
|
|
|
|
|
| |
* hash.c (do_weak_tables): Do not test hash entries for
reachability, only keys and values. This is not worth doing
and possibly adds cycles.
|
|
|
|
|
|
|
| |
* hash.c (do_weak_tables): Iterate to the end of each chain, not
quitting early when a reachable tail is found. This has the effect
that we will always count the nodes properly. Some common code
is factored out of the switch also.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit reverts the April 11, 2020 commit
a4c376979d15323ad729e92e41ba43768e8dc163, subject line
"hash: bugfix: spurious retention in weak processing".
That commit is a regression.
This revert requires a follow-up; the commit was trying to
fix an issue which now reappears. It will have to be
fixed differently.
The regression is that in a situation in which data is
referenced through two or more dependent weak tables, entries
that are reachable can spontaneously disappear from downstream
tables.
Suppose H0 and H1 are weak-key tables. Suppose the program
holds a datum K0, which is the only reference through which
D1 is reached, in the following chain:
K0 -> [H0] -> K1 -> [H1] -> D1
K0 is a key in hash table H0, which has weak keys.
The the associated value K1 is a key in H1, which then
references D1. H0 holds the only reference to K1,
and H1 holds the only reference to D1.
During the first GC marking phase, because we do not mark any
part of a table which has weak keys, when we process H0
we do not mark the value K1. Thus K1 looks unreachable.
In the second weak hash processing pass, because K1 was
treated as unreachable, the <K1, D1> entry in H1 is
incorrectly expired.
This issue affects TXR's origin_hash and form_to_ln_hash,
which are in this relationship. The problem was uncovered in
recent tags.tl work, manifesting as a spurious disappearance
of line number info when processing .txr files. The line
number info disappeared for entries which were traced through
the origin_hash via the macro-ancestor function (see the
unexpand function in tags.tl).
* hash.c (hash_mark): Only skip marking tables that have
both weak keys and values. For weak key tables, mark all the
values and vice versa: for weak value tables, mark the keys.
* txr.1: Text resembling original text restored.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a big commit motivated by the need to clean up the
situation with built-in type symbols, COBJ objects and
structs.
The struct type system allows struct types to be defined
for symbols like regex or str, which are used by built-in
or cobj types. This is a bad thing.
What is worse, structure instances are COBJ types which
identify their type using the COBJ class symbol mechanism.
There are places in the C implementation which assume
that when a COBJ has a certain class symbol, it is of
a certain expected type, which is totally different from
and incompatible form a struct instance. User code can
define a structure object which will fool that code.
There are multiple things going on in this patch.
The major theme is that the COBJ representation is changing.
Instead of a class symbol, COBJ instances now carry a
"struct cobj_class *" pointer. This pointer is obtained
by registration via the cobj_register function. All modules
must register their class symbols to obtain these class
handles, which are then used in cobj() calls for
instantiation.
The CPTR type was identical to COBJ until now, except for the
type tag. This is changing; CPTR objects will keep the old
representation with the class symbol.
commit 20fdfc6008297001491308849c17498c006fe7b4
Author: Kaz Kylheku <kaz@kylheku.com>
Date: Thu Jul 8 19:17:39 2021 -0700
* ffi.h (carray_cls): Declared.
* hash.h (hash_cls): Declared.
(hash_early_init): Declared.
* lib.h (struct cobj_class): New struct.
(struct cobj): cls member changing to struct cobj_class *.
(struct cptr): New struct, same as previous struct cobj.
(union obj): New member cp of type struct cptr, for CPTR.
(builtin_type): Declared.
(class_check): Declaration moved closer to COBJ-related
functions and updated.
(cobj_register, cobj_register_super, cobj_class_exists): New
functions declared.
(cobjclassp, cobj_handle, cobj_ops): Declarations updated.
* parser.h (parser_cls): Declared.
* rand.h (random_state_cls): Declared.
* regex.h (regex_cls): Declared.
* stream.h (stream_cls, stdio_stream_cls): Declared.
* struct.h (struct_cls): Declared.
* tree.h (tree_cls, tree_iter_cls): Declared.
* vm.h (vm_desc_cls): Declared.
* buf.c (buf_strm, make_buf_stream): Pass stream_cls
functions instead of stream_s class symbol.
* chksum.c (sha256_ctx_cls, md5_ctx_cls): New static class
handles.
(sha256_begin, sha256_hash, sha256_end, md5_begin, md5_hash,
md5_end): Pass class handles to instead of class symbols.
(chksum_init): Initialize class handle variables.
* ffi.c (ffi_type_cls, ffi_call_desc_cls, ffi_closure_cls,
union_cls): New static class handles.
(carray_cls): New global variable.
(ffi_type_struct_checked, ffi_type_print_op,
ffi_closure_struct_checked, ffi_closure_print_op,
make_ffi_type_builtin, make_ffi_type_pointer,
make_ffi_type_struct, make_ffi_type_union,
make_ffi_type_array, make_ffi_type_enum,
ffi_call_desc_checked, ffi_call_desc_print_op,
ffi_make_call_desc, ffi_make_closure, carray_struct_checked,
carray_print_op, make_carray, cptr_getobj, cptr_out,
uni_struct_checked, make_union_common): Pass class handles
instead of class symbols.
(ffi_init): Initialize class handle variables.
* filter.c (regex_from_trie): Use hash_cls class handle
instead of hash_s.
* gc.c (mark_obj): Split COBJ and CPTR cases since the
representation is different.
* hash.c (hash_cls, hash_iter_cls): New class handles.
(make_similar_hash, copy_hash, gethash_c, gethash_e, remhash,
clearhash, hash_count, get_hash_userdata, set_hash_userdata,
hashp, hash_iter_init, hash_begin, hash_next, hash_peek,
hash_reset, hash_reset, hash_uni, hash_diff, hash_symdiff,
hash_isec): Pass class handles instead of class symbols.
(hash_early_init): New function.
(hash_init): Set the class symbols in the class handles that
were created in hash_early_init at a time when these symbols
did not exist.
* lib.c (nelem): New macro.
(cobj_class): New static array.
(cobj_ptr): New static pointer.
(cobj_hash): New static hash.
(seq_iter_cls): New static class handle.
(builtin_type_p): New function.
(typeof): Struct instances now all carry the same symbol,
struct, as their COBJ class symbol. To get their type, we must
call struct_type_name.
(subtypep): Rearrangement of two cases: let's make the
reflexive case first. Adjust code for different location
of COBJ class symbol.
(seq_iter_init_with_info, seq_begin, seq_next, seq_reset,
iter_begin, iter_more, iter_item, iter_step, iter_reset,
make_like, list_collect, do_generic_funcall): Use class
handles instead of class symbols.
(class_check, cobj, cobjclassp, cobj_handle, cobj_ops): Take
class handle argument instead of class symbol.
(cobj_register, cobj_register_super, cobj_class_exists): New
functions.
(cobj_populate_hash): New static function.
(cobj_print_op): Adjust for different location of class
(cptr_print_op, cptr_typed, cptr_type, cptr_handle,
cptr_get): cptr functions now refer to obj->cp rather than
obj->co.
(copy, length, sub, ref, refset, replace, dwim_set, dwim_del,
obj_print): Use class handles for various COBJ types rather
than class symbols.
(obj_init): gc-protect cobj_hash. Initialize seq_iter_cls
class symbol and cobj_hash. Populate cobj_hash as the last
initialization step.
(init): Call hash_early_init immediately after gc_init.
diff --git a/lib.c b/lib.c
* match.c (do_match_line): Refer to regex_cls class handle
instead of regex_s..
* parser.c (parser_cls): New global class handle.
(parse, parser_get_impl, lisp_parse_impl, txr_parse,
parser_errors): Use class handles instead of class symbols.
(parse_init): Initialize parser_cls.
* rand.c (random_state_cls): New global class handle.
(make_state, random_state_p, make_random_state,
random_state_get_vec, random_fixnum, random_float, random):
Use class handles instead of class symbols.
(rand_init): Initialize random_state_cls.
* regex.c (regex_cls): New global class handle.
(chset_cls): New static class handle.
(reg_compile_csets, reg_derivative, regex_compile, regexp,
regex_source, regex_print, regex_run, regex_machine_init): Use
class handles instead of class symbols.
(regex_init): Initialize regex_cls and chset_cls.
* socket.c (make_dgram_sock_stream): Use stream_cls class
symbol instead of stream_s.
* stream.c (stream_cls, stdio_stream_cls): New class handles.
(make_null_stream, stdio_get_fd, make_stdio_stream_common,
stream_fd, sock_family, sock_type, sock_peer, sock_set_peer,
make_dir_stream, make_string_input_stream,
make_string_byte_input_stream, make_strlist_input_stream,
make_string_output_stream, make_strlist_output_stream,
get_list_from_stream, make_catenated_stream,
make_delegate_stream, make_delegate_stream, stream_set_prop,
stream_get_prop, close_stream, get_error, get_error_str,
clear_error, get_line, get_char, get_byte, get_bytes,
unget_char, unget_byte, put_buf, fill_buf, fill_buf_adjust,
get_line_as_buf, format, put_string, put_char, put_byte,
flush_stream, seek_stream, truncate_stream, get_indent_mode,
test_set_indent_mode, test_neq_set_indent_mode,
set_indent_mode, get_indent, set_indent, inc_indent,
width_check, force_break, set_max_length, set_max_depth): Use
class handle instead of symbol.
(stream_init): Initialize stream_cls and stdio_stream_cls.
* struct.c (struct_type_cls, struct_cls): New class handles.
(struct_init): Initialize struct_type_cls and struct_cls.
(struct_handle): Static function moved to avoid forward
declaration.
(stype_handle): Refer to struct_type_cls class handle instead
of struct_type_s symbol. Handle instance objects in addition
to types.
(make_struct_type): Throw error if a built-in type is being
defined as a struct type. Refer to class handle instead of
class symbol.
(find_struct_type, allocate_struct, make_struct_impl,
make_lazy_struct, copy_struct): Refer to class handle instead of
class symbol.
* strudel.c (make_struct_delegate_stream): Refer to stream_cls
class handle instead of stream_s symbol.
* sysif.c (dir_cls): New class handle.
(poll_wrap): Use typep instead of subtypep, eliminating access
to class symbol.
(opendir_wrap, closedir_wrap, readdir_wrap): Use class handles
instead of class symbols.
(sysif_init): Initialize dir_cls.
* syslog.c (make_syslog_stream): Refer to stream_cls class
handle instead of stream_s symbol.
* tree.c (tree_cls, tree_iter_cls): New class handles.
(tree_insert_node, tree_lookup_node, tree_delete_node,
tree_root, tree_equal_op, tree, copy_search_tree,
make_similar_tree, treep, tree_begin, copy_tree_iter,
replace_tree_iter, tree_reset, tree_next, tree_peek,
tree_clear): Use class handle instead of class symbol.
(tree_init): Initialize tree_cls and tree_iter_cls.
* unwind.c (sys_cont_cls): New static class handle.
(revive_cont, capture_cont): Use class handle instead of class
symbol.
(uw_late_init): Initialize sys_cont_cls.
* vm.c (vm_desc_cls): New global class handle.
(vm_closure_cls): New static class handle.
(vm_desc_struct, vm_make_desc, vm_closure_struct,
vm_make_closure, vm_copy_closure): Use class handle instead of
class symbol.
(vm_init): Initialize vm_desc_cls and vm_closure_cls.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a flaw that has been in the code since the initial
implementation in 2009. Weak hash tables are only partially
marked during the initial garbage collection marking phase.
They are put into a global list, which is then walked again
to do the weak processing: to expire items which are not
reachable, and then finish walking the table objects. Problem
is, the code assumes that this late processing will not
discover more hash tables and put them into that global list.
This creates a problem when weak hash table contain weak
hash tables, such as in the important and very common case
when a global variable (binding stored in a weak hash table)
contains a weak hash table! These hash tables discovered
during weak hash table processing are partially marked, and
left that way. The result is that their table vectors get
prematurely scavenged by the garbage collector, and then
fall victim to use-after-free crashing.
Note: do_iters doesn't have this bug. Though the
reachable_iters list resembles reachable_weak_hashes, the key
difference is that do_iters does not do any marking, and
so will not discover any more reachable objects.
All it does is update some counts in the hashes to which
the still-reachable iterators point.
* hash.c (do_weak_tables): Clear the reachable_weak_hashes
list on entry into the function, taking a local copy of its
head. After walking the list, check the global variable again;
it if has become non-null, it means more weak tables were
discovered and added to the list. In that case, make a
recursive call (susceptible to tail call treatment) to process
the list again.
|
|
|
|
|
|
| |
hash.c (equal_hash): Do not multiply by the seed if it is
zero; substitute 1. Otherwise under the default seed of zero,
the hash becomes zero.
|
|
|
|
|
|
|
|
|
| |
* hash.c (hash_revget): Default to equal, except in
compatibility mode.
(hash_keys_of): Also default to equal. This function is too
new to bother with compatibility switching.
* txr.1: Documented, with compat notes.
|
|
|
|
|
|
|
|
|
| |
* hash.c (hash_keys_of): New function.
(hash_init): Register hash-keys-of intrinsic
* hash.h (hash_keys_of): Declared.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* METALICENSE: 2020 copyrights bumped to 2021. Added note
about SHA-256 routines from Colin Percival.
* LICENSE, LICENSE-CYG, Makefile, alloca.h, args.c, args.h,
arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c,
chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h,
configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h,
filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h,
hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped,
lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h,
lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c,
regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl,
share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl,
share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl,
share/txr/stdlib/copy-file.tl, share/txr/stdlib/debugger.tl,
share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl,
share/txr/stdlib/each-prod.tl, share/txr/stdlib/error.tl,
share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl,
share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl,
share/txr/stdlib/package.tl, share/txr/stdlib/param.tl,
share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl,
share/txr/stdlib/pmac.tl, share/txr/stdlib/quips.tl,
share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl,
share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl,
share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl,
share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl,
share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl,
share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, socket.c, socket.h, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright year bumped to 2021.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Makefile (OBJS): Add new time.o.
* eval.c (eval_init): Registration of time functions is
removed from here; it is done in time_init now, in time.c.
* hash.c: Must #include "time.h" now.
* lib.c (time_s, time_local_s, time_utc_s, time_string_s,
time_parse_s, year_s, month_s, day_s, hour_s, min_s, sec_s,
dst_s, gmtoff_s, zone_s): Variable definitions removed.
These are now in time.c. Also declared in time.h.
(time_sec, time_sec_usec, gmtime_r, localtime_r, string_time,
time_string_local, time_string_utc, broken_time_list,
tm_to_time_struct, broken_time_struct, time_fields_local,
time_fields_utc, time_struct_local, time_struct_utc,
time_fields_to_tm, time_struct_to_tm, make_time_impl,
make_time, epoch_tm, strptime_wrap, time_parse, setenv,
unsetenv, timegm_hack, make_time_utc, time_meth,
time_string_meth, time_parse_meth, time_parse_local,
time_parse_utc): Functions removed. These are now in time.c.
(time_init): Removed, and now in time.c as an external
function.
* lib.h (time_sec, time_sec_usec, time_string_local,
time_string_utc, time_fields_local, time_fields_utc,
time_struct_local, time_struct_utc, make_time, make_time_utc,
time_parse, time_parse_local, time_parse_utc): Declarations
removed. Now in time.h.
* rand.c: Must #include "time.h" now.
* time.c: New file.
* time.h: New file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The c_num and c_unum functions now take a self argument for
identifying the calling function. This requires changes in a
large number of places.
In a few places, additional functions acquire a self
argument. The ffi module has the most extensive example of
this.
Some functions mention their name in a larger string, or have
scattered literals giving their name; with the introduction of
the self local variable, these are replaced by references to
self.
In the following changelog, the notation TS stands for "take
self argument", meaning that the functions acquires a new "val
self" argument. The notation DS means "define self": the functions
in question defines a self variable, which they pass down.
The notation PS means that the functions pass down an existing
self variable to functions that now require it.
* args.h (args_count): TS.
* arith.c (c_unum, c_num): TS.
(toint, exptv): DS.
* buf.c (buf_check_len, buf_check_alloc_size, buf_check_index,
buf_do_set_len, replace_buf, buf_put_buf, buf_put_i8,
buf_put_u8, buf_put_char, buf_put_uchar, buf_get_bytes,
buf_get_i8, buf_get_u8, buf_get_cptr,
buf_strm_get_byte_callback, buf_strm_unget_byte, buf_swap32,
str_buf, buf_int, buf_uint, int_buf, uint_buf): PS.
(make_duplicate_buf, buf_shrink, sub_buf, buf_print,
buf_pprint): DS.
* chskum.c (sha256_stream_impl, sha256_buf, crc32_buf,
md5_stream_impl, md5_buf): TS.
(chksum_ensure_buf, sha256_stream, sha256, sha256_hash,
md5_stream, md5, md5_hash): PS.
(crc32_stream): DS.
* combi.c (perm_while_fun, perm_gen_fun_common,
perm_str_gen_fun, rperm_gen_fun, comb_vec_gen_fun,
comb_str_gen_fun, rcomb_vec_gen_fun, rcomb_str_gen_fun): DS.
* diff.c (dbg_clear, dbg_set, dbg_restore): DS.
* eval.c (do_eval, gather_free_refs, maprodv, maprendv,
maprodo, do_args_apf, do_args_ipf): DS.
(op_dwim, me_op, map_common): PS.
(prod_common): TS.
* ffi.c (struct txr_ffi_type): release member TS.
(make_ffi_type_pointer): PS and release argument TS.
(ffi_varray_dynsize, ffi_array_in, ffi_array_put_common,
ffi_array_get_common, ffi_varray_in, ffi_varray_null_term): PS.
(ffi_simple_release, ffi_ptr_in_release, ffi_struct_release,
ffi_wchar_array_get, ffi_array_release_common,
ffi_array_release, ffi_varray_release): TS.
(ffi_float_put, double_put, ffi_be_i16_put, ffi_be_u16_put,
ffi_le_i16_put, ffi_le_u16_put, ffi_be_i32_put, ffi_be_u32_put,
ffi_le_i32_put, ffi_sbit_put, ffi_ubit_put, ffi_buf_d_put,
make_ffi_type_array, make_ffi_type_enum, ffi_type_compile,
make_ffi_type_desc, ffi_make_call_desc, ffi_call_wrap,
ffi_closure_dispatch_save, ffi_put_into, ffi_in, ffi_get,
ffi_put, carray_set_length, carray_blank, carray_buf,
carray_buf_sync, carray_cptr, carray_refset, carray_sub,
carray_replace, carray_uint, carray_int): PS.
(carray_vec, carray_list): DS.
* filter.c (url_encode, url_decode, base64_stream_enc_impl): DS.
* ftw.c (ftw_callback, ftw_wrap): DS.
* gc.c (mark_obj, gc_set_delta): DS.
* glob.c (glob_wrap): DS.
* hash.c (equal_hash, eql_hash, eq_hash, do_make_hash,
hash_equal, set_hash_traversal_limit, gen_hash_seed): DS.
* itypes.c (c_i8, c_u8, c_i16, c_u16, c_i32, c_u32, c_i64,
c_u64, c_short, c_ushort, c_int, c_uint, c_long, c_ulong): PS.
* lib.c (seq_iter_rewind): TS and becomes internal.
(seq_iter_init_with_info, seq_setpos, replace_str, less,
replace_vec, diff, isec, obj_print_impl): PS.
(nthcdr, equal, mkstring, mkustring, upcase_str, downcase_str,
search_str, sub_str, cat_str, scat2, scat3, fmt_join,
split_str_keep, split_str_set, trim_str, int_str, chr_int,
chr_str, chr_str_set, vector, vecref, vecref_l, list_vec,
copy_vec, sub_vec, cat_vec, lazy_str_put, lazy_str_gt,
length_str_ge, length_str_lt, length_str_le, cptr_size_hint,
cptr_int, out_lazy_str, out_quasi_str, time_string_local_time,
time_string_utc, time_fields_local_time, time_fields_utc,
time_struct_local, time_struct_utc, make_time, time_meth,
time_parse_meth): DS.
(init_str, cat_str_init, cat_str_measure, cat_str_append,
vscat, time_fields_to_tm, time_struct_to_tm, make_time_impl): TS.
* lib.h (seq_iter_rewind): Declaration removed.
(c_num, c_unum, init_str): Declarations updated.
* match.c (LOG_MISMATCH, LOG_MATCH): PS.
(h_skip, h_coll, do_output_line, do_output, v_skip, v_fuzz,
v_collect): DS.
* parser.c (parser, circ_backpatch, report_security_problem,
hist_save, repl, lino_fileno, lino_getch, lineno_getl,
lineno_gets, lineno_open): DS.
(parser_set_lineno, lisp_parse_impl): PS.
* parser.l (YY_INPUT): PS.
* rand.c (make_random_state): PS.
* regex.c (print_rec): DS.
(search_regex): PS.
* signal.c (kill_wrap, raise_wrap, get_sig_handler,
getitimer_wrap, setitimer_wrap): DS.
* socket.c (addrinfo_in, sockaddr_pack, fd_timeout,
to_connect, open_sockfd, sock_mark_connected,
sock_timeout): TS.
(getaddrinfo_wrap, dgram_set_sock_peer, sock_bind,
sock_connect, sock_listen, sock_accept, sock_shutdown,
sock_send_timeout, sock_recv_timeout, socketpair_wrap): DS.
* stream.c (generic_fill_buf, errno_to_string, stdio_truncate,
string_out_put_string, open_fileno, open_command, base_name,
dir-name): DS.
(unget_byte, put_buf, fill_buf, fill_buf_adjust,
get_line_as_buf, formatv, put_byte, test_set_indent_mode,
test_neq_set_indent_mode, set_indent_mode, set_indent,
inc_indent, set_max_length, set_max_depth, open_subprocess, run ): PS.
(fds_subst, fds_swizzle): TS.
* struct.c (make_struct_type, super, umethod_args_fun): PS.
(method_args_fun): DS.
* strudel.c (strudel_put_buf, strudel_fill_buf): DS.
* sysif.c (errno_wrap, exit_wrap, usleep_wrap, mkdir_wrap,
ensure_dir, makedev_wrap, minor_wrap, major_wrap, mknod_wrap,
mkfifo_wrap, wait_wrap, wifexited, wexitstatus, wifsignaled,
wtermsig, wcoredump, wifstopped, wstopsig, wifcontinued,
dup_wrap, close_wrap, exit_star_wrap, umask_wrap, setuid_wrap,
seteuid_wrap, setgid_wrap, setegid_wrap,
simulate_setuid_setgid, getpwuid_wrap, fnmatch_wrap,
dlopen_wrap): DS.
(chmod_wrap, do_chown, flock_pack, do_utimes, poll_wrap,
setgroups_wrap, setresuid_wrap, setresgid_wrap, getgrgid_wrap): PS.
(c_time): TS.
* sysif.h (c_time): Declaration updated.
* syslog.c (openlog_wrap, syslog_wrap): DS.
* termios.c (termios_pack): TS.
(tcgetattr_wrap, tcsetattr_wrap, tcsendbreak_wrap,
tcdrain_wrap, tcflush_wrap, tcflow_rap, encode_speeds,
decode_speeds): DS.
* txr.c (compato, array_dim, gc_delta): DS.
* unwind.c (uw_find_frames_by_mask): DS.
* vm.c (vm_make_desc): PS.
(vm_make_closure, vm_swtch): DS.
|
|
|
|
|
|
|
|
|
|
| |
Time for some spring cleaning.
* args.c, arith.c, buf.c, cadr.c, chksum.c, debug.c, ftw.c,
gc.c, gencadr.txr, glob.c, hash.c, lisplib.c, match.c,
parser.c, parser.l, parser.y, rand.c, signal.c, stream.c,
strudel.c, syslog.c, tree.c, unwind.c, utf8.c, vm.c: Numerous
unnecessary #include directives removed.
|