summaryrefslogtreecommitdiffstats
path: root/hash.c
Commit message (Collapse)AuthorAgeFilesLines
* New function: group-map.Kaz Kylheku2022-03-021-0/+7
| | | | | | | | | | | | * hash.c (group_map): New function. (hash_init): group-map intrinsic registered. * hash.h (group_map): Declared. * tests/010/hash.tl: New test case. * txr.1: Documented together with group-by. Extra paren removed from group-by example.
* hash: group-reduce calls hash-update.Kaz Kylheku2022-03-021-9/+3
| | | | | | | * hash.c (group_reduce): Replace loop with call to hash_update which is exactly the same logic, and even more efficient because it avoids calling us_rplacd. (hash_update): Fix incorrect self name.
* Fix more -fsanitize=implicit-conversion findings.Kaz Kylheku2022-02-141-2/+1
| | | | | | | | | | | | | | | | | | | | | * arith.c (highest_significant_bit): Bugfix: do not pass a negative value to highest_bit, where we will get then get the wrong idea about the number of significant bits in the value, since the __builtin_clz primitives will include the sign bit. We want to complement the all the bits, so that the sign bit will go to zero. We can do this arithmetically by taking the additive inverse (which is the two's complement (which is the complement plus one)) and subtracting one. (ash): Avoid left shifting a negative number in HAVE_UBSAN mode using the same trick as in num_fast. * ffi.c (ffi_swap_u16): Here the shift and or calculation is producing a value beyond 16 bits which we are relying on the implicit conversion back to uin16_t to trim away. We add the cast to uint16_t to make it explicit. * hash.c (equal_hash): Also handle the CHR and NUM cases here via c_u like in eql_hash and eq_hash.
* Few adjustments to no-implicit-conversion patch.Kaz Kylheku2022-02-141-13/+7
| | | | | | | | | | | | | | | | | | | | * lib.h (c_u): New inline function: unsafe conversion to ucnum, analogous to c_n for cnum. * hash.c (equal_hash, hash_iter_init): Use UINT_PTR_MAX instead of convert(ucnum, -1). (eql_hash): mp_hash returns unsigned long, so shouldn't require a cast to go to the uint_ptr_t. The types are of the same size, or at worst it is a widening. Also replace convert(ucnum, -1) by UINT_PTR_MAX here. Combining the TAG_CHR and TAG_NUM cases, and using c_u, which is more efficient since c_chr and c_num are non-inlined functions which redundantly check type. We no longer need a self variable in this function. (eq_hash): Same TAG_CHR and TAG_NUM changes as eql_hash. * regex.c (char_set_add): Reformat change to avoid line break across assignment.
* Fix various instances of implicit conversions.Paul A. Patience2022-02-141-7/+7
| | | | | | | | | | | | | | | | | | | | | The implicit conversions were discovered with Clang's UBSan (with the -fsanitizer=implicit-conversion option). * gc.c (sweep_one): Convert only the inverted REACHABLE, since block->t.type is already of the right type. * hash.c (eql_hash, eq_hash, hash_iter_init, us_hash_iter_init): Explicitly convert to ucnum. * linenoise/linenoise.c (enable_raw_mode): Explicitly convert the inverted flag sets to tcflag_t. * mpi/mpi.c (mp_set_uintptr): Explicitly convert to uint_ptr_t. * regex.c (char_set_add): Explicitly convert to bitcell_t. * struct.c (struct_inst_hash): Correct type of hash from cnum to ucnum.
* Copyright year bump 2022.Kaz Kylheku2022-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2022.
* Eliminate declaration-after-statement everywhere.Kaz Kylheku2021-12-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The use of -ansi doesn't by itself diagnose instances of some constructs we don't want in the project, like mixed declarations and statements. * configure (diag_flags): Add -Werror=declaration-after-statement. This is C only, so filter it out for C++. Also add -Werror=vla. * HACKING: Update inaccurate statements about what dialect we are using. TXR isn't pure C90: some GCC extensions are used. We even use long long if the configure script detects it as working, and some C99 library features. * buf.c (replace_buf, buf_list): Fix by reordering. * eval.c (op_dohash, op_load_time_lit): Fix by reordering. * ffi.c (ffi_simple_release): Fix by reordering. (align_sw_get): Fix empty macro to expand to dummy declaration so a semicolon after it isn't interpreted as a statement. On platforms with alignment, remove a semicolon from the macro so that it requires one. (ffi_i8_put, ffi_u8_put): Fix by reordering. * gc.c (gc_init): Fix with extra braces. * hash.c (hash_init): Fix by reordering. * lib.c (list_collect_revappend, sub_iter, replace_str, replace_vec, mapcar_listout, mappend, mapdo, window_map_list, subst): Fix by reordering. (gensym, find, rfind, pos, rpos, in, search_common): Fix by renaming optional argument and using declaration instead of assignment. * linenoise/linenoise.c (edit_in_editor): Fix by reordering. * parser.c (is_balanced_line): Fix by reordering. * regex.c (nfa_count_one, print_rec): Fix by reordering. * signal.c (sig_mask): Fix by reordering. * stream.c (get_string): Fix by renaming optional argument and using declaration instead of assignment. * struct.c (lookup_static_slot_desc): Fix by turning mutated variable into block local. (umethod_args_fun): Fix by reordering. (get_special_slot): Fix by new scope via braces. * sysif.c (usleep_wrap): Fix by new scope via braces. (setrlimit_wrap): Fix by new scope via braces. * time.c (time_string_meth, time_parse_meth): Fix by reordering. * tree.c (tr_do_delete_spec): Fix by new scope via braces. * unwind.h (uw_block_beg): New macro which doesn't define RESULTVAR but expects it to refers to an existing one. (uw_block_begin): Replace do while (0) with enum trick so that we have a declaration that requires a semicolon, rather than a statement, allowing declarations to follow. (uw_match_env_begin): Now opens a scope and features the same enum trick as in uw_block_begin. This fixes a declaration-follows-statement issue in the v_output function in match.c. (uw_match_env_end): Closes scope opened by uw_match_env_begin. * unwind.c (revive_cont): Fix by introducing variable, and using new uw_block_beg macro. * vm.c (vm_execute_closure): Fix using combination of local variable and reordering.
* hash: 64 bit string and buffer hashing and seeds.Kaz Kylheku2021-11-171-3/+112
| | | | | | | | | | | | | | | | | | * hash.c (randbox, hash_c_str, hash_buf): Separate implementation for 64 bit pointers, using 64 bit random values, and producing a 64 bit hash, taking in a 64 bit seed. (gen_hash_seed): Use time_sec_nsec to get nanoseconds. On 64 bit, put together the seed differently to generate a wider value. * tests/009/json.txr: Change from hash tables to lists, so the order of the output doesn't change between 64 and 32 bits, due to the different string hashing. * tests/009/json.expected: Updated. * txr.1: Documented that seeds are up to 64 bits, but with possibly only the lower 32 bits being used.
* hash: spurious space in printed representation.Kaz Kylheku2021-11-081-6/+10
| | | | | * hash.c (hash_print_op): Only set the need_space flag if some leading item is printed.
* hash: gc problem in copy-hash.Kaz Kylheku2021-09-131-1/+1
| | | | | | | | | * hash.c (copy_hash): The order of allocating the hash object and vector is incorrect. The hash must be allocated last, like it is in do_make_hash and make_similar_hash. If the vector is allocated after the hash, it can trigger gc, and then the garbage collector will traverse the uninitialized parts of the hash object.
* hash: use unsigned, and operation.Kaz Kylheku2021-08-191-15/+15
| | | | | | | | | | | | | | | | * hash.c (struct hash): modulus and count change from cnum to ucnum. (hash_mark, hash_grow, copy_hash, do_weak_tables): Use ucnum local vars. (do_make_hash, make_similar_hash): Use c_unum to obtain modulus. (gethash_c, gethash_e): Use & masking operation to reduce hash value to table size. (remhash): Move sanity check before decrement since unsigned value can't go below zero. (clearhash): Use ucnum and c_unum. (hash_iter_peek): Use ucnum for chain count local. * hash.h (struct hash_iter): chain changes from cnum to ucnum.
* license: reformat to fit 80 columns.Kaz Kylheku2021-08-161-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h: License reformatted. * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
* hash: change make_hash interface.Kaz Kylheku2021-07-221-25/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The make_hash function now takes the hash_weak_opt_t enumeration instead of a pair of flags. * hash.c (do_make_hash): Take enum argument instead of pair of flags. Just store the option; nothing to calculate. (weak_opt_from_flags): New static function. (tweak_hash): Function removed. (make_seeded_hash): Adjust to new do_make_hash interface with help from weak_opt_from_flags. (make_hash, make_eq_hash): Take enum argument instead of pair of flags. (hashv): Calculate hash_weak_opt_t enum from the extracted flags, pass down to make_eq_hash or make_hash. * hash.h (tweak_hash): Declration removed. (make_hash, make_eq_hash): Declarations updated. * eval.c (me_case, expand_switch): Update make_hash calls to new style. (eval_init): Update make_hash calls and get rid of tweak_hash calls. This renders the tweak_hash function unused. * ffi.c (make_ffi_type_enum, ffi_init): Update make_hash calls to new style. * filter.c (make_trie, trie_add, filter_init): Likewise. * lib.c (make_package_common, obj_init, obj_print): Likewise. * lisplib.c (lisplib_init): Likewise. * match.c (dir_tables_init): Likewise. * parser.c (parser_circ_def, repl, parse_init): Likewise. * parser.l (parser_l_init): Likewise. * struct.c (struct_init, get_slot_syms): Likewise. * sysif.c (get_env_hash): Likewise. * lex.yy.c.shipped, y.tab.c.shipped: Updated.
* hash: rename "flags" to "weak options".Kaz Kylheku2021-07-221-13/+13
| | | | | | | | | | | | | | The flags field of hashes isn't really functioning as flags; it's an enumeration whose numeric properties are exploited in one place in the code. * hash.h (enum hash_flags): Rename to enum hash_weak_opt. (hash_flags_t): Renum to hash_weak_opt_t. (tweak_hash): Declaration updated. * hash.c (struct hash): Rename flags member to wkopt. (hash_print_op, hash_mark, do_make_hash, tweak_hash, make_similar_hash, copy_hash, do_weak_tables): Follow renames.
* hash: and-semantics: add missing nuance in marking.Kaz Kylheku2021-07-211-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A weak table with and-semantics expires entries only when both their key and value is unreachable. When this condition is not met, therefore, the hash table generates a reference to both the key and value. This gives rise to a subtlety that must be be correctly handles in the marking phase. * hash.c (hash_mark): When marking an and-semantics table, whenever we find a reachable key or value, we know that the entry is staying. Therefore we mark it: if the key is unreachable, we mark the value and vice versa. This is important because these unreachable objects may be the only references for reaching reach some other objects via one or more weak hash tables. Those secondary objects may spontaneously disappear due to those other hash tables removing their entries. E.g suppose H0 has and-semantics, and some K-V entry in H1 has a reachable K, but unreachable V. Therefore the entry is not eligible for removal, and thus maintains references to K and V. Suppose V happens to be a key in a weak-key hash table H1. If, while marking H0, we do not mark V, then there is a risk that H1 will be processed first during the later weak procesing stage, and H1 will wrongly expire its V entry due to the key V being unreachable. Then when H0 is processed, it will mark V, making it reachable, but too late: the V entry in H1 is already spuriously gone. The main principle at play is that an entry in an and-semantics table strongly holds on to a key if the value is reachable and vice versa. Only if both are simultaneously unreachable does it relinquish its references.
* hash: support both semantics of weak keys + values.Kaz Kylheku2021-07-211-60/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hash tables with weak keys and values now support a choice of both possible semantics: under and-semantics, an entry lapses when both the key and value are unreachable. Under or-semantics, an entry lapses if either the key or value is unreachable. The and-semantics is new. Until TXR 266, only or-semantics was supported. This will be the default: when a hash table is specified as :weak-keys and :weak-vals, it will have or-semantics. The keywords :weak-or and :weak-and specify weak keys and values, with the specific semantics. They are utually exclusive, but tolerate the presence of :weak-keys and :weak-vals. The make-hash function is being extended such that if its leftmost argument, <weak-keys>, is specified as one of the keywords :weak-and or :weak-or, then the hash table will have weak keys and values with the specified semantics, and the <weak-vals> argument is ignored (values are weak even if that argument is false). * eval.c (eval_init): Initially register the top_vb, top_mb, top_smb, special and builtin hashes as ordinary hashes: no weak keys or values. Then use tweak_hash to switch to weak keys+vals with and-semantics. We do it this way because the keywords are not yet initialized; we cannot use them. * hash.h (enum hash_flags, hash_flags_t): Moved to header. Member hash_weak_both renamed to hash_weak_or. New member hash_weak_and. (weak_and_k, weak_or_k): New keyword variables. (hash_print_op): Handle hash_weak_and by printing :weak-and. (hash_mark): Handle hash_weak_and by marking nothing, like hash_weak_or. (do_make_hash): Check first argument against the two new keywords and set flags accordingly. This function is called from eval_init before the keywords have been initialized, in which case weak_keys == weak_and_k is true when both are nil; we watch for that. (tweak_hash): Now returns void and takes a hash_flags_t argument which is simply planted. (do_wak_tables): Implement hash_weak_and case. Remove the compat 266 stuff from hash_weak_or. Compatibility is no longer required since we are not changing the default semantics of hash tables. Phew; that's a load of worry off the plate. (hashv): Parse the two new keywords, validate and provide semantics. (hash_init): Initialize weak_and_k and weak_or_k kewyords. * hash.h (enum hash_flags, hash_flags_t): Moved here now. (weak_and_k, weak_or_k): Declared. * lib.c (compat_fixup): Remove call to parse_compat_fixup. * parser.c (parse_init): Create stream_parser_hash with and-semantics. (parse_compat_fixup): Function removed. * parser.h (parse_compat_fixup): Declaration removed. * txr.1: Hash documentation updated.
* parse/eval: use weak-both hash tables.Kaz Kylheku2021-07-201-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This addresses the problem that a4c376979d15323ad729e92e41ba43768e8dc163 tried to fix. * eval.c (eval_init): Make all the top-level binding tables, top_fb, top_vb, top_mb, top_smb, special and builtin, weak-both tables: keys and values are weak. This way, the entries disappear if both key and value are unreachable, even if they refer to each other. (eval_compat_fixup): In 266 or earlier compat mode, weak-both tables don't have the right semantics, so we tweak the tables to weak-key tables. * parser.c (parse_init): Same treatment for stream_parser_hash. We want an entry to disappear from the hash if neither the parser nor the stream are reachable. (parse_compat_fixup): New function. * parser.h (parse_compat_function): Declared. * hash.c, hash.h (tweak_hash): New function. * lib.c (compat_fixup): Call parse_compat_fixup.
* hash: change semantics of weak-both hash tables.Kaz Kylheku2021-07-201-16/+40
| | | | | | | | | | | | | | | | From now on, hash tables with both weak keys and values have dijunctive retention semantics. If either the key or value of an entry is reachable, then the entry stays. This is subject to compatibility. * hash.c (do_weak_tables): Expire an entry if neither the key nor the value is reachable. In 266 or lower compatibility mode, expire an entry if either the key or value is unreachable, like before. * txr.1: Document the change, with compat notes. Add a cautionary note about the referencing issue which defeats weak key or weak value tables.
* hash: remove unnecessary tests in weak processing.Kaz Kylheku2021-07-201-4/+3
| | | | | | * hash.c (do_weak_tables): Do not test hash entries for reachability, only keys and values. This is not worth doing and possibly adds cycles.
* hash: fix possibly incorrect counts in weak processing.Kaz Kylheku2021-07-201-16/+11
| | | | | | | * hash.c (do_weak_tables): Iterate to the end of each chain, not quitting early when a reachable tail is found. This has the effect that we will always count the nodes properly. Some common code is factored out of the switch also.
* hash: revert bad fix in weak processing.Kaz Kylheku2021-07-201-13/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit reverts the April 11, 2020 commit a4c376979d15323ad729e92e41ba43768e8dc163, subject line "hash: bugfix: spurious retention in weak processing". That commit is a regression. This revert requires a follow-up; the commit was trying to fix an issue which now reappears. It will have to be fixed differently. The regression is that in a situation in which data is referenced through two or more dependent weak tables, entries that are reachable can spontaneously disappear from downstream tables. Suppose H0 and H1 are weak-key tables. Suppose the program holds a datum K0, which is the only reference through which D1 is reached, in the following chain: K0 -> [H0] -> K1 -> [H1] -> D1 K0 is a key in hash table H0, which has weak keys. The the associated value K1 is a key in H1, which then references D1. H0 holds the only reference to K1, and H1 holds the only reference to D1. During the first GC marking phase, because we do not mark any part of a table which has weak keys, when we process H0 we do not mark the value K1. Thus K1 looks unreachable. In the second weak hash processing pass, because K1 was treated as unreachable, the <K1, D1> entry in H1 is incorrectly expired. This issue affects TXR's origin_hash and form_to_ln_hash, which are in this relationship. The problem was uncovered in recent tags.tl work, manifesting as a spurious disappearance of line number info when processing .txr files. The line number info disappeared for entries which were traced through the origin_hash via the macro-ancestor function (see the unexpand function in tags.tl). * hash.c (hash_mark): Only skip marking tables that have both weak keys and values. For weak key tables, mark all the values and vice versa: for weak value tables, mark the keys. * txr.1: Text resembling original text restored.
* type: disallow structs using built-in type names.Kaz Kylheku2021-07-081-26/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a big commit motivated by the need to clean up the situation with built-in type symbols, COBJ objects and structs. The struct type system allows struct types to be defined for symbols like regex or str, which are used by built-in or cobj types. This is a bad thing. What is worse, structure instances are COBJ types which identify their type using the COBJ class symbol mechanism. There are places in the C implementation which assume that when a COBJ has a certain class symbol, it is of a certain expected type, which is totally different from and incompatible form a struct instance. User code can define a structure object which will fool that code. There are multiple things going on in this patch. The major theme is that the COBJ representation is changing. Instead of a class symbol, COBJ instances now carry a "struct cobj_class *" pointer. This pointer is obtained by registration via the cobj_register function. All modules must register their class symbols to obtain these class handles, which are then used in cobj() calls for instantiation. The CPTR type was identical to COBJ until now, except for the type tag. This is changing; CPTR objects will keep the old representation with the class symbol. commit 20fdfc6008297001491308849c17498c006fe7b4 Author: Kaz Kylheku <kaz@kylheku.com> Date: Thu Jul 8 19:17:39 2021 -0700 * ffi.h (carray_cls): Declared. * hash.h (hash_cls): Declared. (hash_early_init): Declared. * lib.h (struct cobj_class): New struct. (struct cobj): cls member changing to struct cobj_class *. (struct cptr): New struct, same as previous struct cobj. (union obj): New member cp of type struct cptr, for CPTR. (builtin_type): Declared. (class_check): Declaration moved closer to COBJ-related functions and updated. (cobj_register, cobj_register_super, cobj_class_exists): New functions declared. (cobjclassp, cobj_handle, cobj_ops): Declarations updated. * parser.h (parser_cls): Declared. * rand.h (random_state_cls): Declared. * regex.h (regex_cls): Declared. * stream.h (stream_cls, stdio_stream_cls): Declared. * struct.h (struct_cls): Declared. * tree.h (tree_cls, tree_iter_cls): Declared. * vm.h (vm_desc_cls): Declared. * buf.c (buf_strm, make_buf_stream): Pass stream_cls functions instead of stream_s class symbol. * chksum.c (sha256_ctx_cls, md5_ctx_cls): New static class handles. (sha256_begin, sha256_hash, sha256_end, md5_begin, md5_hash, md5_end): Pass class handles to instead of class symbols. (chksum_init): Initialize class handle variables. * ffi.c (ffi_type_cls, ffi_call_desc_cls, ffi_closure_cls, union_cls): New static class handles. (carray_cls): New global variable. (ffi_type_struct_checked, ffi_type_print_op, ffi_closure_struct_checked, ffi_closure_print_op, make_ffi_type_builtin, make_ffi_type_pointer, make_ffi_type_struct, make_ffi_type_union, make_ffi_type_array, make_ffi_type_enum, ffi_call_desc_checked, ffi_call_desc_print_op, ffi_make_call_desc, ffi_make_closure, carray_struct_checked, carray_print_op, make_carray, cptr_getobj, cptr_out, uni_struct_checked, make_union_common): Pass class handles instead of class symbols. (ffi_init): Initialize class handle variables. * filter.c (regex_from_trie): Use hash_cls class handle instead of hash_s. * gc.c (mark_obj): Split COBJ and CPTR cases since the representation is different. * hash.c (hash_cls, hash_iter_cls): New class handles. (make_similar_hash, copy_hash, gethash_c, gethash_e, remhash, clearhash, hash_count, get_hash_userdata, set_hash_userdata, hashp, hash_iter_init, hash_begin, hash_next, hash_peek, hash_reset, hash_reset, hash_uni, hash_diff, hash_symdiff, hash_isec): Pass class handles instead of class symbols. (hash_early_init): New function. (hash_init): Set the class symbols in the class handles that were created in hash_early_init at a time when these symbols did not exist. * lib.c (nelem): New macro. (cobj_class): New static array. (cobj_ptr): New static pointer. (cobj_hash): New static hash. (seq_iter_cls): New static class handle. (builtin_type_p): New function. (typeof): Struct instances now all carry the same symbol, struct, as their COBJ class symbol. To get their type, we must call struct_type_name. (subtypep): Rearrangement of two cases: let's make the reflexive case first. Adjust code for different location of COBJ class symbol. (seq_iter_init_with_info, seq_begin, seq_next, seq_reset, iter_begin, iter_more, iter_item, iter_step, iter_reset, make_like, list_collect, do_generic_funcall): Use class handles instead of class symbols. (class_check, cobj, cobjclassp, cobj_handle, cobj_ops): Take class handle argument instead of class symbol. (cobj_register, cobj_register_super, cobj_class_exists): New functions. (cobj_populate_hash): New static function. (cobj_print_op): Adjust for different location of class (cptr_print_op, cptr_typed, cptr_type, cptr_handle, cptr_get): cptr functions now refer to obj->cp rather than obj->co. (copy, length, sub, ref, refset, replace, dwim_set, dwim_del, obj_print): Use class handles for various COBJ types rather than class symbols. (obj_init): gc-protect cobj_hash. Initialize seq_iter_cls class symbol and cobj_hash. Populate cobj_hash as the last initialization step. (init): Call hash_early_init immediately after gc_init. diff --git a/lib.c b/lib.c * match.c (do_match_line): Refer to regex_cls class handle instead of regex_s.. * parser.c (parser_cls): New global class handle. (parse, parser_get_impl, lisp_parse_impl, txr_parse, parser_errors): Use class handles instead of class symbols. (parse_init): Initialize parser_cls. * rand.c (random_state_cls): New global class handle. (make_state, random_state_p, make_random_state, random_state_get_vec, random_fixnum, random_float, random): Use class handles instead of class symbols. (rand_init): Initialize random_state_cls. * regex.c (regex_cls): New global class handle. (chset_cls): New static class handle. (reg_compile_csets, reg_derivative, regex_compile, regexp, regex_source, regex_print, regex_run, regex_machine_init): Use class handles instead of class symbols. (regex_init): Initialize regex_cls and chset_cls. * socket.c (make_dgram_sock_stream): Use stream_cls class symbol instead of stream_s. * stream.c (stream_cls, stdio_stream_cls): New class handles. (make_null_stream, stdio_get_fd, make_stdio_stream_common, stream_fd, sock_family, sock_type, sock_peer, sock_set_peer, make_dir_stream, make_string_input_stream, make_string_byte_input_stream, make_strlist_input_stream, make_string_output_stream, make_strlist_output_stream, get_list_from_stream, make_catenated_stream, make_delegate_stream, make_delegate_stream, stream_set_prop, stream_get_prop, close_stream, get_error, get_error_str, clear_error, get_line, get_char, get_byte, get_bytes, unget_char, unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, format, put_string, put_char, put_byte, flush_stream, seek_stream, truncate_stream, get_indent_mode, test_set_indent_mode, test_neq_set_indent_mode, set_indent_mode, get_indent, set_indent, inc_indent, width_check, force_break, set_max_length, set_max_depth): Use class handle instead of symbol. (stream_init): Initialize stream_cls and stdio_stream_cls. * struct.c (struct_type_cls, struct_cls): New class handles. (struct_init): Initialize struct_type_cls and struct_cls. (struct_handle): Static function moved to avoid forward declaration. (stype_handle): Refer to struct_type_cls class handle instead of struct_type_s symbol. Handle instance objects in addition to types. (make_struct_type): Throw error if a built-in type is being defined as a struct type. Refer to class handle instead of class symbol. (find_struct_type, allocate_struct, make_struct_impl, make_lazy_struct, copy_struct): Refer to class handle instead of class symbol. * strudel.c (make_struct_delegate_stream): Refer to stream_cls class handle instead of stream_s symbol. * sysif.c (dir_cls): New class handle. (poll_wrap): Use typep instead of subtypep, eliminating access to class symbol. (opendir_wrap, closedir_wrap, readdir_wrap): Use class handles instead of class symbols. (sysif_init): Initialize dir_cls. * syslog.c (make_syslog_stream): Refer to stream_cls class handle instead of stream_s symbol. * tree.c (tree_cls, tree_iter_cls): New class handles. (tree_insert_node, tree_lookup_node, tree_delete_node, tree_root, tree_equal_op, tree, copy_search_tree, make_similar_tree, treep, tree_begin, copy_tree_iter, replace_tree_iter, tree_reset, tree_next, tree_peek, tree_clear): Use class handle instead of class symbol. (tree_init): Initialize tree_cls and tree_iter_cls. * unwind.c (sys_cont_cls): New static class handle. (revive_cont, capture_cont): Use class handle instead of class symbol. (uw_late_init): Initialize sys_cont_cls. * vm.c (vm_desc_cls): New global class handle. (vm_closure_cls): New static class handle. (vm_desc_struct, vm_make_desc, vm_closure_struct, vm_make_closure, vm_copy_closure): Use class handle instead of class symbol. (vm_init): Initialize vm_desc_cls and vm_closure_cls.
* gc: fix astonishing bug in weak hash processing.Kaz Kylheku2021-04-061-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a flaw that has been in the code since the initial implementation in 2009. Weak hash tables are only partially marked during the initial garbage collection marking phase. They are put into a global list, which is then walked again to do the weak processing: to expire items which are not reachable, and then finish walking the table objects. Problem is, the code assumes that this late processing will not discover more hash tables and put them into that global list. This creates a problem when weak hash table contain weak hash tables, such as in the important and very common case when a global variable (binding stored in a weak hash table) contains a weak hash table! These hash tables discovered during weak hash table processing are partially marked, and left that way. The result is that their table vectors get prematurely scavenged by the garbage collector, and then fall victim to use-after-free crashing. Note: do_iters doesn't have this bug. Though the reachable_iters list resembles reachable_weak_hashes, the key difference is that do_iters does not do any marking, and so will not discover any more reachable objects. All it does is update some counts in the hashes to which the still-reachable iterators point. * hash.c (do_weak_tables): Clear the reachable_weak_hashes list on entry into the function, taking a local copy of its head. After walking the list, check the global variable again; it if has become non-null, it means more weak tables were discovered and added to the list. In that case, make a recursive call (susceptible to tail call treatment) to process the list again.
* hashing: bug: hash-equal zero: floats and bignums.Kaz Kylheku2021-03-051-2/+2
| | | | | | hash.c (equal_hash): Do not multiply by the seed if it is zero; substitute 1. Otherwise under the default seed of zero, the hash becomes zero.
* hash: hash-revget now defaults to equal.Kaz Kylheku2021-01-221-2/+6
| | | | | | | | | * hash.c (hash_revget): Default to equal, except in compatibility mode. (hash_keys_of): Also default to equal. This function is too new to bother with compatibility switching. * txr.1: Documented, with compat notes.
* New function: hash-keys-of.Kaz Kylheku2021-01-201-0/+21
| | | | | | | | | * hash.c (hash_keys_of): New function. (hash_init): Register hash-keys-of intrinsic * hash.h (hash_keys_of): Declared. * txr.1: Documented.
* Copyright year bump 2021.Kaz Kylheku2021-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * METALICENSE: 2020 copyrights bumped to 2021. Added note about SHA-256 routines from Colin Percival. * LICENSE, LICENSE-CYG, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/copy-file.tl, share/txr/stdlib/debugger.tl, share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/each-prod.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/param.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/quips.tl, share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2021.
* time: move time functions out of lib.c into time.c.Kaz Kylheku2020-10-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Makefile (OBJS): Add new time.o. * eval.c (eval_init): Registration of time functions is removed from here; it is done in time_init now, in time.c. * hash.c: Must #include "time.h" now. * lib.c (time_s, time_local_s, time_utc_s, time_string_s, time_parse_s, year_s, month_s, day_s, hour_s, min_s, sec_s, dst_s, gmtoff_s, zone_s): Variable definitions removed. These are now in time.c. Also declared in time.h. (time_sec, time_sec_usec, gmtime_r, localtime_r, string_time, time_string_local, time_string_utc, broken_time_list, tm_to_time_struct, broken_time_struct, time_fields_local, time_fields_utc, time_struct_local, time_struct_utc, time_fields_to_tm, time_struct_to_tm, make_time_impl, make_time, epoch_tm, strptime_wrap, time_parse, setenv, unsetenv, timegm_hack, make_time_utc, time_meth, time_string_meth, time_parse_meth, time_parse_local, time_parse_utc): Functions removed. These are now in time.c. (time_init): Removed, and now in time.c as an external function. * lib.h (time_sec, time_sec_usec, time_string_local, time_string_utc, time_fields_local, time_fields_utc, time_struct_local, time_struct_utc, make_time, make_time_utc, time_parse, time_parse_local, time_parse_utc): Declarations removed. Now in time.h. * rand.c: Must #include "time.h" now. * time.c: New file. * time.h: New file.
* c_num: now takes self argument.Kaz Kylheku2020-06-291-17/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The c_num and c_unum functions now take a self argument for identifying the calling function. This requires changes in a large number of places. In a few places, additional functions acquire a self argument. The ffi module has the most extensive example of this. Some functions mention their name in a larger string, or have scattered literals giving their name; with the introduction of the self local variable, these are replaced by references to self. In the following changelog, the notation TS stands for "take self argument", meaning that the functions acquires a new "val self" argument. The notation DS means "define self": the functions in question defines a self variable, which they pass down. The notation PS means that the functions pass down an existing self variable to functions that now require it. * args.h (args_count): TS. * arith.c (c_unum, c_num): TS. (toint, exptv): DS. * buf.c (buf_check_len, buf_check_alloc_size, buf_check_index, buf_do_set_len, replace_buf, buf_put_buf, buf_put_i8, buf_put_u8, buf_put_char, buf_put_uchar, buf_get_bytes, buf_get_i8, buf_get_u8, buf_get_cptr, buf_strm_get_byte_callback, buf_strm_unget_byte, buf_swap32, str_buf, buf_int, buf_uint, int_buf, uint_buf): PS. (make_duplicate_buf, buf_shrink, sub_buf, buf_print, buf_pprint): DS. * chskum.c (sha256_stream_impl, sha256_buf, crc32_buf, md5_stream_impl, md5_buf): TS. (chksum_ensure_buf, sha256_stream, sha256, sha256_hash, md5_stream, md5, md5_hash): PS. (crc32_stream): DS. * combi.c (perm_while_fun, perm_gen_fun_common, perm_str_gen_fun, rperm_gen_fun, comb_vec_gen_fun, comb_str_gen_fun, rcomb_vec_gen_fun, rcomb_str_gen_fun): DS. * diff.c (dbg_clear, dbg_set, dbg_restore): DS. * eval.c (do_eval, gather_free_refs, maprodv, maprendv, maprodo, do_args_apf, do_args_ipf): DS. (op_dwim, me_op, map_common): PS. (prod_common): TS. * ffi.c (struct txr_ffi_type): release member TS. (make_ffi_type_pointer): PS and release argument TS. (ffi_varray_dynsize, ffi_array_in, ffi_array_put_common, ffi_array_get_common, ffi_varray_in, ffi_varray_null_term): PS. (ffi_simple_release, ffi_ptr_in_release, ffi_struct_release, ffi_wchar_array_get, ffi_array_release_common, ffi_array_release, ffi_varray_release): TS. (ffi_float_put, double_put, ffi_be_i16_put, ffi_be_u16_put, ffi_le_i16_put, ffi_le_u16_put, ffi_be_i32_put, ffi_be_u32_put, ffi_le_i32_put, ffi_sbit_put, ffi_ubit_put, ffi_buf_d_put, make_ffi_type_array, make_ffi_type_enum, ffi_type_compile, make_ffi_type_desc, ffi_make_call_desc, ffi_call_wrap, ffi_closure_dispatch_save, ffi_put_into, ffi_in, ffi_get, ffi_put, carray_set_length, carray_blank, carray_buf, carray_buf_sync, carray_cptr, carray_refset, carray_sub, carray_replace, carray_uint, carray_int): PS. (carray_vec, carray_list): DS. * filter.c (url_encode, url_decode, base64_stream_enc_impl): DS. * ftw.c (ftw_callback, ftw_wrap): DS. * gc.c (mark_obj, gc_set_delta): DS. * glob.c (glob_wrap): DS. * hash.c (equal_hash, eql_hash, eq_hash, do_make_hash, hash_equal, set_hash_traversal_limit, gen_hash_seed): DS. * itypes.c (c_i8, c_u8, c_i16, c_u16, c_i32, c_u32, c_i64, c_u64, c_short, c_ushort, c_int, c_uint, c_long, c_ulong): PS. * lib.c (seq_iter_rewind): TS and becomes internal. (seq_iter_init_with_info, seq_setpos, replace_str, less, replace_vec, diff, isec, obj_print_impl): PS. (nthcdr, equal, mkstring, mkustring, upcase_str, downcase_str, search_str, sub_str, cat_str, scat2, scat3, fmt_join, split_str_keep, split_str_set, trim_str, int_str, chr_int, chr_str, chr_str_set, vector, vecref, vecref_l, list_vec, copy_vec, sub_vec, cat_vec, lazy_str_put, lazy_str_gt, length_str_ge, length_str_lt, length_str_le, cptr_size_hint, cptr_int, out_lazy_str, out_quasi_str, time_string_local_time, time_string_utc, time_fields_local_time, time_fields_utc, time_struct_local, time_struct_utc, make_time, time_meth, time_parse_meth): DS. (init_str, cat_str_init, cat_str_measure, cat_str_append, vscat, time_fields_to_tm, time_struct_to_tm, make_time_impl): TS. * lib.h (seq_iter_rewind): Declaration removed. (c_num, c_unum, init_str): Declarations updated. * match.c (LOG_MISMATCH, LOG_MATCH): PS. (h_skip, h_coll, do_output_line, do_output, v_skip, v_fuzz, v_collect): DS. * parser.c (parser, circ_backpatch, report_security_problem, hist_save, repl, lino_fileno, lino_getch, lineno_getl, lineno_gets, lineno_open): DS. (parser_set_lineno, lisp_parse_impl): PS. * parser.l (YY_INPUT): PS. * rand.c (make_random_state): PS. * regex.c (print_rec): DS. (search_regex): PS. * signal.c (kill_wrap, raise_wrap, get_sig_handler, getitimer_wrap, setitimer_wrap): DS. * socket.c (addrinfo_in, sockaddr_pack, fd_timeout, to_connect, open_sockfd, sock_mark_connected, sock_timeout): TS. (getaddrinfo_wrap, dgram_set_sock_peer, sock_bind, sock_connect, sock_listen, sock_accept, sock_shutdown, sock_send_timeout, sock_recv_timeout, socketpair_wrap): DS. * stream.c (generic_fill_buf, errno_to_string, stdio_truncate, string_out_put_string, open_fileno, open_command, base_name, dir-name): DS. (unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, formatv, put_byte, test_set_indent_mode, test_neq_set_indent_mode, set_indent_mode, set_indent, inc_indent, set_max_length, set_max_depth, open_subprocess, run ): PS. (fds_subst, fds_swizzle): TS. * struct.c (make_struct_type, super, umethod_args_fun): PS. (method_args_fun): DS. * strudel.c (strudel_put_buf, strudel_fill_buf): DS. * sysif.c (errno_wrap, exit_wrap, usleep_wrap, mkdir_wrap, ensure_dir, makedev_wrap, minor_wrap, major_wrap, mknod_wrap, mkfifo_wrap, wait_wrap, wifexited, wexitstatus, wifsignaled, wtermsig, wcoredump, wifstopped, wstopsig, wifcontinued, dup_wrap, close_wrap, exit_star_wrap, umask_wrap, setuid_wrap, seteuid_wrap, setgid_wrap, setegid_wrap, simulate_setuid_setgid, getpwuid_wrap, fnmatch_wrap, dlopen_wrap): DS. (chmod_wrap, do_chown, flock_pack, do_utimes, poll_wrap, setgroups_wrap, setresuid_wrap, setresgid_wrap, getgrgid_wrap): PS. (c_time): TS. * sysif.h (c_time): Declaration updated. * syslog.c (openlog_wrap, syslog_wrap): DS. * termios.c (termios_pack): TS. (tcgetattr_wrap, tcsetattr_wrap, tcsendbreak_wrap, tcdrain_wrap, tcflush_wrap, tcflow_rap, encode_speeds, decode_speeds): DS. * txr.c (compato, array_dim, gc_delta): DS. * unwind.c (uw_find_frames_by_mask): DS. * vm.c (vm_make_desc): PS. (vm_make_closure, vm_swtch): DS.
* Remove unnecessary #include directives.Kaz Kylheku2020-04-221-1/+0
| | | | | | | | | | Time for some spring cleaning. * args.c, arith.c, buf.c, cadr.c, chksum.c, debug.c, ftw.c, gc.c, gencadr.txr, glob.c, hash.c, lisplib.c, match.c, parser.c, parser.l, parser.y, rand.c, signal.c, stream.c, strudel.c, syslog.c, tree.c, unwind.c, utf8.c, vm.c: Numerous unnecessary #include directives removed.
* hash: bugfix: spurious retention in weak processing.Kaz Kylheku2020-04-111-32/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The algorithm for weak processing is not correct. In hash_mark, we must must simply not mark any of the entries, keys or values, of a weak table regardless of what type of weak table it is. If we do that, we cause spurious retention in situations that the keys and values have some kind of link together other than through the table. For instance, suppose keys are weak, but values happen to have references to keys. If we mark the values, we mark the keys and nothing will expire from the table. Such a situation happens in stream_parser_hash, which associates streams with parsers, and has weak keys. Parsers have references to streams. So entries in the hash never expire. Any stream that gets a parser is retained forever. The weak hashes used for binding in eval.c (top_vb, ...) are also affected, because the key is some symbol <sym> and the value is (<sym> . <val>). The key is weak, but the value references the sym. So these hashes also will not expire the keys: unreachable variable bindings will stick around. * hash.c (hash_mark): If a hash table has weak keys, values, or both, then only mark its vector if the count is zero. If it has one or more entries, we just add it to the reachable_weak_hashes list to be processed in do_weak_tables.
* warning cleanup: suspicious switch fallthrough cases.Kaz Kylheku2020-04-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the seventh round of an effort to enable GCC's -Wextra option. Warnings about switch fallthrough situations are addressed. GCC now has a diagnostic for this that is enabled by -Wextra in such a way that if a fallthrough comment is present, the diagnostic is suppressed. In much of the code, we have such a comment. It's missing in a few places, or misplaced. There are also some real bugs. * hash.c (hash_buf): Add fallthrough comments to intentional fallthrough cases. (hash_hash_op): bugfix: add break statement. The 32 and 64 bit cases are independent (at compile time). * lib.c (cdr, nullify, list_collect, empty): Add fallthrough comment. (int_str): Add missing break. This has not caused a bug though because setting the octzero flag in the zerox case is harmless to the logic which follows. * linenoise.c (edit): Move misplaced fallthrough. * sysif.c (fcntl_wrap): Bugfix: add missing break, without which errno is tampered to hold EINVAL, in spite of a successful F_SETLK, F_SETLKW or F_GETLK operation. * unwind.h (jmp_restore): Declare noreturn, so that GCC does not issue a false positive warning about a fallthrough in uw_unwind_to_exit_point. * utf8.c (utf8_from_buf, utf8_decode): Move a fallthrough comment outside of preprocessing, so it is properly processed by GCC's diagnostic.
* New type args with DARG type code.Kaz Kylheku2020-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | An object of args type captures into the heap the "struct args" argument list that normally appears only on the stack. Such an object also has space for a car and cdr field, which can come in handy. * args.c (dyn_args): New function: hoist a struct args * into an args heap object. * args.h (dyn_args): Declared. * gc.c (finalize, mark_obj): Handle DARGS type code. * hash.c (equal_hash): Handle DARG via eq equivalence. * lib.c (args_s): New symbol variable. (code2type): Map DARG to args symbol. (equal): Handle DARG type, using eq equivalence for now. (obj_init): Initialize args_s with interned symbol. * lib.h (enum type, type_t): New type code, DARG. (struct dyn_args): New struct. (union obj): New member, a of type struct dyn_args. * txr.1: Documented args type under typeof.
* hash-uni: two new arguments for projecting values.Kaz Kylheku2020-03-191-8/+16
| | | | | | | | | | | * hash.c (hash_uni): New functional argument map1fun and map2fun. If present, values from hash1 and hash2, respectively, are projected through these functions. (hash_init): hash-uni registration updated. * hash.h (hash_uni): Declaration updated. * txr.1: Documented new arguments.
* hash: bugfix: maintain counts in weak processing.Kaz Kylheku2020-03-091-4/+10
| | | | | | * hash.c (do_weak_tables): Update the count field of each weak hash to account for the entries that get removed by expiry. Also, loop variable moves into a tighter scope.
* hash: bug: not hashing key of tree node.Kaz Kylheku2020-01-121-1/+1
| | | | | | | * hash.c (equal_hash): Spurious semicolon in TNOD case causing part of expression that includes the key to be cut off. This was not diagnosed by the C compiler of GCC 4.x or 7.4.0. The GCC 7.4.0 C++ front end caught this bug.
* Copyright year bump 2020.Kaz Kylheku2019-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/debugger.tl, share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/param.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr: Extended copyright notices to 2020.
* hash: bugfix: bad memset size in hash-reset.Kaz Kylheku2019-11-181-1/+1
| | | | | * hash.c (hash_reset): Clear the whole structure, not just a pointer-sized region at its base.
* hash: new hash-reset function.Kaz Kylheku2019-11-021-0/+20
| | | | | | | | | * hash.c (hash_reset): New function. (hash_init): hash-reset intrinsic registered. * hash.h (hash_reset): Declared. * txr.1: Documented.
* lib: use stack-allocated hash iterators everywhere.Kaz Kylheku2019-11-011-9/+14
| | | | | | | | | | | | | | | | | * eval.c (op_dohash): Use hash_iter instead of consing up heap-allocated hash iterator. * filter.c (trie_compress, regex_from_trie): Likewise. * hash.c (hash_equal_op, hash_hash_op, hash_print_op): Likewise. * lib.c (package_local_symbols, package_foreign_symbols, find_max, find_if, rfind_if, populate_obj_hash): Likewise. * parser.c (circ_backpatch, get_visible_syms): Likewise. * struct.c (method_name, get_slot_syms): Likewise.
* hash: expose new iterator interface.Kaz Kylheku2019-11-011-11/+4
| | | | | | | | | | * hash.c (struct hash): Declaration removed from here. (hash_iter_init, us_hash_iter_init, hash_iter_next, hash_iter_peek): Functions switched to external linkage. * hash.h (struct hash): Declared here now. (hash_iter_init, us_hash_iter_init, hash_iter_next, hash_iter_peek): Declared.
* hash: improve new hash_iter interface.Kaz Kylheku2019-11-011-16/+22
| | | | | | | | | | | | | | | | | | | | | Most calls to hash_iter_next are passing a null parameter for the object; only hash_next uses that parameter. Let's make hash_iter_next a wrapper which doesn't have that parameter. This interface will soon be exposed to other source files, so it's important to streamline it. * hash.c (hash_iter_next_impl): New function, exact copy of hash_iter_next. (hash_iter_next): Reduced to wrapper for hash_iter_next_impl, with one less argument. (hash_next): Call hash_iter_next_impl instead of hash_iter_next. (maphash, group_by, group_reduce, hash_uni, hash_diff, hash_symdiff, hash_isec, hash_subset, hash_update, hash_revget, hash_invert): Remove null argument from hash_iter_next calls.
* lib: don't assume time_t is signed.Kaz Kylheku2019-10-311-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | We introduce the function c_time to convert a Lisp integer to time_t, and num_time to do the reverse conversion. The FFI type time-t already does this right. (See registration of time-t in ffi_init_extra_types). * hash.c (gen_hash_seed): The first value out of time_sec_usec corresponds to a time_t value. We now convert this to C number using c_time rather than c_num. Also, while we are touching this code, the microseconds value can convert directly to ucnum with c_unum. * lib.c (time_sec_usec): Use num_time for seconds. (time_string_local, time_string_utc, time_fields_local, time_fields_utc, time_struct_local, time_struct_utc): Use c_time. (make_time_impl, time_parse_utc): Use num_time instead of num. * signal.h (getitimer_wrap, setitimer_wrap): Convert tv_sec members of struct timeval using c_time and num_time. * sysif.c (c_time, num_time): New functions. (stat_to_struct): Convert st_atime, st_mtime and st_ctime to Lisp using num_time instead of num. * sysif.c (c_time, num_time): Declared.
* hash: stack-allocated iterators.Kaz Kylheku2019-10-291-62/+105
| | | | | | | | | | | | * hash.c (hash_iter_init, us_hash_iter_init, hash_iter_next, hash_iter_peek): New static functions, made from hash_begin, hash_next and hash_peek internals. (hash_begin, hash_next, hash_peek): Turned into wrappers for hash_iter_init, hash_iter_next, hash_iter_peek. (maphash, group_by, group_reduce, hash_uni, hash_diff, hash_symdiff, hash_isec, hash_subset, hash_update, hash_revget, hash_invert): Use stack-allocated struct hash_iter instead of heap allocated object from hash_begin.
* naming: get the -func out, at least some of it.Kaz Kylheku2019-10-291-6/+6
| | | | | | | | | | | | | | | | | | | | The code base contains a lot of irksome _func which should be _fun, and also the public functions func-get-form and func-get-name are irksomely named. As a first step, we can fix parameters which carry this suffix. * glob.c (global_wrap): errfunc argument renamed to errfun. * glob.h (global_wrap): Likewise. * hash.h (hash_uni, hash_isec): join_func argument renamed to joinfun. * hash.h (hash_uni, hash_isec): Likewise. * txr.1: fixed gen-func typo. Arguments renamed in descriptions of hash-uni, hash-isec, iff, iffi, glob, and ftw.
* New function: hash-invert.Kaz Kylheku2019-10-281-0/+31
| | | | | | | | | * hash.c (hash_invert): New function. (hash_init): hash-invert intrinsic registered. * hash.c (hash_invert): Declared. * txr.1: Documented.
* hashing: partially revert 63feff9c.Kaz Kylheku2019-10-251-4/+4
| | | | | | | | | Mixing the hash seed with the hashes for characters, fixnums and pointers by multiplication doesn't make sense. It doesn't perturb the hash sufficiently. * hash.c (equal_hash): Do not multiply the hash by the seed for CHR, NUM, SYM, PKG and ENV.
* hash: observe count in eql-based hash.Kaz Kylheku2019-10-211-0/+3
| | | | * hash.c (eql_hash): Decrement count and bail if zero.
* hash: rename hash_rec_limit.Kaz Kylheku2019-10-181-10/+10
| | | | | | | | | | | | | | hash_rec_limit isn't a limit on recursion depth but on the elements traversed. * hash.c (hash_rec_limit): Variable renamed to hash_traversal_limit. (gethash_c, gethash_e, remhash, hash_equal): Use new name. (set_hash_rec_limit): Function renamed to set_hash_traversal_limit. (hash_init): set-hash-rec-limit intrinsic renamed to set-hash-traversal-limit. This function is undocumented, so no backward compatibility is provided.
* hash: get rid of hash_str_limit.Kaz Kylheku2019-10-181-18/+13
| | | | | | | | | | | | | | hash.c (hash_str_limit): Variable removed. (hash_c_str): Take count parameter. Observe the limit and update it. The count is scaled by 4 for strings: four characters for one count. (hash_buf): Likewise. (equal_hash): Pass count to hash_c_str and hash_buf. Use the count to determine how far to force a lazy string for hashing. (set_hash_str_limit): Static function removed. (hash_init): Removed sys:set-hash-str-limit intrinsic. This was used in genman.txr once, but that use was removed a year and a half ago.