summaryrefslogtreecommitdiffstats
path: root/hash.c
Commit message (Collapse)AuthorAgeFilesLines
* Copyright year bump 2021.Kaz Kylheku2021-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * METALICENSE: 2020 copyrights bumped to 2021. Added note about SHA-256 routines from Colin Percival. * LICENSE, LICENSE-CYG, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/copy-file.tl, share/txr/stdlib/debugger.tl, share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/each-prod.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/param.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/quips.tl, share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2021.
* time: move time functions out of lib.c into time.c.Kaz Kylheku2020-10-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Makefile (OBJS): Add new time.o. * eval.c (eval_init): Registration of time functions is removed from here; it is done in time_init now, in time.c. * hash.c: Must #include "time.h" now. * lib.c (time_s, time_local_s, time_utc_s, time_string_s, time_parse_s, year_s, month_s, day_s, hour_s, min_s, sec_s, dst_s, gmtoff_s, zone_s): Variable definitions removed. These are now in time.c. Also declared in time.h. (time_sec, time_sec_usec, gmtime_r, localtime_r, string_time, time_string_local, time_string_utc, broken_time_list, tm_to_time_struct, broken_time_struct, time_fields_local, time_fields_utc, time_struct_local, time_struct_utc, time_fields_to_tm, time_struct_to_tm, make_time_impl, make_time, epoch_tm, strptime_wrap, time_parse, setenv, unsetenv, timegm_hack, make_time_utc, time_meth, time_string_meth, time_parse_meth, time_parse_local, time_parse_utc): Functions removed. These are now in time.c. (time_init): Removed, and now in time.c as an external function. * lib.h (time_sec, time_sec_usec, time_string_local, time_string_utc, time_fields_local, time_fields_utc, time_struct_local, time_struct_utc, make_time, make_time_utc, time_parse, time_parse_local, time_parse_utc): Declarations removed. Now in time.h. * rand.c: Must #include "time.h" now. * time.c: New file. * time.h: New file.
* c_num: now takes self argument.Kaz Kylheku2020-06-291-17/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The c_num and c_unum functions now take a self argument for identifying the calling function. This requires changes in a large number of places. In a few places, additional functions acquire a self argument. The ffi module has the most extensive example of this. Some functions mention their name in a larger string, or have scattered literals giving their name; with the introduction of the self local variable, these are replaced by references to self. In the following changelog, the notation TS stands for "take self argument", meaning that the functions acquires a new "val self" argument. The notation DS means "define self": the functions in question defines a self variable, which they pass down. The notation PS means that the functions pass down an existing self variable to functions that now require it. * args.h (args_count): TS. * arith.c (c_unum, c_num): TS. (toint, exptv): DS. * buf.c (buf_check_len, buf_check_alloc_size, buf_check_index, buf_do_set_len, replace_buf, buf_put_buf, buf_put_i8, buf_put_u8, buf_put_char, buf_put_uchar, buf_get_bytes, buf_get_i8, buf_get_u8, buf_get_cptr, buf_strm_get_byte_callback, buf_strm_unget_byte, buf_swap32, str_buf, buf_int, buf_uint, int_buf, uint_buf): PS. (make_duplicate_buf, buf_shrink, sub_buf, buf_print, buf_pprint): DS. * chskum.c (sha256_stream_impl, sha256_buf, crc32_buf, md5_stream_impl, md5_buf): TS. (chksum_ensure_buf, sha256_stream, sha256, sha256_hash, md5_stream, md5, md5_hash): PS. (crc32_stream): DS. * combi.c (perm_while_fun, perm_gen_fun_common, perm_str_gen_fun, rperm_gen_fun, comb_vec_gen_fun, comb_str_gen_fun, rcomb_vec_gen_fun, rcomb_str_gen_fun): DS. * diff.c (dbg_clear, dbg_set, dbg_restore): DS. * eval.c (do_eval, gather_free_refs, maprodv, maprendv, maprodo, do_args_apf, do_args_ipf): DS. (op_dwim, me_op, map_common): PS. (prod_common): TS. * ffi.c (struct txr_ffi_type): release member TS. (make_ffi_type_pointer): PS and release argument TS. (ffi_varray_dynsize, ffi_array_in, ffi_array_put_common, ffi_array_get_common, ffi_varray_in, ffi_varray_null_term): PS. (ffi_simple_release, ffi_ptr_in_release, ffi_struct_release, ffi_wchar_array_get, ffi_array_release_common, ffi_array_release, ffi_varray_release): TS. (ffi_float_put, double_put, ffi_be_i16_put, ffi_be_u16_put, ffi_le_i16_put, ffi_le_u16_put, ffi_be_i32_put, ffi_be_u32_put, ffi_le_i32_put, ffi_sbit_put, ffi_ubit_put, ffi_buf_d_put, make_ffi_type_array, make_ffi_type_enum, ffi_type_compile, make_ffi_type_desc, ffi_make_call_desc, ffi_call_wrap, ffi_closure_dispatch_save, ffi_put_into, ffi_in, ffi_get, ffi_put, carray_set_length, carray_blank, carray_buf, carray_buf_sync, carray_cptr, carray_refset, carray_sub, carray_replace, carray_uint, carray_int): PS. (carray_vec, carray_list): DS. * filter.c (url_encode, url_decode, base64_stream_enc_impl): DS. * ftw.c (ftw_callback, ftw_wrap): DS. * gc.c (mark_obj, gc_set_delta): DS. * glob.c (glob_wrap): DS. * hash.c (equal_hash, eql_hash, eq_hash, do_make_hash, hash_equal, set_hash_traversal_limit, gen_hash_seed): DS. * itypes.c (c_i8, c_u8, c_i16, c_u16, c_i32, c_u32, c_i64, c_u64, c_short, c_ushort, c_int, c_uint, c_long, c_ulong): PS. * lib.c (seq_iter_rewind): TS and becomes internal. (seq_iter_init_with_info, seq_setpos, replace_str, less, replace_vec, diff, isec, obj_print_impl): PS. (nthcdr, equal, mkstring, mkustring, upcase_str, downcase_str, search_str, sub_str, cat_str, scat2, scat3, fmt_join, split_str_keep, split_str_set, trim_str, int_str, chr_int, chr_str, chr_str_set, vector, vecref, vecref_l, list_vec, copy_vec, sub_vec, cat_vec, lazy_str_put, lazy_str_gt, length_str_ge, length_str_lt, length_str_le, cptr_size_hint, cptr_int, out_lazy_str, out_quasi_str, time_string_local_time, time_string_utc, time_fields_local_time, time_fields_utc, time_struct_local, time_struct_utc, make_time, time_meth, time_parse_meth): DS. (init_str, cat_str_init, cat_str_measure, cat_str_append, vscat, time_fields_to_tm, time_struct_to_tm, make_time_impl): TS. * lib.h (seq_iter_rewind): Declaration removed. (c_num, c_unum, init_str): Declarations updated. * match.c (LOG_MISMATCH, LOG_MATCH): PS. (h_skip, h_coll, do_output_line, do_output, v_skip, v_fuzz, v_collect): DS. * parser.c (parser, circ_backpatch, report_security_problem, hist_save, repl, lino_fileno, lino_getch, lineno_getl, lineno_gets, lineno_open): DS. (parser_set_lineno, lisp_parse_impl): PS. * parser.l (YY_INPUT): PS. * rand.c (make_random_state): PS. * regex.c (print_rec): DS. (search_regex): PS. * signal.c (kill_wrap, raise_wrap, get_sig_handler, getitimer_wrap, setitimer_wrap): DS. * socket.c (addrinfo_in, sockaddr_pack, fd_timeout, to_connect, open_sockfd, sock_mark_connected, sock_timeout): TS. (getaddrinfo_wrap, dgram_set_sock_peer, sock_bind, sock_connect, sock_listen, sock_accept, sock_shutdown, sock_send_timeout, sock_recv_timeout, socketpair_wrap): DS. * stream.c (generic_fill_buf, errno_to_string, stdio_truncate, string_out_put_string, open_fileno, open_command, base_name, dir-name): DS. (unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, formatv, put_byte, test_set_indent_mode, test_neq_set_indent_mode, set_indent_mode, set_indent, inc_indent, set_max_length, set_max_depth, open_subprocess, run ): PS. (fds_subst, fds_swizzle): TS. * struct.c (make_struct_type, super, umethod_args_fun): PS. (method_args_fun): DS. * strudel.c (strudel_put_buf, strudel_fill_buf): DS. * sysif.c (errno_wrap, exit_wrap, usleep_wrap, mkdir_wrap, ensure_dir, makedev_wrap, minor_wrap, major_wrap, mknod_wrap, mkfifo_wrap, wait_wrap, wifexited, wexitstatus, wifsignaled, wtermsig, wcoredump, wifstopped, wstopsig, wifcontinued, dup_wrap, close_wrap, exit_star_wrap, umask_wrap, setuid_wrap, seteuid_wrap, setgid_wrap, setegid_wrap, simulate_setuid_setgid, getpwuid_wrap, fnmatch_wrap, dlopen_wrap): DS. (chmod_wrap, do_chown, flock_pack, do_utimes, poll_wrap, setgroups_wrap, setresuid_wrap, setresgid_wrap, getgrgid_wrap): PS. (c_time): TS. * sysif.h (c_time): Declaration updated. * syslog.c (openlog_wrap, syslog_wrap): DS. * termios.c (termios_pack): TS. (tcgetattr_wrap, tcsetattr_wrap, tcsendbreak_wrap, tcdrain_wrap, tcflush_wrap, tcflow_rap, encode_speeds, decode_speeds): DS. * txr.c (compato, array_dim, gc_delta): DS. * unwind.c (uw_find_frames_by_mask): DS. * vm.c (vm_make_desc): PS. (vm_make_closure, vm_swtch): DS.
* Remove unnecessary #include directives.Kaz Kylheku2020-04-221-1/+0
| | | | | | | | | | Time for some spring cleaning. * args.c, arith.c, buf.c, cadr.c, chksum.c, debug.c, ftw.c, gc.c, gencadr.txr, glob.c, hash.c, lisplib.c, match.c, parser.c, parser.l, parser.y, rand.c, signal.c, stream.c, strudel.c, syslog.c, tree.c, unwind.c, utf8.c, vm.c: Numerous unnecessary #include directives removed.
* hash: bugfix: spurious retention in weak processing.Kaz Kylheku2020-04-111-32/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The algorithm for weak processing is not correct. In hash_mark, we must must simply not mark any of the entries, keys or values, of a weak table regardless of what type of weak table it is. If we do that, we cause spurious retention in situations that the keys and values have some kind of link together other than through the table. For instance, suppose keys are weak, but values happen to have references to keys. If we mark the values, we mark the keys and nothing will expire from the table. Such a situation happens in stream_parser_hash, which associates streams with parsers, and has weak keys. Parsers have references to streams. So entries in the hash never expire. Any stream that gets a parser is retained forever. The weak hashes used for binding in eval.c (top_vb, ...) are also affected, because the key is some symbol <sym> and the value is (<sym> . <val>). The key is weak, but the value references the sym. So these hashes also will not expire the keys: unreachable variable bindings will stick around. * hash.c (hash_mark): If a hash table has weak keys, values, or both, then only mark its vector if the count is zero. If it has one or more entries, we just add it to the reachable_weak_hashes list to be processed in do_weak_tables.
* warning cleanup: suspicious switch fallthrough cases.Kaz Kylheku2020-04-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the seventh round of an effort to enable GCC's -Wextra option. Warnings about switch fallthrough situations are addressed. GCC now has a diagnostic for this that is enabled by -Wextra in such a way that if a fallthrough comment is present, the diagnostic is suppressed. In much of the code, we have such a comment. It's missing in a few places, or misplaced. There are also some real bugs. * hash.c (hash_buf): Add fallthrough comments to intentional fallthrough cases. (hash_hash_op): bugfix: add break statement. The 32 and 64 bit cases are independent (at compile time). * lib.c (cdr, nullify, list_collect, empty): Add fallthrough comment. (int_str): Add missing break. This has not caused a bug though because setting the octzero flag in the zerox case is harmless to the logic which follows. * linenoise.c (edit): Move misplaced fallthrough. * sysif.c (fcntl_wrap): Bugfix: add missing break, without which errno is tampered to hold EINVAL, in spite of a successful F_SETLK, F_SETLKW or F_GETLK operation. * unwind.h (jmp_restore): Declare noreturn, so that GCC does not issue a false positive warning about a fallthrough in uw_unwind_to_exit_point. * utf8.c (utf8_from_buf, utf8_decode): Move a fallthrough comment outside of preprocessing, so it is properly processed by GCC's diagnostic.
* New type args with DARG type code.Kaz Kylheku2020-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | An object of args type captures into the heap the "struct args" argument list that normally appears only on the stack. Such an object also has space for a car and cdr field, which can come in handy. * args.c (dyn_args): New function: hoist a struct args * into an args heap object. * args.h (dyn_args): Declared. * gc.c (finalize, mark_obj): Handle DARGS type code. * hash.c (equal_hash): Handle DARG via eq equivalence. * lib.c (args_s): New symbol variable. (code2type): Map DARG to args symbol. (equal): Handle DARG type, using eq equivalence for now. (obj_init): Initialize args_s with interned symbol. * lib.h (enum type, type_t): New type code, DARG. (struct dyn_args): New struct. (union obj): New member, a of type struct dyn_args. * txr.1: Documented args type under typeof.
* hash-uni: two new arguments for projecting values.Kaz Kylheku2020-03-191-8/+16
| | | | | | | | | | | * hash.c (hash_uni): New functional argument map1fun and map2fun. If present, values from hash1 and hash2, respectively, are projected through these functions. (hash_init): hash-uni registration updated. * hash.h (hash_uni): Declaration updated. * txr.1: Documented new arguments.
* hash: bugfix: maintain counts in weak processing.Kaz Kylheku2020-03-091-4/+10
| | | | | | * hash.c (do_weak_tables): Update the count field of each weak hash to account for the entries that get removed by expiry. Also, loop variable moves into a tighter scope.
* hash: bug: not hashing key of tree node.Kaz Kylheku2020-01-121-1/+1
| | | | | | | * hash.c (equal_hash): Spurious semicolon in TNOD case causing part of expression that includes the key to be cut off. This was not diagnosed by the C compiler of GCC 4.x or 7.4.0. The GCC 7.4.0 C++ front end caught this bug.
* Copyright year bump 2020.Kaz Kylheku2019-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/debugger.tl, share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/param.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr: Extended copyright notices to 2020.
* hash: bugfix: bad memset size in hash-reset.Kaz Kylheku2019-11-181-1/+1
| | | | | * hash.c (hash_reset): Clear the whole structure, not just a pointer-sized region at its base.
* hash: new hash-reset function.Kaz Kylheku2019-11-021-0/+20
| | | | | | | | | * hash.c (hash_reset): New function. (hash_init): hash-reset intrinsic registered. * hash.h (hash_reset): Declared. * txr.1: Documented.
* lib: use stack-allocated hash iterators everywhere.Kaz Kylheku2019-11-011-9/+14
| | | | | | | | | | | | | | | | | * eval.c (op_dohash): Use hash_iter instead of consing up heap-allocated hash iterator. * filter.c (trie_compress, regex_from_trie): Likewise. * hash.c (hash_equal_op, hash_hash_op, hash_print_op): Likewise. * lib.c (package_local_symbols, package_foreign_symbols, find_max, find_if, rfind_if, populate_obj_hash): Likewise. * parser.c (circ_backpatch, get_visible_syms): Likewise. * struct.c (method_name, get_slot_syms): Likewise.
* hash: expose new iterator interface.Kaz Kylheku2019-11-011-11/+4
| | | | | | | | | | * hash.c (struct hash): Declaration removed from here. (hash_iter_init, us_hash_iter_init, hash_iter_next, hash_iter_peek): Functions switched to external linkage. * hash.h (struct hash): Declared here now. (hash_iter_init, us_hash_iter_init, hash_iter_next, hash_iter_peek): Declared.
* hash: improve new hash_iter interface.Kaz Kylheku2019-11-011-16/+22
| | | | | | | | | | | | | | | | | | | | | Most calls to hash_iter_next are passing a null parameter for the object; only hash_next uses that parameter. Let's make hash_iter_next a wrapper which doesn't have that parameter. This interface will soon be exposed to other source files, so it's important to streamline it. * hash.c (hash_iter_next_impl): New function, exact copy of hash_iter_next. (hash_iter_next): Reduced to wrapper for hash_iter_next_impl, with one less argument. (hash_next): Call hash_iter_next_impl instead of hash_iter_next. (maphash, group_by, group_reduce, hash_uni, hash_diff, hash_symdiff, hash_isec, hash_subset, hash_update, hash_revget, hash_invert): Remove null argument from hash_iter_next calls.
* lib: don't assume time_t is signed.Kaz Kylheku2019-10-311-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | We introduce the function c_time to convert a Lisp integer to time_t, and num_time to do the reverse conversion. The FFI type time-t already does this right. (See registration of time-t in ffi_init_extra_types). * hash.c (gen_hash_seed): The first value out of time_sec_usec corresponds to a time_t value. We now convert this to C number using c_time rather than c_num. Also, while we are touching this code, the microseconds value can convert directly to ucnum with c_unum. * lib.c (time_sec_usec): Use num_time for seconds. (time_string_local, time_string_utc, time_fields_local, time_fields_utc, time_struct_local, time_struct_utc): Use c_time. (make_time_impl, time_parse_utc): Use num_time instead of num. * signal.h (getitimer_wrap, setitimer_wrap): Convert tv_sec members of struct timeval using c_time and num_time. * sysif.c (c_time, num_time): New functions. (stat_to_struct): Convert st_atime, st_mtime and st_ctime to Lisp using num_time instead of num. * sysif.c (c_time, num_time): Declared.
* hash: stack-allocated iterators.Kaz Kylheku2019-10-291-62/+105
| | | | | | | | | | | | * hash.c (hash_iter_init, us_hash_iter_init, hash_iter_next, hash_iter_peek): New static functions, made from hash_begin, hash_next and hash_peek internals. (hash_begin, hash_next, hash_peek): Turned into wrappers for hash_iter_init, hash_iter_next, hash_iter_peek. (maphash, group_by, group_reduce, hash_uni, hash_diff, hash_symdiff, hash_isec, hash_subset, hash_update, hash_revget, hash_invert): Use stack-allocated struct hash_iter instead of heap allocated object from hash_begin.
* naming: get the -func out, at least some of it.Kaz Kylheku2019-10-291-6/+6
| | | | | | | | | | | | | | | | | | | | The code base contains a lot of irksome _func which should be _fun, and also the public functions func-get-form and func-get-name are irksomely named. As a first step, we can fix parameters which carry this suffix. * glob.c (global_wrap): errfunc argument renamed to errfun. * glob.h (global_wrap): Likewise. * hash.h (hash_uni, hash_isec): join_func argument renamed to joinfun. * hash.h (hash_uni, hash_isec): Likewise. * txr.1: fixed gen-func typo. Arguments renamed in descriptions of hash-uni, hash-isec, iff, iffi, glob, and ftw.
* New function: hash-invert.Kaz Kylheku2019-10-281-0/+31
| | | | | | | | | * hash.c (hash_invert): New function. (hash_init): hash-invert intrinsic registered. * hash.c (hash_invert): Declared. * txr.1: Documented.
* hashing: partially revert 63feff9c.Kaz Kylheku2019-10-251-4/+4
| | | | | | | | | Mixing the hash seed with the hashes for characters, fixnums and pointers by multiplication doesn't make sense. It doesn't perturb the hash sufficiently. * hash.c (equal_hash): Do not multiply the hash by the seed for CHR, NUM, SYM, PKG and ENV.
* hash: observe count in eql-based hash.Kaz Kylheku2019-10-211-0/+3
| | | | * hash.c (eql_hash): Decrement count and bail if zero.
* hash: rename hash_rec_limit.Kaz Kylheku2019-10-181-10/+10
| | | | | | | | | | | | | | hash_rec_limit isn't a limit on recursion depth but on the elements traversed. * hash.c (hash_rec_limit): Variable renamed to hash_traversal_limit. (gethash_c, gethash_e, remhash, hash_equal): Use new name. (set_hash_rec_limit): Function renamed to set_hash_traversal_limit. (hash_init): set-hash-rec-limit intrinsic renamed to set-hash-traversal-limit. This function is undocumented, so no backward compatibility is provided.
* hash: get rid of hash_str_limit.Kaz Kylheku2019-10-181-18/+13
| | | | | | | | | | | | | | hash.c (hash_str_limit): Variable removed. (hash_c_str): Take count parameter. Observe the limit and update it. The count is scaled by 4 for strings: four characters for one count. (hash_buf): Likewise. (equal_hash): Pass count to hash_c_str and hash_buf. Use the count to determine how far to force a lazy string for hashing. (set_hash_str_limit): Static function removed. (hash_init): Removed sys:set-hash-str-limit intrinsic. This was used in genman.txr once, but that use was removed a year and a half ago.
* hash: observe count limit for vectors and hash tables.Kaz Kylheku2019-10-181-1/+3
| | | | | | * hash.c (equal_hash): Break out of hashing a vector when the count is exceeded. (hash_hash_op): Likewise for traversing a hash table.
* hash: optimize vector access.Kaz Kylheku2019-10-121-22/+18
| | | | | | | | | | | | We use direct access instead of going through vecref/vecref_l. As a result of this change, I measured an 8% speedup in "make retest"! I'm also seeing a more than 10% speedup in "make" after "make clean-tlo" (i.e. recompiling the Lisp library). * hash.c (hash_mark, hash_grow, copy_hash, gethash_c, gethash_e, remhash, hash_next, hash_peek, do_weak_tables): Access the table directly.
* hash: use ucnum for hash values everywhere.Kaz Kylheku2019-10-121-11/+11
| | | | | | | | | | | | | | | | | | | One consequence of this is that we no longer pass negative values to vecref; that functon was protecting us from the consequences of this signed/unsigned mixup, at the cost of cyles. * hash.c (struct hash_ops): hash parameter in assoc_fun and acons_new_c_fun pointers changes from cnum to ucnum. (hash_assoc, hash_assql, hash_assq): hash parameter changes to ucnum. (hash_acons_new_c, hash_aconsql_new_c, hash_aconsq_new_c): Likewise. (gethash_c, gethash_e, remhash): Variable which captures the output of the hashing function is now ucnum instead of cnum. * lib.h (struct cons_hash_entry): Member hash changes from cnum to ucnum.
* hash: strengthen type mutual exclusion check.Kaz Kylheku2019-10-111-8/+13
| | | | | | | | | | * hash.c (equal_based_p): Take additional argument eq indicating that :eq-based has been specified. Check all three exclusive combinations. (hashv): Call equal_based_p unconditionally, even when :eq-based is specified and we don't use this function's return value so we benefit from that function's exclusion check. Pass the eq boolean, as required.
* hash: implement :eq-based.Kaz Kylheku2019-10-111-7/+117
| | | | | | | | | | | | | | | | | | | | | We need eq based hash tables to fix a problem in *print-circle*. * hash.c (enum hash_type, hash_type_t): New enum type. (eq_based_k): New keyword variable. (eq_hash, eq_hash_op): New static functions. (hash_print_op): Ensure we print eq-based hashes with the correct keyword. (hash_assq, hash_aconsq_new_c): New static functions. (hash_eq_ops): New static structure. (do_make_hash): New function, made from previous contents of make_seeded_hash. (make_seeded_hash): Wrapper around do_make_hash now. (make_eq_hash): New function. (hashv): Parse out :eq-based argument. Use make_eq_hash if it is present. (hash_init): Initialize eq_based_k. * hash.h (eq_based_k, make_eq_hash): Declared.
* safety: fix type tests that code can subvert.Kaz Kylheku2019-09-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes numerous instances of a safety hole which involves the type of a COBJ object being tested to be of a given class using logic that can be subverted by the definition of a like-named struct. Specifically logic like (typeof(obj) == hash_s) is broken, because if a struct type called hash is defined, then the test will yield true for instances of that struct type. Those instances can then be passed into code that only works on COBJ hashes, and relies on this test to reject invalid objects. * ffi.c (make_carray): Replace fragile test with strong one, using new cobjclassp function. * hash.c (hashp): Likewise. * lib.c (class_check): The expression used here for the type test moves into the new function cobjclassp and so is replaced by a call to that function. (cobjclassp): New function. * lib.h (cobjclassp): Declared. * rand.c (random_state_p): Replace fragile test using cobjclassp. * regex.c (char_set_compile): Replace fragile typeof tests for character type with is_chr. (reg_derivative, regexp): Replace fragile test with cobjclassp. * struct.c (struct_type_p): Replace fragile test with cobjclassp.
* Use put_char for single character output.Kaz Kylheku2019-09-261-5/+5
| | | | | | | * hash.c (hash_print_op): Replace length 1 put_string calls with put_char. * lib.c (obj_print_impl): Likewise.
* New data type: tnode.Kaz Kylheku2019-09-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Binary search tree nodes are being added as a basic heap data type. The C type tag is TNOD, and the Lisp type is tnode. Binary search tree nodes have three elements: a key, a left child and a right child. The printed notation is #N(key left right). Quasiquoting is supported: ^#N(,foo ,bar) but not splicing. Because tnodes have three elements, they they fit into TXR's four-word heap cell, not requiring any additional memory allocation. These nodes are going to be the basis for a binary search tree container, which will use the scapegoat tree algorithm for maintaining balance. * tree.c, tree.h: New files. * Makefile (OBJS): Adding tree.o. * eval.c (expand_qquote_rec): Recurse through tnode cells, so unquotes work inside #N syntax. * gc.c (finalize): Add TNOD to no-op case in switch; tnodes don't require finalization. (mark_obj): Traverse tnode cell. * hash.c (equal_hash): Add TNOD case. * lib.c (tnode_s): New symbol variable. (seq_kind_tab): New entry for TNOD, mapping to SEQ_NOTSEQ. (code2type, equal): Handle TNOD. (obj_init): Initialize tnode_s variable. (obj_print_impl, populate_obj_hash): Handle TNOD. (init): Call tree_init function in tree.c. * lib.h (enum type, type_t): New enumeration TNOD. (struct tnod): New struct type. (union obj, obj_t): New union member tn of type struct tnod. (tnode_s): Declard. * parserc.c (circ_backpatch): Handle TNOD, so circular notation works through tnode cells. * parser.l (grammar): Recognize #N prefix, mapping to HASH_N token. * parser.y (HASH_N): New grammar terminal symbol. (tnode): New nonterminal symbol. (i_expr, n_expr): Add tnode cases to productions. (yybadtoken): Map HASH_N to "#N" string.
* hashing: take advantage of seed when hashing aggregates.Kaz Kylheku2019-09-201-11/+12
| | | | | | | | | | * hash.c (equal_hash): When hashing conses and ranges, perturb the seed going into the hash for the second element, rather than hashing it the same way, and then multiplying it. When hashing the elements of a vector, perturb the seed of each one by multiplying by the index. For the CHR, NUM, BGNUM, FLNUM and several types hashed like pointers, we now mix the seed into the hash.
* New function: hash-zip.Kaz Kylheku2019-07-171-0/+17
| | | | | | | | | * hash.c (hash_zip): New function. (hash_init): hash-zip intrinsic registered. * hash.h (hash_zip): Declared. * txr.1: Documented.
* New function: hash-peek.Kaz Kylheku2019-06-111-0/+26
| | | | | | | | | * hash.c (hash_peek): New function. (hash_init): hash-peek intrinsic registered. * hash.h (hash_peek): Declared. * txr.1: Documented.
* hash: remove unnecessary test.Kaz Kylheku2019-06-111-1/+1
| | | | | | * hash.c (hash_next): We know after the loop that hi->cons is not nil, because the loop contains no break, and is guarded by hi->cons being nil.
* C99: get rid of useless inline instantiations.Kaz Kylheku2019-05-021-5/+0
| | | | | | | | | | | * hash.c, lib.c, parser.y, unwind.c: Remove useless declarations that were believed to be C99 inline instantiations. This was mistakenly added at the time the Solaris issue was discovered that _XOPEN_SOURCE values of 600 or greater require compiling in C99 mode. We use "static inline" under C99 for inline functions; instantiation is not applicable at all to inline functions that don't have external linkage.
* Support max length and depth for object printing.Kaz Kylheku2019-04-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * hash.c (hash_print_op): Implement max length. * lib.c (lazy_str_put, out_lazy_str): Take struct strm_base * parameter and implement max length. (out_quasi_str_sym): Don't use obj_print_impl for symbol's name string; just put_string. The use of obj_print_impl causes symbols appearing as variables in quasiliterals to be truncated when max length is imposed. (out_quasi_str): Implement max length. (obj_print_impl): Implement max length and depth. Note that there is now always a non-null ctx pointer. (obj_print): Always set up context pointer for obj_print_impl. Context now has pointer to low-level stream structure, where we can access max_length and max_depth. It also carries the recursion depth. * lib.h (lazy_str_put): Declaration updated. * stream.c (strm_base_init): Add initializers for max_length and max_depth. (put_string): Pass stream structure to lazy_str_put. (set_max_length, set_max_depth): New functions. (stream_init): set-max-length and set-max-depth intrinsics registered. * stream.h (struct strm_ctx): New members depth and strm. (struct strm_base): New members max_length and max_depth. (set_max_length, set_max_depth): Declared. * txr.1: Documented.
* lib: use accessor for lcons function.Kaz Kylheku2019-03-121-4/+4
| | | | | | | | * hash.c (hash_keys_lazy, hash_values_lazy, hash_pairs_lazy, hash_alist_lazy): Use us_lcons_fun instead of direct lcons->lc.fun access. * lib.c (simple_lazy_stream_func, lazy_stream_func): Likewise.
* lib: rename make_half_lazy_cons.Kaz Kylheku2019-03-121-12/+12
| | | | | | | | | | | * lib.h (make_half_lazy_cons): Renamed to make make_lazy_cons_car. * lib.c (rem_lazy_rec, make_half_lazy_cons): Follow rename. * hash.c (hash_keys_lazy, hash_keys, hash_values_lazy, hash_values, hash_pairs_lazy, hash_pairs, hash_alist_lazy, hash_alist): Follow rename.
* hash: gc issue in clearhash.Kaz Kylheku2019-02-241-0/+1
| | | | | | * hash.c (clearhash): Assignment of new table into hash requires setcheck, because it's putting a new object into an old one.
* hash: remove redundant assignment from hash_grow.Kaz Kylheku2019-02-241-1/+1
| | | | | | * hash.c (hash_grow): The new_table value is stored in h->table twice. First directly and then via the set macro. Let's just use setcheck, which avoids the intermediate loc object.
* hashing: provide unsafe hash count.Kaz Kylheku2019-02-221-0/+6
| | | | | | | * hash.c (us_hash_count): New function. Blindly assumes that the argument is a hash. * hash.h (us_hash_count): Declared.
* Optimize hash operation with unsafe car/cdr.Kaz Kylheku2019-02-141-94/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The associative lists that make up the chains of a hash table are guaranteed to be made of conses. We can use unsafe versions of car, cdr, rplaca and rplacd to speed up hash operations. * eval.c (op_dohash): Use unsafe operations on hash cell. * filter.c (trie_compress, regex_from_trie): Likewise. * hash.c (hash_equal_op, hash_print_op, hash_mark, hash_grow, hash_assoc, hash_assql, copy_hash_chain, gethash, inhash, gethash_n, sethash, remhash, hash_next, maphash, do_weak_tables, group_by, group_reduce, hash_keys_lazy, hash_keys, hash_values_lazy, hash_values, hash_pairs_lazy, hash_pairs, hash_alist_lazy, hash_uni, hash_diff, hash_symdiff, hash_isec, hash_subset, hash_update, hash_update_1, hash_revget): Likewise. * lib.c (us_rplaca, us_rplacd): New functions. (package_local_symbols, package_foreign_symbols, where, populate_obj_hash, obj_hash_merge): Use unsafe operations on hash cell * lib.h (us_rplaca, us_rplacd): Declared. * parser.c (circ_backpatch, get_visible_syms): Use unsafe operations on hash cell. * struct.c (method_name, get_slot_syms): Likewise.
* gethash_c: review uses and improve or replace.Kaz Kylheku2019-02-141-8/+7
| | | | | | | | | | | | | | | | | | | | * eval.c (env_fbind, env_vbind, reg_symacro): Use gethash_l instead of gethash_c to eliminate repeated cdr operations on the same cell. * hash.c (sethash): Since new_p is never used, eliminated it and use nulloc. (group_reduce): Use gethash_l instead of gethash_c. * lib.c (obj_init): Replace rplacd(gethash_c(...)) pattern whose return value is not used with with sethash. We lose some diagnosability here since sethash doesn't take a "self" argument. (obj_print_impl, obj_hash_merge): Use gethash_l instead of gethash_c. * parser.y (ensure_parser, parser_circ_def, get_visible_syms, rlset): Use gethash_l instead of gethash_c.
* gethash_f: removing function.Kaz Kylheku2019-02-141-16/+7
| | | | | | | | | | | | | | | | Uses of gethash_f can be replaced with gethash_e, which returns the hash cell directly rather than through a loc argument. Code that needs the value can call cdr itself. * hash.c (inhash, hash_isec, hash_update_1): Replace gethash_f with gethash_e. (gethash_f): Function removed. * hash.h (gethash_f): Declaration removed. * lib.c (use_sym, unuse_sym, find_symbol, unintern, intern_fallback, in, sel, populate_obj_hash): Replace gethash_f with gethash_e.
* hash-from-alist: new function.Kaz Kylheku2019-02-131-0/+15
| | | | | | | | | * hash.c (hash_from_alist_v): New function. (hash_init): Register hash-from-alist intrinsic. * hash.h (hash_from_alist_v): Declared. * txr.1: Documented.
* hash-symdiff: new function.Kaz Kylheku2019-02-131-0/+35
| | | | | | | | | * hash.c (hash_symdiff): New function. (hash_init): hash-symdiff intrinsic registered. * hash.h (hash_symdiff): Declared. * txr.1: Documented.
* hash-uni: bugfix.Kaz Kylheku2019-02-131-2/+6
| | | | | | | | * hash.c (hash_uni): The join function must only be called for the values of keys that exist in both hashes. The broken logic here unconditionally calls the join function for all keys in the left hash (using nil as the right join value when the key doesn't exist in the right hash).
* copy-hash: showstopper: seed must be copied.Kaz Kylheku2019-01-281-0/+1
| | | | | | * hash.c (copy_hash): Fix failure to initialize seed member in the duplicated hash structure. This regression was introduced along with seeded hashing.