summaryrefslogtreecommitdiffstats
path: root/lib.c
Commit message (Collapse)AuthorAgeFilesLines
* printer: revise package prefix decision.Kaz Kylheku2019-02-231-24/+50
| | | | | | | | | | | | | | | | | | | | | | | * lib.c (symbol_needs_prefix): revisiting the wrongheaded requirements codified in 7bc150f, because the ergonomics is bad. In a package that has a local symbol that has the same name as one in a fallback list, that symbol is always printed with a prefix, which is annoying. The new rules are simple: if the symbol being printed is the one which is visible, then it gets no package prefix. Also, this function now handles the full responsibility for the prefix calculation, including for keyword symbols and uninterned symbools. It returns nil to indicate no prefix is needed, or else a character string. Moreover, logic is added to detect symbols which have a home package, but are uninterned from it, which should be printed with the "#" prefix. Lastly, this function is optimized to avoid unnecessary gethash operations. If a symbol S's home package is P, and P contains no hidden symbols (overwhelmingly common situation), then S is interned in P; no need to do the hash lookup to check this. (obj_print_impl): Symbol printing simplified: if symbol_needs_refix returns non-nil, that string value is the prefix.
* Optimize hash operation with unsafe car/cdr.Kaz Kylheku2019-02-141-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The associative lists that make up the chains of a hash table are guaranteed to be made of conses. We can use unsafe versions of car, cdr, rplaca and rplacd to speed up hash operations. * eval.c (op_dohash): Use unsafe operations on hash cell. * filter.c (trie_compress, regex_from_trie): Likewise. * hash.c (hash_equal_op, hash_print_op, hash_mark, hash_grow, hash_assoc, hash_assql, copy_hash_chain, gethash, inhash, gethash_n, sethash, remhash, hash_next, maphash, do_weak_tables, group_by, group_reduce, hash_keys_lazy, hash_keys, hash_values_lazy, hash_values, hash_pairs_lazy, hash_pairs, hash_alist_lazy, hash_uni, hash_diff, hash_symdiff, hash_isec, hash_subset, hash_update, hash_update_1, hash_revget): Likewise. * lib.c (us_rplaca, us_rplacd): New functions. (package_local_symbols, package_foreign_symbols, where, populate_obj_hash, obj_hash_merge): Use unsafe operations on hash cell * lib.h (us_rplaca, us_rplacd): Declared. * parser.c (circ_backpatch, get_visible_syms): Use unsafe operations on hash cell. * struct.c (method_name, get_slot_syms): Likewise.
* gethash_c: review uses and improve or replace.Kaz Kylheku2019-02-141-8/+7
| | | | | | | | | | | | | | | | | | | | * eval.c (env_fbind, env_vbind, reg_symacro): Use gethash_l instead of gethash_c to eliminate repeated cdr operations on the same cell. * hash.c (sethash): Since new_p is never used, eliminated it and use nulloc. (group_reduce): Use gethash_l instead of gethash_c. * lib.c (obj_init): Replace rplacd(gethash_c(...)) pattern whose return value is not used with with sethash. We lose some diagnosability here since sethash doesn't take a "self" argument. (obj_print_impl, obj_hash_merge): Use gethash_l instead of gethash_c. * parser.y (ensure_parser, parser_circ_def, get_visible_syms, rlset): Use gethash_l instead of gethash_c.
* gethash_f: removing function.Kaz Kylheku2019-02-141-28/+26
| | | | | | | | | | | | | | | | Uses of gethash_f can be replaced with gethash_e, which returns the hash cell directly rather than through a loc argument. Code that needs the value can call cdr itself. * hash.c (inhash, hash_isec, hash_update_1): Replace gethash_f with gethash_e. (gethash_f): Function removed. * hash.h (gethash_f): Declaration removed. * lib.c (use_sym, unuse_sym, find_symbol, unintern, intern_fallback, in, sel, populate_obj_hash): Replace gethash_f with gethash_e.
* symdiff: new function.Kaz Kylheku2019-02-141-0/+43
| | | | | | | | | | | | * eval.c (eval_init): Register symdiff intrinsic. * lib.c (symdiff): New function. * lib.h (us_car_p, us_cdr_p): New inline functions. (symdiff): Declared. * txr.1: Documented, also fixing issues not related to symdiff doc.
* optimizing diff, isec and uni for non-lists.Kaz Kylheku2019-02-131-35/+70
| | | | | | | | | | | | | | | Also, these functions now support hashes. * eval.c (eval_init): Register only the deprecated set-diff to the set_diff function. The diff intrinsic is now going to the new function named diff. * lib.c (diff): New function. (isec, uni): Rewritten to use seq_iter_t. * lib.h (diff): Declared. * txr.1: Documentation updated.
* num: reduce duplicate code.Kaz Kylheku2019-02-131-3/+1
| | | | | * lib.c (num): Use num_fast instead of an expression that is identical to the body of that inline function.
* Framework for iterating over sequences.Kaz Kylheku2019-02-131-0/+79
| | | | | | | | | | | | | | | | This has been needed for a while. While we have seq_info for classifying sequences to nicely dispatch code into various cases, those cases duplicate code. The code base could benefit from generic traversal. * lib.c (seq_iter_get_nil, seq_iter_get_list, seq_iter_get_vec, set_iter_get_hash): New static functions. (seq_iter_rewind, seq_iter_init): New functions. * lib.h (struct seq_iter, seq_iter_t): New struct type and its typedef name. (seq_iter_init, seq_iter_rewind): Declared. (seq_get): New inline function.
* sum and prod take keyfun argument.Kaz Kylheku2019-02-021-4/+39
| | | | | | | | | | | | | * eval.c (eval_init): Adjust registrations of sum and prod to be binary functions with an optional argument. * lib.c (nary_op_keyfun, sumv, prodv): New static functions. (sum, prod): Implement optional keyfun argument via sumv and prodv helpers. * lib.h (sum, prod): Declarations updated. * txr.1: Documentation updated.
* Provide faster bignum-in-fixed-integer range tests in MPI.Kaz Kylheku2019-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * mpi/mpi.c (mp_in_range, mp_in_intptr_range, mp_in_uintptr_range): New functions. * mpi/mpi.h (mp_in_range, mp_in_intptr_range, mp_in_uintptr_range): Declared. * arith.c (NUM_MAX_MP, INT_PTR_MAX_MP, UINT_PTR_MAX_MP, INT_PTR_MAX_SUCC_MP): Static variables removed. Note that INT_PTR_MAX_MP was not used at all! (normalize): Use mp_in_range instead magnitude comparison to NUM_MAX_MP. (in_int_ptr_range, in_uint_ptr_range): Static functions removed. (c_unum): Use mp_in_uintptr_range instead of in_uint_ptr_range. (arith_init): Remove initializations of removed variables. (arith_free_all): Remove cleanup of removed variables, leaving function empty. * lib.c (c_num): Use mp_in_intptr_range instead of in_int_ptr_range.
* lib: revise wording of integer range errors.Kaz Kylheku2019-01-241-7/+8
| | | | | | | | * arith.c (c_unum): Fix misleading error message, and instead specify the range that was violated. * lib.c (c_num): Similar change: don't refer to a 'cnum range' which means nothing to the user.
* mpi: use wchar_t string for text-to-bignum.Kaz Kylheku2019-01-181-4/+1
| | | | | | | | | | | | | * mpi/mpi.c (mp_read_radix): Take const wchar_t * string rather than unsigned char *. (s_mp_tovalue): Take character argument as wchar_t rather than int. * mpi/mpi.h (mp_read_radix): Declaration updated. * lib.c (int_str): Avoid a malloc/free and UTF-8 conversion by passing the original wide string to mp_read_radix. This removes a TODO dating back to December 2011.
* Copyright year bump 2019.Kaz Kylheku2019-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr: Extended Copyright line to 2018.
* Eliminate ALLOCA_H.Kaz Kylheku2018-12-311-1/+1
| | | | | | | | | | | | | * configure: Instead of generating a definition of ALLOCA_H, generate the variable HAVE_ALLOCA_<name> with a value of 1, where <name> is one of stdlib, alloca or malloc. * alloca.h: New header. * args.c, eval.c, ffi.c ffi.c, ftw.c, hash.c, lib.c, match.c, parser.c, parser.y, regex.c, socket.c, stream.c, struct.c, sysif.c, syslog.c, termios.c, unwind.c, vm.c: Include "alloca.h" instead of ALLOCA_H.
* Drastically reduce inclusion of <dirent.h>.Kaz Kylheku2018-12-111-1/+0
| | | | | | | | | | | | | | | | | | | The <dirent.h> header is included all over the place because it is needed by a single declaration in stream.h. That declaration is for a function that is only called within stream.c, so we make it internal. Now only stream.c has to include <dirent.h>. * buf.c, debug.c, eval.c, ffi.c, filter.c, gc.c, gencadr.txr, hash.c, lib.c, lisplib.c, match.c, parser.c, regex.c, socket.c, struct.c, strudel.c, sysif.c, syslog.c, termios.c, txr.c, unwind.c, vm.c: Remove #include <dirent.h>. * cadr.c: Regenerated. * stream.c (make_dir_stream): Make external function static. * stream.h (make_dir_stream): Declaration updated.
* New range testing functions.Kaz Kylheku2018-11-271-0/+20
| | | | | | | | | | | * eval.c (eval_init): Register in-range and in-range* intrinsics. * lib.c (in_range, in_range_star): New functions. * lib.h (in_range, in_range_star): Declared. * txr.1: Documented.
* print: keep colon in keyword symsKaz Kylheku2018-11-251-1/+5
| | | | | | | | * lib.c (obj_print_impl): Always include leading colon when printing keyword symbols, regardless of pretty flag. Subject to backward compatibility. * txr.1: Compat note added.
* copy: call copy-fun for functions.Kaz Kylheku2018-11-171-0/+2
| | | | | | * lib.c (copy): Handle FUN type through copy_fun. * txr.1: Documented.
* vm: provide special case call entry points.Kaz Kylheku2018-11-161-14/+7
| | | | | | | | | | | | | | | * lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Use vm_funcall, vm_funcall1, vm_funcall2, vm_funcall3, and vm_funcall4, respectively instead of the general vm_execute_closure. Also, missing argument count check added in funcall. * vm.c (vm_funcall_common): New macro. (vm_funcall, vm_funcall1, vm_funcall2, vm_funcall3, vm_funcall4): New functions. * vm.h (vm_funcall, vm_funcall1, vm_funcall2, vm_funcall3, vm_funcall4): Declared.
* copy-fun: duplicate a function, with own environment.Kaz Kylheku2018-11-131-0/+15
| | | | | | | | | | | | | | | | | * eval.c (deep_copy_env): New function. (eval_init): Register copy-fun intrinsic. * eval.h (deep_copy_env): Declared. * lib.c (copy_fun): New function. * lib.h (copy_fun): Declared. * vm.c (vm_copy_closure): New function. * vm.h (vm_copy_closure): Declared. * txr.1: Documented copy-fun.
* Better identify functions that misuse COBJ-s and hashes.Kaz Kylheku2018-11-071-38/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this patch, the cobj_handle, cobj_ops and variants of gethash get an additional argument to identify the caller. Many functions are updated to pass this down. * buf.c (buf_strm): Pass self name to cobj_handle. * eval.c (env_fbind, env_vbind, rt_defvarl, me_case): Pass self name to gethash_c or gethash_e. (load): Pass self name to read_eval_stream and read_compiled_file. (reg_symacro): Pass situation-identifying string to gethash_c. * ffi.c (ffi_type_struct_checked, ffi_closure_struct_checked, ffi_call_desc_checked, uni_struct_checked): Take self name parameter, and pass down to cobj_handle. (ffi_get_type, ffi_get_lisp_type): Take self name and pass down to ffi_type_struct_checked. (union_get_ptr): Take self name and pass to uni_struct_checked. (ffi_union_in, ffi_union_put): Pass self name to union_get_ptr. (ffi_type_compile): Pass self name to ffi_get_lisp_type. (ffi_make_call_desc): Pass self name to ffi_type_struct_checked, ffi_get_type and ffi_call_desc_checked. (ffi_make_closure): Pass self name to ffi_call_desc_checked. (ffi_closure_get_fptr): Take self name, pass to ffi_closure_struct_checked. (ffi_typedef, ffi_size, ffi_alignof, ffi_offsetof, ffi_arraysize, ffi_elemsize, ffi_elemtype, ffi_put_into, ffi_put, ffi_in, ffi_get, ffi_out, make_carray): Pass self name to ffi_closure_struct_checked. (carray_struct_checked): Take self name, pass to cobj_handle. (carray_set_length, carray_dup, carray_own, carray_free, carray_type, length_carray, copy_carray, carray_ptr, buf_carray, vec_carray, list_carray, carray_ref, carray_refset, carray_sub, carray_replace, carray_get_common, carray_put_common, unum_carray, num_carray, put_carray, fill_carray): Pass self name to carray_struct_checked. (carray_blank, carray_buf, carray_cptr): Pass self name ffi_type_struct_checked. (carray_pun): Pass self name to carray_struct_checked and ffi_type_struct_checked. (make_union): Pass self name to ffi_type_struct_checked. (union_members, union_get, union_put, union_in, union_out): Pass self name to uni_struct_checked. (make_zstruct, zero_fill, put_obj, get_obj, fill_obj): Pass self-name to ffi_type_struct_checked. * ffi.h (ffi_closure_get_fptr, union_get_ptr): Declarations updated. * filter.c (trie_add): Pass self-name to gethash_l. * hash.c (make_similar_hash, copy_hash, hash_count, get_hash_userdata, set_hash_userdata, hash_begin, hash_next, hash_uni, hash_diff, hash_isec): Pass self name to cobj_handle. (gethash_c, gethash_e): Take self name parameter and pass down to cobj_handle. (gethash_f): Take self parameter and pass down to gethash_e. (gethash, inhash, gethash_n, sethash, pushhash, remhash, clearhash, hash_update_1): Pass self name to gethash_e or gethash_c. * hash.h (gethash_c, gethash_e, gethash_f): Declarations updated. (gethash_l): Take self name, and pass down to gethash_c. * lib.c (class_check): Take self name parameter and use in type mismatch diagnostic. (use_sym, unuse_sym, symbol_needs_prefix, find_symbol, intern, unintern, intern_fallback, unique, in, sel, obj_print_impl, populate_obj_hash, obj_hash_merge): Pass self name to gethash_f or gethash_l. (symbol_visible, obj_init): Pass situation-identifying string to gethash_e. (cobj_handle, cobj_ops): Take self name parameter and pass down to class_check. * lib.h (class_check, cobj_handle, cobj_ops): Declarations updated. * match.c (v_load): Pass self name to read_compiled_file and read_eval_stream. * parser.c (get_parser_impl): Take self name and pass to cobj_handle. (ensure_parser): Pass situation-identifying string to gethash_c. (parser_circ_def): Pass self-name to gethash_c. (lisp_parser_impl): Pass self name to get_parser_impl and class_check. (lisp_parse, nread, iread): Pass self-name to lisp_parser_impl. (read_file_common): Take self name parameter and pass down to get_parser_impl. (read_eval_stream, read_compiled_file): Take self name and pass down to read_file_common. (load_rcfile): Pass situation-identifying string to read_eval_streem. (get_visible_syms): Pass situation-identifying string to gethash_c. (parser_errors, parser_eof): Pass self name to cobj_handle. * parser.h (read_eval_stream, read_compiled_file): Declarations updated. * parser.y (rlset): Pass self name to gethash_c. * rand.c (make_random_state, random_state_get_vec,l random_fixnum, random_float): Pass self name to cobj_handle. * regex.c (regex_source, regex_print, regex_run): Pass self-name to cobj_handle. (regex_machine_init): Take self name param and pass to cobj_handle. (search_regex, match_regex, match_regex_right, regex_prefix_match, read_until_match): Pass self-name to regex_machine_init. * stream.c (stdio_get_fd): Pass self name to cobj_handle. (generic_get_line): Get COBJ operations via unsafe, diret object access rather than cobj_ops. (set_mode_props): Get object handle via unsafe, direct object access. (stream_fd, sock_family, sock_type, sock_peer, set_sock_peer, get_string_from_stream, get_list_from_stream, stream_set_prop, stream_get_prop, close_stream, get_error, get_error_str, clear_error, get_line, get_char, get_byte, unget_char, unget_byte, put_buf, fill_buf, put_string, put_char, put_byte, flush_stream, seek_stream, truncate_stream, get_indent_mode, test_set_indent_mode, set_indent_mode, get_indent, set_indent, inc_indent, width_check, force_break, get_set_ctx, get_ctx): Pass self name to cobj_ops. (make_delegate_stream): Take self name parameter, pass down to cobj_ops. (record_adapter): Pass self name down to make_delegate_stream. (format): Pass self name to class_check. * struct.c (stype_handle): Pass self name to cobj_handle. (make_struct_type): Pass self name to class_check. * txr.c (read_eval_stream_noerr): Take self name parameter, pass to read_eval_stream. (txr_main): Pass istuation-identifying string to read_compiled_file and read_eval_stream_noerr. * unwind.c (revive_cont): Pass self-name to cobj_handle. * vm.c (vm_desc_struct): Take self name parameter, pass to cobj_handle. (vm_desc_nlevels, vm_desc_nregs, vm_desc_bytecode, vm_desc_datavec, vm_desc_symvec, vm_execute_toplevel, vm_execute_closure, vm_closure_entry): Pass self name to vm_desc_struct. (vm_closure_struct): Take self name parameter, pass to cobj_handle.
* lib: remove unused type checking functions.Kaz Kylheku2018-11-071-17/+0
| | | | | | * lib.c (type_check2, type_check3): Functions removed. * lib.h (type_check2, type_check3): Declarations removed.
* Fix wrong uses of ~s for function name string.Kaz Kylheku2018-11-071-5/+5
| | | | | | | | | | | | * ffi.c (make_ffi_type_enum): Use ~a for function name rather than ~s because it's a string which is quoted under ~s. * lib.c (chk_xalloc, string_extend, find_symbol, intern_fallback): Likewise. * stream.c (open_process, run): Likewise. * sysif.c (exec_wrap, setgroups_wrap, dlclose_wrap): Likewise.
* type_check: take function name arg.Kaz Kylheku2018-11-071-33/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * arith.c (flo_int): Pass down name to type_check. * eval.c (copy_env, env_fbind, env_vbind, env_vb_to_fb, func_get_name, lexical_var_p, lexical_fun_p, lexical_lisp1_binding, squash_menv_deleting_range, op_upenv): Pass relevant Lisp function name to type_check. (lookup_global_var, lookup_sym_lisp1, lookup_fun, lookup_mac, lookup_symac, lookup_symac_lisp1): For these widely used functions, pass situational prefix in place of function name. They may get a funtion name argument in the future. * gc.c (gc_finalize): Pass function name to type_check. * lib.c (throw_mismatch): Take function nme argument, incorporate into mesage. (lcons_fun, c_flo, string_extend, symbol_name, symbol_package, get_package, package_name, func_get_form, func_get_env, func_set_env, vec_set_length, length_vec, size_vec, list_vec, lay_str_force, lay_str_force_upto, lazy_str_get_trailing_list, from, too, set_from, set_to): Pass relevant Lisp function name to type_check. (symbol_setname, symbol_visible): Pass indication of internal error into type_check, since this doesn't pertain to any Lisp function being wrong. * lib.h (throw_mismatch): Declaration updated. (type_check): Take new parameter and pass down to throw_mismatch. * signal.c (set_sig_handler): Pass name down to type_check.
* symbol_needs_prefix: take function name argument.Kaz Kylheku2018-11-071-3/+3
| | | | | | | | * lib.c (symbol_needs_prefix): New parameter. (unquote_star_check, obj_print_impl): Pass Lisp function name to symbol_needs_prefix. * lib.h (symbol_needs_prefix): Declaration updated.
* lazy strings: remove two type checks.Kaz Kylheku2018-11-071-2/+0
| | | | | | * lib.c (lazy_str_put, out_lazy_str): The very few calls to these functions already ensure that the object is a lazy string; let's drop the check.
* math: improve error diagnosis.Kaz Kylheku2018-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | More streamlined code, better identification of functions. * arith.c (not_number, not_integer, invalid_ops, invalid_op, divzero): New static functions. (num_to_buffer, bugnum_len, plus, minus, neg, abso, signum, mul, trunc1, mod, floordiv, round1, roundiv, divi, zerop, plusp, minusp, evenp, oddp, gt, lt, ge, le, numeq, expt, exptmod, floorf, ceili, sine, cosi, tang, asine, acosi, atang, loga, logten, logtwo, expo, sqroot, int_flo, flo_int, cum_norm_dist, inv_cum_norm): Establish function's Lisp name as self variable. Use new static functions for reporting common errors. Pass function name to new argument of c_flo function. * buf.c (buf_put_float, buf_put_double): Pass function's Lisp name to c_flo function. * ffi.c (ffi_float_put, ffi_double_put): Likewise. * lib.c (c_flo): Takes new argument, name of calling function. * lib.h (c_flo): Declaration updated. * stream.c (formatv): Pass function name to c_flo.
* lib: use type switch in some string functions.Kaz Kylheku2018-11-061-79/+67
| | | | | | * lib.c (length_str, c_str, length_str_gt, length_str_ge, length_str_lt, length_str_le): Streamline code into single switch on the type code of the object.
* gc: eliminate most uses of gc_mutated.Kaz Kylheku2018-11-061-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code is using gc_mutated in situations that resemble assignment: a value is stored into a slot in some object. These situations should be handled using the same logic as embodied in the gc_set function. This is because gc_set will consider both objects, and in many cases will not have to do anything special. E.g. if an immature object is stored into another immature object, or mature into immature, or mature into mature. Whereas gc_mutated is a "just in case" function which forces the garbage collector to traverse the indicated object, if that object is mature. In this patch we refactor gc_set to expose its underlying logic with a somewhat more flexible function called gc_assign_check. We put that behind a conditionally defined macro called setcheck, and then use that to replace invocations of the mut macro in various places. The only uses of gc_mutated that remain are in the bulk vector assignment and copy_struct: operations in which potentially many element values are migrated from one aggregate object to another, making it potentially expensive to do individual assignment checks. * gc.c (gc_assign_check): New function, formed from guts of gc_set. (gc_set): Now a trivial function, implemented via call to gc_assign_check. * gc.h (gc_assign_check): Declared. * lib.c (cons): Use setcheck instead of gc_mutated, since we are storing only two values into the existing cons: the car and the cdr. * struct.c (clear_struct): Use setcheck instead of gc_mutated, since we are just storing one value into the structure, the clear_val. The fact that we are storing it into multiple slots is irrelevant. * vm.c (vm_make_closure): Use setcheck instead of mut, using the new heap_vector as the child object with regard to the closure. Rationale: the only threat here is that when we allocate the heap vector, a GC is triggered which pushes the closure into the mature generation. Then the store of the heap vector into the closure is a wrong-way reference, with regard to generational GC. The elements in the vector are immaterial; they are older than both the closure and the vector, therefore their relationship to either object is a right-way reference. (vm_set, vm_sm_set): Replace mut by a setcheck between the vector from the display and the new value being stored in it. (vm_stab): Replace the gc_mutated check, which should have been a mut macro call, with a setcheck between the vm, and the binding being stored into the table. The gc_mutated should have been wrapped with an #if CONFIG_GEN_GC so we are fixing a build bug here: the code would have prevented TXR from being built with the generational GC disabled.
* compiler: bugfix: handle defpackage and such properly.Kaz Kylheku2018-11-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is that the file compiler is emitting one big form that contains all of the compiled top-level forms. For obvious reasons, this doesn't work when that form contains symbols that are in a package which is defined by one of those forms; the compiled file will not load due to qualified symbols referencing a nonexistent package. The solution is to break up that big form when it contains forms that manipulate the package system in ways that possibly affect the read time of subsequent forms. * lib.c (delete_package): Use a non-destructive deletion on the *package-alist*, because we are going to be referring to this variable in the compiler to detect whether the list of packages has changed. * share/txr/stdlib/compiler.tl (%package-manip%): New global variable. This is a list of functions that manipulate the package system in suspicious ways. (user:compile-file): When compiling a form which is a call to any of the suspicious functions, add a :fence symbol into the compiled form list. Also do this if the evaluation of the compiled form modifies the *package-alist* variable. When emitting the list of forms into the output file, remove the :fence symbols and break it up into multiple lists along these fence boundaries. * txr.1: Documented the degenerate situation that can arise.
* buffers: implement copy-buf.Kaz Kylheku2018-11-041-0/+2
| | | | | | | | | | | * buf.c (copy_buf): New function. (buf_init): Register copy-buf intrinsic. * buf.h (copy_buf): Declared. * lib.c (copy): Handle BUF via copy_buf. * txr.1: Documented.
* hash: use full width unsigned type for hash values.Kaz Kylheku2018-07-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Throughout the hashing framework, hashes are reduced into the fixnum range, and returned as cnum. This is not necessary; only the hash-eql and hash-equal functions need to reduce hashes to fixnums. Let's make it ucnum everywhere else, using its full range (no reduction into the [0, NUM_MAX) range). * hash.c (struct hash_ops): hash_fun function pointer returns ucnum instead of cnum. (hash_double): Return unreduced ucnum. Obsolete #ifdef-s removed; the ucnum type gives us a pointer-wide unsigned integer on all platforms. (equal_hash, eql_hash): Return ucnum. Don't reduce values to fixnum range. Some of the way we combine hashes from recursive calls changes; we multiply by at most 2 not to lose too many bits. (eql_hash_op, cobj_eq_hash_op, hash_hash_op): Return ucnum. * hash.h (equal_hash): Declaration updated. * lib.c (cobj_handle_hash_op): Return value changes to ucnum. * lib.h (struct cobj_ops): Hash function pointer's return type changes. (cobj_eq_hash_op, cobj_handle_hash_op): Declarations updated. * struct.c (struct_inst_hash): Return value changes to ucnum.
* hashing: overhaul part 1.Kaz Kylheku2018-07-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hashing of buffers and character strings is being replaced with a seedable hash, providing a tool against denial of service attacks against hash tables. This commit lays most of the groundwork: most of the internal interface changes, and a new hashing implementation. What is missing is the mechanisms to do the seeding. * hash.c (struct hash_ops): Hash operation now takes a seed argument of type ucnum. (struct hash): New member, seed. (hash_str_limit): Default value changed to INT_MAX. A short value opens the gateway to an obvious collision attack whereby strings sharing the same 128 character prefix are entered into the same hash table, which will defeat any seedings strategy. (randbox): New static array. Values come from the Kazlib hash module, but are not used in exactly the same way. (hash_c_str, hash_buf): Now take a seed argument, and are rewritten. (equal_hash): Takes a seed, and passes it to hash_c_str, hash_buf and to recursive self calls. (eql_hash_op): New static function. Adapts the eql_hash operation, which doesn't take a seed, to the new interface that calls for a seed. (obj_eq_hash_op): Take a seed; ignore it. (hash_hash_op): Take a seed, pass it down to equal_hash. (hash_eql_ops): Wire hash functiono pointer to eql_hash_op instead of eql_hash. (make_hash): For now, intialize the hash's seed to zero. (make_similar_hash): Copy original hash's seed. (gethash_c, gethash_e, remhash): Pass hash table's seed to the hashing function. (hash_equal): Pass a seed of zero to equal_hash for now; this function will soon acquire an optional parameter for the seed. * hash.h (equal_hash): Declaration updated. * lib.c (cobj_handle_hash_op): Take seed argument, pass down. * lib.h (cobj_ops): Hash operation now takes seed. (cobj_eq_hash_op, cobj_handle_hash_op): Declarations updated. * struct.c (struct_inst_hash): Take seed argument, pass down. * tests/009/json.expected: Updated, because the hash table included in this output is now printed in a different order.
* C++ fixes related to recent Unicode work.Kaz Kylheku2018-05-181-1/+1
| | | | | | | * lib.c (chk_wrealloc): convert needs to be a coerce. * parser.l (grammar): Use yyg instead of yyscanner; the latter is the same pointer but of void * type.
* linenoise: switch to wide characters, support Unicode.Kaz Kylheku2015-09-221-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib.c (chk_wrealloc): New function. * lib.h (mem_t): Wrap with ifndef block. (MEM_T_DEFINED): New preprocessor symbol. (chk_wrealloc): Declared. * linenoise/linenoise.c (LINENOISE_MAX_DISP): Adjust to a reasonable value; just twice the number of abstract characters. The 8 factor had been chosen to cover the worst case that every character is mapped to a tab. (struct lino_state): Almost everything char typed turns to wchar_t. The TTY isn't referenced with Unix file descriptors, ifd and ofd, but abstract stream handles tty_ifs and tty_ofs. The ifs member isn't required any more since plain mode is handled via the tty_ifs stream. (mem_t): Declaration removed; now in linenoise.h. (chk_malloc, chk_realloc, chk_strdup_utf8): Declarations removed. (lino_os): New static structure. (nelem): New macro. (wcsnprintf): New static function. (enable_raw_mode, disable_raw_mode): Get Unix FD from stream using lino_os interface. (get_cursor_position, get_columns, handle_resize, record_undo, remove_noop_undo, restore_undo, undo_renumber_hist_idx, compare_completions, complete_line, lino_add_completion, next_hist_match, history_search, show_help, struct abuf, ab_append, ab_free, sync_data_to_buf, refresh_singleline, screen_rows, col_offset_in_str, refresh_multiline, scan_match_rev, scan_match_fwd, scan_fwd, find_nearest_paren, usec_delay, flash, yank_sel, delete_sel, edit_insert, edit_insert_str, edit_move_eol, edit_history_next, edit_delete, edit_backspace, edit_delete_prev_all, edit_delete_to_eol, edit_delete_line, edit_in_editor, edit, linenoise, lino_make, lino_cleanup. lino_free, free_hist, lino_hist_add, lino_hist_save, lino_set_result): Revised using streams, wide chars and lino_os interface. (lino_init): New function. * linenoise/linenoise.h (LINO_PAD_CHAR): New preprocessor symbol. (mem_t): Defined here. (MEM_T_DEFINED): New preprocessor symbol. (struct lino_os, lino_os_t): New structure. (lino_os_init): New macro. (struct lino_completions, lino_compl_cb_t, lino_atom_cb_t, lino_enter_cb_t): Switch to wchar_t. (lino_init): New function. (lino_add_completion, lino_make, linenoise, lino_hist_add, lino_hist_save, lino_hist_load, lino_set_result) * parser.c (find_matching_syms, provide_completions, provide_atom, is_balanced_line, repl): Adapt to wide character linenoise. (lino_fileno, lino_puts, lino_getch, lino_getl, lino_gets, lino_feof, lino_open, lino_open8, lino_fdopen, lino_close): New static functions. (linenoise_txr_binding): New static structure. (parse_init): Call lino_init, passing OS binding. * txr.1: Update text about the listener's limitations.
* compiler: replace "$" package hack.Kaz Kylheku2018-04-251-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | When compile-file writes emits the file, it does so with *package* bound to a temporary package named "$" so that all the symbols get fully qualified. Problem is, this is a valid package name and is added to the package list. While the package exists, symbols such as $:a could be interned. If such symbols occur in code being compiled, they get emitted using unqualified names. Let's introduce an internal interface for making an anonymous package which isn't on the list of package, and which has a name that results in bad syntax if it occurs in print. * eval.c (eval_init): Register sys:make-anon-package intrinsic. * lib.c (make_package_common): New static function. (make_package): Package construction and initialization code moved into make_package_common. (make_anon_package): New function. * lib.h (make_anon_package): Declared. * share/txr/stdlib/compiler.tl (usr:compile-file): When writing out translation, bind *package* to anonymous package from sys:make-anon-package.
* lib: new function vm-fun-p.Kaz Kylheku2018-04-071-0/+5
| | | | | | | | * eval.c (eval_init): vm-fun-p intrinsic registered. * lib.c (vm_fun_p): New function. * lib.h (vm_fun_p): Declared.
* vm: allow vm description to be callable as function.Kaz Kylheku2018-04-061-0/+4
| | | | | | | * lib.c (generic_funall): Handle vm-desc objects via vm_execute_toplevel. * vm.h (vm_desc_s, vm_closure_s): Declared.
* Application code is now in a package called pub.Kaz Kylheku2018-04-091-2/+4
| | | | | | | | | | | | | | | * lib.c (public_package): New variable. (obj_init): Protect public_package from gc. Initialize it with a package called "pub" which has the user package in its fallback list. * lib.h (public_package): Declared. * eval.c (eval_init): Initialize package_s to public_package rather than user_package, except in compat <= 190 mode. * txr.c (txr_main): Bind *package* to public_package rather than user_package, except in compat <= 190 mode.
* lib: get rid of preprocessor macros for packages.Kaz Kylheku2018-04-051-5/+5
| | | | | | | | | | | | | | | | | | | | | | | The identifiers user_package, system_package and keyword_package are preprocessor symbols that expand to other preprocessor symbols for no good reason. Time to get rid of this. * lib.c (system_package_var, keyword_package_var, user_package_var): Variables renamed to system_package, keyword_package and user_package. (symbol_package, keywordp, obj_init): Fix variable references to follow rename. * lib.h (keyword_package, user_package, system_package): Macros removed. (system_package_var, keyword_package_var, user_package_var): Variables renamed. * eval.c (eval_init): Fix variable references to follow rename. * parser.y (sym_helper): Likewise.
* printer: improve object formatting.Kaz Kylheku2018-04-051-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an issue with the printer in that it produces output whereby objects continue on the same line after a multi-line object, e.g: (foo (foobly bar xyzzy quux) (oops same line)) rather than: (foo (foobly bar xyzzy quux) (oops same line)) There is a simple fix for this: set a flag to force a line break on the next width-check operation whenever an object has been broken into multiple lines. width-check can return a Boolean indication whether it generated a line break, and so aggregate object printing routines can tell whether their object has been broken into lines, and set the flag. * stream.h (struct strm_base): New member, force_break. (force_break): Declared. * stream.c (strm_base_init): Extent initializer to cover force_break flag. (put_string, put_char): Clear the force_break flag whenever we hit column zero. (width_check): If indent mode is on, and force_break is true, generate a break. Clear force_break. (force_break): New function. (stream_init): Register force-break intrinsic. * buf.c (buf_print): Set the force break flag if the buffer was broken into multiple lines. * hash.c (hash_print_op): Set the force break flag if the hash was broken into multiple lines. * lib.c (obj_print_impl): Same logic for lists. * struct.c (struct_inst_print): Same logic for structs. * tests/009/json.expected, tests/011/macros-2.expected, tests/012/struct.tl, tests/017/glob-zarray.expected: Update expected textual output to reflect new formatting.
* regex: read/print bug: escaped double quote.Kaz Kylheku2018-04-041-3/+8
| | | | | | | | | | | | | | | | | | | Because the regex printer wrongly uses out_str_char (for the sake of borrowing its semicolon-notation processing) when a regex prints, all characters that require escaping in a string literal get escaped, which includes the " character. Unfortunately the \" sequence which results is rejected by the regex parser. * lib.c (out_str_char): Kludge: add extra argument to distinguish regex use versus string use, and treat the double quote accordingly. (out_str_readable): Give 0 arg to new param of out_str_char. * lib.h (out_str_char): Declaration updated. * regex.c (print_class_char, print_rec): Pass 1 to new param of out_str_char.
* packages: fix package prefix read/print issue.Kaz Kylheku2018-04-031-2/+44
| | | | | | | | | | | | | | | | | | | | | | | | Suppose that we have two symbols of the same name, in two packages: foo:sym and bar:sym. Suppose that the bar package has foo in its package fallback list, and suppose bar is the current package. Then bar:sym prints without a package prefix, as just sym. However, this is potentially ambiguous. Suppose that bar:sym is written to a file as just sym. Then later the file is read into a fresh image in a situation in which bar:sym has not yet been interned, but foo:sym already exists. In this situation, sym will just resolve to foo:sym. The printer must detect this ambiguous situation. If a symbol is present in a package, but a same-named symbol is in the fallback list; or if a symbol is visible in the fallback list, but a same-named symbol is present in the package, then a package prefix should be printed. * lib.c (symbol_needs_prefix): New function. (unquote_star_check, obj_print_impl): Use symbol_needs_prefix rather than symbol_visible. * lib.h (symbol_needs_prefix): Declared.
* lib: elminate reduce_right from expt.Kaz Kylheku2018-03-291-1/+9
| | | | | | * lib.c (rexpt): New static function. (exptv): Create reversed arguments on the stack, then process with nary_op and the rexpt shim.
* lib: eliminate reduce-left from n-ary math ops.Kaz Kylheku2018-03-291-36/+44
| | | | | | | | | | | | | | | | Using reduce-left is inefficient; it conses up a list. We can decimate the stacked arguments without consing. * lib.c (nary_op): Replace reduce_left with iteration. (nary_simple_op): New function, variant of nary_op useable by functions that have a mandatory argument passed separately from the argument list. (minusv, divv): Replace reduce_left with iteration. (maxv, minv): Replace reduce_left with nary_simple_op. (abso_self): New static function. (gcdv, lcmv): Replace reduce_left with nary_op. * lib.h (nary_simple_op): Declared.
* vm: funcall wrappers need to check arg count.Kaz Kylheku2018-03-201-0/+16
| | | | | | | | | | | | | | There is the possibility that funcall1 through funcall3 will pass the wrong number of arguments to vm_execute_closure, which doesn't do any checks. This is only for the code paths which don't go through generic_funcall, when the function has no optional arguments. * lib.c (funcall1, funcall2, funcall3, funcall4): Check that the closure doesn't require more fixed parameters than we're giving it in the variadic case, or that it doesn't require a different number of fixed parameters we're giving it in the non-variadic case.
* lib: new ldiff function.Kaz Kylheku2018-03-201-1/+56
| | | | | | | | | | * eval.c (eval_init): Use the old ldiff function under compatibility with 190 or lower. * lib.c (ldiff): Rewritten. (ldiff_old): New function, copy of previous version of ldiff. * lib.h (ldiff_old): Declared.
* vm: variadic arg closures bug 1/3.Kaz Kylheku2018-03-191-1/+1
| | | | | | | * lib.c (func_vm): fix incorrect assignment of int boolean to one-bit-wide bitfield. The argument isn't zero or one; in fact the value 256 gets passed down representing true. This gets truncated to one bit which is zero.
* vm: handle FVM function type thorughout run-time.Kaz Kylheku2018-03-161-1/+46
| | | | | | | | | | | | | | | | * gc.c (mark_obj): Recognize FVM functions and mark their vm_desc. * lib.c (equal): Handle equality for FVM. If the environment pointers are equal, consider the functions equal. (funcall, funcall1, funcall2, funcall3, funcall4): Recognize and call FVM functions. However, there is a lack of robustness here that needs to be addressed: vm_execute_closure doesn't check whether there are too many or not enough arguments. Interpreted functions have a run-time check inside bind_args. (obj_print_impl): Don't print VM functions as #<intrinsic fun...> but rather #<vm fun>.
* vm: bugfix: wrong setup of closure param counts.Kaz Kylheku2018-03-151-1/+1
| | | | | | * lib.c (func_vm): The fixparam argument is the total number of fixed parameters, including optionals. That argument must be stored in the same-named member of the function structure.