txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Optimize hash operation with unsafe car/cdr.	Kaz Kylheku	2019-02-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The associative lists that make up the chains of a hash table are guaranteed to be made of conses. We can use unsafe versions of car, cdr, rplaca and rplacd to speed up hash operations. * eval.c (op_dohash): Use unsafe operations on hash cell. * filter.c (trie_compress, regex_from_trie): Likewise. * hash.c (hash_equal_op, hash_print_op, hash_mark, hash_grow, hash_assoc, hash_assql, copy_hash_chain, gethash, inhash, gethash_n, sethash, remhash, hash_next, maphash, do_weak_tables, group_by, group_reduce, hash_keys_lazy, hash_keys, hash_values_lazy, hash_values, hash_pairs_lazy, hash_pairs, hash_alist_lazy, hash_uni, hash_diff, hash_symdiff, hash_isec, hash_subset, hash_update, hash_update_1, hash_revget): Likewise. * lib.c (us_rplaca, us_rplacd): New functions. (package_local_symbols, package_foreign_symbols, where, populate_obj_hash, obj_hash_merge): Use unsafe operations on hash cell * lib.h (us_rplaca, us_rplacd): Declared. * parser.c (circ_backpatch, get_visible_syms): Use unsafe operations on hash cell. * struct.c (method_name, get_slot_syms): Likewise.
*	symdiff: new function.	Kaz Kylheku	2019-02-14	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register symdiff intrinsic. * lib.c (symdiff): New function. * lib.h (us_car_p, us_cdr_p): New inline functions. (symdiff): Declared. * txr.1: Documented, also fixing issues not related to symdiff doc.
*	optimizing diff, isec and uni for non-lists.	Kaz Kylheku	2019-02-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also, these functions now support hashes. * eval.c (eval_init): Register only the deprecated set-diff to the set_diff function. The diff intrinsic is now going to the new function named diff. * lib.c (diff): New function. (isec, uni): Rewritten to use seq_iter_t. * lib.h (diff): Declared. * txr.1: Documentation updated.
*	Framework for iterating over sequences.	Kaz Kylheku	2019-02-13	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has been needed for a while. While we have seq_info for classifying sequences to nicely dispatch code into various cases, those cases duplicate code. The code base could benefit from generic traversal. * lib.c (seq_iter_get_nil, seq_iter_get_list, seq_iter_get_vec, set_iter_get_hash): New static functions. (seq_iter_rewind, seq_iter_init): New functions. * lib.h (struct seq_iter, seq_iter_t): New struct type and its typedef name. (seq_iter_init, seq_iter_rewind): Declared. (seq_get): New inline function.
*	sum and prod take keyfun argument.	Kaz Kylheku	2019-02-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Adjust registrations of sum and prod to be binary functions with an optional argument. * lib.c (nary_op_keyfun, sumv, prodv): New static functions. (sum, prod): Implement optional keyfun argument via sumv and prodv helpers. * lib.h (sum, prod): Declarations updated. * txr.1: Documentation updated.
*	Extend infrastructure for double_intptr_t.	Kaz Kylheku	2019-01-25	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We support an unsigned version of the type, and add functions for converting between Lisp values and both types. * arith.c (bignum_dbl_uipt): New function, unsigned companion to existing bignum_dbl_ipt. (c_dbl_num, c_dbl_unum): New functions. * arith.h (bignum_dbl_uipt, c_dbl_num, c_dbl_unum): Declared. * configure (superulong_t, SIZEOF_DOUBLE_INTPTR, DOUBLE_INTPTR_MAX, DOUBLE_INTPTR_MIN, DOUBLE_UINTPTR_MAX, double_uintptr_t): New definitions going into config.h. * lib.h (dbl_cnum, dbl_ucnum): New typedefs: double-sized analogs of cnum and ucnum. * mpi/mpi.c (mp_set_double_uintptr, mp_get_double_uintptr, mp_get_double_intptr): New functions. (s_mp_in_big_range): New static function. (mp_in_double_intptr_range, mp_in_double_uintptr_range): New functions. * mpi/mpi.h (mp_set_double_uintptr, mp_get_double_intptr, mp_get_double_uintptr, mp_in_double_intptr_range, mp_in_double_uintptr_range): Declared.
*	Copyright year bump 2019.	Kaz Kylheku	2019-01-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr: Extended Copyright line to 2018.
*	New function: square.	Kaz Kylheku	2019-01-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The square function calulates (* x x) but is faster for bignum integers by taking advantage of mp_sqr. * arith.c (square): New function. * eval.c (eval_init): Register square as intrinsic. * lib.h (square): Declared. * txr.1: Documented.
*	nzerop: new function.	Kaz Kylheku	2018-12-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	* arith.c (nzerop): New function. * eval.c (eval_init): Register nzerop intrinsic. * lib.h (nzerop): Declared. * txr.1: Documented.
*	New range testing functions.	Kaz Kylheku	2018-11-27	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register in-range and in-range* intrinsics. * lib.c (in_range, in_range_star): New functions. * lib.h (in_range, in_range_star): Declared. * txr.1: Documented.
*	logxor: fix seriously broken function.	Kaz Kylheku	2018-11-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Reported by Guillaume le Vaillant. * arith.c (logxor): Fix broken behavior when the arguments are the same nonzero fixnum, or the same bignum object. (logxor_old): New function: verbatim copy of previous logxor. * eval.c (eval_init): Register logxor intrinsic to the broken function if compatibility is 202 or less. * txr.1: Compat note added.
*	math: remove redundant type checks from NUM.	Kaz Kylheku	2018-11-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	* arith.c (plus, minus, neg, abso, signum, mul, trunc, mod, floordiv, plusp, minusp, evenp, oddp, gt, lt, ge, le, numeq, expt, exptmod, isqrt, gcd, flo_int, logand, logior, logxor, comp_trunc, lognot, logtrunc, sign_extend, ash, bit, logcount, tofloat, toint, width, poly, rpoly): Use the unchecked c_n rather than c_num on quantities that are known to be of NUM and CHR type. * lib.h (c_n): New inline function.
*	copy-fun: duplicate a function, with own environment.	Kaz Kylheku	2018-11-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (deep_copy_env): New function. (eval_init): Register copy-fun intrinsic. * eval.h (deep_copy_env): Declared. * lib.c (copy_fun): New function. * lib.h (copy_fun): Declared. * vm.c (vm_copy_closure): New function. * vm.h (vm_copy_closure): Declared. * txr.1: Documented copy-fun.
*	Better identify functions that misuse COBJ-s and hashes.	Kaz Kylheku	2018-11-07	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this patch, the cobj_handle, cobj_ops and variants of gethash get an additional argument to identify the caller. Many functions are updated to pass this down. * buf.c (buf_strm): Pass self name to cobj_handle. * eval.c (env_fbind, env_vbind, rt_defvarl, me_case): Pass self name to gethash_c or gethash_e. (load): Pass self name to read_eval_stream and read_compiled_file. (reg_symacro): Pass situation-identifying string to gethash_c. * ffi.c (ffi_type_struct_checked, ffi_closure_struct_checked, ffi_call_desc_checked, uni_struct_checked): Take self name parameter, and pass down to cobj_handle. (ffi_get_type, ffi_get_lisp_type): Take self name and pass down to ffi_type_struct_checked. (union_get_ptr): Take self name and pass to uni_struct_checked. (ffi_union_in, ffi_union_put): Pass self name to union_get_ptr. (ffi_type_compile): Pass self name to ffi_get_lisp_type. (ffi_make_call_desc): Pass self name to ffi_type_struct_checked, ffi_get_type and ffi_call_desc_checked. (ffi_make_closure): Pass self name to ffi_call_desc_checked. (ffi_closure_get_fptr): Take self name, pass to ffi_closure_struct_checked. (ffi_typedef, ffi_size, ffi_alignof, ffi_offsetof, ffi_arraysize, ffi_elemsize, ffi_elemtype, ffi_put_into, ffi_put, ffi_in, ffi_get, ffi_out, make_carray): Pass self name to ffi_closure_struct_checked. (carray_struct_checked): Take self name, pass to cobj_handle. (carray_set_length, carray_dup, carray_own, carray_free, carray_type, length_carray, copy_carray, carray_ptr, buf_carray, vec_carray, list_carray, carray_ref, carray_refset, carray_sub, carray_replace, carray_get_common, carray_put_common, unum_carray, num_carray, put_carray, fill_carray): Pass self name to carray_struct_checked. (carray_blank, carray_buf, carray_cptr): Pass self name ffi_type_struct_checked. (carray_pun): Pass self name to carray_struct_checked and ffi_type_struct_checked. (make_union): Pass self name to ffi_type_struct_checked. (union_members, union_get, union_put, union_in, union_out): Pass self name to uni_struct_checked. (make_zstruct, zero_fill, put_obj, get_obj, fill_obj): Pass self-name to ffi_type_struct_checked. * ffi.h (ffi_closure_get_fptr, union_get_ptr): Declarations updated. * filter.c (trie_add): Pass self-name to gethash_l. * hash.c (make_similar_hash, copy_hash, hash_count, get_hash_userdata, set_hash_userdata, hash_begin, hash_next, hash_uni, hash_diff, hash_isec): Pass self name to cobj_handle. (gethash_c, gethash_e): Take self name parameter and pass down to cobj_handle. (gethash_f): Take self parameter and pass down to gethash_e. (gethash, inhash, gethash_n, sethash, pushhash, remhash, clearhash, hash_update_1): Pass self name to gethash_e or gethash_c. * hash.h (gethash_c, gethash_e, gethash_f): Declarations updated. (gethash_l): Take self name, and pass down to gethash_c. * lib.c (class_check): Take self name parameter and use in type mismatch diagnostic. (use_sym, unuse_sym, symbol_needs_prefix, find_symbol, intern, unintern, intern_fallback, unique, in, sel, obj_print_impl, populate_obj_hash, obj_hash_merge): Pass self name to gethash_f or gethash_l. (symbol_visible, obj_init): Pass situation-identifying string to gethash_e. (cobj_handle, cobj_ops): Take self name parameter and pass down to class_check. * lib.h (class_check, cobj_handle, cobj_ops): Declarations updated. * match.c (v_load): Pass self name to read_compiled_file and read_eval_stream. * parser.c (get_parser_impl): Take self name and pass to cobj_handle. (ensure_parser): Pass situation-identifying string to gethash_c. (parser_circ_def): Pass self-name to gethash_c. (lisp_parser_impl): Pass self name to get_parser_impl and class_check. (lisp_parse, nread, iread): Pass self-name to lisp_parser_impl. (read_file_common): Take self name parameter and pass down to get_parser_impl. (read_eval_stream, read_compiled_file): Take self name and pass down to read_file_common. (load_rcfile): Pass situation-identifying string to read_eval_streem. (get_visible_syms): Pass situation-identifying string to gethash_c. (parser_errors, parser_eof): Pass self name to cobj_handle. * parser.h (read_eval_stream, read_compiled_file): Declarations updated. * parser.y (rlset): Pass self name to gethash_c. * rand.c (make_random_state, random_state_get_vec,l random_fixnum, random_float): Pass self name to cobj_handle. * regex.c (regex_source, regex_print, regex_run): Pass self-name to cobj_handle. (regex_machine_init): Take self name param and pass to cobj_handle. (search_regex, match_regex, match_regex_right, regex_prefix_match, read_until_match): Pass self-name to regex_machine_init. * stream.c (stdio_get_fd): Pass self name to cobj_handle. (generic_get_line): Get COBJ operations via unsafe, diret object access rather than cobj_ops. (set_mode_props): Get object handle via unsafe, direct object access. (stream_fd, sock_family, sock_type, sock_peer, set_sock_peer, get_string_from_stream, get_list_from_stream, stream_set_prop, stream_get_prop, close_stream, get_error, get_error_str, clear_error, get_line, get_char, get_byte, unget_char, unget_byte, put_buf, fill_buf, put_string, put_char, put_byte, flush_stream, seek_stream, truncate_stream, get_indent_mode, test_set_indent_mode, set_indent_mode, get_indent, set_indent, inc_indent, width_check, force_break, get_set_ctx, get_ctx): Pass self name to cobj_ops. (make_delegate_stream): Take self name parameter, pass down to cobj_ops. (record_adapter): Pass self name down to make_delegate_stream. (format): Pass self name to class_check. * struct.c (stype_handle): Pass self name to cobj_handle. (make_struct_type): Pass self name to class_check. * txr.c (read_eval_stream_noerr): Take self name parameter, pass to read_eval_stream. (txr_main): Pass istuation-identifying string to read_compiled_file and read_eval_stream_noerr. * unwind.c (revive_cont): Pass self-name to cobj_handle. * vm.c (vm_desc_struct): Take self name parameter, pass to cobj_handle. (vm_desc_nlevels, vm_desc_nregs, vm_desc_bytecode, vm_desc_datavec, vm_desc_symvec, vm_execute_toplevel, vm_execute_closure, vm_closure_entry): Pass self name to vm_desc_struct. (vm_closure_struct): Take self name parameter, pass to cobj_handle.
*	lib: remove unused type checking functions.	Kaz Kylheku	2018-11-07	1	-2/+0
\| \| \| \| \| \|	* lib.c (type_check2, type_check3): Functions removed. * lib.h (type_check2, type_check3): Declarations removed.
*	type_check: take function name arg.	Kaz Kylheku	2018-11-07	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* arith.c (flo_int): Pass down name to type_check. * eval.c (copy_env, env_fbind, env_vbind, env_vb_to_fb, func_get_name, lexical_var_p, lexical_fun_p, lexical_lisp1_binding, squash_menv_deleting_range, op_upenv): Pass relevant Lisp function name to type_check. (lookup_global_var, lookup_sym_lisp1, lookup_fun, lookup_mac, lookup_symac, lookup_symac_lisp1): For these widely used functions, pass situational prefix in place of function name. They may get a funtion name argument in the future. * gc.c (gc_finalize): Pass function name to type_check. * lib.c (throw_mismatch): Take function nme argument, incorporate into mesage. (lcons_fun, c_flo, string_extend, symbol_name, symbol_package, get_package, package_name, func_get_form, func_get_env, func_set_env, vec_set_length, length_vec, size_vec, list_vec, lay_str_force, lay_str_force_upto, lazy_str_get_trailing_list, from, too, set_from, set_to): Pass relevant Lisp function name to type_check. (symbol_setname, symbol_visible): Pass indication of internal error into type_check, since this doesn't pertain to any Lisp function being wrong. * lib.h (throw_mismatch): Declaration updated. (type_check): Take new parameter and pass down to throw_mismatch. * signal.c (set_sig_handler): Pass name down to type_check.
*	symbol_needs_prefix: take function name argument.	Kaz Kylheku	2018-11-07	1	-1/+1
\| \| \| \| \| \| \| \|	* lib.c (symbol_needs_prefix): New parameter. (unquote_star_check, obj_print_impl): Pass Lisp function name to symbol_needs_prefix. * lib.h (symbol_needs_prefix): Declaration updated.
*	math: improve error diagnosis.	Kaz Kylheku	2018-11-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	More streamlined code, better identification of functions. * arith.c (not_number, not_integer, invalid_ops, invalid_op, divzero): New static functions. (num_to_buffer, bugnum_len, plus, minus, neg, abso, signum, mul, trunc1, mod, floordiv, round1, roundiv, divi, zerop, plusp, minusp, evenp, oddp, gt, lt, ge, le, numeq, expt, exptmod, floorf, ceili, sine, cosi, tang, asine, acosi, atang, loga, logten, logtwo, expo, sqroot, int_flo, flo_int, cum_norm_dist, inv_cum_norm): Establish function's Lisp name as self variable. Use new static functions for reporting common errors. Pass function name to new argument of c_flo function. * buf.c (buf_put_float, buf_put_double): Pass function's Lisp name to c_flo function. * ffi.c (ffi_float_put, ffi_double_put): Likewise. * lib.c (c_flo): Takes new argument, name of calling function. * lib.h (c_flo): Declaration updated. * stream.c (formatv): Pass function name to c_flo.
*	gc: eliminate most uses of gc_mutated.	Kaz Kylheku	2018-11-06	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code is using gc_mutated in situations that resemble assignment: a value is stored into a slot in some object. These situations should be handled using the same logic as embodied in the gc_set function. This is because gc_set will consider both objects, and in many cases will not have to do anything special. E.g. if an immature object is stored into another immature object, or mature into immature, or mature into mature. Whereas gc_mutated is a "just in case" function which forces the garbage collector to traverse the indicated object, if that object is mature. In this patch we refactor gc_set to expose its underlying logic with a somewhat more flexible function called gc_assign_check. We put that behind a conditionally defined macro called setcheck, and then use that to replace invocations of the mut macro in various places. The only uses of gc_mutated that remain are in the bulk vector assignment and copy_struct: operations in which potentially many element values are migrated from one aggregate object to another, making it potentially expensive to do individual assignment checks. * gc.c (gc_assign_check): New function, formed from guts of gc_set. (gc_set): Now a trivial function, implemented via call to gc_assign_check. * gc.h (gc_assign_check): Declared. * lib.c (cons): Use setcheck instead of gc_mutated, since we are storing only two values into the existing cons: the car and the cdr. * struct.c (clear_struct): Use setcheck instead of gc_mutated, since we are just storing one value into the structure, the clear_val. The fact that we are storing it into multiple slots is irrelevant. * vm.c (vm_make_closure): Use setcheck instead of mut, using the new heap_vector as the child object with regard to the closure. Rationale: the only threat here is that when we allocate the heap vector, a GC is triggered which pushes the closure into the mature generation. Then the store of the heap vector into the closure is a wrong-way reference, with regard to generational GC. The elements in the vector are immaterial; they are older than both the closure and the vector, therefore their relationship to either object is a right-way reference. (vm_set, vm_sm_set): Replace mut by a setcheck between the vector from the display and the new value being stored in it. (vm_stab): Replace the gc_mutated check, which should have been a mut macro call, with a setcheck between the vm, and the binding being stored into the table. The gc_mutated should have been wrapped with an #if CONFIG_GEN_GC so we are fixing a build bug here: the code would have prevented TXR from being built with the generational GC disabled.
*	hash: use full width unsigned type for hash values.	Kaz Kylheku	2018-07-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Throughout the hashing framework, hashes are reduced into the fixnum range, and returned as cnum. This is not necessary; only the hash-eql and hash-equal functions need to reduce hashes to fixnums. Let's make it ucnum everywhere else, using its full range (no reduction into the [0, NUM_MAX) range). * hash.c (struct hash_ops): hash_fun function pointer returns ucnum instead of cnum. (hash_double): Return unreduced ucnum. Obsolete #ifdef-s removed; the ucnum type gives us a pointer-wide unsigned integer on all platforms. (equal_hash, eql_hash): Return ucnum. Don't reduce values to fixnum range. Some of the way we combine hashes from recursive calls changes; we multiply by at most 2 not to lose too many bits. (eql_hash_op, cobj_eq_hash_op, hash_hash_op): Return ucnum. * hash.h (equal_hash): Declaration updated. * lib.c (cobj_handle_hash_op): Return value changes to ucnum. * lib.h (struct cobj_ops): Hash function pointer's return type changes. (cobj_eq_hash_op, cobj_handle_hash_op): Declarations updated. * struct.c (struct_inst_hash): Return value changes to ucnum.
*	hashing: overhaul part 1.	Kaz Kylheku	2018-07-04	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hashing of buffers and character strings is being replaced with a seedable hash, providing a tool against denial of service attacks against hash tables. This commit lays most of the groundwork: most of the internal interface changes, and a new hashing implementation. What is missing is the mechanisms to do the seeding. * hash.c (struct hash_ops): Hash operation now takes a seed argument of type ucnum. (struct hash): New member, seed. (hash_str_limit): Default value changed to INT_MAX. A short value opens the gateway to an obvious collision attack whereby strings sharing the same 128 character prefix are entered into the same hash table, which will defeat any seedings strategy. (randbox): New static array. Values come from the Kazlib hash module, but are not used in exactly the same way. (hash_c_str, hash_buf): Now take a seed argument, and are rewritten. (equal_hash): Takes a seed, and passes it to hash_c_str, hash_buf and to recursive self calls. (eql_hash_op): New static function. Adapts the eql_hash operation, which doesn't take a seed, to the new interface that calls for a seed. (obj_eq_hash_op): Take a seed; ignore it. (hash_hash_op): Take a seed, pass it down to equal_hash. (hash_eql_ops): Wire hash functiono pointer to eql_hash_op instead of eql_hash. (make_hash): For now, intialize the hash's seed to zero. (make_similar_hash): Copy original hash's seed. (gethash_c, gethash_e, remhash): Pass hash table's seed to the hashing function. (hash_equal): Pass a seed of zero to equal_hash for now; this function will soon acquire an optional parameter for the seed. * hash.h (equal_hash): Declaration updated. * lib.c (cobj_handle_hash_op): Take seed argument, pass down. * lib.h (cobj_ops): Hash operation now takes seed. (cobj_eq_hash_op, cobj_handle_hash_op): Declarations updated. * struct.c (struct_inst_hash): Take seed argument, pass down. * tests/009/json.expected: Updated, because the hash table included in this output is now printed in a different order.
*	logcount: new function.	Kaz Kylheku	2018-05-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is in ANSI CL; potentially useful and hard to implement efficiently in user code. * arith.c (logcount): New function. * eval.c (eval_init): Register logcount intrinsic. * lib.h (logcount): Declared. * mpi/mi.c (s_mp_count_ones): New static function. (mp_count_ones): New function. * mpi/mpi.h (mp_count_ones): Declared. * txr.1: Documented.
*	linenoise: switch to wide characters, support Unicode.	Kaz Kylheku	2015-09-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (chk_wrealloc): New function. * lib.h (mem_t): Wrap with ifndef block. (MEM_T_DEFINED): New preprocessor symbol. (chk_wrealloc): Declared. * linenoise/linenoise.c (LINENOISE_MAX_DISP): Adjust to a reasonable value; just twice the number of abstract characters. The 8 factor had been chosen to cover the worst case that every character is mapped to a tab. (struct lino_state): Almost everything char typed turns to wchar_t. The TTY isn't referenced with Unix file descriptors, ifd and ofd, but abstract stream handles tty_ifs and tty_ofs. The ifs member isn't required any more since plain mode is handled via the tty_ifs stream. (mem_t): Declaration removed; now in linenoise.h. (chk_malloc, chk_realloc, chk_strdup_utf8): Declarations removed. (lino_os): New static structure. (nelem): New macro. (wcsnprintf): New static function. (enable_raw_mode, disable_raw_mode): Get Unix FD from stream using lino_os interface. (get_cursor_position, get_columns, handle_resize, record_undo, remove_noop_undo, restore_undo, undo_renumber_hist_idx, compare_completions, complete_line, lino_add_completion, next_hist_match, history_search, show_help, struct abuf, ab_append, ab_free, sync_data_to_buf, refresh_singleline, screen_rows, col_offset_in_str, refresh_multiline, scan_match_rev, scan_match_fwd, scan_fwd, find_nearest_paren, usec_delay, flash, yank_sel, delete_sel, edit_insert, edit_insert_str, edit_move_eol, edit_history_next, edit_delete, edit_backspace, edit_delete_prev_all, edit_delete_to_eol, edit_delete_line, edit_in_editor, edit, linenoise, lino_make, lino_cleanup. lino_free, free_hist, lino_hist_add, lino_hist_save, lino_set_result): Revised using streams, wide chars and lino_os interface. (lino_init): New function. * linenoise/linenoise.h (LINO_PAD_CHAR): New preprocessor symbol. (mem_t): Defined here. (MEM_T_DEFINED): New preprocessor symbol. (struct lino_os, lino_os_t): New structure. (lino_os_init): New macro. (struct lino_completions, lino_compl_cb_t, lino_atom_cb_t, lino_enter_cb_t): Switch to wchar_t. (lino_init): New function. (lino_add_completion, lino_make, linenoise, lino_hist_add, lino_hist_save, lino_hist_load, lino_set_result) * parser.c (find_matching_syms, provide_completions, provide_atom, is_balanced_line, repl): Adapt to wide character linenoise. (lino_fileno, lino_puts, lino_getch, lino_getl, lino_gets, lino_feof, lino_open, lino_open8, lino_fdopen, lino_close): New static functions. (linenoise_txr_binding): New static structure. (parse_init): Call lino_init, passing OS binding. * txr.1: Update text about the listener's limitations.
*	compiler: replace "$" package hack.	Kaz Kylheku	2018-04-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When compile-file writes emits the file, it does so with package bound to a temporary package named "$" so that all the symbols get fully qualified. Problem is, this is a valid package name and is added to the package list. While the package exists, symbols such as $:a could be interned. If such symbols occur in code being compiled, they get emitted using unqualified names. Let's introduce an internal interface for making an anonymous package which isn't on the list of package, and which has a name that results in bad syntax if it occurs in print. * eval.c (eval_init): Register sys:make-anon-package intrinsic. * lib.c (make_package_common): New static function. (make_package): Package construction and initialization code moved into make_package_common. (make_anon_package): New function. * lib.h (make_anon_package): Declared. * share/txr/stdlib/compiler.tl (usr:compile-file): When writing out translation, bind package to anonymous package from sys:make-anon-package.
*	vm: de-inline opcode dispatch.	Kaz Kylheku	2018-04-25	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The vm_execute function is heavily inlined by gcc, and requires almost 500 bytes of stack space. The stack space really adds up when the vm re-enters itself recursively. Also, pointers to garbage can hide in areas of that bloated stack frame that are not being used by execution paths, adding to the spurious retention problem. * lib.h (NOINLINE): New preprocessor symbol. * vm.c (vm_prof, vm_frame, vm_sframe, vm_dframe, vm_end, vm_fin, vm_call, vm_apply, vm_gcall, vm_gapply, vm_movrs, vm_movsr, vm_movrr, vm_movrsi, vm_movsmi, vm_movrbi, vm_if, vm_ifq, vm_ifql, vm_swtch, vm_uwprot, vm_block, vm_no_block_err, vm_retsr, vm_retrs, vm_retrr, vm_abscsr, vm_catch, vm_handle, vm_getsym, vm_getbind, vm_setsym, vm_bindv, vm_close, vm_execute): Apply INLINE to functions.
*	lib: new function vm-fun-p.	Kaz Kylheku	2018-04-07	1	-0/+1
\| \| \| \| \| \| \| \|	* eval.c (eval_init): vm-fun-p intrinsic registered. * lib.c (vm_fun_p): New function. * lib.h (vm_fun_p): Declared.
*	Application code is now in a package called pub.	Kaz Kylheku	2018-04-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (public_package): New variable. (obj_init): Protect public_package from gc. Initialize it with a package called "pub" which has the user package in its fallback list. * lib.h (public_package): Declared. * eval.c (eval_init): Initialize package_s to public_package rather than user_package, except in compat <= 190 mode. * txr.c (txr_main): Bind package to public_package rather than user_package, except in compat <= 190 mode.
*	lib: get rid of preprocessor macros for packages.	Kaz Kylheku	2018-04-05	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The identifiers user_package, system_package and keyword_package are preprocessor symbols that expand to other preprocessor symbols for no good reason. Time to get rid of this. * lib.c (system_package_var, keyword_package_var, user_package_var): Variables renamed to system_package, keyword_package and user_package. (symbol_package, keywordp, obj_init): Fix variable references to follow rename. * lib.h (keyword_package, user_package, system_package): Macros removed. (system_package_var, keyword_package_var, user_package_var): Variables renamed. * eval.c (eval_init): Fix variable references to follow rename. * parser.y (sym_helper): Likewise.
*	regex: read/print bug: escaped double quote.	Kaz Kylheku	2018-04-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because the regex printer wrongly uses out_str_char (for the sake of borrowing its semicolon-notation processing) when a regex prints, all characters that require escaping in a string literal get escaped, which includes the " character. Unfortunately the \" sequence which results is rejected by the regex parser. * lib.c (out_str_char): Kludge: add extra argument to distinguish regex use versus string use, and treat the double quote accordingly. (out_str_readable): Give 0 arg to new param of out_str_char. * lib.h (out_str_char): Declaration updated. * regex.c (print_class_char, print_rec): Pass 1 to new param of out_str_char.
*	packages: fix package prefix read/print issue.	Kaz Kylheku	2018-04-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Suppose that we have two symbols of the same name, in two packages: foo:sym and bar:sym. Suppose that the bar package has foo in its package fallback list, and suppose bar is the current package. Then bar:sym prints without a package prefix, as just sym. However, this is potentially ambiguous. Suppose that bar:sym is written to a file as just sym. Then later the file is read into a fresh image in a situation in which bar:sym has not yet been interned, but foo:sym already exists. In this situation, sym will just resolve to foo:sym. The printer must detect this ambiguous situation. If a symbol is present in a package, but a same-named symbol is in the fallback list; or if a symbol is visible in the fallback list, but a same-named symbol is present in the package, then a package prefix should be printed. * lib.c (symbol_needs_prefix): New function. (unquote_star_check, obj_print_impl): Use symbol_needs_prefix rather than symbol_visible. * lib.h (symbol_needs_prefix): Declared.
*	lib: eliminate reduce-left from n-ary math ops.	Kaz Kylheku	2018-03-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using reduce-left is inefficient; it conses up a list. We can decimate the stacked arguments without consing. * lib.c (nary_op): Replace reduce_left with iteration. (nary_simple_op): New function, variant of nary_op useable by functions that have a mandatory argument passed separately from the argument list. (minusv, divv): Replace reduce_left with iteration. (maxv, minv): Replace reduce_left with nary_simple_op. (abso_self): New static function. (gcdv, lcmv): Replace reduce_left with nary_op. * lib.h (nary_simple_op): Declared.
*	lib: new ldiff function.	Kaz Kylheku	2018-03-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Use the old ldiff function under compatibility with 190 or lower. * lib.c (ldiff): Rewritten. (ldiff_old): New function, copy of previous version of ldiff. * lib.h (ldiff_old): Declared.
*	New: virtual machine with assembler.	Kaz Kylheku	2018-03-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit is the start of compiler work to make TXR Lisp execute faster. In six days of part time work, we now have a register-style virtual machine with 32 instructions, handling exceptions, unwind-protect, lexical closures, and global environment access/mutation. We have a complete assembler and disassembler for this machine. The assembler supports labels with forward referencing with backpatching, and features pseudo-ops: for instance the (mov ...) pseudo-instruction chooses one of three kinds of specific move instruction based on the operands. * Makelfile (OBJS): Add vm.o. * eval.c (lookup_sym_lisp1): Static function becomes external; the virtual machine needs to use this to support that style of lookup. * genvmop.txr: New file. This is the generator for the "vmop.h" header. * lib.c (func_vm): New function. (generic_funcall): Handle the FVM function type via new vm_execute_closure function. In the variadic case, we want to avoid the argument copying which we do for the sake of C functions that get their fixed arguments directly, and then just the trailing arguments. Thus the code is restructured a bit in order to switch twice on the function type. (init): Call vm_init. * lib.h (functype_t): New enum member FVM. (struct func): New member in the .f union: vm_desc. (func_vm): Declared. * lisplib.c (set_dlt_entries_impl): New static function, formed from set_dlt_entries. (set_dlt_entries): Reduced to wrapper for set_dlt_entries_impl, passing in the user package. (set_dlt_entries_sys): New static function: like set_dlt_entries but targetting the sys package. (asm_instantiate, asm_set_entries): New static functions. (lisplib_init): Auto-load the sys:assembler class. * share/txr/stdlib/asm.tl: New file. * vm.c, vm.h, vmop.h: New files.
*	Require semicolon after static_{forward,def} macros.	Kaz Kylheku	2018-02-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	* lib.h (static_forward, static_def): At least the C version of these now require a trailing semicolon. * struct.c (struct_type_ops): Add required semicolon after static_def. * syslog.c (syslog_strm_ops): Add required semicolon after static_forward and after static_def.
*	Copyright year bump 2018.	Kaz Kylheku	2018-02-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, win/cleansvg.txr: Extended Copyright line to 2018.
*	term: move near site of use.	Kaz Kylheku	2018-01-07	1	-1/+0
\| \| \| \| \| \| \| \| \|	* eval.c (term): Function here from lib.c, and changed to static. It is used only by iapply. * lib.c (term): Function moved to eval.c. * lib.h (term): Declaration removed.
*	listref_l: remove.	Kaz Kylheku	2018-01-06	1	-1/+0
\| \| \| \| \| \|	* lib.c (listref_l): Unused function removed. * lib.h (listref_l): Declaration removed.
*	ltail: unused function.	Kaz Kylheku	2018-01-02	1	-1/+0
\| \| \| \| \| \| \| \|	* lib.c (ltail): Function removed. This was introduced at the same time as lazy_appendv and used only by it. That function was rewritten a few months ago and doesn't use lail. * lib.h (ltail): Declaration removed.
*	eliminate cdr_l use from implementation of last.	Kaz Kylheku	2018-01-02	1	-1/+1
\| \| \| \| \| \|	* lib.c (lastcons): Return value is just the last cons rather than a loc. The only caller of this function is last. (last): Adapt to the new lastcons.
*	New methods rplaca and rplacd.	Kaz Kylheku	2017-12-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register rplaca and rplacd using new rplaca_s and rplacd_s symbol variables. * lib.c (rplaca_s, rplacd_s): New symbol variables. (rplaca): Handle struct object via rplaca method, if it has one, otherwise lambda-set, if it has that, or else error out. (rplacd): Handle struct object via rplacd method. * lib.h (rplaca_s, rplacd_s): Declared. * txr.1: Documented rplaca and rplacd methods.
*	prof: deal with overflowing mem counters.	Kaz Kylheku	2017-12-04	1	-0/+2
\| \| \| \| \| \| \| \|	* eval.c (op_prof): Deal with the cases when alloc_bytes_t value cannot be converted to a val in a single call to unum. * lib.h (SIZEOF_ALLOC_BYTES_T): New macro.
*	New function: grade.	Kaz Kylheku	2017-11-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Inspired by APL. * eval.c (eval_init): Register grade intrinsic. * lib.c (grade): New function. * lib.h (grade): Declared. * txr.1: Documented.
*	bugfix: fixnum crackdown.	Kaz Kylheku	2017-09-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The purpose of this commit is to address certain situations in which code is wrongly relying on a cnum value being in the fixnum range (NUM_MIN to NUM_MAX), so that num_fast can safely be used on it. One wrong pattern is that c_num is applied to some Lisp value, and that value (or one derived from it arithmetically) is then passed to num_fast. The problem is that c_num succeeds on integers outside of the fixnum range. Some bignum values convert to a cnum successfully. Thus either num has to be used instead of num_fast, or else the original c_num attempt must be replaced with something that will fail if the original value isn't a fixnum. (In the latter case, any arithmetic on the fixnum cannot produce value outside of that range). * buf.c (buf_put_bytes): The size argument here is not guaranteed to be in fixnum range: use num. * combi.c (perm_init_common): Throw if the sequence length isn't a fixnum. Thus the num_fast in perm_while_fun is correct, since the ci value is bounded by k, which is bounded by n. * hash.c (hash_grow): Remove dubious assertion which aborts the run-time if the hash table doubling overflows. Simply don't allow the modulus to grow beyond NUM_MAX. If doubling it makes it larger than NUM_MAX, then just don't grow the table. We need the modulus to be in fixnum range, so that uses of num_fast on the modulus value elsewhere are correct. (group_by, group_reduce): Use c_fixnum rather than c_num to extract a value that is later assumed to be a fixnum. * lib.c (c_fixnum): New function. (nreverse, reverse, remove_if, less, window_map_list, sort_vec, unique): Use c_fixnum rather than c_num to extract a value that is later assumed to be a fixnum. (string_extend): Use c_fixnum rather than c_num to extract a value that is later assumed to be a fixnum. Cap the string allocation size to fixnum range rather than INT_PTR_MAX. (cmp_str): The wcscmp function could return values outside of the fixnum range, so we must use num, not num_fast. * lib.h (c_fixnum): Declared.
*	Revising out-of-memory handling.	Kaz Kylheku	2017-08-18	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't want to be aborting on OOM, but throwing an exception. * lib.c (alloc_error_s): New symbol variable. (oom_realloc): Global variable removed. (oom): New static function. (chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc): Call oom instead of removed oom_realloc handler. (env): Throw alloc-error rather than error by calling oom. (obj_init): Initialize alloc_error_s. (init): Drop function pointer argument; do not initialize removed oom_realloc. * lib.h (alloc_error_s): Declared. (oom_realloc): Declaration removed. (init): Declaration updated. * txr.1: Type tree diagram includes alloc-error.
*	New spl and tok: variants of tok-str and split-str.	Kaz Kylheku	2017-08-07	1	-0/+2
\| \| \| \| \| \| \| \|	* eval.c (eval_init): Register spl and tok intrinsics. * lib.c (spl, tok): New functions. * txr.1: Documented.
*	bugfix: n-ary arith functions must check single arg.	Kaz Kylheku	2017-08-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are allowing calls like (* "a") and (+ "a") without diagnosing that the argument isn't of a valid type. Note that (max "a") is fine beacause min and max use the less function; they are not strictly numeric. * lib.c (nary_op): Beef up function with additional argument for type checking the unary case. (unary_num, unary_arith, unary_int): New static functions. (plusv, mulv, logandv, logiorv): Use new nary_op interface. (gtv, ltv, gev, lev, numeqv, numneq): Check the first number. * lib.c (nary_op): Declaration updated.
*	Add sum and prod convenience functions.	Kaz Kylheku	2017-08-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): prod and sum intrinsics registered. * lib.c (sum, prod): New functions. * lib.h (sum, prod): Declared. * txr.1: Documented.
*	lib: deprecate set-diff; extend set operations.	Kaz Kylheku	2017-07-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register set-diff under two names: set-diff and diff. Register new isec and uni intrinsics. * lib.c (isec, uni): New functions. * lib.h (isec, uni): Declared. * txr.1: Documented new uni and isec functions, new diff function name, and the deprecation of set-diff and its order guarantee w.r.t the left sequence.
*	new function: nth	Kaz Kylheku	2017-07-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just the ANSI CL nth for lists. * eval.c (eval_init): Register nth intrinsic. * lib.c (nth): New function. * lib.h (nth): Declared. * share/txr/stdlib/place.tl (nth): New place macro, trivially takes care of making nth an accessor. Place macros are terrific! * txr.1: Documented.
*	lib: new function, relate.	Kaz Kylheku	2017-07-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register new intrinsic relate. * lib.c (do_relate, do_relate_dfl): New static functions. (relate): New function. * lib.h (relate): Declared. * txr.1: Documented.