summaryrefslogtreecommitdiffstats
path: root/parser.c
Commit message (Collapse)AuthorAgeFilesLines
* repl: bugfix: half-baked source auto loading in completion.Kaz Kylheku2021-10-251-5/+6
| | | | | | | | | | | | | | | | | | | | | | | The following behavior is observed. When we clean the compiled files using "make clean-tlo", then autoloading during completion does not work reliably for some symbols like dissassemble and compile. The symbols don't complete, and afterward, the functions remain undefined, and no longer autoload. The root cause is that when some modules are loaded form source, deferred warnings occur, due to code referring to symbols that are defined later. But the provide_completions function installs a catch for all exceptions, including deferred warnings. It thereby abruptly terminates loads which trigger deferred warnings, leaving them half-complete. The fix is to catch only errors. * parser.c (catch_error): New global variable. (load_rcfile): Use catch_error from now on instead of locally consing this. (provide_completions): Use catch_error instead of catch_all. (parse_init): gc-protect catch_error and initialize it.
* exceptions: hack to store errno in string object.Kaz Kylheku2021-09-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Basic idea: when we throw an exception that pertains to a system error which has an errno code, we can stick the errno into the memory area of the character string, into the wchar_t that immediately follows the null terminator. We can do this because strings track their actual allocation size. A pair of setter/getter functions to set and retrieve this value are provided, and all functions in the code which can set such a code are updated to do so, simply by calling the newly added uw_ethrowf that drop-in replaces for uw_throwf. * lib.[ch] (string_set_code, string_get_code): New functions. * unwind.[ch] (uw_ethrowf): New function. * eval.c (eval_init): Register string-set-code and string-get-code intrinsics. * ftw.c (ftw_wrap): Switch to uw_ethrowf. * parser.c (open_txr_file): Likewise. * socket.c (dgram_overflow): Store the ENOBUFS error in errno, and use uw_ethrowf instead uw_throwf. (dgram_get_byte_callback, dgram_flush, sock_bind, to_connect, open_sockfd, sock_connect, sock_listen, sock_accept, sock_shutdown, sock_timeout, socketpair_wrap): Switch to uw_ethrowf. * stream.c (dev_null_get_fd, stdio_maybe_read_error, stdio_maybe_error, stdio_close, pipe_close, open_directory, open_file, open_fileno, open_tail, fds_subst, open_subprocess, open_command, remove_path, rename_path, tmpfile_wrap, mkdtemp_wrap, mkstemp_wrap): Switch to uw_ethrowf. * sysif.c (mkdir_wrap, ensure_dir, chdir_wrap, getcwd_wrap, rmdir_wrap, mknod_wrap, mkfifo_wrap, chmod_wrap, do_chown, symlink_wrap, link_wrap, readlink_wrap, close_wrap, val exec_wrap, stat_impl, do_utimes, pipe_wrap, poll_wrap, getgroups_wrap, setuid_wrap, seteuid_wrap, setgid_wrap, setegid_wrap, setgroups_wrap, getresuid_wrap, setresuid_wrap, setresgid_wrap, crypt_wrap, uname_wrap, opendir_wrap, getrlimit_wrap, setrlimit_wrap): Likewise. * termios.c (tcgetattr_wrap, tcsetattr_wrap, tcsendbreak_wrap, tcdrain_wrap, tcflush_wrap, tcflow_wrap): Likewise. * tests/018/errno.tl: New file. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* lookup_var: don't pass dyn_env explicitly.Kaz Kylheku2021-09-031-2/+2
| | | | | | | | | | | | | Since lookup_var(nil, ...) skips the environment and goes for dyn_env, there is no need to pass dyn_env explicitly; it's a bit of an anti-pattern. The argument is intended for lexical scopes. * eval.c (eval_exception, expand_eval, load): Pass nil to lookup_var instead of the current dynamic environment. * match.c (v_load): Likewise. * parser.c (txr_parse, read_eval_ret_last): Likewise.
* configure: implement full-repl option.Kaz Kylheku2021-08-201-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch unbundles the building of the full-featured REPL from HAVE_TERMIOS. We make it subject to its own configuration option CONFIG_FULL_REPL, which is 1 by default. This way, the downstream users or package maintainers can build TXR without the full-featured REPL even if HAVE_TERMIOS is 1, and the other termios material is built-in. * configure (full_repl): New variable. (help): Include full-repl in the help text. In the termios test, if we don't detect termios, then negate the full_repl variable. In the final config variable generation section, generate the CONFIG_FULL_REPL 1 define in config.h, if full_repl is true, ensuring it is subject to HAVE_TERMIOS, too. * linenoise/linenoise.c: Replace HAVE_TERMIOS with CONFIG_FULL_REPL. * linenoise/linenoise.h: Likewise. * parser.c: Likewise. * txr.c: Likewise and ... (if_termios): Macro renamed to if_full_repl. (if_full_repl): New macro. (opt_noninteractive): Use if_full_repl macro for initializing. (banner): Use if_ful_repl macro instead of if_termios.
* listener: additional reductions in non-termios build.Kaz Kylheku2021-08-201-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * linenoise/linenoise.c (struct lino_state): these members are now absent in non-termios build: atom_callback, ca_ctx, rawmode, mlmode, clip, result, buf, plen, pos, sel, end, len, dlen, dpos, dsel, dend, cols, oldrow, maxrows, history_index, need_resize, need_refresh, selmode, selinclusive, noninteractive and undo_stack. (lino_set_multiline, lino_get_multiline, lino_set_selinclusive, lino_get_selinculsive, lino_set_noninteractive, lino_get_noninteractive, lino_set_atom_cb): Functions now only defined in termios build. (linenoise): Adjustments for missing members in non-termios mode. (lino_make): We no longer need to set noninteractive to 1 in non-termios build. The flag no longer exists. (lino_copy, lino_cleanup): Avoid referencing nonexistent members in non-termios build. (lino_set_result): Another termios-only function. * linenoise/linenoise.h (lino_set_result, lino_clear_screen, lino_set_multiline, lino_get_multiline, lino_set_selinclusive, lino_get_selinculsive, lino_set_noninteractive, lino_get_noninteractive, lino_atom_cb_t, lino_set_atom_cb): Declare only in termios build. * parser.c (repl): Add #if HAVE_TERMIOS to avoid using linenoise features not available in non-termios build, and to remove any variables that thus become unused.
* listener: unbundle from termios.Kaz Kylheku2021-08-201-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes the conceptual issue that when there is no termios support (HAVE_TERMIOS is absent/false), then there is no listener at all, even though the listener supports plain mode that doesn't require termios. * Makefile (linenoise/linenoise.o): Link in unconditionally, not subject to have_termios. * linenoise.c: Include termios-related header only if HAVE_TERMIOS. (struct lino_state): Define the completion_callback and orig_termios members only if HAVE_TERMIOS. (wcnsprintf, atexit_handler, enable_raw_mode, disable_raw_mode, get_cursor_position, get_columns, lino_clear_screen, refresh_line, handle_resize, generate_beep, delete_undo, free_undo_stack, record_undo, record_triv_undo, remove_noop_undo, restore_undo, undo_subst_hist_idx, undo_renumber_hist_idx, free_completions, sync_data_to_buf, compare_completions, complete_line, lino_set_completion_cb, lino_add_completion, next_hist_match, copy_display_params, history_search, ab_init, ab_append, ab_free, sync_data_to_buf, copy_display_params, refresh_singleline, col_offset_in_str, refresh_multiline, refresh_line, move_cursor_multiline, move_cursor, scan_match_rev, scan_rev, scan_match_fwd, scan_fwd, find_nearest_paren, usec_delay, paren_jump, flash, yank, yank_by_ptr, update_sel, clear_sel, yank_sel, delete_sel, edit_insert, edit_insert_str, edit_move_left, edit_move_right, edit_move_home, edit_move_sol, edit_move_end, edit_move_eol, edit_move_matching_paren, edit_history_next, edit_delete, edit_backspace, edit_delete_prev_all, edit_delete_to_eol, edit_delete_prev_word, edit_delete_line, tr, char, edit_in_editor, edit, sigwinch_handler): Functions defined only if HAVE_TERMIOS. (struct abuf, struct row_values): Struct types defined only if HAVE_TERMIOS. (screen_rows): Defined only if HAVE_TERMIOS. (linenoise): Support only noninteractive read loop unless HAVE_TERMIOS. (lino_make): If HAVE_TERMIOS is false, then set the noninteractive flag, so the linenoise function enters the plain-mode loop. (lino_cleanup, lino_hist_add): Add #ifdefs to avoid calling nonexistent functions when HAVE_TERMIOS is false. * linenoise/linenoise.h (struct lino_completions, lino_compl_cb_t): Define these types only if HAVE_TERMIOS. (lino_set_completion_cb, lino_add_completion): Declare only if HAVE_TERMIOS. * parser.c: Include linenoise/linenoise.h unconditionally. (report_security_problem, load_rcfile, repl_intr, read_eval_ret_last, get_home_path, repl_warning, is_balanced_line, hist_save): Now define regardless of HAVE_TERMIOS. (repl): Define regardless of HAVE_TERMIOS, but don't set completion or atom callback if HAVE_TERMIOS is false. * parser.h (repl): Declare unconditionally, not subject to HAVE_TERMIOS. * txr.c (if_termios): New macro. (opt_noninteractive): Initialize to 1 if HAVE_TERMIOS is false. (help): Text about entering into listener mode is always present now, even in a build withou HAVE_TERMIOS. (banner): Function is always defined. If we don't HAVE_TERMIOS, then the unused string literal that will never be printed is replaced by nil. (hint): Function removed. (txr_main): Blocks conditional on HAVE_TERMIOS that either call banner and go to the repl, or else call hint and exit, are reduced to unconditionally calling banner and going to the repl. All #if HAVE_TERMIOS blocks are similarly replaced with just the HAVE_TERMIOS case.
* license: reformat to fit 80 columns.Kaz Kylheku2021-08-161-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h: License reformatted. * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
* listener: prompt feature for plain mode.Kaz Kylheku2021-08-031-0/+4
| | | | | | | | | | | | | | | | | | The :prompt-on command will enable prompting in plain mode. * linenoise/linenoise.c (struct lino_state): New member, show_prompt. (line_enable_noninteractive_prompt): New function. (linenoise): In the plain mode loop, the show_prompt flag is on, show the prompt. For continuation lines, show a condensed prompt, which consists of the suffix of the full prompt, starting on the last non-whitespace character. * linenoise/linenoise.h (lino_enable_noninteractive): Declared. * parser.c (repl): Implement :prompt-on command which enables the above mode. * txr.1: Documented.
* parser: allow trailing commas in json, via opt-in flag.Kaz Kylheku2021-07-291-1/+5
| | | | | | | | | | | | | | | | | | | | | | | * parser.c (read_bad_json_s): New symbol variable. (parser_common_init): Propagate value of *read-bad-json* into read_bad_json flag in parser structure. (parser_init): Initialize read_bad_json_s and register the *read-bad-json* dynamic variable. * parser.h (struct parser): New member, read_bad_json. (read_bad_json_s): Declared. * parser.y (json_val): Support an opt_comma symbol just before the closing bracket or brace. (opt_comma): New nonterminal symbol. Recognizes ',' or nothing. Error is flagged if ',' is recognized, and *read-bad-json* is nil. * y.tab.c.shipped: Updated. * tests/010/json.tl: New tests. * txr.1: Documented.
* hash: change make_hash interface.Kaz Kylheku2021-07-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The make_hash function now takes the hash_weak_opt_t enumeration instead of a pair of flags. * hash.c (do_make_hash): Take enum argument instead of pair of flags. Just store the option; nothing to calculate. (weak_opt_from_flags): New static function. (tweak_hash): Function removed. (make_seeded_hash): Adjust to new do_make_hash interface with help from weak_opt_from_flags. (make_hash, make_eq_hash): Take enum argument instead of pair of flags. (hashv): Calculate hash_weak_opt_t enum from the extracted flags, pass down to make_eq_hash or make_hash. * hash.h (tweak_hash): Declration removed. (make_hash, make_eq_hash): Declarations updated. * eval.c (me_case, expand_switch): Update make_hash calls to new style. (eval_init): Update make_hash calls and get rid of tweak_hash calls. This renders the tweak_hash function unused. * ffi.c (make_ffi_type_enum, ffi_init): Update make_hash calls to new style. * filter.c (make_trie, trie_add, filter_init): Likewise. * lib.c (make_package_common, obj_init, obj_print): Likewise. * lisplib.c (lisplib_init): Likewise. * match.c (dir_tables_init): Likewise. * parser.c (parser_circ_def, repl, parse_init): Likewise. * parser.l (parser_l_init): Likewise. * struct.c (struct_init, get_slot_syms): Likewise. * sysif.c (get_env_hash): Likewise. * lex.yy.c.shipped, y.tab.c.shipped: Updated.
* hash: support both semantics of weak keys + values.Kaz Kylheku2021-07-211-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hash tables with weak keys and values now support a choice of both possible semantics: under and-semantics, an entry lapses when both the key and value are unreachable. Under or-semantics, an entry lapses if either the key or value is unreachable. The and-semantics is new. Until TXR 266, only or-semantics was supported. This will be the default: when a hash table is specified as :weak-keys and :weak-vals, it will have or-semantics. The keywords :weak-or and :weak-and specify weak keys and values, with the specific semantics. They are utually exclusive, but tolerate the presence of :weak-keys and :weak-vals. The make-hash function is being extended such that if its leftmost argument, <weak-keys>, is specified as one of the keywords :weak-and or :weak-or, then the hash table will have weak keys and values with the specified semantics, and the <weak-vals> argument is ignored (values are weak even if that argument is false). * eval.c (eval_init): Initially register the top_vb, top_mb, top_smb, special and builtin hashes as ordinary hashes: no weak keys or values. Then use tweak_hash to switch to weak keys+vals with and-semantics. We do it this way because the keywords are not yet initialized; we cannot use them. * hash.h (enum hash_flags, hash_flags_t): Moved to header. Member hash_weak_both renamed to hash_weak_or. New member hash_weak_and. (weak_and_k, weak_or_k): New keyword variables. (hash_print_op): Handle hash_weak_and by printing :weak-and. (hash_mark): Handle hash_weak_and by marking nothing, like hash_weak_or. (do_make_hash): Check first argument against the two new keywords and set flags accordingly. This function is called from eval_init before the keywords have been initialized, in which case weak_keys == weak_and_k is true when both are nil; we watch for that. (tweak_hash): Now returns void and takes a hash_flags_t argument which is simply planted. (do_wak_tables): Implement hash_weak_and case. Remove the compat 266 stuff from hash_weak_or. Compatibility is no longer required since we are not changing the default semantics of hash tables. Phew; that's a load of worry off the plate. (hashv): Parse the two new keywords, validate and provide semantics. (hash_init): Initialize weak_and_k and weak_or_k kewyords. * hash.h (enum hash_flags, hash_flags_t): Moved here now. (weak_and_k, weak_or_k): Declared. * lib.c (compat_fixup): Remove call to parse_compat_fixup. * parser.c (parse_init): Create stream_parser_hash with and-semantics. (parse_compat_fixup): Function removed. * parser.h (parse_compat_fixup): Declaration removed. * txr.1: Hash documentation updated.
* parse/eval: use weak-both hash tables.Kaz Kylheku2021-07-201-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This addresses the problem that a4c376979d15323ad729e92e41ba43768e8dc163 tried to fix. * eval.c (eval_init): Make all the top-level binding tables, top_fb, top_vb, top_mb, top_smb, special and builtin, weak-both tables: keys and values are weak. This way, the entries disappear if both key and value are unreachable, even if they refer to each other. (eval_compat_fixup): In 266 or earlier compat mode, weak-both tables don't have the right semantics, so we tweak the tables to weak-key tables. * parser.c (parse_init): Same treatment for stream_parser_hash. We want an entry to disappear from the hash if neither the parser nor the stream are reachable. (parse_compat_fixup): New function. * parser.h (parse_compat_function): Declared. * hash.c, hash.h (tweak_hash): New function. * lib.c (compat_fixup): Call parse_compat_fixup.
* type: disallow structs using built-in type names.Kaz Kylheku2021-07-081-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a big commit motivated by the need to clean up the situation with built-in type symbols, COBJ objects and structs. The struct type system allows struct types to be defined for symbols like regex or str, which are used by built-in or cobj types. This is a bad thing. What is worse, structure instances are COBJ types which identify their type using the COBJ class symbol mechanism. There are places in the C implementation which assume that when a COBJ has a certain class symbol, it is of a certain expected type, which is totally different from and incompatible form a struct instance. User code can define a structure object which will fool that code. There are multiple things going on in this patch. The major theme is that the COBJ representation is changing. Instead of a class symbol, COBJ instances now carry a "struct cobj_class *" pointer. This pointer is obtained by registration via the cobj_register function. All modules must register their class symbols to obtain these class handles, which are then used in cobj() calls for instantiation. The CPTR type was identical to COBJ until now, except for the type tag. This is changing; CPTR objects will keep the old representation with the class symbol. commit 20fdfc6008297001491308849c17498c006fe7b4 Author: Kaz Kylheku <kaz@kylheku.com> Date: Thu Jul 8 19:17:39 2021 -0700 * ffi.h (carray_cls): Declared. * hash.h (hash_cls): Declared. (hash_early_init): Declared. * lib.h (struct cobj_class): New struct. (struct cobj): cls member changing to struct cobj_class *. (struct cptr): New struct, same as previous struct cobj. (union obj): New member cp of type struct cptr, for CPTR. (builtin_type): Declared. (class_check): Declaration moved closer to COBJ-related functions and updated. (cobj_register, cobj_register_super, cobj_class_exists): New functions declared. (cobjclassp, cobj_handle, cobj_ops): Declarations updated. * parser.h (parser_cls): Declared. * rand.h (random_state_cls): Declared. * regex.h (regex_cls): Declared. * stream.h (stream_cls, stdio_stream_cls): Declared. * struct.h (struct_cls): Declared. * tree.h (tree_cls, tree_iter_cls): Declared. * vm.h (vm_desc_cls): Declared. * buf.c (buf_strm, make_buf_stream): Pass stream_cls functions instead of stream_s class symbol. * chksum.c (sha256_ctx_cls, md5_ctx_cls): New static class handles. (sha256_begin, sha256_hash, sha256_end, md5_begin, md5_hash, md5_end): Pass class handles to instead of class symbols. (chksum_init): Initialize class handle variables. * ffi.c (ffi_type_cls, ffi_call_desc_cls, ffi_closure_cls, union_cls): New static class handles. (carray_cls): New global variable. (ffi_type_struct_checked, ffi_type_print_op, ffi_closure_struct_checked, ffi_closure_print_op, make_ffi_type_builtin, make_ffi_type_pointer, make_ffi_type_struct, make_ffi_type_union, make_ffi_type_array, make_ffi_type_enum, ffi_call_desc_checked, ffi_call_desc_print_op, ffi_make_call_desc, ffi_make_closure, carray_struct_checked, carray_print_op, make_carray, cptr_getobj, cptr_out, uni_struct_checked, make_union_common): Pass class handles instead of class symbols. (ffi_init): Initialize class handle variables. * filter.c (regex_from_trie): Use hash_cls class handle instead of hash_s. * gc.c (mark_obj): Split COBJ and CPTR cases since the representation is different. * hash.c (hash_cls, hash_iter_cls): New class handles. (make_similar_hash, copy_hash, gethash_c, gethash_e, remhash, clearhash, hash_count, get_hash_userdata, set_hash_userdata, hashp, hash_iter_init, hash_begin, hash_next, hash_peek, hash_reset, hash_reset, hash_uni, hash_diff, hash_symdiff, hash_isec): Pass class handles instead of class symbols. (hash_early_init): New function. (hash_init): Set the class symbols in the class handles that were created in hash_early_init at a time when these symbols did not exist. * lib.c (nelem): New macro. (cobj_class): New static array. (cobj_ptr): New static pointer. (cobj_hash): New static hash. (seq_iter_cls): New static class handle. (builtin_type_p): New function. (typeof): Struct instances now all carry the same symbol, struct, as their COBJ class symbol. To get their type, we must call struct_type_name. (subtypep): Rearrangement of two cases: let's make the reflexive case first. Adjust code for different location of COBJ class symbol. (seq_iter_init_with_info, seq_begin, seq_next, seq_reset, iter_begin, iter_more, iter_item, iter_step, iter_reset, make_like, list_collect, do_generic_funcall): Use class handles instead of class symbols. (class_check, cobj, cobjclassp, cobj_handle, cobj_ops): Take class handle argument instead of class symbol. (cobj_register, cobj_register_super, cobj_class_exists): New functions. (cobj_populate_hash): New static function. (cobj_print_op): Adjust for different location of class (cptr_print_op, cptr_typed, cptr_type, cptr_handle, cptr_get): cptr functions now refer to obj->cp rather than obj->co. (copy, length, sub, ref, refset, replace, dwim_set, dwim_del, obj_print): Use class handles for various COBJ types rather than class symbols. (obj_init): gc-protect cobj_hash. Initialize seq_iter_cls class symbol and cobj_hash. Populate cobj_hash as the last initialization step. (init): Call hash_early_init immediately after gc_init. diff --git a/lib.c b/lib.c * match.c (do_match_line): Refer to regex_cls class handle instead of regex_s.. * parser.c (parser_cls): New global class handle. (parse, parser_get_impl, lisp_parse_impl, txr_parse, parser_errors): Use class handles instead of class symbols. (parse_init): Initialize parser_cls. * rand.c (random_state_cls): New global class handle. (make_state, random_state_p, make_random_state, random_state_get_vec, random_fixnum, random_float, random): Use class handles instead of class symbols. (rand_init): Initialize random_state_cls. * regex.c (regex_cls): New global class handle. (chset_cls): New static class handle. (reg_compile_csets, reg_derivative, regex_compile, regexp, regex_source, regex_print, regex_run, regex_machine_init): Use class handles instead of class symbols. (regex_init): Initialize regex_cls and chset_cls. * socket.c (make_dgram_sock_stream): Use stream_cls class symbol instead of stream_s. * stream.c (stream_cls, stdio_stream_cls): New class handles. (make_null_stream, stdio_get_fd, make_stdio_stream_common, stream_fd, sock_family, sock_type, sock_peer, sock_set_peer, make_dir_stream, make_string_input_stream, make_string_byte_input_stream, make_strlist_input_stream, make_string_output_stream, make_strlist_output_stream, get_list_from_stream, make_catenated_stream, make_delegate_stream, make_delegate_stream, stream_set_prop, stream_get_prop, close_stream, get_error, get_error_str, clear_error, get_line, get_char, get_byte, get_bytes, unget_char, unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, format, put_string, put_char, put_byte, flush_stream, seek_stream, truncate_stream, get_indent_mode, test_set_indent_mode, test_neq_set_indent_mode, set_indent_mode, get_indent, set_indent, inc_indent, width_check, force_break, set_max_length, set_max_depth): Use class handle instead of symbol. (stream_init): Initialize stream_cls and stdio_stream_cls. * struct.c (struct_type_cls, struct_cls): New class handles. (struct_init): Initialize struct_type_cls and struct_cls. (struct_handle): Static function moved to avoid forward declaration. (stype_handle): Refer to struct_type_cls class handle instead of struct_type_s symbol. Handle instance objects in addition to types. (make_struct_type): Throw error if a built-in type is being defined as a struct type. Refer to class handle instead of class symbol. (find_struct_type, allocate_struct, make_struct_impl, make_lazy_struct, copy_struct): Refer to class handle instead of class symbol. * strudel.c (make_struct_delegate_stream): Refer to stream_cls class handle instead of stream_s symbol. * sysif.c (dir_cls): New class handle. (poll_wrap): Use typep instead of subtypep, eliminating access to class symbol. (opendir_wrap, closedir_wrap, readdir_wrap): Use class handles instead of class symbols. (sysif_init): Initialize dir_cls. * syslog.c (make_syslog_stream): Refer to stream_cls class handle instead of stream_s symbol. * tree.c (tree_cls, tree_iter_cls): New class handles. (tree_insert_node, tree_lookup_node, tree_delete_node, tree_root, tree_equal_op, tree, copy_search_tree, make_similar_tree, treep, tree_begin, copy_tree_iter, replace_tree_iter, tree_reset, tree_next, tree_peek, tree_clear): Use class handle instead of class symbol. (tree_init): Initialize tree_cls and tree_iter_cls. * unwind.c (sys_cont_cls): New static class handle. (revive_cont, capture_cont): Use class handle instead of class symbol. (uw_late_init): Initialize sys_cont_cls. * vm.c (vm_desc_cls): New global class handle. (vm_closure_cls): New static class handle. (vm_desc_struct, vm_make_desc, vm_closure_struct, vm_make_closure, vm_copy_closure): Use class handle instead of class symbol. (vm_init): Initialize vm_desc_cls and vm_closure_cls.
* streams: tightening sloppy argument defaulting.Kaz Kylheku2021-07-011-23/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Numerous functions in TXR Lisp treat a nil argument for an optional parameter as if it were omitted. In the case of streams, this can cause problems. An accidental nil passed to an input function can cause it to read from standard input and hang. In this patch, argument defaulting is tighented for functions that perform I/O. It's mostly stream parameters, but not exclusively. * eval.c (prinl, pprinl): Use default_arg_strict to default the stream argument, and also re-use that value for the put_char call. * lib.c (lazy_stream_cons, print, pprint, put_json): Use default_arg_strict rather than default_arg. * parser.c (regex_parse, lisp_parse_impl, txr_parse): Tighten the defaulting of the input stream and error stream arguments, streamlining the logic at the same time. * stream.c (do_parse_mode): Use default_arg_strict for the mode string argument. (record_adapter, get_line, get_char, get_byte, get_bytes, unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, put_string, put_char, put_byte, put_line, flush_stream, get_string): Use strict defaulting for stream argument. (mkstemp_wrap): Use strict defaulting for suffix.
* byacc: fix regression caused by yystype.Kaz Kylheku2021-06-261-1/+1
| | | | | | | * parser.c (lisp_parse_impl): Refer to YYSTYPE not yystype, which doesn't exist under byacc. Reported by Sergey Romanov, against CentOS 8.4 using byacc 1.9.20170709-4.el8; easily reproduces with 1.9.20140715 on Ubuntu 18.
* c_str now takes a self argument.Kaz Kylheku2021-06-231-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a self parameter to c_str so that when a non-string occurs, the error is reported against a function. Legend: A - Pass existing self to c_str. B - Define self and pass to c_str and possibly other functions. C - Take new self parameter and pass to c_str and possibly other functions. D - Pass existing self to c_str and/or other functions. E - Define self and pass to other functions, not c_str. X - Pass nil to c_str. * buf.c (buf_strm_put_string, buf_str): B. * chksum.c (sha256_str, md5_str): C. (sha256_hash, md5_hash): D. * eval.c (load): D. * ffi.c (ffi_varray_dynsize, ffi_str_put, ffi_wstr_put, ffi_bstr_put): A. (ffi_char_array_put, ffi_wchar_array_put): C. (ffi_bchar_array_put): A. (ffi_array_put, ffi_array_out, ffi_varray_put): D. * ftw.c (ftw_wrap): A. * glob.c (glob_wrap): A. * lib.c (copy_str, length_str, coded_length,split_str_set, list_str, cmp_str, num_str, out_json_str, out_json_rec, display_width): B. (upcase_str, downcase_str, string_extend, search_str, do_match_str, do_rmatch_str, sub_str, replace_str, cat_str_append, split_str_keep, trim_str, int_str, chr_str, span_str, compl_span_str, break_str, length_str_gt, length_str_ge, length_str_lt, length_str_le, find, rfind, pos, rpos, mismatch, rmismatch): A. (c_str): Add self parameter and use in type mismatch diagnostic. If the parameter is nil, use "internal error". (flo_str): B, and correction to "flot-str" typo. (out_lazy_str, out_quasi_str, obj_print_impl): D. * lib.h (c_str): Declaration updated. * match.c (dump_var): X. (v_load): D. * parser.c (open_txr_file): C. (load_rcfile): E. (find_matching_syms, provide_atom): X. (hist_save, repl): B. * parser.h (open_txr_file): Declaration updated. * parser.y (chrlit): X. * regex.c (search_regex): A. * socket.c (getaddrinfo_wrap, sockaddr_pack): A. (dgram_put_string): B. (open_sockfd): D. (sock_connect): E. * stream.c (stdio_put_string, tail_strategy, vformat_str, open_directory, open_file, open_tail, remove_path, rename_path, tmpfile_wrap, mkdtemp_wrap, mkstemp_wrap): B. (do_parse_mode, parse_mode, make_string_byte_input_stream): B. (normalize_mode, normalize_mode_no_bin): E. (string_out_put_string, formatv, put_string, open_fileno, open_subprocess, open_command, base_name, dir_name, short_suffix, long_suffix): A. (run): D. (win_escape_cmd, win_escape_arg): X. * stream.h (parse_mode, normalize_mode, normalize_mode_no_bin): Declarations updated. * sysif.c (mkdir_wrap, do_utimes, dlopen_wrap, dlsym_wrap, dlvsym_wrap): A. (do_stat, do_lstat): C. (mkdir_nothrow_exists, ensure_dir): E. (chdir_wrap, rmdir_wrap, mkfifo_wrap, chmod_wrap, symlink_wrap, link_wrap, readlink_wrap, exec_wrap, getenv_wrap, setenv_wrap, unsetenv_wrap, getpwnam_wrap, getgrnam_wrap, crypt_wrap, fnmatch_wrap, realpath_wrap, opendir_wrap): B. (stat_impl): statfn pointer-to-function argument now takes self parameter. When calling it, we pass name. * syslog.c (openlog_wrap, syslog_wrapv): A. * time.c (time_string_local, time_string_utc, time_string_meth, time_parse_meth): A. (strptime_wrap): B. * txr.c (txr_main): D. * y.tab.c.shipped: Updated.
* read/get-json: reject trailing junk in string input.Kaz Kylheku2021-06-201-2/+16
| | | | | | | | | | | | | | | | | * parser.c (lisp_parse_impl): If parsing from string, check for trailing junk and diagnose. JSON parsing doesn't use lookahead because it doesn't have a.b syntax, so the recent_tok gives the last token that actually went into the syntax, and not a lookahead token. So in the case of JSON, we call yylex to see if there is any trailing token. * tests/010/json.tl: Extend get-json tests to more kinds of objects, and then replicate with trailing whitespace and trailing junk to provide coverage for these cases. * tests/012/parse.t: Slew of new read tests and iread also. * txr.1: Documented.
* listener: new --noprofile option.Kaz Kylheku2021-06-161-1/+1
| | | | | | | | | | | | | * parser.c (repl): Set the rcfile variable to nil if opt_noprofile is true, to suppress reading it. * txr.c (op_noprofile): New global variable. (help): Add help text. (txr_main): Recognize noprofile option and set variable. * txr.h (opt_noprofile): Declared. * txr.1: Documented.
* listener: complete macros and operators after quote.Kaz Kylheku2021-06-101-6/+11
| | | | | | | | | | | This fixes the problem that (doc 'wh[Tab] will not complete the macro name while. * parser.c (find_matching_syms): Recognize the kind context symbol 'Q', under which a which a macro or operator are eligible for completion. (provide_completions): Calculate kind as 'Q' if the previous character is a ' quote or ^ backquote.
* parser: new *read-unknown-structs* variable.Kaz Kylheku2021-06-081-1/+5
| | | | | | | | | | | | | | | | | | | | | * parser.c (read_unknown_structs_s): New symbol variable. (parser_common_init): Initialize read_unknown_structs flag member of the parser structure from the new special variable. (parse_init): Initialize read_unknown_struct_s variable. Register the *read-unknown-structs* dynamic variable. * parser.h (struct parser): New member, read_unknown_structs. (read_unknown_structs_s): Declared. * parser.y (struct): Generate the struct literal syntax not only for quasiquoted structures, but for structures with an unknown type name, if the read_unkonwn_structs flag is set. * txr.1: Documented. * share/txr/stdlib/doc-syms.tl: Regenerated. * y.tab.c.shipped: Regenerated.
* parser: gc bug in token.Kaz Kylheku2021-05-311-0/+2
| | | | | | | | | | | | | | The parser maintains some token objects for one parse job to the next in the tok_pushback and recent_tok arrays. When the recent_tok is assigned into the parser, it could be a wrong-way assignment: the parser is a gen 1 object, yet the token's semantic value of type val is a gen 0. * parser.c (lisp_parse_impl, txr_parse): Before re-enabling gc, indicate that the parser object may have been mutated by a wrong-way assignment using the mut macro. * txr.c (txr_main): Likewise.
* repl: syntax error diag improvement.Kaz Kylheku2021-05-291-1/+1
| | | | | | * parser.c (repl): syntax error exceptions carry some text as an argument, which we now print. This distinguishes end-of-stream syntax errors from real ones.
* parser: remove some parser access functions.Kaz Kylheku2021-05-281-32/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | * parser.c (lisp_parse_impl): Access pi->eof directly instead of going through parser_eof. We are using pi->errors in the same expression! This is the only place pi->eof is needed. (read_file_common): Don't call parser_eof. Assume that if error_val emerges, and errors is zero, it must be eof. (read_eval_ret_last): Simplify: the code actually does nothing specia for eof versus errors, so we just bail if error_var emerges. (get_parser): Function removed. (parse_errors): This is the only caller of get_parser, which now just calls gethash to fetch the parser. (parser_eof): Function removed. (parse_init): sys:get-parser, sys:parser-errors and sys:parser-eof intrinsics removed. The C function parser_errors still exists; it is used in a few places. * parser.h (get_parser, parser_eof): Declarations removed. * share/txr/stdlib/compiler.tl (compile-file-conditionally): Use the new parse-errors function instead of the removed sys:parser-errors. Since that works on the stream, we don't have to call sys:get-parser. Because it returns nil if there are no errors, we drop the > test.
* parser: provide parse-errors function.Kaz Kylheku2021-05-281-0/+14
| | | | | | | | | | * parser.c (parse_errors): New function. * parser.h (parse_errors): Declared. * txr.1: Documented. * share/txr/stdlib/doc-syms.tl: Updated.
* json: get-json function.Kaz Kylheku2021-05-281-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | * eval.c (eval_init): get-json intrinsic registered. * parser.c (prime_parser): Handle prime_json. (lisp_parse_impl): Take enum prime_parser argument directly instead of the interactive flag. (lisp_parse, nread, iread): Pass appropriate prime_parser value instead of the original flag. (get_json): New function. Like nread, but passes prime_json. * parser.h (enum prime_parser): New constant, prime_json. (get_json): Declared. * parser.l (prime_scanner): Handle prime_json. * parser.y (SECRET_ESCAPE_J): New terminal symbol. (spec): New productions around SECRET_ESCAPE_J for parsing JSON. * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated. * txr.1: Documented. * share/txr/stdlib/doc-syms.tl: Updated.
* json: implement distinguished json quasiquote.Kaz Kylheku2021-05-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because #J<json> produces the (json ...) form that translates into quote, ^#J<json> yields a quasiquote around a quote. This has some disadvantages, because it requires an explicit eval in some situtions to do what the programmer wants. Here, we introduce an alternative: the syntax #J^<json> will produce a quasiquote instead of a quote. The new translation scheme is #J X -> (json quote <X>) #J^ X -> (json sys:qquote <X>) where <X> denotes the Lisp object translation of JSON syntax X. * parser.c (me_json): The quote symbol is now already in the json form, so all that is left to do here is to take the cdr to pop off the json symbol. * parser.l (JPUNC, NJPUNC): Allow ^ to be a punctuator in JSON mode. * parser.y (json): For regular #J, generate the new (json quote ...) syntax. Implement J# ^ which sets up the nonzero quasi_level around the processing of the JSON syntax, so that everything is in a quasiquote, finally producing the (json sys:qquote ...) syntax. * lex.yy.c.shipped, y.tab.c.shipped: Updated.
* New #J syntax for JSON objects in TXR Lisp.Kaz Kylheku2021-05-261-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (needs buffer literal error message cleanup) * parser.c (json_s): New symbol variable. (is_balanced_line): Follow braces out of initial state. This concession allows the listener to accept input like #J{"a":"b"}. (me_json): New static function (macro expander). The #J X syntax produces a (json Y) form, with the JSON syntax X translated to a Lisp object Y. If that is evaluated, this macro translates it to (quote Y). (parse_init): initialize json_s variable with interned symbol, and register the json macro. * parser.h (json_s): Declared. (end_of_json): Declared. * parser.l (num_esc): Treat u escape sequences in the same way as x. This function can then be used for handling the \u escapes in JSON string literals. (DIG19, JNUM, JPUNC, NJPUNC): New lex named patterns. (JSON, JLIT): New lex start conditions. (grammar): Recognize #J syntax, mapping to HASH_J token, which transitions into JSON start state. In JSON start state, handle all the elements: numbers, keywords, arrays and objects. Transition into JLIT state. In JLIT start state, handle all the elements of JSON string literals, including surrogate pair escapes. JSON literals share the fallback {UANY} fallback patter with other literals. (end_of_jason): New function. * parser.y (HASH_J, JSKW): New token symbols. (json, json_val, json_vals, json_pairs): New nonterminal symbols, and rules. (i_expr, n_expr): Generate json nonterminal, to hook the stuff into the grammar. (yybadtoken): Handle JKSW and HASH_J tokens. * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
* listener: complete on structs and FFI typedefs.Kaz Kylheku2021-05-251-7/+11
| | | | | | | | | * parser.c (find_matching_syms): Apply DeMorgan's on the symbol binding tests, to use sum of positive tests instead of product of negations. Check for struct and ffi type bindings in the default case. The default and '[' cases are rearranged so that the '[' case omits these, so as not to complete on a struct or FFI typedef after a [.
* listener: don't complete on unbound symbolsKaz Kylheku2021-05-181-4/+3
| | | | | | | | | | | | | | | | | | | | | This patch prevents Tab-completing on interned symbols that have no binding. The current behavior is, for instance: 1> 'hamsandwich hamsandwich 2> 'ham[Tab] 2> 'hamsandwich ;; completes The new behavior will not complete hamsandwich, because it has no binding as a function or variable. * parser.c (find_matching_syms): Treat the default case the same as after '[': a function or variable binding is required, or the symbol is not listed. Use fboundp instead of lookup_fun. They are the same, except in TXR 127 compat mode, which includes macros under fboundp.
* compiler: better code for global var definitions.Kaz Kylheku2021-05-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | * eval.c (rt_defvarl): More accurate self string. (rt_defv): New static function: like rt_defvarl but ensures that the new variable has a binding cell, and returns that cell instead of the hash cell. (op_defvarl): Take advantage of rt_defv to not have to cons up the binding cell. (eval_init): Register sys:rt-defv intrinsic. * parser.c (read_file_common): Compiled files are now version 7, so we must recognize them. We still load version 6 files because rt:defvarl still exists for them. * share/txr/stdlib/compiler.tl (expand-defvarl): Improve the generated code in two ways. Firstly, use the new sys:rt-defv, which returns the binding cell, so that the value can be stored into it with rplacd without having to cons up anything. Secondly, if there is no value expression, don't emit the code to do the assignment. (%tlo-ver%): Bump compiled file version to (7 0). * txr.1: Add note about TXR 260 loading version 7 and 6.
* parser: bug: handing of lex state in pushback tokens.Kaz Kylheku2021-05-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is fairly obscure. A repro test case is a file which contains: 3"foo" When the 3 is parsed, the " is also scanned as a lookahead token, and when that happens, the lexer shifts into the STRLIT state. At that point the parse job finishes for that top-level form. The next time the parser is called, it will prime the token stream by pushing the " token into it. But, the lex state is not put into the STRLIT. State. The result is that the parser obtains the " token, and then foo is lexically analyzed in the wrong state as a symbol. A syntax error occurs: symbol token in the middle of a string literal, instead of just a sequence of LITCHAR tokens, as expected. What we can do is associate a lex state with pushback tokens. If a pushback token has a nonzero lex state which is different from the current YYSTATE, then when that pushback token is consumed, we push that state also. * parser.h (struct yy_token): New member, yy_lex_state. * parser.c (parser_common_init): Initialize the new yy_lex_state member of every token member of the parser structure. * parser.l (yylex): When feeding a pushed token to the parser, if that token has a nonzero state, and the state is different from YYSTATE, we push that state. So for instance a pushed back " token will carry the STRLIT state, which is different from the NESTED state that will be in effect at the start of the parse job, and so it will be pushed, as if the " character had been scanned. Also, when we call the real yylex_impl, when we are storing the recenty seen token in recent_tok, we also store the current YYSTATE along with it. That's how tokens get associated with a state. The artificial tokens that are used for priming parsing like SECRET_ESCAPE_E are never associated with a nonzero state. * tests/012/syntax.tl: Some test cases that didn't pass before this. * lex.yy.c.shipped: Regenerated.
* tree: streamline iteration; provide high limit.Kaz Kylheku2021-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Getting rid of tree-begin-at and tree-reset-at. Now tree-begin takes two optional parameters, for specifying high and low range. * tree.c (struct tree_diter): New members, tree and highkey. We need tree due to requiring access to the less function. If the iterator has no highkey, the iterator itself is stored in that member to indicate this. (tree_iter_mark): Mark the tree and highkey. (tree_begin): Take optional lowkey and highkey arguments, initializing iterator acordingly. (tree_begin_at): Function removed. (copy_tree_iter, replace_tree_iter): Copy tree and highkey members. The latter require special handling due to the funny convention for indicating highkey absence. (tree_reset): Take optional lowkey and highkey arguments, configuring these in the iterator being reset. (tree_reset_at): Function removed. (tree_next, tree_peek): Implement highkey semantics. (sub_tree): Simplified: from and to arguments are just passed through to tree_begin, and there is no need for a separate loop which enforces the upper limit, that now being handled by the iterator itself. (tree_begin): Update registrations of tree-begin and tree-reset; remove tree-begin-at and tree-reset-at intrinsics. * tree.h (tree_begin_at, tree_reset_at): Declarations removed. (tree_begin, tree_reset): Declarations updated. * lib.c (seq_iter_rewind, seq_iter_init_with_info, where, populate_obj_hash): Default new optional arguments in tree_begin and tree_reset calls. * parser.c (circ_backpatch): Likewise. * tests/010/tree.tl: Affected cases updated. * txr.1: Documentation updated. * share/txr/stdlib/doc-syms.tl: Regenerated.
* compile/eval: more standard formatting for diags.Kaz Kylheku2021-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch eliminates parentheses from the error messages, as well as a leading ./ being added to relative paths. The word "warning: " is moved into the error message, so that it does not appear before the location. Example, when doing (compile-file "path/to/foo.tl"). Before patch: warning: (./path/to/foo.tl:37): unbound function foo After: path/to/foo.tl:37: warning: unbound function foo Now when I compile out of Vim, it nicely jumps to errors in Lisp code. * eval.c (eval_exception): Drop parentheses from error location, add colon. (eval_warn): Prepend "warning: " to format string. (eval_defr_warn): Drop parentheses from location, and prepend "warning: " to format string. * parser.c (repl-warning): Drop "warning:" prefix. * share/txr/stdlib/compiler.tl (open-compile-streams): Do not do parent substitution for relative paths if the parent path is the empty string "", to avoid inserting ./ onto relative paths in that case. * share/txr/stdlib/error.tl (sys:loc): Drop parentheses and space from location. (compile-error) Separate location with colon and space. (compile-warning, compile-defr-warning): Likewise and add "warning: " prefix. * unwind.c (uw_rthrow): Drop "warning: " prefix. (uw_warningf): Add "warning: " prefix. (uw_dump_deferred_warnings): Drop "warning: " prefix.
* compiler/vm: more compact frame size for closures.Kaz Kylheku2021-02-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Closures do not share t-registers with surrounding code; they do not store a value into such a register that code outside the closure would read and vice versa. When compiling closures, we can can temporarily reset the compiler's t-register allocator machinery to get low t-register values. Then, when executing the closure, we reserve space just for the registers it needs, not based off the containing vm description. Here we make a backwards-incompatible change. The VM close instruction needs an extra parameter indicating the number of t-regisers. This is stored into the closure and used for allocating the frame when it is dispatched. * parser.c (read_file_common): We read nothing but version 6 tlo files now. * share/txr/stdlib/asm.tl (op-close asm): Parse new ntreg argument from close syntax, and put it out as an extra word. Here is where we pay for this improvement in extra code size. (op-close dis): Extract the new argument from the machine code and add it to the disassembled format. * share/txr/stdlib/compiler.tl (compile-in-toplevel): Save and restore the t-reg discards list also. Don't bother with a gensym for the compiler; the argument is always a symbol, which we can use unhygienically like in with-var-spy. (compile-with-fresh-tregs): New macro based on compile-in-toplevel: almost the same but doesn't reset the level. (comp-lambda-impl): Use compile-with-fresh-tregs to compile the entire closure with a minimized register set. Place the treg-cntr into the closure instruction to indicate the number of registers the closure requires. * vm.c (struct vm): New member, nreg. (vm_make_closure): New parameter, nreg, stored into the closure. (vm_close): Extract a third opcode word, and pull the nreg value from the bottom half. Pass this to vm_make_closure. (vm_execute_closure, vm_funcall_common): Calculate frame size based on the closur's nreg rather than the VM description's. * txr.1: Document that the upcoming version 252 produces version 6.0 object files and only loads version 6.
* compiler: frame-eliminating optimization.Kaz Kylheku2021-02-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This optimization identifies let blocks whose variables are not captured by closures. The variables are relocated to registers and the frame M N ... end reg wrapping is removed. * parser.c (read_file_common): Load version 6 files. We remain backwards-compatible. * share/txr/stdlib/compiler.tl (var-spy, capture-var-spy): New structure types. (struct compiler): New slot, var-spies. (with-var-spy): New macro. (compiler (alloc-new-treg, unalloc-reg-count, push-var-spy, pop-var-spy)): New methods. (compiler (comp-atom, compt-setq, comp-list-setq, comp-lisp1-value)): Inform the spies in the spy notification stack about assignments and accesses. (compiler eliminate-frame): New method. (compiler comp-let): Use spies to determine which variables from this frame are captured, and if none are, then use eliminate-frame to rename all the variables to t-registers and drop the frame setup/teardown. (compiler comp-lambda): Set up a capture-var-spy which intercepts accesses and assignments within a lambda, and informs other spies about the captures. (%tlo-ver%): Bump compiled file version to to (6 0), because of some behavioral changes necessary in the VM. We might revert this if the issues are solved differently. * vm.c (vm_getz): Do not null out T registers. (vm_execute_toplevel, vm_execute_closure): Use zalloca to allocate the register part of the frame, so T registers are initialized to nil.
* Copyright year bump 2021.Kaz Kylheku2021-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * METALICENSE: 2020 copyrights bumped to 2021. Added note about SHA-256 routines from Colin Percival. * LICENSE, LICENSE-CYG, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/copy-file.tl, share/txr/stdlib/debugger.tl, share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/each-prod.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/param.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/quips.tl, share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2021.
* listener: fix completion regression.Kaz Kylheku2020-08-121-1/+1
| | | | | | | | | This problem was was introduced in TXR 239 in commit 1f54ad5cc1d384d0818a6bf6cec20a95ecc5a5ae. The completion for package-qualified symbols and keywords is wonky. * parser.c (find_matching_syms): The first argument to scat is the separator, so we must specify it as nil.
* listener: new *-1, *-2 ... *-20 macros.Kaz Kylheku2020-07-111-0/+13
| | | | | | | | | | | | * arith.h (minus_s): Declared. * eval.c (reg_symacro): Changing to external linkage. * eval.h (macro_time_s, reg_symacro): Declared. * parser.c (repl): Bind the *-1 to *-20 symbol macros. * txr.1: Documented.
* c_num: now takes self argument.Kaz Kylheku2020-06-291-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The c_num and c_unum functions now take a self argument for identifying the calling function. This requires changes in a large number of places. In a few places, additional functions acquire a self argument. The ffi module has the most extensive example of this. Some functions mention their name in a larger string, or have scattered literals giving their name; with the introduction of the self local variable, these are replaced by references to self. In the following changelog, the notation TS stands for "take self argument", meaning that the functions acquires a new "val self" argument. The notation DS means "define self": the functions in question defines a self variable, which they pass down. The notation PS means that the functions pass down an existing self variable to functions that now require it. * args.h (args_count): TS. * arith.c (c_unum, c_num): TS. (toint, exptv): DS. * buf.c (buf_check_len, buf_check_alloc_size, buf_check_index, buf_do_set_len, replace_buf, buf_put_buf, buf_put_i8, buf_put_u8, buf_put_char, buf_put_uchar, buf_get_bytes, buf_get_i8, buf_get_u8, buf_get_cptr, buf_strm_get_byte_callback, buf_strm_unget_byte, buf_swap32, str_buf, buf_int, buf_uint, int_buf, uint_buf): PS. (make_duplicate_buf, buf_shrink, sub_buf, buf_print, buf_pprint): DS. * chskum.c (sha256_stream_impl, sha256_buf, crc32_buf, md5_stream_impl, md5_buf): TS. (chksum_ensure_buf, sha256_stream, sha256, sha256_hash, md5_stream, md5, md5_hash): PS. (crc32_stream): DS. * combi.c (perm_while_fun, perm_gen_fun_common, perm_str_gen_fun, rperm_gen_fun, comb_vec_gen_fun, comb_str_gen_fun, rcomb_vec_gen_fun, rcomb_str_gen_fun): DS. * diff.c (dbg_clear, dbg_set, dbg_restore): DS. * eval.c (do_eval, gather_free_refs, maprodv, maprendv, maprodo, do_args_apf, do_args_ipf): DS. (op_dwim, me_op, map_common): PS. (prod_common): TS. * ffi.c (struct txr_ffi_type): release member TS. (make_ffi_type_pointer): PS and release argument TS. (ffi_varray_dynsize, ffi_array_in, ffi_array_put_common, ffi_array_get_common, ffi_varray_in, ffi_varray_null_term): PS. (ffi_simple_release, ffi_ptr_in_release, ffi_struct_release, ffi_wchar_array_get, ffi_array_release_common, ffi_array_release, ffi_varray_release): TS. (ffi_float_put, double_put, ffi_be_i16_put, ffi_be_u16_put, ffi_le_i16_put, ffi_le_u16_put, ffi_be_i32_put, ffi_be_u32_put, ffi_le_i32_put, ffi_sbit_put, ffi_ubit_put, ffi_buf_d_put, make_ffi_type_array, make_ffi_type_enum, ffi_type_compile, make_ffi_type_desc, ffi_make_call_desc, ffi_call_wrap, ffi_closure_dispatch_save, ffi_put_into, ffi_in, ffi_get, ffi_put, carray_set_length, carray_blank, carray_buf, carray_buf_sync, carray_cptr, carray_refset, carray_sub, carray_replace, carray_uint, carray_int): PS. (carray_vec, carray_list): DS. * filter.c (url_encode, url_decode, base64_stream_enc_impl): DS. * ftw.c (ftw_callback, ftw_wrap): DS. * gc.c (mark_obj, gc_set_delta): DS. * glob.c (glob_wrap): DS. * hash.c (equal_hash, eql_hash, eq_hash, do_make_hash, hash_equal, set_hash_traversal_limit, gen_hash_seed): DS. * itypes.c (c_i8, c_u8, c_i16, c_u16, c_i32, c_u32, c_i64, c_u64, c_short, c_ushort, c_int, c_uint, c_long, c_ulong): PS. * lib.c (seq_iter_rewind): TS and becomes internal. (seq_iter_init_with_info, seq_setpos, replace_str, less, replace_vec, diff, isec, obj_print_impl): PS. (nthcdr, equal, mkstring, mkustring, upcase_str, downcase_str, search_str, sub_str, cat_str, scat2, scat3, fmt_join, split_str_keep, split_str_set, trim_str, int_str, chr_int, chr_str, chr_str_set, vector, vecref, vecref_l, list_vec, copy_vec, sub_vec, cat_vec, lazy_str_put, lazy_str_gt, length_str_ge, length_str_lt, length_str_le, cptr_size_hint, cptr_int, out_lazy_str, out_quasi_str, time_string_local_time, time_string_utc, time_fields_local_time, time_fields_utc, time_struct_local, time_struct_utc, make_time, time_meth, time_parse_meth): DS. (init_str, cat_str_init, cat_str_measure, cat_str_append, vscat, time_fields_to_tm, time_struct_to_tm, make_time_impl): TS. * lib.h (seq_iter_rewind): Declaration removed. (c_num, c_unum, init_str): Declarations updated. * match.c (LOG_MISMATCH, LOG_MATCH): PS. (h_skip, h_coll, do_output_line, do_output, v_skip, v_fuzz, v_collect): DS. * parser.c (parser, circ_backpatch, report_security_problem, hist_save, repl, lino_fileno, lino_getch, lineno_getl, lineno_gets, lineno_open): DS. (parser_set_lineno, lisp_parse_impl): PS. * parser.l (YY_INPUT): PS. * rand.c (make_random_state): PS. * regex.c (print_rec): DS. (search_regex): PS. * signal.c (kill_wrap, raise_wrap, get_sig_handler, getitimer_wrap, setitimer_wrap): DS. * socket.c (addrinfo_in, sockaddr_pack, fd_timeout, to_connect, open_sockfd, sock_mark_connected, sock_timeout): TS. (getaddrinfo_wrap, dgram_set_sock_peer, sock_bind, sock_connect, sock_listen, sock_accept, sock_shutdown, sock_send_timeout, sock_recv_timeout, socketpair_wrap): DS. * stream.c (generic_fill_buf, errno_to_string, stdio_truncate, string_out_put_string, open_fileno, open_command, base_name, dir-name): DS. (unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, formatv, put_byte, test_set_indent_mode, test_neq_set_indent_mode, set_indent_mode, set_indent, inc_indent, set_max_length, set_max_depth, open_subprocess, run ): PS. (fds_subst, fds_swizzle): TS. * struct.c (make_struct_type, super, umethod_args_fun): PS. (method_args_fun): DS. * strudel.c (strudel_put_buf, strudel_fill_buf): DS. * sysif.c (errno_wrap, exit_wrap, usleep_wrap, mkdir_wrap, ensure_dir, makedev_wrap, minor_wrap, major_wrap, mknod_wrap, mkfifo_wrap, wait_wrap, wifexited, wexitstatus, wifsignaled, wtermsig, wcoredump, wifstopped, wstopsig, wifcontinued, dup_wrap, close_wrap, exit_star_wrap, umask_wrap, setuid_wrap, seteuid_wrap, setgid_wrap, setegid_wrap, simulate_setuid_setgid, getpwuid_wrap, fnmatch_wrap, dlopen_wrap): DS. (chmod_wrap, do_chown, flock_pack, do_utimes, poll_wrap, setgroups_wrap, setresuid_wrap, setresgid_wrap, getgrgid_wrap): PS. (c_time): TS. * sysif.h (c_time): Declaration updated. * syslog.c (openlog_wrap, syslog_wrap): DS. * termios.c (termios_pack): TS. (tcgetattr_wrap, tcsetattr_wrap, tcsendbreak_wrap, tcdrain_wrap, tcflush_wrap, tcflow_rap, encode_speeds, decode_speeds): DS. * txr.c (compato, array_dim, gc_delta): DS. * unwind.c (uw_find_frames_by_mask): DS. * vm.c (vm_make_desc): PS. (vm_make_closure, vm_swtch): DS.
* itypes: remove silly itypes_little_endian.Kaz Kylheku2020-06-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Curiously, the itypes_little_endian variable was introduced in April 2017, in a the first commit implementing buffers. The variable was initialized, but never referenced. About a month after that, I introduced HAVE_LITTLE_ENDIAN config.h macro, which is always present and defined as 0 or 1. This was used right away in the implementation of FFI bitfields, and later on in other FFI work. In spite of HAVE_LITTLE_ENDIAN, almost exactly a year after itypes_little_endian was introduced, I mistakenly put that variable to use instead of recognizing it as superfluous and removing it in favor of using HAVE_LITTLE_ENDIAN. That is what I'm doing now. * itypes.c (itypes_little_endian): Variable removed. (itypes_init): Function removed, since its only job is to initialize this variable. * itypesl.h (itypes_little_endian, itypes_init): Declarations removed. * lib.c (init): Call to itypes_init removed. * parser.c (read_file_common): Instead of itypes_little_indian, use HAVE_LITTLE_ENDIAN, which the configuration script always exists and is defined as 0 or 1.
* listener: perms complaint for missing .txr_historyKaz Kylheku2020-06-191-2/+4
| | | | | | | | | * parser.c (repl): The listener wrongly complains that .txr_history has bad permissions when it doesn't exist. New users of TXR won't have a history file, so this looks sloppy. Since we load the history file regardless of the check, let's do the check in the case when the file has successfully loaded.
* Replace trivial format(nil, ...) with simpler ops.Kaz Kylheku2020-05-301-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * gencadr.txr (cadr_register): Use scat2 to glue two strings. * cadr.c: Regenerated. * lib.c (scat2, scat3): New functions. * lib.h (scat2, scat3): Declared. * liblib.c (place_instantiate, ver_instantiate, ifa_instantiate, txr_case_instantiate, with_resources_instantiate, path_test_instantiate, struct_instantiate, with_stream_instantiate, hash_instantiate, except_instantiate, type_instantiate, yield_instantiate, sock_instantiate, termios_instantiate, awk_instantiate, build_instantiate, trace_instantiate, getopts_instantiate, package_instantiate, getput_instantiate, tagbody_instantiate, pmac_instantiate, error_instantiate, keyparams_instantiate, ffi_instantiate, doloop_instantiate, stream_wrap_instantiate, asm_instantiate, compiler_instantiate, debugger_instantiate, op_instantiate, save_exe_instantiate, defset_instantiate, copy_file_instantiate): Use scat2 to glue two strings instead of format. * parser.c (find_matching_syms, hist_save, repl): Replace trivial uses of format with scat2 or scat3. * sysif.c (ensure_dir): Likewise. * txr.c (get_self_path, substitute_basename, sysroot, sysroot_init, parse_once_noerr, read_compiled_file_noerr, read_eval_stream_noerr): Likewise. * unwind.c (uw_unwind_to_exit_point): Likewise.
* Remove unnecessary #include directives.Kaz Kylheku2020-04-221-1/+0
| | | | | | | | | | Time for some spring cleaning. * args.c, arith.c, buf.c, cadr.c, chksum.c, debug.c, ftw.c, gc.c, gencadr.txr, glob.c, hash.c, lisplib.c, match.c, parser.c, parser.l, parser.y, rand.c, signal.c, stream.c, strudel.c, syslog.c, tree.c, unwind.c, utf8.c, vm.c: Numerous unnecessary #include directives removed.
* listener: completion for Unicode identifiers.Kaz Kylheku2020-04-171-2/+4
| | | | | | * parser.c (provide_completions): Recognize U+0080 and higer characters as token constituents, allowing completion to work for symbols which use these characters.
* txr-parse: release deferred warnings.Kaz Kylheku2020-04-141-2/+11
| | | | | | | * parser.c (txr_parse): The function must ensure that deferred warnings are released, if it is not wrapped in a larger translation unit. If *load-recursive* is false, we release the warnings after the parser.
* repl: improve dotfile security tests.Kaz Kylheku2020-04-091-5/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We test the .txr_history file for bad permissions also, not only .txr_profile. Though commands are not automatically executed out of .txr_history, a user could execute a harmful command due to not noticing the malicious modification. An additional useful diagnostic is added: if a dotfile is found to have the wrong permission, it's possible that this is due to a poor umask setting. We check for a weak umask and warn the user. Note: the .txr_history check doesn't use the open stream, therefore it is vulnerable to TOCTTOU race condition: the file looks good, but between the time we verify this and open the file to load it, the file has been replaced by a malicious one. * parser.c (report_security_problem): New static function, factored out of load_rcfile. Includes umask test. (load_rcfile): Call report_security_problem if the .txr_profile is writable to others. Also, no need to call stat any more; the path testing function now takes a stream argument. (repl): Check .txr_history for inappropriate writepermissions also and call report_security_problem if so. * sysif.c (umask_wrap): Change static function to external linkage. * sysif.c (umask_wrap): Declaration updated.
* exceptions: use uw_rthrow for non-error exceptions.Kaz Kylheku2020-04-071-2/+2
| | | | | | | | | | | | | | | | | | | * eval.c (eval_exception): This function is shared by warnings and errors. Use uw_throw. The eval_error caller already has an abort() after its eval_exception call, which makes that code path continue to be equivalent to uw_throw. The behavior changes for the other caller, eval_warn, which will now return if the warning is not handled. (eval_defr_warn, gather_free_refs, gather_free_refs_nw): Throw non-error exception with uw_rthrow. * match.c (v_throw, v_assert, h_assert): Use uw_rthrow for these directives, just like the throw function. * parser.c (repl_intr, repl_warning): Use uw_rthrow. * unwind.c (uw_muffle_warning, uw_release_deferred_warnings): Likewise.
* warning cleanup: GNU C++ initializer warnings.Kaz Kylheku2020-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This is the eight and final round of an effort to enable GCC's -Wextra option. The C++ compiler, with -Wextra, doesn't like C's universal struct initializer { 0 }, individually complaining about all the remaining members not being initialized. What works in C++ is the { } initializer. Conditional definition to the rescue. * lib.h (all_zero_init): New macro which expands to { } under C++, and { 0 } under C. * lib.c (make_time_impl, epoch_tm, time_string_meth, time_parse_meth): Use all_zero_init. * parser.c (prime_parser): Likewise. * socket.c (sock_mark_connected): Likewise. * sysif.c (fcntl_wrap): Likewise. * termios.c (encode_speeds, decode_speeds): Likewise. * configure (diag_flags): Add -Wextra.
* warning cleanup: missing member initializers.Kaz Kylheku2020-04-051-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the sixth round of an effort to enable GCC's -Wextra option. Warnings about uninitialized members are addressed. I am not happy with what had to be done in linenoise.c. We just need a dummy circular list node for the lino_list, which we achieved with a dummy all zero struture, with statially initialized next and prev pointers. There are way too many members to initialize, including one that has struct termios type containing a nonportable set of members. On the plus size, the lino_list structure now moves into the BSS section, reducing the executable size slightly. * lib.c (cptr_ops): Initialize using cobj_ops_init, which has all the initializers already, and should have been used for this in the first place. * linenoise/linenoise.c (lino_list): Remove initializer, which eliminates the warning about some members not being initialized. (lino_init): Initialize the next and prev pointers here. * parser.c (parser_ops): Initialize with cobj_ops_init. * stream.h (stdio_mode_init_blank, stdio_mode_init_r, stdio_mode_init_rpb): Add initializer corresponding to redir array member of struct stdio_mode. * sysif.c (cptr_dl_ops): Use cobj_ops_init. * tree.c (tree_iter_init): Add initializer for the path array member of struct tree_iter. (tr_rebuild): Initialize all fields of the dummy object. Since it's a union, we just have to deal with the any member. There are two layouts for the obj_common fields based on whether CONFIG_GEN_GC is enabled. This is ugly, but occurs in one place only.
* warning cleanup: signed/unsigned in ternaries.Kaz Kylheku2020-04-051-1/+1
| | | | | | | | | | | | | | | | | | | | | This is the third round of an effort to enable GCC's -Wextra option. Instances of signed/unsigned mismatch between the branches of ternary conditionals are addressed. * ffi.c (pad_retval): Add cast into the consequent of the conditional so it yields size_t, like the alternative. * lib.c (split_str_keep): Likewise. (vector): Cast -1 to ucnum so it has the same type as the alloc_plus opposite to it. * parser.c (lino_getch): Add a cast to wint_t to match return value and opposite WEOF operand. * stream.c (generic_get_line): Likewise. * sysif.c (c_time): Convert both consequent and alternative to time_t to silence warning.