summaryrefslogtreecommitdiffstats
path: root/parser.y
Commit message (Collapse)AuthorAgeFilesLines
* C99: get rid of useless inline instantiations.Kaz Kylheku2019-05-021-5/+0
| | | | | | | | | | | * hash.c, lib.c, parser.y, unwind.c: Remove useless declarations that were believed to be C99 inline instantiations. This was mistakenly added at the time the Solaris issue was discovered that _XOPEN_SOURCE values of 600 or greater require compiling in C99 mode. We use "static inline" under C99 for inline functions; instantiation is not applicable at all to inline functions that don't have external linkage.
* parser: always use stream-associated parser for parse_once.Kaz Kylheku2019-04-211-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This refactoring is needed for fixing the off-by-one line number bug when the hash bang line is processed. * eval.c (load): Don't define parser locally; ensure there is one in the stream and use it. * match.c (v_load): Likewise. * parser.c (get_parser_impl): Renamed to parser_get_impl and changed from internal to external linkage. (ensure_parser): Changed to external linkage. (lisp_parser_impl, read_file_common): Follow rename of get_parser_impl. * parser.h (parse_once): Declaration updated. (parser_get_impl, ensure_parser): Declared. * parser.y (parse_once): Take self parameter; drop parser parameter. Ensure a parser to the stream, rather than declaring one locally. Don't clean up the parser when done, just let the stream clean it up. * txr.c (parse_once_noerr): Parser argument is dropped and not passed to parse_once. Program name is passed as self argument to parse_once. (txr_main): When parsing the TXR pattern query, don't define a parser locally; ensure there is one in the stream and use it, like in load and v_load.
* debugger: initial backtrace support.Kaz Kylheku2019-04-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * debug.c (debug_state): Switch to unsigned, since this is now a bitmask. (sys_print_backtrace_s): New symbol variable. (dbg_clear, dbg_set, dbg_restore): New static functions. (debug_init): Initialize sys_print_backtrace_s. Register dbg-clear, dbg-set, dbg-restore intrinsics. Register dbg-enable, dbg-step, dbg-backtrace and dbg-all bitmask variables, Lisp equivalents of DBG_ENABLE, DBG_SETP, DBG_BACKTRACE and DBG_ALL. (debug_dump_backtrace): New function. * debug.h (opt_debugger): Declaration removed. (debug_state): Declaration updated. (DBG_ENABLE, DBG_STEP, DBG_BACKTRACE, DBG_ALL): New preprocessor symbols. (debug_set_state): Inline function removed. (debug_clear, debug_set, debug_restore): New inline functions. (dbg_backtrace, dbg_fcall_begin, dbg_fcall_end): New macros. (debug_dump_backtrace): Declared. * eval.c (error_trace): Invoke debug_dump_backtrace if support is compiled in and backtraces are enabled. * lib.c (do_generic_funcall): New function, copy of generic_funcall. (generic_funcall): Now a wrapper for do_generic_funcall which registers fcall frames if backtrace support is enabled. (funcall, funcall1, funcall2, funcall3, funcall4): Route to slow generic_funcall path if backtraces are enabled. * lisplib.c (debugger_instantiate, debugger_set_entries): New static functions. (lisplib_init): Autload support for debug module via above new functions. (lisplib_try_load): Save and restore debugger state in new way using debug_set and debug_restore, with specific mask values. * parser.y (parse_once): Disable debugging in new way. * share/txr/stdlib/debug.tl New file. * sighal.h (EJ_DBG_MEMB, EJ_DBG_SAVE, EJ_DBG_REST): New macros for saving/restoring debug state. (EJ_OPT_MEMB, EJ_OPT_SAVE, EJ_OPT_REST): Reference the above macros to include debug state in extended jump context. * txr.c (help): Document --backtrace and that that -d implies --backtrace. (txr_main): Enable debugger using debug_set. Provide new --backtrace option to enable backtraces only. * unwind.c (args_s): New symbol variable. (fcall_frame_type): New static variable. (unwind_to_exit_point): Save pointer to original frame stack and restore it when calling error_trace. This is so that error_trace can walk the stack to collect a backtrace. (uw_find_frames_by_mask, uw_push_fcall): New functions. (uw_late_init): Initialize args_s and fcall_frame_type. gc-protect fcall_frame_type. Register uw-* variables corresponding to the UW_* frame types. * unwind.h (uw_frtype_t): New enum constant UW_FCALL. (struct uw_fcall): New frame structure. (union uw_frame): New member fc. (uw_push_fcall, uw_find_frames_by_mask): Declared.
* debug support: crude debugger removed.Kaz Kylheku2019-04-091-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * debug.c (debug_depth, debug_quit_s): Variables removed. (step_mode, next_depth, breakpoints, last_command, cols): Static variables removed. (debug_check): C99 inline instantiation removed. (help, show_bindings): Static functions removed. (debug): Function removed. (debug_set_state): Now takes one int argument, returns int. It's anticipated that the new debug system will have a simple on-off switch; there won't be a debug_depth hack. (debug_restore_state): Function removed. (debug_init): Emptied. * debug.h (debug_depth, debug_state_t): Declarations removed. (debug_enter, debug_leave, debug_return): Macros removed. (debug_check): Inline function removed. (debug_set_state): Declaration updated. (debug_restore_state): Declaration removed. (debug_frame, debug_end): Macros removed. * eval.c (do_eval, me_interp_macro): Debugging support scrubbed. * lisplib.c (lisplib_try_load): Adapt to debug_set_state interface change. * match.c (h_fun, do_match_line, v_fun, match_files, match_fun): Debugging support scrubbed. * parser.y (parse_once): Adapt to debug_set_state interface change. * protsym.c: Regenerated. * signal.h (debug_depth): Declaration removed. (EJ_DBG_MEMB, EJ_DBG_SAVE, EJ_DBG_REST): Macros removed. (EJ_OPT_MEMB, EJ_OPT_SAVE, EJ_OPT_REST): Reduced to unconditionally empty definitions for future use. * unwind.c (uw_push_debug): Function removed. * unwind.h (uw_frtype_t): UW_DBG enum member removed. (struct uw_debug): struct declaration removed. (union uw_frame): db member removed. (uw_push_debug): Declaration removed. * txr.1: Debugger doc removed.
* gethash_c: review uses and improve or replace.Kaz Kylheku2019-02-141-2/+1
| | | | | | | | | | | | | | | | | | | | * eval.c (env_fbind, env_vbind, reg_symacro): Use gethash_l instead of gethash_c to eliminate repeated cdr operations on the same cell. * hash.c (sethash): Since new_p is never used, eliminated it and use nulloc. (group_reduce): Use gethash_l instead of gethash_c. * lib.c (obj_init): Replace rplacd(gethash_c(...)) pattern whose return value is not used with with sethash. We lose some diagnosability here since sethash doesn't take a "self" argument. (obj_print_impl, obj_hash_merge): Use gethash_l instead of gethash_c. * parser.y (ensure_parser, parser_circ_def, get_visible_syms, rlset): Use gethash_l instead of gethash_c.
* Copyright year bump 2019.Kaz Kylheku2019-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr: Extended Copyright line to 2018.
* Eliminate ALLOCA_H.Kaz Kylheku2018-12-311-1/+1
| | | | | | | | | | | | | * configure: Instead of generating a definition of ALLOCA_H, generate the variable HAVE_ALLOCA_<name> with a value of 1, where <name> is one of stdlib, alloca or malloc. * alloca.h: New header. * args.c, eval.c, ffi.c ffi.c, ftw.c, hash.c, lib.c, match.c, parser.c, parser.y, regex.c, socket.c, stream.c, struct.c, sysif.c, syslog.c, termios.c, unwind.c, vm.c: Include "alloca.h" instead of ALLOCA_H.
* Better identify functions that misuse COBJ-s and hashes.Kaz Kylheku2018-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this patch, the cobj_handle, cobj_ops and variants of gethash get an additional argument to identify the caller. Many functions are updated to pass this down. * buf.c (buf_strm): Pass self name to cobj_handle. * eval.c (env_fbind, env_vbind, rt_defvarl, me_case): Pass self name to gethash_c or gethash_e. (load): Pass self name to read_eval_stream and read_compiled_file. (reg_symacro): Pass situation-identifying string to gethash_c. * ffi.c (ffi_type_struct_checked, ffi_closure_struct_checked, ffi_call_desc_checked, uni_struct_checked): Take self name parameter, and pass down to cobj_handle. (ffi_get_type, ffi_get_lisp_type): Take self name and pass down to ffi_type_struct_checked. (union_get_ptr): Take self name and pass to uni_struct_checked. (ffi_union_in, ffi_union_put): Pass self name to union_get_ptr. (ffi_type_compile): Pass self name to ffi_get_lisp_type. (ffi_make_call_desc): Pass self name to ffi_type_struct_checked, ffi_get_type and ffi_call_desc_checked. (ffi_make_closure): Pass self name to ffi_call_desc_checked. (ffi_closure_get_fptr): Take self name, pass to ffi_closure_struct_checked. (ffi_typedef, ffi_size, ffi_alignof, ffi_offsetof, ffi_arraysize, ffi_elemsize, ffi_elemtype, ffi_put_into, ffi_put, ffi_in, ffi_get, ffi_out, make_carray): Pass self name to ffi_closure_struct_checked. (carray_struct_checked): Take self name, pass to cobj_handle. (carray_set_length, carray_dup, carray_own, carray_free, carray_type, length_carray, copy_carray, carray_ptr, buf_carray, vec_carray, list_carray, carray_ref, carray_refset, carray_sub, carray_replace, carray_get_common, carray_put_common, unum_carray, num_carray, put_carray, fill_carray): Pass self name to carray_struct_checked. (carray_blank, carray_buf, carray_cptr): Pass self name ffi_type_struct_checked. (carray_pun): Pass self name to carray_struct_checked and ffi_type_struct_checked. (make_union): Pass self name to ffi_type_struct_checked. (union_members, union_get, union_put, union_in, union_out): Pass self name to uni_struct_checked. (make_zstruct, zero_fill, put_obj, get_obj, fill_obj): Pass self-name to ffi_type_struct_checked. * ffi.h (ffi_closure_get_fptr, union_get_ptr): Declarations updated. * filter.c (trie_add): Pass self-name to gethash_l. * hash.c (make_similar_hash, copy_hash, hash_count, get_hash_userdata, set_hash_userdata, hash_begin, hash_next, hash_uni, hash_diff, hash_isec): Pass self name to cobj_handle. (gethash_c, gethash_e): Take self name parameter and pass down to cobj_handle. (gethash_f): Take self parameter and pass down to gethash_e. (gethash, inhash, gethash_n, sethash, pushhash, remhash, clearhash, hash_update_1): Pass self name to gethash_e or gethash_c. * hash.h (gethash_c, gethash_e, gethash_f): Declarations updated. (gethash_l): Take self name, and pass down to gethash_c. * lib.c (class_check): Take self name parameter and use in type mismatch diagnostic. (use_sym, unuse_sym, symbol_needs_prefix, find_symbol, intern, unintern, intern_fallback, unique, in, sel, obj_print_impl, populate_obj_hash, obj_hash_merge): Pass self name to gethash_f or gethash_l. (symbol_visible, obj_init): Pass situation-identifying string to gethash_e. (cobj_handle, cobj_ops): Take self name parameter and pass down to class_check. * lib.h (class_check, cobj_handle, cobj_ops): Declarations updated. * match.c (v_load): Pass self name to read_compiled_file and read_eval_stream. * parser.c (get_parser_impl): Take self name and pass to cobj_handle. (ensure_parser): Pass situation-identifying string to gethash_c. (parser_circ_def): Pass self-name to gethash_c. (lisp_parser_impl): Pass self name to get_parser_impl and class_check. (lisp_parse, nread, iread): Pass self-name to lisp_parser_impl. (read_file_common): Take self name parameter and pass down to get_parser_impl. (read_eval_stream, read_compiled_file): Take self name and pass down to read_file_common. (load_rcfile): Pass situation-identifying string to read_eval_streem. (get_visible_syms): Pass situation-identifying string to gethash_c. (parser_errors, parser_eof): Pass self name to cobj_handle. * parser.h (read_eval_stream, read_compiled_file): Declarations updated. * parser.y (rlset): Pass self name to gethash_c. * rand.c (make_random_state, random_state_get_vec,l random_fixnum, random_float): Pass self name to cobj_handle. * regex.c (regex_source, regex_print, regex_run): Pass self-name to cobj_handle. (regex_machine_init): Take self name param and pass to cobj_handle. (search_regex, match_regex, match_regex_right, regex_prefix_match, read_until_match): Pass self-name to regex_machine_init. * stream.c (stdio_get_fd): Pass self name to cobj_handle. (generic_get_line): Get COBJ operations via unsafe, diret object access rather than cobj_ops. (set_mode_props): Get object handle via unsafe, direct object access. (stream_fd, sock_family, sock_type, sock_peer, set_sock_peer, get_string_from_stream, get_list_from_stream, stream_set_prop, stream_get_prop, close_stream, get_error, get_error_str, clear_error, get_line, get_char, get_byte, unget_char, unget_byte, put_buf, fill_buf, put_string, put_char, put_byte, flush_stream, seek_stream, truncate_stream, get_indent_mode, test_set_indent_mode, set_indent_mode, get_indent, set_indent, inc_indent, width_check, force_break, get_set_ctx, get_ctx): Pass self name to cobj_ops. (make_delegate_stream): Take self name parameter, pass down to cobj_ops. (record_adapter): Pass self name down to make_delegate_stream. (format): Pass self name to class_check. * struct.c (stype_handle): Pass self name to cobj_handle. (make_struct_type): Pass self name to class_check. * txr.c (read_eval_stream_noerr): Take self name parameter, pass to read_eval_stream. (txr_main): Pass istuation-identifying string to read_compiled_file and read_eval_stream_noerr. * unwind.c (revive_cont): Pass self-name to cobj_handle. * vm.c (vm_desc_struct): Take self name parameter, pass to cobj_handle. (vm_desc_nlevels, vm_desc_nregs, vm_desc_bytecode, vm_desc_datavec, vm_desc_symvec, vm_execute_toplevel, vm_execute_closure, vm_closure_entry): Pass self name to vm_desc_struct. (vm_closure_struct): Take self name parameter, pass to cobj_handle.
* txr: support variable in postive match.Kaz Kylheku2018-05-221-0/+1
| | | | | | | | | | | | | In the @{var mod} syntax in the pattern language, allow mod to be a variable which contains a regex or integer, not just an integer or regex literal. * match.c (h_var): Check for modifier being a variable, and resolve it. * parser.y (modifiers): Allow a SYMTOK phrase. * txr.1: Documented.
* parser: propagate copyright to generated parser.Kaz Kylheku2018-04-161-2/+2
| | | | | | | | | | * parser.y: Move the copyright comment header into the %{ ... %} section so that it is copied to the generated parser, rathe than stripped away by the generator. The problem is that Bison adds a GPL header to the file wrongly implying that the whole thing is under the GPL (with a special exception). Without our copyright header there, it looks as if the whole file is from Bison.
* parser: show starting line of unterminated form.Kaz Kylheku2018-04-151-0/+5
| | | | | | * parser.y (parse): Note the line number before parsing. If the error seems to be bad termination, issue an extra message indicating the starting line number of the form.
* parser: @(if) hack in output must use usr package.Kaz Kylheku2018-04-101-6/+3
| | | | | | | | | | | * match.c (else_s, elif_s): New symbol variables. (syms_init): Initialize new variable with interned symbols. * match.h (else_s, elif_s): Declared. * parser.y (not_a_clause): Refer to if_s, else_s and elif_s, which are symbols in the usr package, instead of intering symbols in whatever package is current.
* lib: get rid of preprocessor macros for packages.Kaz Kylheku2018-04-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | The identifiers user_package, system_package and keyword_package are preprocessor symbols that expand to other preprocessor symbols for no good reason. Time to get rid of this. * lib.c (system_package_var, keyword_package_var, user_package_var): Variables renamed to system_package, keyword_package and user_package. (symbol_package, keywordp, obj_init): Fix variable references to follow rename. * lib.h (keyword_package, user_package, system_package): Macros removed. (system_package_var, keyword_package_var, user_package_var): Variables renamed. * eval.c (eval_init): Fix variable references to follow rename. * parser.y (sym_helper): Likewise.
* parser: don't generate special lits outside quasiquote.Kaz Kylheku2018-04-041-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The parser generates a sys:hash-lit, sys:struct-lit or sys:vector-lit whenever a hash, struct or vector literal contains unquotes. This allows the quasiquote expander to treat these objects as ordinary list structure when interpolating inside them, and then recognize these symbols and construct the implied real objects. The issue is that these literals are generated even if the unquotes occur outside of a backquote. For instance if a vector literal like #(,a) occurs out of the blue, not in any backquote, this is still a (sys:vector-lit (sys:unquote a)) and not an actual vector. The issue is compounded because this substitution takes place even if there is no actual comma or splice notation. Even the following is a sys:vector-lit: #((sys:unquote x)). In any case, it causes problems for compiled files, because such material can occur in the data vector of a compiled toplevel form. In this patch we modify the parser to keep track of the quasiquote/unquote level. The special literals are generated only when the object occurs inside a quasiquote. * parser.h (struct parser): New member, quasi_level. * parser.c (parser_common_init): Initialize the parser's new quasi_level member. * parser.y (vector, hash, struct): To decide whether to generate the special literal, don't just check whether unquotes occur in the list. Check that we are in a quasiquote, indicated by the quasiquoting level being positive. (i_expr, n_expr): Use a mid-rule actions on the quasiquote, unquote and splice rules to bump the quasiquoting level in one direction before recognizing the object, and then bump in the opposite direction when reducing the rule. (parse): Initialize quasi_level.
* parser: avoid consing for buf literals.Kaz Kylheku2018-04-031-12/+7
| | | | | | | | | | | * parser.y (buflit, buflit_items): Don't cons up a list of bytes in buflit_items which are then assembled into a buffer. Rather, the buflit_items rules construct and fill a buffer object directly. The buflit rule then just has to signal the end of the buffer literal to the lexer, and trim the buffer to the actual size. We will need this for efficient loading of compiled files, in which the virtual machine code is represented as a buffer literal.
* packages: drop no-fallback-list interning restriction.Kaz Kylheku2018-03-091-14/+1
| | | | | | | | | | | | | Removing the restriction that qualified pkg:sym syntax may not cause interning to take place if pkg has a nonempty fallback list. This serves no purpose, only hindering the flexibility of the package system. * parser.y (sym_helper): When processing a qualified symbol, if the package exists, just intern it. * txr.1: Revise all text which touched on the removed rule, to remove all traces of it from the documentation.
* Copyright year bump 2018.Kaz Kylheku2018-02-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, win/cleansvg.txr: Extended Copyright line to 2018.
* bugfix: don't record source loc of symbols and numbers.Kaz Kylheku2018-02-051-4/+22
| | | | | | | | | | | | | | | | | | This prevents a bug which manifests itself as a totally bogus file name and line number being reported in a diagnostic. The cause is that source loc info is recorded for an interned symbol when it is first encountered in one place in the code. Then that symbol occurs again in another place (perhaps a different file) in such a way that its source loc info is inherited into a surrounding generated form which now has incorrect source loc info: the location of the first occurrence of the symbol not of this form. Then when some error is reported against the form, the bogus source loc info is shown. * parser.y (rlviable): New static function. (rlset): Only record source loc for forms which satisfy rlviable.
* read, iread: source location recording now conditional.Kaz Kylheku2017-12-291-58/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recording of source location info incurs a time and space penalty. We don't want to impose this on programs which are just reading large amounts of Lisp data that isn't code. * eval.c (eval_init): Register lisp-parse and read functions to the newly introduced nread function rather than lisp_parse. lisp_parse continues to record source location info unconditionally. * parser.c (rec_source_loc_s): New symbol variable. (parser_common_init): Set the new member of the parser structure, rec_source_loc, according to the current value of the special var *rec-source-loc*. (lisp_parse_impl): New second argument, rlcp_p. If true, it overrides the rec_source_loc member of the parser structure to true. (lisp_parse): Pass true argument to rlcp_p parameter of lisp_parse_impl, so parsing via lisp_parse always records source loc info. (nread): New function. (iread): Pass true argument to rlcp_p parameter of lisp_parse_impl, so *rec-source-loc* controls whether source location info is recorded. (parse_init): Initilize rec_source_loc_s symbol variable, and register the *rec-source-loc* special var. * parser.h (struct parser): New member, rec_source_loc. (rec_source_loc_s, nread): Declared. * parser.y (rlcp_parser): New static function. Like rlcp but does nothing if parser->rec_source_loc is false. (rlc): New macro. (grammar): Replace rlcp uses with rlc, which expands to a call to rlcp_parser. (rlrec): Do nothing if source loc recording is not enabled in the parser. (make_expr, uref_helper): Replace rlcp with rlc. This is possible because these functions have a parser local variable that the macro expansion can refer to. (parse_once): Override rec_source_loc in the parser to 1, so that source loc info is always recorded when parsing is invoked through this function. * txr.1: Documented *rec-source-loc* and added text under read and iread.
* cleanup: remove unnecessary header includes.Kaz Kylheku2017-09-191-4/+0
| | | | | | | | | | * eval.c: doesn't need rand.h. * filter.c: doesn't need gc.h. * parser.l: doesn't need eval.h. * parser.y: doesn't need utf8.h, stream.h, args.h or cadr.h.
* parser: fix precedence of DOTDOT.Kaz Kylheku2017-09-071-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is that a.b .. c.d parses as (qref a b..c d), which is useless and counterintuitive. Let's fix it, but with a backward compatibility switch to give more leeway to any hapless people out there whose code happens to depend on this unfortunate situation. We basically use two token numbers for the .. token: OLD_DOTDOT, and DOTDOT. Both are wired into the grammar. In backward compatibility mode, the lexer pumps out OLD_DOTDOT. Otherwise DOTDOT. * parser.l (grammar): When .. is scanned, return OLD_DOTDOT when in compatibility with 185 or earlier. Otherwise DOTDOT. * parser.y (OLD_DOTDOT): New terminal symbol; introduced at the same high precedence previously occupied by DOTDOT. (DOTDOT): Changes precedence to lower than '.' and UREFDOT. (n_expr): Two productions added involving OLD_DOTDOT. These are copy and paste of the existing productions involving DOTDOT; the only difference is that OLD_DOTDOT replaces DOTDOT. (yybadtoken): Handle OLD_DOTDOT. * txr.1: Compat notes added.
* txr -i honored despite parse-time exception.Kaz Kylheku2017-09-061-2/+2
| | | | | | | | | | | | | | | | | | | | | If an error is thrown while parsing a .txr file or while reading and evaluating the forms of a .tl file. * parser.y (parse_once, parse): Wording change in message when exception is caught. Only exceptions derived from error are caught. * txr.c (parse_once_noerr, read_eval_stream_noerr): New static functions. (txr_main): Use parse_once_noerr and read_eval_stream_noerr instead of parse_once and read_eval_stream. Don't exit if a TXR file has parser errors; in that situation, exit only if interactive mode is not requested, otherwise go interactive. Make sure *self-path* is registered to the name of the input source in this case also. * unwind.h (ignerr_func_body): New macro.
* parser: bugfix: empty buf literal problem.Kaz Kylheku2017-08-221-1/+2
| | | | | | | * parser.y (buflit): Fix neglect to call end_of_buflit in the empty buffer literal case, which precipitates syntax errors when an empty buffer literal #b'' is embedded in other syntax.
* parser: fix byacc regression related to hash-semi.Kaz Kylheku2017-08-171-0/+1
| | | | | | | | | | | | | | | | | | | | The "byacc_fool" rule needed a small update when in November 2016 it became involved in newly introduced #; (hash semicolon) syntax for commenting out. The problem is that when a top-level nested list expression is followed by #; then a byacc-generated parser throws a syntax error. This is because the byacc_fool production only generates a n_expr, or empty. Thus only those symbols are allowed which may begin a n_expr. The hash-semicolon token isn't one of them. * parser.y (byacc_fool): Add a production rule which generates a HASH_SEMI. Thus HASH_SEMI is now one of the terminals that may legally follow any sentential form that matches a byacc_fool rule after it.
* parser: more efficient treatment of string literals.Kaz Kylheku2017-08-171-27/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this patch we switch the string literal parser from right recursion to left recursion, so that it doesn't require a Yacc stack depth proportional to the number of characters in the literal. Secondly, we build the string directly in a syntax-directed way, rather than building a list of characters and then walking the list to build a string. This was discovered as a byacc regression, though the fix is not for the sake of byacc but basic efficiency. The Sep 29, 2015 commit 111650e235ab2e529fa1529b1c9a23688a11cd1f "Implementation of static slots for structures." extended the string literal in the struct.tl test case in such a way that if the parser is generated by byacc rather than GNU Bison, the test case fails with a "yacc stack overflow". I haven't done any regression testing with byacc in over two years so I didn't notice this. Quasiliterals could use this treatment also. Word list literals benefit from this change, but they still use a Yacc stack depth proportional to the number of words, since the accumulation of words is right recursive. * parser.y (lit_char_helper): Static function removed. (restlitchar): New grammar nonterminal symbol. (strlit, quasi_item, wordslit): No need to call lit_char_helper. (litchars): A litchars is now either a single LITCHAR, or else a LITCHARS followed by a sequence of more. This sequence is a separate production rule called restlitchar, which is purely left recursive. (If litchars is made directly left recursive without this helper rule, intractable reduce/reduce and shift/reduce conflicts arise.)
* Bugfix: (sys:expr . atom) bad syntax out of parser.Kaz Kylheku2017-08-021-1/+1
| | | | | | | | | | | | | * parser.y (expand_meta): Fix incorrect conversion of (sys:var x) when x is a non-bindable term to (sys:expr . x). Should be (sys:expr x). This doesn't have that much of an impact, I don't think. It prevent certain degenerate forms from working like @(bind x @"str"). The bad thing is that this particular one has a silent problem: @"str" wrongly evaluates to #\s. Neverheless, this doesn't seem worth the addition of a compat flag test; the odds of someone depending on @"str" producing #\s in some pattern language code see vanishingly low.
* Continuing implementation of buffers.Kaz Kylheku2017-04-211-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | * Makefile (OBJS): New objects itypes.o and buf.o. * buf.c, buf.h: New files. * itypes.c, itypes.h: New files. * lib.c (obj_print_impl): Handle BUF via buf_print and buf_pprint. (init): Call itypes_init and buf_init. * parser.h (end_of_buflit): Declared. * parser.l (BUFLIT): New exclusive state. (grammar): New rules for recognizing start of buffer literal and its interior. (end_of_buflit): New function. * parser.y (HASH_B_QUOTE): New token. (buflit, buflit_items, buflit_item): New nonterminals and corresponding grammar rules. (i_expr, n_expr): These symbols now generate a buflit; a buffer literal is a kind of expression. (yybadtoken): Handle HASH_B_QUOTE case.
* parser: add some error cases to hash notations.Kaz Kylheku2017-04-081-1/+9
| | | | | | | | Produce better diagnostics for expressions like #[... or #Habc. * parser.y (vector, hash, struct, range): Add error productions.
* parser: refactor grammar to banish #[] etc.Kaz Kylheku2017-04-071-16/+26
| | | | | | | | | | | | | | | | | | | | Turns out we have an over-eager parser whcih recognizes invalid notion such as #[...], #S[...] and others. This is because the grammar nonterminal list is overloaded with phrase structures. The syntax of a vector literal, for instance, is '#' list, but a list can be '[' ... and other expressions. * parser.y (dwim, meta, compound): New non-terminal symbols. Dwim derives the square bracketed "dwim" expressons that were previously overloaded into list. Meta derives the @ exprs. compound is what list used to be. (list): Handle only (...) list expressions. (o_elem, modifiers): Derives compound rather than list, thus preserving existing behavior. (i_expr, n_expr): Likewise. All other uses references to the list nonterminal stay, thereby trimming the grammar of dubious expressions.
* parser: fix a...b syntax error.Kaz Kylheku2017-04-021-0/+6
| | | | | | | | | | | | | This issue has implications mainly for read/print consistency. The (rcons a .b) expression prints a...b, but that doesn't read back. The reason is that the . on .b isn't preceded by whitespace, and so isn't the UREFDOT token recognized in a n_expr. It's just the '.' token which is a syntax error in that situation. * parser.y (n_expr): New special case rule to handle the phrase pattern n_expr DOTDOT '.' n_expr which is now a syntax error.
* parser: support uref dot as top-level expr.Kaz Kylheku2017-03-151-0/+8
| | | | | * parser.y (hash_semi_oor_n_expr, hash_semi_or_i_expr): add grammar rules for leading dot.
* parser: factor repeated uref-related code.Kaz Kylheku2017-03-151-32/+20
| | | | | | | * parser.y (uref_helper): New static function. (list, i_dot_expr, n_expr, n_dot_expr): Replace most action code releated to unbound ref dot syntax with call to uref_helper.
* Add in-package directive.Kaz Kylheku2017-03-131-0/+4
| | | | | | | | | | | | * match.c (in_package_s): New symbol variable. (syms_init): Initialize in_package_s. * match.h (in_package_s): Declared. * parser.y (check_parse_time_action): Add case for in-package. Evaluate just with eval, as a case of the in-package macro. * txr.1: Documented.
* New directive: mdo.Kaz Kylheku2017-03-121-5/+11
| | | | | | | | | | | | | | | * eval.h (progn_s): Declarationa added. * match.c (mdo_s): New symbol variable. (syms_init): Initialize mdo_s. * match.h (mdo_s): Declared. * parser.y (check_for_include): Renamed to check_parse_time_action and implements mdo, not only include. (clauses_rev): Follow rename of function. * txr.1: Documented.
* uref: the a.b.c syntax extended to .a.b.cKaz Kylheku2017-03-061-13/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now it is possible to use a leading dot on the referencing dot syntax. This is the is the "unbound reference dot". It expands to the uref macro, which denotes an unbound-reference: it produces a function which takes an object as the argument, and curries the reference implied by the remaining arguments. * eval.c (uref_s): New global symbol variable. (eval_init): Intern uref symbol and init uref_s. * eval.h (uref_s): Declared. * lib.c (simple_qref_args_p): A qref expression is now also not simple if it contains an embedded uref, meaning that it cannot be rendered into the dot notation without ambiguity. (obj_print_impl): Support printing (uref a b c) as .a.b.c. * lisplib.c (struct_set_entries): Add uref to the list of autoload triggers for struct.tl. * parser.l (DOTDOT): Consume any leading whitespace as part of recognizing the DOTDOT token. Otherwise the new rule for UREFDOT, which matches (mandatory) leading space will take precedence, causing " .." to be scanned wrong. (UREFDOT): Rule for new kind of dot token, which is preceded by mandatory whitespace, and isn't consing dot (which has mandatory trailing whitespace too, matched by an earlier rule). * parser.y (UREFDOT): New token type. (i_dot_expr, n_dot_expr): New grammar rules. (list): Handle a leading dot on the first element of a list as a special case. Things are done this way because trying to work a UREFDOT into the grammar otherwise causes intractable conflicts. (i_expr): The ^, ' and , punctuators are now followed by an i_dot_expr, so that the expression can be an unbound dot. (n_expr): Same change as in i_expr, but using n_dot_expr. Plus new UREFDOT n_expr production. * share/txr/stdlib/struct.tl (uref): New macro. * txr.1: Documented.
* Harmonize code with previous commit.Kaz Kylheku2017-03-041-2/+3
| | | | | | | * parser.y (expand_repeat_rep_args): Use a sym local variable to avoid evaluating first(arg) twice, like the previous commit does in another case of this function.
* bugfix: :vars in output repeat not registered.Kaz Kylheku2017-03-041-3/+7
| | | | | | | | | | | | | | | | Test case: @(output) @ (repeat :vars (x (y 42)) @ (list x y) @ (end) @(end) x and y are spuriously reported as unbound variables in the (list x y) form. * parser.y (expand_repeat_rep_args): Do the missing calls to match_reg_var when processing :vars list.
* Support horizontal @(block), phase 1.Kaz Kylheku2017-02-151-0/+5
| | | | | | | | | | | | Unresolved issue: horizontal @(accept) terminating in a vertical @(block) or horizontal @(block) in a different line, or vertical @(accept) caught in horizontal context. * match.c (h_block, h_accept_fail): New functions. (dir_tables_init): Register horizontal @(block), @(accept) and @(fail). * parser.y (elem): Support BLOCK syntax.
* bugfix: :filter not handled right in output var.Kaz Kylheku2017-01-261-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This issue was fixed in quasiliterals only. Because of the implementation duplicity between output vars and quasiliteral vars, we have to fix it in two places. When the parser handles quasiliterals, it builds vars without expanding the contents. The quasiliteral expander takes care of recognzing (sys:var ...) forms and properly handles them and their attributes, avoiding expanding the argument of a :filter keyword. When the parser handles an o_var that is a braced variable, it calls expand on its contents right there, then builds the (sys:var ...) form from the expanded contents. Why don't we just call expand_quasi in the o_var rule to have a single (sys:var ...) form expanded exactly how it is done in quasiliterals. * eval.c (expand_quasi): Change static function to external. * eval.c (expand_quasi): Declared. * parser.y (o_var): Construct an unexpanded (sys:var ...) form, and then wrap it in a one-element list. This is a de-facto quasi-items list, which can be expanded by expand_quasi. Then we pull the car of the expansion to get our expanded var.
* bugfix: catch arguments not registered properly.Kaz Kylheku2017-01-231-1/+1
| | | | | | | | Symptom: variables appearing in a @(catch) are reported as unbound variables anyway. * parser.y (process_catch_exprs); The parameters are the second element of the catch form, not its rest.
* Bump copyright year to 2017.Kaz Kylheku2017-01-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl: Add 2017 to all copyright headers and strings.
* parser bugfix: expand used instead of expand_forms.Kaz Kylheku2017-01-221-1/+1
| | | | | * parser.y (o_var): fix expand wrongly being called on a list of forms.
* Enable unbound warnings when expanding TXR code.Kaz Kylheku2017-01-221-12/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this change, Lisp expansion-time warnings are no longer suppressed during the parsing of the TXR pattern language. Embedded Lisp expressions can refer to TXR pattern variables, which generates spurious warnings that must be suppressed. Since TXR pattern variables are dynamically introduced in a very flexible way, it's hard to do an exact job of this. We take the crude approach that warnings are suppressed for all pattern variables that appear anywhere in the TXR code. To do that, we identify, at parse time, all directives which can bind new variables, and register those variables as if they were tentative global defs, purging all pending warnings for them. * match.c (binding_directive_table): New static hash table. (match_reg_var, match_reg_params, match_reg_elem): New functions. (match_reg_var_rec): New static function. (dir_tables_init): gc-protect binding_directive_table, and populate its entries. * match.h (into_k, named_k): Declared. (match_reg_var, match_reg_params, match_reg_elem): Declared. * parser.y (process_catch_exprs): New static function. (elem): Call match_reg_elem for each basic directive, to process the variables in that directive according to its operator symbol. Do this for each compound form elem and variable elem. Te horizontal @(define) eleme has its own grammar production here, and we handle its parameter list in that rule. (define_clause): Handle the parameters of a vertical @(define). It binds pattern variables, and so we must suppress unbound warnings for those. (catch_clauses_opt): Process the parameters bound by @(catch) clauses. (output_clause): Suppress warnings for the variables nominated by any :into or :named argument. (expand_repeat_rep_args): Suppress warnings for :counter variable, and for :vars variables. (parse_once): Remove the warning-muffling handler frame set up around the yyparse call. * txr.c (txr_main): Suppress warnings for TXR variables defined using -D syntax on the command line. Dump deferred warnings after parsing a .txr file.
* Improve accuracy of expansion of repeat/rep args.Kaz Kylheku2017-01-221-8/+10
| | | | | | | * parser.y (expand_repeat_rep_args): Correctly handle situation when :counter or :vars appears as an argument to another keyword. (A warning might be generated here, since this situation is wrong.)
* bugfix: expand macros in a number of directives.Kaz Kylheku2017-01-211-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | This is the last round of changes on this topic, bringing proper macro expansion to the arguments to @(skip), @(fuzz), @(next), @(call), @(cat), @(load) and @(close). * match.c (match_expand_keyword_args): Only process the keyword arguments if they are followed by an argument. Process @(next) arguments here too: :list and :string take a Lisp expression, but :tlist and :var take an argument which is not a Lisp expression and must be handled properly. Also, expand any non-keyword expression. This handles the <source> argument of @(next). (match_expand_elem): New function. * match.h (match_expand_elem): Declared. * parser.h (expand_meta): Declared. * parser.y (expand_meta): Static function becomes external. (elem): Expand elem other than require or do using match_expand_elem. We don't fold require and do into this because match_expand_elem has a backward compat switch in it that doesn't apply to these.
* bugfix: expand dest arg of @(output).Kaz Kylheku2017-01-211-1/+12
| | | | | * parser.y (expand_form_ver): New inline function. (output_clause): If exprs are present, expand first one.
* Expand lisp forms in @(mod) and @(modlast) args.Kaz Kylheku2017-01-191-4/+15
| | | | | | * parser.y (expand_forms_ver): New function. (repeat_parts_opt, rep_parts_opt): Expand the exprs_opt that follow MOD or MODLAST.
* Bugfix: expand macros in collect, coll, gather.Kaz Kylheku2017-01-191-16/+24
| | | | | | | | | | | | | | | In the argument lists of @(collect)/@(repeat), @(coll)/@(rep) and @(gather), Lisp expressions can appear as arguments to keywords or for supplying default values for variables. These are not being macro-expanded. * match.c (match_expand_vars): New static function. (match_expand_keyword_args): New function. * match.h (match_expand_keyword_args): Declared. * parser.y (gather_clause, collect_clause, elem): Use new function in match.c to expand the argument lists.
* Eliminate rejection of empty clauses.Kaz Kylheku2017-01-081-87/+27
| | | | | | * parser.y (grammar): Remove all checks which raise a syntax error if a clause is empty. These reject some correct situations, getting in the programmer's way.
* Bugfix: incorrect quasi-quoting over #R syntax.Kaz Kylheku2016-12-251-4/+0
| | | | | | | | | | | | | | | | | | | The issue is that ^#R(,(+ 2 2) ,(+ 3 3)) produces 4..6 rather than #R(4..6). 4..6 is, of course, the syntax (rcons 4 6) which evaluates to a range. Here we want the range to which it evaluates, not the syntax. * eval.c (expand_qquote_rec): Handle the case when the qquoted_form is a range atom: expand the from and to parts, and generate a rcons expression. Though this seems to be opposite to the previous paragraph, it's the right thing. * parser.y (range): Drop the unquotes_occurs case which produces rcons syntax. Produce a range object, always. This is the source of the problem: a (rcons ...) expression was produced here which was just traversed by the qquote expander as list. It's the expander that must produce the rcons expression.