summaryrefslogtreecommitdiffstats
path: root/match.c
Commit message (Collapse)AuthorAgeFilesLines
* Copyright year bump 2024.Kaz Kylheku2024-01-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2024.
* txr: bug in handling @{nil ...} variable match.Kaz Kylheku2023-12-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | In March 2012, b7f1f4c5bbea86e288b6a4d68595c1d2d07217bd introduced the feature that the @nil variable matches and discards. This was incompletely implemented. Some cases of a nil variable with modifiers fail to match. * match.c (dest_bind): This function must correctly handle the case when pattern is nil: it should just return bindings without extending them. If the pattern is any nonbindable symbol, it should indicate a failed match using t. The logic has not been touched since 2009, at which time an additional bogosity was introduced of calling funcall(testfun, pattern, value) when pattern is a non-bindable symbol. If value is a string, that could never work. Possibly the idea is that the value could come from a symbol-valued expression, such as one producing a keyword symbol. We are not going to support that, unless someone complains. * tests/000/nilvar.txr, tests/000/nilvar.expected: New files, providing a test case that fails without this commit.
* New @(push) directive.Kaz Kylheku2023-06-121-4/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | @(push) is like @(output), but feeds back into input. Use carefully. * parser.y (PUSH): New token. (output_push): New nonterminal symbol. (output_clause): Handle OUTPUT or PUSH via output_push. Some logic moved to output_helper. (output_helper): New function. Transforms both @(output) and @(push) directives. Checks both for valid keywords; push has only :filter. * parser.l (grammar): Recognize @(push similarly to other directives. * lib.[ch] (push_s): New symbol variable. * match.c (v_output_keys): Internal linkage changes to external. (v_push): New function. (v_parallel): We must fix the max_line algorithm not to use an initial value of zero, because lines can go negative thanks to @(push). We end up rejecting the pushed data. (v_collect): We can no longer assert that the data line number doesn't retreat. (dir_tables_init): Register push directive in table of vertical directives. * match.h (append_k, continue_k, finish_k): Existing symbol variables declared. (v_output_keys): Declared. * y.tab.c.shipped, * y.tab.h.shipped, * lex.yy.c.shipped: Updated. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* load: now establishes a block named load.Kaz Kylheku2023-05-311-3/+4
| | | | | | | | | | | | | | | | | | | | | | Files loaded from the command line or via the load function or @(load) directive now have a block named load in scope. Thus (return-from load <expr>) may be used to abort a load when it is finished. The load function will then return the value of <expr>. * eval.c (load): Bind a load block around the whole thing and use the captured return value as the function's return value. * match.c (v_load): Bind the load block here too. Ensure that if the block return is taken, the ret variable contains next_spec_k, so that processing continues with whatever directive follows the @(load). * txr.c (txr_main): Bind the block in severl cases that load code. * txr.1: Documented.
* txr: bugfix, allow lazy lists in multi match.Kaz Kylheku2023-02-271-0/+1
| | | | | | | | | * match.c (do_match_line): Handle LCONS the same way as CONS. When a variable occurs whose value is a list of strings, that may be a lazy list. I ran into a problem using @(data x) to capture a list of strings, and then matching that with @x; the error being "unsupported object in spec", caused by the list's LCONS type not handled in this switch.
* Copyright year bump 2023.Kaz Kylheku2023-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2023.
* txr: close streams.Kaz Kylheku2022-08-291-15/+59
| | | | | | | | | | | | | | | | | | | * match.c (noclose_k): New keyword variable. (v_next_keys, v_output_keys): New static variables. (v_next_impl): Use v_next_keys in calculating alist, rather than freshly allocating it each time. Check for the new :noclose keyword; if it is missing, close any locally opened stream when done. (v_output): Refer to v_output_keys precalculated list rather than allocating it every time. (match_files): If a stream is opened in by a call to open_data_source from this function, then the stream is closed when this function returns. (syms_init): Intern the :noclose symbol. (plist_keys_init): New function. (match_init): Call plist_keys_init. * txr.1: Documented new :noclose option of @(next).
* load/@(load): use path_cat.Kaz Kylheku2022-04-251-3/+1
| | | | | | | * eval.c (load): Use path_cat and dir_name instead of ad hoc path munging. * match.c (v_load): Likewise.
* New: load can search multiple directories.Kaz Kylheku2022-04-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * eval.c (load_search_dirs_s): New symbol variable. (load): Initialize the name variable whose address is passed as the third argument of open_txr_file, which is now an in-out parameter. Pass t for the new search_dirs parameter, so that load benefits from the searching. (eval_init): Initialize load_search_dirs_s and register the *load-search-dirs* special variable. * eval.h (load_search_dirs_s): Declared. (load_search_dirs): New macro. * match.c (v_load): Initialize the variable passed as third argument of open_txr_file. * parser.c (open_txr_file): Take a new argument, search_dirs. If this is t, it tells the function "if the path is not found, then recurse on the *load-search-dirs* variable. Otherwise, if the value is not t, it is a list of the remaining directories to try. The existing parameter orig_in_resolved_out must now point to a location which is initialized. It is assumed to hold the original target that was passed to the load function. The first_try_path is a the path to actually try, derived from that one. Thus, the caller of open_txr_file gets to determine the initial try path using its own algorithm. Then any recursive calls that go through *load-search-dirs* will pass a first argument which is made of the original name, combined with a search dir. (load_rcfile): Pass pointer to initialized location as third argument of open-txr_file, and pass a nil value for search_dirs: no search takes place when looking for that file, which is at a single, fixed location. * parser.h (open_txr_file): Declaration updated. * txr.c (sysroot_init): Initialize *load-search-dirs*. (txr_main): Ensure third argument in all calls to open_txr_file points to initialized variable, with the correct value, and pass t for the search_dirs argument. * txr.1: Documented. * stdlib/doc-syms.tl: Updated. New: load can search multiple directories. * eval.c (load_search_dirs_s): New symbol variable. (load): Initialize the name variable whose address is passed as the third argument of open_txr_file, which is now an in-out parameter. Pass t for the new search_dirs parameter, so that load benefits from the searching. (eval_init): Initialize load_search_dirs_s and register the *load-search-dirs* special variable. * eval.h (load_search_dirs_s): Declared. (load_search_dirs): New macro. * match.c (v_load): Initialize the variable passed as third * argument of open_txr_file. * parser.c (open_txr_file): Take a new argument, search_dirs. If this is t, it tells the function "if the path is not found, then recurse on the *load-search-dirs* variable. Otherwise, if the value is not t, it is a list of the remaining directories to try. The existing parameter orig_in_resolved_out must now point to a location which is initialized. It is assumed to hold the original target that was passed to the load function. The first_try_path is a the path to actually try, derived from that one. Thus, the caller of open_txr_file gets to determine the initial try path using its own algorithm. Then any recursive calls that go through *load-search-dirs* will pass a first argument which is made of the original name, combined with a search dir. (load_rcfile): Pass pointer to initialized location as third argument of open-txr_file, and pass a nil value for search_dirs: no search takes place when looking for that file, which is at a single, fixed location. * parser.h (open_txr_file): Declaration updated. * txr.c (sysroot_init): Initialize *load-search-dirs*. (txr_main): Ensure third argument in all calls to open_txr_file points to initialized variable, with the correct value, and pass t for the search_dirs argument. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* Use null_string throughout code base.Kaz Kylheku2022-02-051-3/+3
| | | | | | | | | | | | | | | | * eval.c (load): Use null_string instead of lit(""). * lib.c (obj_init): Likewise. * match.c (LOG_MATCH, LOG_MISMATCH, do_txeval): Likewise. * parser.c (regex_parse, lisp_parse_impl, find_matching_syms): Likewise. * stream.c (do_parse_mode): Likewise. * txr.c (sysroot_init): Likewise. (txr_main): Replace string(L"") with null_string.
* New function: match-fboundp.Kaz Kylheku2022-01-171-0/+5
| | | | | | | | | | | | | | | | | | User vapnik spaknik was asking in the mailing list whether there is an existence test for TXR pattern functions. Now there is. * eval.c (eval_init): Register match-fboundp intrinsic. * match.c (match_fbound): New function. * match.h (match_fbound): Declared. * tests/011/txr-case.txr: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* Copyright year bump 2022.Kaz Kylheku2022-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2022.
* freeform: bug: account for consumed prefix.Kaz Kylheku2022-01-041-1/+1
| | | | | | | | | | | | | | | | | | | | This was reported by user vapnik spaknik. The @(freeform), when reconstituting the unmatched trailing portion of the virtual line back into a list of lines, uses the abstract match position, neglecting to account for the fact that a prefix of the line may have been physically consumed to save memory. * match.c (v_freeform): When calling lazy_str_get_trailing_list, indicate the correct amount of prefix material, by subtracting, from the matching length, the base variable, which indicates how much of the prefix had been consumed. This consumption takes place above 4000 bytes, which is why the freeform test cases are not catching this. * tests/006/freeform-5.txr: New file. * tests/006/freeform-5.expected: New file.
* txr: do not ignore regex in positive match.Kaz Kylheku2021-12-271-39/+17
| | | | | | | | | | | | | | | | | * match.c (h_var): Refactor the logic here a bit. Without regard for whether the variable has a value, we dispatch the regex, fixed field and function cases. These handle the binding against the existing value. Then before all other cases, we check for the existing value and convert that to a literal text match. The effect of this is that now the regular expression is processed even if the variable has a value. * tests/010/span-var.txr: Last two test cases hardened a bit so they cannot fall through to a successful exit, if they invoke the wrong case. This is not related to this change. New test cases for regex span. * txr.1: Updated documentation and compatibility notes.
* txr: function span variable must match existing value.Kaz Kylheku2021-12-271-3/+231
| | | | | | | | | | | | | | | | | | | | * match.c (h_var_compat): New function; verbatim copy of existing h_var prior to this commit. (h_var): If a variable has an existing binding, but is a function spanning match, do not substitute it with text. Handle it with the ordinary case, in which we now use dest_bind instead of cons. (v_var): Similarly, here, we must also use dest_bind, rather than always freshly binding the variable. (match_compat_fixup): For 272 compatibility, substitute h_var_compat for h_var in the horizontal directive table. * tests/010/span-var.txr: New test cases. * txr.1: Documentation updated and also improved overall. The behavior when a variable has an existing value is clarified for the regex and fixed field case. Also update and condense compat notes for 272.
* txr: allow variable to span vertical function.Kaz Kylheku2021-12-261-5/+44
| | | | | | | | | | | | | | | * match.c (v_var_compat, v_var): New static functions. (match_files): No longer recognize v_var specially; it is now handled via vertical table. (dir_tables_init): Register a vertical sys:var directive also via v_var function. (match_compat_fixup): New function. * txr.c (compat): Call match_compat_fixup. * tests/010/span-var.txr: New file. * txr.1: Documented.
* lookup_var: don't pass dyn_env explicitly.Kaz Kylheku2021-09-031-1/+1
| | | | | | | | | | | | | Since lookup_var(nil, ...) skips the environment and goes for dyn_env, there is no need to pass dyn_env explicitly; it's a bit of an anti-pattern. The argument is intended for lexical scopes. * eval.c (eval_exception, expand_eval, load): Pass nil to lookup_var instead of the current dynamic environment. * match.c (v_load): Likewise. * parser.c (txr_parse, read_eval_ret_last): Likewise.
* load: new *load-hooks* feature.Kaz Kylheku2021-09-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *load-hooks* lets a .txr, .tl or .tlo file specify actions to be taken when the loading of that file completes, whether normally or via an exception. They are also honored by process exit. For instance, with this, we can have a Lisp file that behaves like a script which cleans up after itself (e.g. removing temporary files) even if it is not run as a stand-alone program, but invoked via (load ...). Because it's not a stand-alone program, it cannot simply use the at-exit-call mechanism. The unwind-protect operator could be used, but it's inconvenient because it protects a single form. The *load-hooks* feature in effect protects all the top level forms of a load, similarly to unwind-protect. Also, unwind-protect does not guard against a process exit. (However, *load-hooks* does not guard against an abnormal exit, only normal termination). * eval.c (load_hooks_s): New symbol variable. (run_load_hooks): New function. (run_load_hooks_atexit): New static function. (load): bind *load-hooks* to nil around load. Implement the hooks processing via run_load_hooks, taking care to pass the load-time dynamic environment that has already been undone. (eval_init): Initialize load_hooks_s and register the *load-hooks* variable. Register run_load_hooks_atexit with atexit, so the current value of *load-hooks* is processed on process exit. * eval.h (load_hooks_s, run_load_hooks): Declared. * match.c (v_load): Similar changes as in load. * txr.c (txr_main): Run the load hooks with run_load_hooks immediately after processing the .txr or .tl file, before entering the listener. * tests/019/load-hook.tl: New directory and file * tests/load-hook.tl: New file. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* license: reformat to fit 80 columns.Kaz Kylheku2021-08-161-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h: License reformatted. * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
* lazy-stream-cons: control close throwing behavior.Kaz Kylheku2021-08-071-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | * eval.c (eval_init): Update registrations of lazy-stream-cons and get-lines with one more optional argument. * lib.c (simple_lazy_stream_func_nt, lazy_stream_func_nt): New static functions. (lazy_stream_cons): Take a new argument, no_throw_close, defaulting it to nil. When calling close_stream directly, pass the inverted value of no_throw_close. Choose the new _nt functions for the lazy list if no_throw_close is true; those functions pass nil as the second argument of close_stream. * lib.h (lazy_stream_cons): Declaration updated. * match.c (v_next_impl, open_data_source, match_fun): Pass down the nothrow value to lazy_stream_cons, or else nil in situations when that is not applicable or there is no such value. Thus the :nothrow feature of v_next will now not only ensure that there is no exception when opening the stream but also when closing it. Unusual situations encountered when the lazy list reads from the stream still throw. * txr.1: Documented.
* txr: @(eof) takes argument for binding termination status.Kaz Kylheku2021-08-051-21/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We extend the matching context structures to keep track of the underlying stream from which lines are being taken via the lazy list. Then the implementation of the @(eof) directive, when it hits the eof condition, can use this stream to gain access to the termination status. * match.c (match_line_ctx, match_files_ctx): New member, stream. (ml_all): Take stream argument and initialize new member. (h_call, do_match_line): Pass stream argument to h_call. (mf_all, mf_file_data): Take stream argument and initialize new member. (mf_from_ml): Propagate stream from line context to file context. (freeform_prepare, v_next_impl, match_filter, match_fun, extract): Pass stream argument where now needed. (v_eof): Implement termination status binding via the stream stored in the context. (open_data_source): Store stream in match files context. * tests/010/eof-status.txr: New file. * tests/010/eof-status.expected: New file. * Makefile (tst/tests/010/eof-status.ok): -B option for new test. * txr.1: Documented eof directive, argument and all. * stdlib/doc-syms.tl: Updated.
* hash: change make_hash interface.Kaz Kylheku2021-07-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The make_hash function now takes the hash_weak_opt_t enumeration instead of a pair of flags. * hash.c (do_make_hash): Take enum argument instead of pair of flags. Just store the option; nothing to calculate. (weak_opt_from_flags): New static function. (tweak_hash): Function removed. (make_seeded_hash): Adjust to new do_make_hash interface with help from weak_opt_from_flags. (make_hash, make_eq_hash): Take enum argument instead of pair of flags. (hashv): Calculate hash_weak_opt_t enum from the extracted flags, pass down to make_eq_hash or make_hash. * hash.h (tweak_hash): Declration removed. (make_hash, make_eq_hash): Declarations updated. * eval.c (me_case, expand_switch): Update make_hash calls to new style. (eval_init): Update make_hash calls and get rid of tweak_hash calls. This renders the tweak_hash function unused. * ffi.c (make_ffi_type_enum, ffi_init): Update make_hash calls to new style. * filter.c (make_trie, trie_add, filter_init): Likewise. * lib.c (make_package_common, obj_init, obj_print): Likewise. * lisplib.c (lisplib_init): Likewise. * match.c (dir_tables_init): Likewise. * parser.c (parser_circ_def, repl, parse_init): Likewise. * parser.l (parser_l_init): Likewise. * struct.c (struct_init, get_slot_syms): Likewise. * sysif.c (get_env_hash): Likewise. * lex.yy.c.shipped, y.tab.c.shipped: Updated.
* type: disallow structs using built-in type names.Kaz Kylheku2021-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a big commit motivated by the need to clean up the situation with built-in type symbols, COBJ objects and structs. The struct type system allows struct types to be defined for symbols like regex or str, which are used by built-in or cobj types. This is a bad thing. What is worse, structure instances are COBJ types which identify their type using the COBJ class symbol mechanism. There are places in the C implementation which assume that when a COBJ has a certain class symbol, it is of a certain expected type, which is totally different from and incompatible form a struct instance. User code can define a structure object which will fool that code. There are multiple things going on in this patch. The major theme is that the COBJ representation is changing. Instead of a class symbol, COBJ instances now carry a "struct cobj_class *" pointer. This pointer is obtained by registration via the cobj_register function. All modules must register their class symbols to obtain these class handles, which are then used in cobj() calls for instantiation. The CPTR type was identical to COBJ until now, except for the type tag. This is changing; CPTR objects will keep the old representation with the class symbol. commit 20fdfc6008297001491308849c17498c006fe7b4 Author: Kaz Kylheku <kaz@kylheku.com> Date: Thu Jul 8 19:17:39 2021 -0700 * ffi.h (carray_cls): Declared. * hash.h (hash_cls): Declared. (hash_early_init): Declared. * lib.h (struct cobj_class): New struct. (struct cobj): cls member changing to struct cobj_class *. (struct cptr): New struct, same as previous struct cobj. (union obj): New member cp of type struct cptr, for CPTR. (builtin_type): Declared. (class_check): Declaration moved closer to COBJ-related functions and updated. (cobj_register, cobj_register_super, cobj_class_exists): New functions declared. (cobjclassp, cobj_handle, cobj_ops): Declarations updated. * parser.h (parser_cls): Declared. * rand.h (random_state_cls): Declared. * regex.h (regex_cls): Declared. * stream.h (stream_cls, stdio_stream_cls): Declared. * struct.h (struct_cls): Declared. * tree.h (tree_cls, tree_iter_cls): Declared. * vm.h (vm_desc_cls): Declared. * buf.c (buf_strm, make_buf_stream): Pass stream_cls functions instead of stream_s class symbol. * chksum.c (sha256_ctx_cls, md5_ctx_cls): New static class handles. (sha256_begin, sha256_hash, sha256_end, md5_begin, md5_hash, md5_end): Pass class handles to instead of class symbols. (chksum_init): Initialize class handle variables. * ffi.c (ffi_type_cls, ffi_call_desc_cls, ffi_closure_cls, union_cls): New static class handles. (carray_cls): New global variable. (ffi_type_struct_checked, ffi_type_print_op, ffi_closure_struct_checked, ffi_closure_print_op, make_ffi_type_builtin, make_ffi_type_pointer, make_ffi_type_struct, make_ffi_type_union, make_ffi_type_array, make_ffi_type_enum, ffi_call_desc_checked, ffi_call_desc_print_op, ffi_make_call_desc, ffi_make_closure, carray_struct_checked, carray_print_op, make_carray, cptr_getobj, cptr_out, uni_struct_checked, make_union_common): Pass class handles instead of class symbols. (ffi_init): Initialize class handle variables. * filter.c (regex_from_trie): Use hash_cls class handle instead of hash_s. * gc.c (mark_obj): Split COBJ and CPTR cases since the representation is different. * hash.c (hash_cls, hash_iter_cls): New class handles. (make_similar_hash, copy_hash, gethash_c, gethash_e, remhash, clearhash, hash_count, get_hash_userdata, set_hash_userdata, hashp, hash_iter_init, hash_begin, hash_next, hash_peek, hash_reset, hash_reset, hash_uni, hash_diff, hash_symdiff, hash_isec): Pass class handles instead of class symbols. (hash_early_init): New function. (hash_init): Set the class symbols in the class handles that were created in hash_early_init at a time when these symbols did not exist. * lib.c (nelem): New macro. (cobj_class): New static array. (cobj_ptr): New static pointer. (cobj_hash): New static hash. (seq_iter_cls): New static class handle. (builtin_type_p): New function. (typeof): Struct instances now all carry the same symbol, struct, as their COBJ class symbol. To get their type, we must call struct_type_name. (subtypep): Rearrangement of two cases: let's make the reflexive case first. Adjust code for different location of COBJ class symbol. (seq_iter_init_with_info, seq_begin, seq_next, seq_reset, iter_begin, iter_more, iter_item, iter_step, iter_reset, make_like, list_collect, do_generic_funcall): Use class handles instead of class symbols. (class_check, cobj, cobjclassp, cobj_handle, cobj_ops): Take class handle argument instead of class symbol. (cobj_register, cobj_register_super, cobj_class_exists): New functions. (cobj_populate_hash): New static function. (cobj_print_op): Adjust for different location of class (cptr_print_op, cptr_typed, cptr_type, cptr_handle, cptr_get): cptr functions now refer to obj->cp rather than obj->co. (copy, length, sub, ref, refset, replace, dwim_set, dwim_del, obj_print): Use class handles for various COBJ types rather than class symbols. (obj_init): gc-protect cobj_hash. Initialize seq_iter_cls class symbol and cobj_hash. Populate cobj_hash as the last initialization step. (init): Call hash_early_init immediately after gc_init. diff --git a/lib.c b/lib.c * match.c (do_match_line): Refer to regex_cls class handle instead of regex_s.. * parser.c (parser_cls): New global class handle. (parse, parser_get_impl, lisp_parse_impl, txr_parse, parser_errors): Use class handles instead of class symbols. (parse_init): Initialize parser_cls. * rand.c (random_state_cls): New global class handle. (make_state, random_state_p, make_random_state, random_state_get_vec, random_fixnum, random_float, random): Use class handles instead of class symbols. (rand_init): Initialize random_state_cls. * regex.c (regex_cls): New global class handle. (chset_cls): New static class handle. (reg_compile_csets, reg_derivative, regex_compile, regexp, regex_source, regex_print, regex_run, regex_machine_init): Use class handles instead of class symbols. (regex_init): Initialize regex_cls and chset_cls. * socket.c (make_dgram_sock_stream): Use stream_cls class symbol instead of stream_s. * stream.c (stream_cls, stdio_stream_cls): New class handles. (make_null_stream, stdio_get_fd, make_stdio_stream_common, stream_fd, sock_family, sock_type, sock_peer, sock_set_peer, make_dir_stream, make_string_input_stream, make_string_byte_input_stream, make_strlist_input_stream, make_string_output_stream, make_strlist_output_stream, get_list_from_stream, make_catenated_stream, make_delegate_stream, make_delegate_stream, stream_set_prop, stream_get_prop, close_stream, get_error, get_error_str, clear_error, get_line, get_char, get_byte, get_bytes, unget_char, unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, format, put_string, put_char, put_byte, flush_stream, seek_stream, truncate_stream, get_indent_mode, test_set_indent_mode, test_neq_set_indent_mode, set_indent_mode, get_indent, set_indent, inc_indent, width_check, force_break, set_max_length, set_max_depth): Use class handle instead of symbol. (stream_init): Initialize stream_cls and stdio_stream_cls. * struct.c (struct_type_cls, struct_cls): New class handles. (struct_init): Initialize struct_type_cls and struct_cls. (struct_handle): Static function moved to avoid forward declaration. (stype_handle): Refer to struct_type_cls class handle instead of struct_type_s symbol. Handle instance objects in addition to types. (make_struct_type): Throw error if a built-in type is being defined as a struct type. Refer to class handle instead of class symbol. (find_struct_type, allocate_struct, make_struct_impl, make_lazy_struct, copy_struct): Refer to class handle instead of class symbol. * strudel.c (make_struct_delegate_stream): Refer to stream_cls class handle instead of stream_s symbol. * sysif.c (dir_cls): New class handle. (poll_wrap): Use typep instead of subtypep, eliminating access to class symbol. (opendir_wrap, closedir_wrap, readdir_wrap): Use class handles instead of class symbols. (sysif_init): Initialize dir_cls. * syslog.c (make_syslog_stream): Refer to stream_cls class handle instead of stream_s symbol. * tree.c (tree_cls, tree_iter_cls): New class handles. (tree_insert_node, tree_lookup_node, tree_delete_node, tree_root, tree_equal_op, tree, copy_search_tree, make_similar_tree, treep, tree_begin, copy_tree_iter, replace_tree_iter, tree_reset, tree_next, tree_peek, tree_clear): Use class handle instead of class symbol. (tree_init): Initialize tree_cls and tree_iter_cls. * unwind.c (sys_cont_cls): New static class handle. (revive_cont, capture_cont): Use class handle instead of class symbol. (uw_late_init): Initialize sys_cont_cls. * vm.c (vm_desc_cls): New global class handle. (vm_closure_cls): New static class handle. (vm_desc_struct, vm_make_desc, vm_closure_struct, vm_make_closure, vm_copy_closure): Use class handle instead of class symbol. (vm_init): Initialize vm_desc_cls and vm_closure_cls.
* txr: stack protection in pattern language.Kaz Kylheku2021-06-241-0/+4
| | | | | | | | | * txr.c (do_match_line, match_files): call gc_stack_check on entry. * tests/012/stack2.txr: New file. * tests/012/stack2.expected: New file.
* c_str now takes a self argument.Kaz Kylheku2021-06-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a self parameter to c_str so that when a non-string occurs, the error is reported against a function. Legend: A - Pass existing self to c_str. B - Define self and pass to c_str and possibly other functions. C - Take new self parameter and pass to c_str and possibly other functions. D - Pass existing self to c_str and/or other functions. E - Define self and pass to other functions, not c_str. X - Pass nil to c_str. * buf.c (buf_strm_put_string, buf_str): B. * chksum.c (sha256_str, md5_str): C. (sha256_hash, md5_hash): D. * eval.c (load): D. * ffi.c (ffi_varray_dynsize, ffi_str_put, ffi_wstr_put, ffi_bstr_put): A. (ffi_char_array_put, ffi_wchar_array_put): C. (ffi_bchar_array_put): A. (ffi_array_put, ffi_array_out, ffi_varray_put): D. * ftw.c (ftw_wrap): A. * glob.c (glob_wrap): A. * lib.c (copy_str, length_str, coded_length,split_str_set, list_str, cmp_str, num_str, out_json_str, out_json_rec, display_width): B. (upcase_str, downcase_str, string_extend, search_str, do_match_str, do_rmatch_str, sub_str, replace_str, cat_str_append, split_str_keep, trim_str, int_str, chr_str, span_str, compl_span_str, break_str, length_str_gt, length_str_ge, length_str_lt, length_str_le, find, rfind, pos, rpos, mismatch, rmismatch): A. (c_str): Add self parameter and use in type mismatch diagnostic. If the parameter is nil, use "internal error". (flo_str): B, and correction to "flot-str" typo. (out_lazy_str, out_quasi_str, obj_print_impl): D. * lib.h (c_str): Declaration updated. * match.c (dump_var): X. (v_load): D. * parser.c (open_txr_file): C. (load_rcfile): E. (find_matching_syms, provide_atom): X. (hist_save, repl): B. * parser.h (open_txr_file): Declaration updated. * parser.y (chrlit): X. * regex.c (search_regex): A. * socket.c (getaddrinfo_wrap, sockaddr_pack): A. (dgram_put_string): B. (open_sockfd): D. (sock_connect): E. * stream.c (stdio_put_string, tail_strategy, vformat_str, open_directory, open_file, open_tail, remove_path, rename_path, tmpfile_wrap, mkdtemp_wrap, mkstemp_wrap): B. (do_parse_mode, parse_mode, make_string_byte_input_stream): B. (normalize_mode, normalize_mode_no_bin): E. (string_out_put_string, formatv, put_string, open_fileno, open_subprocess, open_command, base_name, dir_name, short_suffix, long_suffix): A. (run): D. (win_escape_cmd, win_escape_arg): X. * stream.h (parse_mode, normalize_mode, normalize_mode_no_bin): Declarations updated. * sysif.c (mkdir_wrap, do_utimes, dlopen_wrap, dlsym_wrap, dlvsym_wrap): A. (do_stat, do_lstat): C. (mkdir_nothrow_exists, ensure_dir): E. (chdir_wrap, rmdir_wrap, mkfifo_wrap, chmod_wrap, symlink_wrap, link_wrap, readlink_wrap, exec_wrap, getenv_wrap, setenv_wrap, unsetenv_wrap, getpwnam_wrap, getgrnam_wrap, crypt_wrap, fnmatch_wrap, realpath_wrap, opendir_wrap): B. (stat_impl): statfn pointer-to-function argument now takes self parameter. When calling it, we pass name. * syslog.c (openlog_wrap, syslog_wrapv): A. * time.c (time_string_local, time_string_utc, time_string_meth, time_parse_meth): A. (strptime_wrap): B. * txr.c (txr_main): D. * y.tab.c.shipped: Updated.
* txr: gather: report list of missing required vars.Kaz Kylheku2021-04-131-2/+8
| | | | | * match.c (v_gather): Identify all required variables that are missing, and list them all in the diagnostic.
* txr: tighten keyword arg parsing in @(next).Kaz Kylheku2021-02-221-13/+17
| | | | | | | | | | | | This fixes the bug of not allowing @(next :list nil). Also the bug of @(next :string) or @(next :string nil) reporting a nonsensical error message, instead of correctly complaining that nil isn't a string. * match.c (v_next_impl): Capture the conses returned by assoc, so we know whether a keyword is present. Then use those conses in tests which detect whether the keyword is present, rather than the values, which could be nil.
* txr: typo in comment.Kaz Kylheku2021-02-221-1/+1
| | | | * match.c (match_files): Fix bungled wording.
* txr: bugfix: give @(call) same semantics as direct call.Kaz Kylheku2021-02-221-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | The @(call) directive is buggy in the following ways, which cause an indirect call to behave differently from a direct call. It creates a new context, and so if the opening of a data source is deferred into the indirectly called function, that data source is lost when the indirect call terminates. Furthermore, if a data source is already established, there is no progress through the data: two consecutive @(call ...) directives operate on the same data. It also fails to implement vertical to horizontal fallback; if a function is not vertically defined, the directive fails. * match.c (v_call): Rewrite the core logic in the following way: we rewrite the indirect @(call) syntax into direct call syntax, substitute that into c->spec, and then just call v_fun. * tests/008/call-2.expected: New file. * tests/008/call-2.txr: New file. Test fails before this commit because both calls are matching against the same "A" element of the list.
* txr: pattern function calls are non-matching.Kaz Kylheku2021-02-211-22/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch causes TXR to treat calls to verticatl functions, as well as the @(call) directive to be considered non-matching directives, so that opening the data source is deferred. This allows included .txr files to call the funtions that they define, without the side effect of standard input being read. * match.c (open_data_source): Function refactored to reduce duplication. c->data is checked first, and if it is not t, nothing is done, making the function cheaper in the frequent case. The non_matching_dir condition changes. We now check that the first element of the first spec is a non-nil symbol. If it has a function binding as a vertical function, then that is considered non_matching. (dir_tables_init): Treat @(call) as a non-matching directive. * Makefile (tst/tests/008/no-stdin-hang.ok): Add -n argument for non-interactive, which will cause stdin to be read in that test case if there is a regression in this change. If make tests is run in a terminal, this will hang make tests. * tests/no-stdin-hang.txr: New file. * tests/no-stdin-hang.expected: New file.
* @(call): bugfix: matching doesn't continue.Kaz Kylheku2021-02-171-1/+2
| | | | | | | | | | | * match.c (v_call): This function must propagate the next_spec_k return value, rather than return a short-circuiting match object, which causes the parent to immediately succeed. * tests/008/call.txr: New test case, from Frank Schwidom. * tests/008/call.expected: Likewise.
* @(rebind): bugfix: don't clobber right side variable.Kaz Kylheku2021-01-301-1/+1
| | | | | | | | | | | | | | | | | | | | | * Makefile (tst/tests/000/binding.ok): Pass -B to txr for this new test. * match.c (v_rebind): Fix gaping copy-and-paste bug here, which causes rebind to take on the behavior of local/forget; it takes all symbols that appear as its arguments from the environment and produces an environment in which they don't exist. What we want is to remove the left variables from the environment, and since that is a nested pattern, the right way to do that is to flatten it. Bug reported by Frank Schwidom. * tests/000/binding.txr: New file. * tests/000/binding.expected: New file. * txr.1: Improve documentation of @(rebind), also making improvements in @(set) documentation.
* Copyright year bump 2021.Kaz Kylheku2021-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * METALICENSE: 2020 copyrights bumped to 2021. Added note about SHA-256 routines from Colin Percival. * LICENSE, LICENSE-CYG, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/copy-file.tl, share/txr/stdlib/debugger.tl, share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/each-prod.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/param.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/quips.tl, share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2021.
* txr, eval: eliminate some func_n1 calls.Kaz Kylheku2020-12-101-3/+3
| | | | | | | | | * match.c (do_output_line, do_repeat, v_deffiler): Replace func_n1(cdr) and func_n1(rest) with cdr_f. * eval.c (eval_init): Replace func_n1(car) and func_n1(cdr) with car_f and cdr_f. Should have been done in 2011 when this was done for the registrations of car and cdr.
* env: move function to sysif.cKaz Kylheku2020-10-161-0/+1
| | | | | | | | | | | | | | | | | * lib.c (env_list): Static variable removed; now appears in sysif.c (env): Function removed; now in sysif.c. (obj_init): Don't gc-protect env_list here any more. * lib.h (env): Declaration removed. * match.c: Must include "sysif.h" now for env. * sysif.c (env_list): Static variable moved here. (env): Function moved here. (sysif_init): env_list gc-protected here. * sysif.h (env): Declared.
* tags: address small issue with tag lookup.Kaz Kylheku2020-09-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Exuberant Ctags uses the full content of one line as the key to find a tag. A function declaration that is split into multiple lines can have a first line which is identical to the definition, as in: static int add(int a, int b); static int add(int a, int b) { return a + b; } Here, the search key which ctags uses for the add function is "static int add(int a,", taken from the definition. But it's exactly the same as a the first line of the declaration, and that is what Vim jumps to for that tag. A few function declarations in TXR have this issue. * eval.c (expand_params_rec, do_eval): Make the first line of the forward declaration different from the first line of the definition. * match.c (mf_all): Likewise. * struct.c (make_struct_type_compat): Likewise.
* txr: identify output repeat vars at parse timeKaz Kylheku2020-08-171-51/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Up to now the @(repeat) and @(rep) directives have scanned their interior to find output variables. We now hoist that into the parser; the variables are found up-front and integrated into the abstract syntax. This work anticipates doing a more proper job of identifying free variables in Lisp expressions. * match.c (extract_vars): Delete static this function here. It moves into parser.y. (extract_bindings): Don't call extract_vars to obtain the variables occurring in the repeat body. Get it as a new argument, occur_vars. (do_output_line, do_repeat): Extract the new occur_vars element of the abstract syntax node and pass it to extract_bindings. * parser.y (extract_vars): New static function, moved here from match.c. (repeat_rep_helper): Scan over each of the repeat subclauses with extract vars. Catenate the resulting lists together and pass through uniq to squash repeated symbols. Then add the resulting vars as a new element of the returned syntax node.
* Remove unnecessary forward declarations.Kaz Kylheku2020-08-171-1/+0
| | | | | | * lib.c (length_proper_list): Forward decl removed. * match.c (do_match_line): Likewise.
* match.c: revert bad commits.Kaz Kylheku2020-08-141-158/+118
| | | | | | | | | | | | | | | | | | | This contains two reverts, folded into one. "txr: spurious retention in @(next)." commit fe1e960389a89f481d46c02aa040fdc762da735f. "txr: avoid by-value match_files_ctx passing." commit 3d7330b827d6e9cc0d9e87edd30388374cb45900. These are dependent. The second of these two (i.e. the one that was done first) is the problematic one. By-reference passing of contexts means contexts become shared objects between callers and callees. But this is incorrect, because mutations are going on, like c->spec = rest(c->spec) or c->data = rest(c->data). These mutations were done under the assumption of safety: that the caller is not affected due to by-value semantics of the entire context.
* Change noreturn to NORETURN.Kaz Kylheku2020-08-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The noreturn macro is respelled to harmonize with the upper-case INLINE and NOINLINE. * lib.h (noreturn): Rename to NORETURN. * arith.c (not_number, not_integer, invalid_ops, invalid_op): Declaration updated. * arith.c (do_mp_error): Likewise. * eval.c (eval_error, no_bindable_error, dotted_form_error): Likewise. * eval.h (eval_error): Likewise. * lib.c (unsup_obj, callerror, wrongargs): Likewise. * match.c (sem_error): Likewise. * stream.c (unimpl, unimpl_put_string, unimpl_put_char, unimpl_put_byte, unimpl_get_line, unimpl_get_char, unimpl_get_byte, unimpl_unget_char, unimpl_unget_byte, unimpl_put_buf, unimpl_fill_buf, unimpl_seek, unimpl_truncate, unimpl_get_sock_family, unimpl_get_sock_type, unimpl_get_sock_peer, unimpl_set_sock_peer): Likewise. * struct.c (no_such_struct, no_such_slot, no_such_static_slot): Likewise. * unwind.h (jmp_restore, uw_throw, uw_throwf, uw_errorf, uw_errorfv, type_mismatch): Likewise.
* txr: support @(if)/@(elif)/@(else) in @(output).Kaz Kylheku2020-07-071-0/+14
| | | | | | | | | | | | | | | | | | | This turns out to be way easier than I thought. * match.c (do_output_if): New static function. (do_output): Handle if via do_output_if. * parser.y (out_if_clause, out_elif_clauses_opt, out_else_clause_opt): New nonterminal symbols and grammar rules. (out_clause): Now produces out_if_clause. (not_a_clause): Remove ELIF and ELSE; these entries here cause conflicts now. Here, continue to recognize the Lisp if, which is distinguished by having at least two arguments. out_if_clause matches only a one-argument if, and a no-argumeent one that is diagnosed as erroneous. * txr.1: Documented.
* txr: factor repeat out of output.Kaz Kylheku2020-07-071-97/+107
| | | | | * match.c (do_repeat): New static function. (do_output): Repeat processing logic moved into do_repeat.
* c_num: now takes self argument.Kaz Kylheku2020-06-291-23/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The c_num and c_unum functions now take a self argument for identifying the calling function. This requires changes in a large number of places. In a few places, additional functions acquire a self argument. The ffi module has the most extensive example of this. Some functions mention their name in a larger string, or have scattered literals giving their name; with the introduction of the self local variable, these are replaced by references to self. In the following changelog, the notation TS stands for "take self argument", meaning that the functions acquires a new "val self" argument. The notation DS means "define self": the functions in question defines a self variable, which they pass down. The notation PS means that the functions pass down an existing self variable to functions that now require it. * args.h (args_count): TS. * arith.c (c_unum, c_num): TS. (toint, exptv): DS. * buf.c (buf_check_len, buf_check_alloc_size, buf_check_index, buf_do_set_len, replace_buf, buf_put_buf, buf_put_i8, buf_put_u8, buf_put_char, buf_put_uchar, buf_get_bytes, buf_get_i8, buf_get_u8, buf_get_cptr, buf_strm_get_byte_callback, buf_strm_unget_byte, buf_swap32, str_buf, buf_int, buf_uint, int_buf, uint_buf): PS. (make_duplicate_buf, buf_shrink, sub_buf, buf_print, buf_pprint): DS. * chskum.c (sha256_stream_impl, sha256_buf, crc32_buf, md5_stream_impl, md5_buf): TS. (chksum_ensure_buf, sha256_stream, sha256, sha256_hash, md5_stream, md5, md5_hash): PS. (crc32_stream): DS. * combi.c (perm_while_fun, perm_gen_fun_common, perm_str_gen_fun, rperm_gen_fun, comb_vec_gen_fun, comb_str_gen_fun, rcomb_vec_gen_fun, rcomb_str_gen_fun): DS. * diff.c (dbg_clear, dbg_set, dbg_restore): DS. * eval.c (do_eval, gather_free_refs, maprodv, maprendv, maprodo, do_args_apf, do_args_ipf): DS. (op_dwim, me_op, map_common): PS. (prod_common): TS. * ffi.c (struct txr_ffi_type): release member TS. (make_ffi_type_pointer): PS and release argument TS. (ffi_varray_dynsize, ffi_array_in, ffi_array_put_common, ffi_array_get_common, ffi_varray_in, ffi_varray_null_term): PS. (ffi_simple_release, ffi_ptr_in_release, ffi_struct_release, ffi_wchar_array_get, ffi_array_release_common, ffi_array_release, ffi_varray_release): TS. (ffi_float_put, double_put, ffi_be_i16_put, ffi_be_u16_put, ffi_le_i16_put, ffi_le_u16_put, ffi_be_i32_put, ffi_be_u32_put, ffi_le_i32_put, ffi_sbit_put, ffi_ubit_put, ffi_buf_d_put, make_ffi_type_array, make_ffi_type_enum, ffi_type_compile, make_ffi_type_desc, ffi_make_call_desc, ffi_call_wrap, ffi_closure_dispatch_save, ffi_put_into, ffi_in, ffi_get, ffi_put, carray_set_length, carray_blank, carray_buf, carray_buf_sync, carray_cptr, carray_refset, carray_sub, carray_replace, carray_uint, carray_int): PS. (carray_vec, carray_list): DS. * filter.c (url_encode, url_decode, base64_stream_enc_impl): DS. * ftw.c (ftw_callback, ftw_wrap): DS. * gc.c (mark_obj, gc_set_delta): DS. * glob.c (glob_wrap): DS. * hash.c (equal_hash, eql_hash, eq_hash, do_make_hash, hash_equal, set_hash_traversal_limit, gen_hash_seed): DS. * itypes.c (c_i8, c_u8, c_i16, c_u16, c_i32, c_u32, c_i64, c_u64, c_short, c_ushort, c_int, c_uint, c_long, c_ulong): PS. * lib.c (seq_iter_rewind): TS and becomes internal. (seq_iter_init_with_info, seq_setpos, replace_str, less, replace_vec, diff, isec, obj_print_impl): PS. (nthcdr, equal, mkstring, mkustring, upcase_str, downcase_str, search_str, sub_str, cat_str, scat2, scat3, fmt_join, split_str_keep, split_str_set, trim_str, int_str, chr_int, chr_str, chr_str_set, vector, vecref, vecref_l, list_vec, copy_vec, sub_vec, cat_vec, lazy_str_put, lazy_str_gt, length_str_ge, length_str_lt, length_str_le, cptr_size_hint, cptr_int, out_lazy_str, out_quasi_str, time_string_local_time, time_string_utc, time_fields_local_time, time_fields_utc, time_struct_local, time_struct_utc, make_time, time_meth, time_parse_meth): DS. (init_str, cat_str_init, cat_str_measure, cat_str_append, vscat, time_fields_to_tm, time_struct_to_tm, make_time_impl): TS. * lib.h (seq_iter_rewind): Declaration removed. (c_num, c_unum, init_str): Declarations updated. * match.c (LOG_MISMATCH, LOG_MATCH): PS. (h_skip, h_coll, do_output_line, do_output, v_skip, v_fuzz, v_collect): DS. * parser.c (parser, circ_backpatch, report_security_problem, hist_save, repl, lino_fileno, lino_getch, lineno_getl, lineno_gets, lineno_open): DS. (parser_set_lineno, lisp_parse_impl): PS. * parser.l (YY_INPUT): PS. * rand.c (make_random_state): PS. * regex.c (print_rec): DS. (search_regex): PS. * signal.c (kill_wrap, raise_wrap, get_sig_handler, getitimer_wrap, setitimer_wrap): DS. * socket.c (addrinfo_in, sockaddr_pack, fd_timeout, to_connect, open_sockfd, sock_mark_connected, sock_timeout): TS. (getaddrinfo_wrap, dgram_set_sock_peer, sock_bind, sock_connect, sock_listen, sock_accept, sock_shutdown, sock_send_timeout, sock_recv_timeout, socketpair_wrap): DS. * stream.c (generic_fill_buf, errno_to_string, stdio_truncate, string_out_put_string, open_fileno, open_command, base_name, dir-name): DS. (unget_byte, put_buf, fill_buf, fill_buf_adjust, get_line_as_buf, formatv, put_byte, test_set_indent_mode, test_neq_set_indent_mode, set_indent_mode, set_indent, inc_indent, set_max_length, set_max_depth, open_subprocess, run ): PS. (fds_subst, fds_swizzle): TS. * struct.c (make_struct_type, super, umethod_args_fun): PS. (method_args_fun): DS. * strudel.c (strudel_put_buf, strudel_fill_buf): DS. * sysif.c (errno_wrap, exit_wrap, usleep_wrap, mkdir_wrap, ensure_dir, makedev_wrap, minor_wrap, major_wrap, mknod_wrap, mkfifo_wrap, wait_wrap, wifexited, wexitstatus, wifsignaled, wtermsig, wcoredump, wifstopped, wstopsig, wifcontinued, dup_wrap, close_wrap, exit_star_wrap, umask_wrap, setuid_wrap, seteuid_wrap, setgid_wrap, setegid_wrap, simulate_setuid_setgid, getpwuid_wrap, fnmatch_wrap, dlopen_wrap): DS. (chmod_wrap, do_chown, flock_pack, do_utimes, poll_wrap, setgroups_wrap, setresuid_wrap, setresgid_wrap, getgrgid_wrap): PS. (c_time): TS. * sysif.h (c_time): Declaration updated. * syslog.c (openlog_wrap, syslog_wrap): DS. * termios.c (termios_pack): TS. (tcgetattr_wrap, tcsetattr_wrap, tcsendbreak_wrap, tcdrain_wrap, tcflush_wrap, tcflow_rap, encode_speeds, decode_speeds): DS. * txr.c (compato, array_dim, gc_delta): DS. * unwind.c (uw_find_frames_by_mask): DS. * vm.c (vm_make_desc): PS. (vm_make_closure, vm_swtch): DS.
* pattern lang: vertical-horizontal fallback regression.Kaz Kylheku2020-06-291-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c9cab7138636c6c1d6e47f8d1a4053bec2dd0ad4, on Feb 1, 2019, breaks the following case: @(define horiz)whatever@(end) @(horiz) The code assumes that if v_fun returned :decline, then the function doesn't exist. That is false, because the function may have a horizontal definition, as in the above example. That's why in the original code, nothing was done in this case to just allow the flow to proceed to the horizontal fallback, where the call will be tried as a horizontal function. The small problem with that was lack of a diagnosis when the function actually doesn't exist (neither vertical nor horizontal). In that case, the horizontal fallback will expect a line of data. If no data arrives, then the undefined function call is undiagnosed. The idea behind c9cab7138636c6c1d6e47f8d1a4053bec2dd0ad4 was to provide a diagnostic immediately. The right way to do that is to check for a horizontal definition of the function. If there isn't one, then error out, otherwise fall back on the horizontal processing.
* Remove unnecessary #include directives.Kaz Kylheku2020-04-221-2/+0
| | | | | | | | | | Time for some spring cleaning. * args.c, arith.c, buf.c, cadr.c, chksum.c, debug.c, ftw.c, gc.c, gencadr.txr, glob.c, hash.c, lisplib.c, match.c, parser.c, parser.l, parser.y, rand.c, signal.c, stream.c, strudel.c, syslog.c, tree.c, unwind.c, utf8.c, vm.c: Numerous unnecessary #include directives removed.
* load: release warnings before throwing exception.Kaz Kylheku2020-04-141-1/+3
| | | | | | | | | | | | * eval.c (load): When we parse TXR code, let's not release warnings unconditionally. Let's do that when throwing an exception though due to parse errors. If the load is not recursed it will release warnings at the bottom of the function. * match.c (v_load): Consistently with load, release deferred warnings if throwing exception due to the parse having failed.
* txr: spurious retention in @(next).Kaz Kylheku2020-04-111-3/+16
| | | | | | | | | | | | | | * match.c (mf_file_lazy): New static function. The lazy list is created here and stored directly into the data field of the context structure. Function is marked NOINLINE because on an older system with gcc 4.4.5, it didn't solve the problem. We need the function to have a stack frame so any spurious copies of the linked list go into a frame that disappears when the function returns. (v_next_impl): Use mf_file_lazy instead of mf_file_data. (open_data_source): Also make this function INLINE just in case because it contains calls to lazy_stream_cons.
* txr: avoid by-value match_files_ctx passing.Kaz Kylheku2020-04-111-117/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Passing the match_files_ctx structure by value looks nice in the code, but it is contributing to a long standing false retention issue. This test case runs in constant memory: $ yes | txr -c '@(skip) n' skip scans through the lazy list of "y" lines looking for an "n" that never comes. This version should also run in constant memory, but shows unbounded memory growth. $ yes | txr -c '@(next *stdin*) @(skip) n' This patch doesn't fix it, but it moves things in that direction. * match.c (mf_all, mf_args, mf_data, mf_spec, mf_spec_bindings, mf_file_data, mf_from_ml): Take pointer to structure which to initialize and return that pointer, instead of initializing a local structure and returning it by value. (match_files): Take pointer to context rather than copy. (h_call, do_match_line, v_fuzz, v_block, v_next_impl, v_parallel, v_gather, v_collect, v_bind, hv_trampoline, v_try, v_fun, v_if, v_assesrt, v_load, v_call, match_filter, match_fun, extract): Adjust to by-pointer context handling of mf_all, match_files and other functions.
* exceptions: improve non-error @(throw) and @(assert).Kaz Kylheku2020-04-071-3/+4
| | | | | | | | | | | | | | | When @(throw) generates a non-error exception that is unhandled, we just want it to continue. In the same situation, an @(assert) should behave as a failed match; that is, the failure of the query material that follows the assert, which activated it, should propagate through the assert. * match.c (v_throw): Return next_spec_k if uw_rthrow returns. (v_assert, h_assert): Return nil if uw_rthrow returns. * txr.1: Expanded @(throw) and @(assert) documentation with discussion of unhandled exceptions.
* exceptions: use uw_rthrow for non-error exceptions.Kaz Kylheku2020-04-071-3/+3
| | | | | | | | | | | | | | | | | | | * eval.c (eval_exception): This function is shared by warnings and errors. Use uw_throw. The eval_error caller already has an abort() after its eval_exception call, which makes that code path continue to be equivalent to uw_throw. The behavior changes for the other caller, eval_warn, which will now return if the warning is not handled. (eval_defr_warn, gather_free_refs, gather_free_refs_nw): Throw non-error exception with uw_rthrow. * match.c (v_throw, v_assert, h_assert): Use uw_rthrow for these directives, just like the throw function. * parser.c (repl_intr, repl_warning): Use uw_rthrow. * unwind.c (uw_muffle_warning, uw_release_deferred_warnings): Likewise.