summaryrefslogtreecommitdiffstats
path: root/match.c
Commit message (Collapse)AuthorAgeFilesLines
* @(next) takes Lisp expression as source now.Kaz Kylheku2015-11-201-1/+14
| | | | | | | | | | | * match.c (v_next): Evaluate the source expression as TXR Lisp, unless it is meta-expression or meta-variable, or the compatibility option is set to 124 or lower. In those cases treat it as an expression of the TXR Pattern * txr.1: Updated documentation of @(next) and all relevant examples of @(next) everywhere. Added compatibility notes.
* @(rep) as shorthand for @(coll :vars nil).Kaz Kylheku2015-11-201-2/+11
| | | | | | | | | | | | | * match.c (h_coll): Check for rep symbol, and handle similarly to v_coll. Use symbol in error message. (dir_tables_init): Bind rep symbol to h_coll. * parser.y (elems): Don't generate rep_elem phrase structure for the sake of catching "rep outside of output"; this production now conflicts with the intent to allow this. (elem): Add various REP productions which clones of COLL. * txr.1: Documented new @(rep) usage.
* Use symbol in diagnostics in collect.Kaz Kylheku2015-11-201-9/+9
| | | | | * match.c (v_collect): Don't use hard-coded "collect" in diagnostics because the symbol can be repeat.
* Implementing *print-base* and ~d format directive.Kaz Kylheku2015-11-141-36/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * debug.c (show_bindings): Use ~d for level, so as not to be influenced by *print-base*. (debug): Use ~d for line numbers. * lib.c (gensym): Use ~d conversion specifier for formatting gensym counter into symbol name. * match.c (LOG_MISMATCH, LOG_MATCH): Use ~d for line number references. (h_skip, h_coll, h_fun, h_chr, match_line_completely, v_skip, v_fuzz, v_gather, v_collect, v_output, v_filter, v_fun, v_assert, v_load, v_line, h_assert, open_data_source): Use ~d for line refs, number of iterations, errno values. * parser.c (repl): Use ~d for prompt line numbers, numbered variables and the expr-<n> string in error messages. * parser.l (yyerrorf, source_loc_str): Use ~d for line numbers. * stream.c (print_base_s): New symbol variable. (formatv): Implement *print-base*. (stdio_maybe_read_error, stdio_maybe_error, stdio_close, pipe_close, open_directory, open_file, open_fileno, open_tail, open_process, run, remove_path): Use ~d for errno values. (stream_init): Initialize print_base_s and register *print-base* special variable. sysif.c (mkdir_wrap, ensure_dir, getcwd_wrap, mknod_wrap, chmod_wrap, symlink_wrap, link_wrap, readlink_wrap, excec_wrap, stat_impl, pipe_wrap, poll_wrap, getgroups_wrap, setuid_wrap, seteuid_wrap, setgid_wrap): Use ~d for errno values and system function results. * txr.1: Documented *print-base* and ~d conversion specifier.
* Pattern vars accessed from Lisp now dynamic.Kaz Kylheku2015-11-041-4/+29
| | | | | | | | | | | | | | | * eval.c (set_dyn_env): Static function becomes external. * eval.h (set_dyn_env): Declared. * match.c (eval_with_bindings, eval_progn_with_bindings): Evaluate Lisp code in null lexical environment. Instead install the pattern variables as dynamic, so they shadow global variables. A compatibility check for 121 or earlier provides the old behavior. * txr.1: Document scoping rules, and added compatibility notes.
* Factor out excessive uw_set_match_context calls.Kaz Kylheku2015-11-031-51/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | We only need to stash the TXR matcher's context in an environment frame when there is the possibility that Lisp code may be called, or filters which re-enter the matcher directly or through Lisp code. * match.c (eval_with_bindings, eval_progn_with_bindings): New static functions. (dest_bind): Use eval_with_bindings instead of five lines of boilerplate code. (h_chr): No need to stash context around dest_bind; lower levels take care of it. (subst_vars): Do set up match context around the body of this function, for the sake of Lisp calls and filtering in format_field. Use eval_with_bindings. (do_txeval): Remove match context environment frame setup around subst_vars. Use eval_with_bindings, too. (do_output_line): Use eval_with_bindings. (v_output): No match context environment frame around do_output. (v_do, v_require): Use eval_progn_with_bindings instead of five line boilerplate. (v_line): No match context frame around dest_bind. (h_do): Use eval_progn_with_bindings.
* New range type, distinct from cons cell.Kaz Kylheku2015-11-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * eval.c (eval_init): Register intrinsic functions rcons, rangep from and to. (eval_init): Register rangep intrinsic. * gc.c (mark_obj): Traverse RNG objects. (finalize): Handle RNG in switch. * hash.c (equal_hash, eql_hash): Hashing for for RNG objects. * lib.c (range_s, rcons_s): New symbol variables. (code2type): Handle RNG type. (eql, equal): Equality for ranges. (less_tab_init): Table extended to cover RNG. (less): Semantics defined for ranges. (rcons, rangep, from, to): New functions. (obj_init): range_s and rcons_s variables initialized. (obj_print_impl): Produce #R notation for ranges. (generic_funcall, dwim_set): Recognize range objects for indexing * lib.h (enum type): New enum member, RNG. MAXTYPE redefined to RNG value. (TYPE_SHIFT): Increased to 5 since there are now 16 type codes. (struct range): New struct type. (union obj): New member rn, of type struct range. (range_s, rcons_s, rcons, rangep, from, to): Declared. (range_bind): New macro. * parser.l (grammar): New rule for recognizing the #R sequence as HASH_R token. * parser.y (HASH_R): New terminal symbol. (range): New nonterminal symbol. (n_expr): Derives the new range symbol. The n_expr DOTDOT n_expr rule produces rcons expression rather than const. * match.c (format_field): Recognize rcons syntax in fields which is now what ranges translate to. Also recognize range object. * tests/013/maze.tl (neigh): Fix code which destructures range as a cons. That can't be done any more. * txr.1: Document ranges.
* Stop using C library setjmp/longjmp.Kaz Kylheku2015-10-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR is moving to custom assembly-language routines. This is mainly motivated by a very dubious thing done in the GNU C Library setjmp and longjmp in the name of security. Evidently, glibc's setjmp "mangles" certain pointer values which are stored into the jmp_buf buffer. It's been that way since 2005, evidently. This means that, firstly, all along, the use of setjmp in gc.c to get registers into a buffer so they can be scanned has not actually worked properly. More importantly, this pointer mangling in setjmp and longjmp is very hostile to a stack copying implementation of delimited continuations. The reason is that continuations contain jmp_buf buffers, which get relocated in the process of capturing and reviving a continuation. Any pointers in a jmp_buf which point into the captured stack segment have to be fixed up to point into the relocated location. Mangled pointers make this difficult, requiring hacks which are specific to glibc and the machine architecture. We might as well implement a clean, well-behaved setjmp and longjmp. * Makefile (jmp.o): New object file. (dbg/%.o, opt/%.o): New rules for .S prerequisites. * args.c, arith.c, cadr.c, combi.c, cadr.c, combi.c, debug.c, eval.c, filter.c, glob.c, hash.c, lib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Removed <setjmp.h> include. * gc.c: Switch to struct jmp and jmp_save, instead of jmp_buf and setjmp. * jmp.S: New source file. * signal.h (struct jmp): New struct type. (jmp_save, jmp_restore): New function declarations denoting assembly language routines in jmp.S. (extended_jmp_buf): Uses struct jmp instead of setjmp. (extended_setjmp): Use jmp_save instead of setjmp. (extended_longjmp): Use jmp_restore instead of longjmp.
* Fix cascading message for unbound vars in pattern language.Kaz Kylheku2015-09-271-2/+4
| | | | | | | | | | | | | Commit 5ab2b46a on 2011-10-06 introduced a hack for suppressing redundant location informaton being tacked on to an unbound variable error (in the TXR pattern language, not TXR Lisp). This ugly hack broke along the way when uw_throw was changed so that exception arguments are always lists, because it still expects the exception object to be a string. (The breaking change took place in 55cc8493, on 2015-02-06). * match.c (do_txeval): In exception catch, exc is a list, and not a string.
* C++: don't use int constant as enum initializer.Kaz Kylheku2015-09-091-1/+1
| | | | | * match.c (complex_open): Initialize member close of fpip_t using fpip_close enum constant.
* TXR 105 regression: real-time stream not used on tty.Kaz Kylheku2015-09-071-6/+11
| | | | | | | | | | | | | | | | | When the -n option was introduced, on Mar 29, 2015, the change didn't take into account that the hacky complex_open function takes the C stdin stream directly, and not the std_input Lisp stream. After that change, only the std_input is automaticaly marked for real-time input if standard input is a tty, and not any stream later opened from stdin. * match.c (enum fpip_close): New enum member, fpip_close_stream. (struct fpip): New member s, for smuggling through a stream. (complex_open): If name is "-", then plant std_input or std_output as the s member of fpip_t, rather than planting stdin or stdout as the f member. (complex_open_failed): Check for nil stream also. (complex_snarf, complex_stream): Handle stream case.
* Go into repl after processing txr file also.Kaz Kylheku2015-09-071-5/+6
| | | | | | | | | | | * match.c (extract): Return match result as cons, rather than int termination status. * match.h (extract): Declaration updated. * txr.c (txr_main): Handle result cons. If repl mode is selected, pass bindings from car(result) to repl. Otherwise use match success indication in cdr(result) to determine termination status.
* Replace two-step initialization of args with macros.Kaz Kylheku2015-08-241-4/+2
| | | | | | | | | | | | | | | | | | | | | * args.h (args_init_list, args_init): Return the struct args * pointer. (args_decl_list, args_decl): New macros. * eval.c (apply, do_eval, expand_macro, op_dwim, op_catch, (mapcarl, lazy_mapcarl): Switch to new macros. * hash.c (hashl): Likewise. * lib.c (generic_funcall, lazy_appendl, maxl, minl, funcall, funcal1, funcall2, funcall3, funcall4, transpose, juxtv, do_and, do_or, do_iff, unique): Likewise. * match.c (h_fun, v_fun): Likewise. * stream.c (vformat): Likewise. * syslog.c (syslog_wrap): Likewise.
* Use of new args for function calls in interpreter.Kaz Kylheku2015-08-231-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * args.c (args_copy_to_list): New function. * args.h (ARGS_MIN): New preprocessor symbol. (args_add_list): New inline function. (args_copy_to_list): Declared. * debug.c (debug): Args in debug frame are now struct args *. Pull them out nondestructively for printing using args_copy_to_list. * eval.c (do_eval_args): Fill struct args argument list rather than returning evaluated list. Dot position evaluation is handled by installing the dot position value as args->list. (do_eval): Allocate args of at least ARGS_MAX for the call to do_eval_args. Then use generic_funcall to invoke the function rather than apply. (eval_args_lisp1): Modified similarly to do_eval_args. (eval_lisp1): New static function. (expand_macro): Construct struct args argument list for the sake of debug_frame. (op_dwim): Allocate args which are filled by eval_args_lisp1, and applied to the function/object with generic_funcall. The object expression is separately evaluated with eval_lisp1. * match.c (h_fun, v_fun): Construct struct args arglist for the sake of debug_frame call. * unwind.c (uw_push_debug): args argument becomes struct args *. * unwind.h (struct uw_debug): args member becomes struct args *. (uw_push_debug): Declaration updated. * txr.1: Update documentation about dot position argument in function calls. (list . a) now works, which previously didn't.
* Crafting a better parser-priming hack.Kaz Kylheku2015-08-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The method of inserting a character sequence which generates a SECRET_TOKEN_E token is being replaced with a purely token based method. Because we don't manipulate the input stream, the lexer is not involved. We don't have to flush its state and deal with the carry-over of the yy_hold_char. This comes about because recent changes expose a weakness in the old scheme. Now that a top-level expression can have the form expr.expr, it means that the Yacc parser reads one token ahead, to see whether there is a dot or something else. This lookahead token is discarded. We must re-create it when we call yyparse again. This re-creation is done by creating a custom yylex function, which can maintain pushback tokens. We can prime this array of pushback tokens to generate the SECRET_TOKEN_E, as well as to re-inject the lookahead symbol that was thrown away by the previous yyparse. To know which lookahead symbol to re-inject is simple: the scanner just keeps a copy of the most recent token that it returns to the parser. When the parser returns, that token must be the lookahead one. The tokens we keep now in the parser structure are subject to garbage collection, and so we must mark them. Since the YYSTYPE union has no type field, a new API is opened up into the garbage collector to help implement a conservative GC technique. * gc.c (gc_is_heap_obj): New function. * gc.h (gc_is_heap_obj): Declared. * match.c: Include y.tab.h. This is now needed by any module that needs to instantiate a parser_t structure, because members of type YYSTYPE occur in the structure. (parser.h can still be included without y.tab.h, but only an incomplete declaration for the parser strucure is then given, and a few functions are not declared.) * parser.c (yy_tok_mark): New static function. (parser_mark): Mark the recent token and the pushback tokens. (parser_common_init): Initialize the recent token, the pushback tokens, and the pushback stack index. (pushback_token): New static function. (prime_parser): hold_byte argument removed. Body considerably simplified. The catenated stream trick is no longer required. All we do here is set up two pushback tokens and prime the scanner, if necessary, so it is in the right start state for Lisp. * parser.l (YY_DECL): Take over definition of scanning function, renaming to yylex_impl, so we can implement yylex. (grammar): Rule which produces SECRET_ESCAPE_E token removed. (reset_scanner): Function removed. (yylex): New function. * parser.h (struct parser): Now only forward-declared unless y.tab.h has been included. New members, recent_tok, tok_pushback and tok_idx. (yyset_hold_char): Declared. (reset_scanner): Declaration removed. (yylex): Declared (if y.tab.h included). (prime_parser): Declaration updated. (prime_scanner): Declared. * Makefile: express new dependency on existence of y.tab.h of txr.o, match.o and parser.o.
* * eval.c (force): Default the new second argument of source_loc_str.Kaz Kylheku2015-08-041-4/+5
| | | | | | | | | | | | | | | | | | | (eval_error): Derive location of error from the last_form_evaled, if form doesn't have it. (eval_init): Re-register source-loc-str as binary with an optional arg. * match.c (debuglf, sem_error, file_err, typed_error): Default new argument of source_loc_str. * parser.h (source_loc_str): Declaration updated. * parser.l (source_loc_str): Take second argument which specifies alternative value if the source loc info is not found. * unwind.c (uw_throw): Simplify code thanks to source_loc_str default argument. * txr.1: Document new argument of source-loc-str.
* Hash-bang support for .tl files.Kaz Kylheku2015-07-021-1/+1
| | | | | | | | | | | | | | | | * parser.c (read_eval_stream): New boolean argument to request hash bang support. * parser.h (read_eval_stream): Declaration updated. * eval.c (sys_load): Pass new thid argument to read_eval_stream, to decline hash bang support. * match.c (v_load): Likewise. * txr.c (txr_main): Request hash bang support from read_eval_stream. Thus files referenced from the txr command line can have a #! line, which is ignored.
* @(load) and @(include) now load Lisp code.Kaz Kylheku2015-06-121-29/+36
| | | | | | | | | | | * match.c (v_load): Check txr_lisp_p flag coming out of open_txr_file and handle the Lisp case usin read_eval_stream. * parser.c (read_eval_stream, get_parser, parser_errors): New functions. * parser.h (read_eval_stream, get_parser, parser_errors): Declared.
* Preparing for lisp loading.Kaz Kylheku2015-06-101-1/+2
| | | | | | | | | | | | | * parser.c (open_txr_file): Rewritten to take new argument which indicates whether to treat an unsuffixed file as TXR or TXR Lisp, and is updated to indicate which is the case by looking at the suffix. * parser.h (open_txr_file): Declaration updated. * match.c (v_load): Follow change in open_txr_file. * txr.c (txr_main): Likewise.
* * match.c (v_load): Call parse_once rater than parse.Kaz Kylheku2015-06-071-1/+1
| | | | | | | | | | | | | * parser.c (regex_parse): Likewise. * txr.c (txr_main): Likewise. * parser.h (parse): Declaration updated. (parse_once): Declared. * parser.y (parse_once): New function, same as old parse implementation. (parse): Becomes one argument function which works with a previously initialized parser and continues the parse.
* Ligher weight debug instrumentation.Kaz Kylheku2015-05-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This speeds up the TXR Lisp interpreter, because do_eval sets up a debug frame and uses debug_return. * debug.c (debug_block_s): Symbol removed. (debug_init): Remove initialization of debug_block_s. * debug.h (debug_block_s): Declaration removed. (debug_enter): Do not establish a named block or a catch block; no time-wasting unwind stack manipulation at all. The debug_depth variable is managed by the extended setjmp context now. Provide a return value variable, and a well-defined name to branch to to exit from the debug block. (debug_return): Do not use heavy-weight uw_block_return; simply set the return variable and branch to debug_return_out label. * signal.h (EJ_DBG_MEMB, EJ_DBG_SAVE, EJ_DBG_REST, EJ_OPT_MEMB, EJ_OPT_SAVE, EJ_OPT_REST): New macros. (extended_jmp_buf): Define optional global state variables using EJ_OPT_MEMB. (extended_setjmp): Save and restore optional globals using EJ_OPT_SAVE and EJ_OPT_RESTORE. Now debug_depth is saved and restored if debugging support is compiled in. * match.c (open_data_source): Remove bogus debug_return invocations which were uncovered here by changes to the macro. * eval.c (do_eval, expand_macro): debug_return must now be after debug_end, because it won't dynamically clean up frames that it doesn't know about. The set_dyn_env is no longer unreachable in expand_macro; it is now necessary because debug_return isn't doing the longjmp that previously restored dyn_env.
* Slight internal representation change of string-only exceptions.Kaz Kylheku2015-02-061-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | One upshot of all this is that (throw 'foo "msg") now does exactly the same thing as (throwf 'foo "msg"). A message-only exception really is a one-string exception argument list ("message ..."), like the documentation says. * unwind.h (struct uw_catch): exception member renamed to args. (uw_catch): Macro follows structure member rename. * eval.c (op_catch): Removed now unnecessary kludge of turning non-list exception argument list into a one-element argument list. * match.c (v_try): Similar hack to the one in op_catch removed here. * unwind.c (uw_unwind_to_exit_point, uw_push_catch): Follows rename of exception member. (uw_throw): The exception parameter is renamed to args. The kludge removed from op_catch re-appears here, because numerous calls to uw_throw just pass a string as args. It's less of a kludge here because this is the master entry point to exception processing, and it straightens out the representation right away. The exception arguments or message are printed in a clearer way.
* Update copyright notices from 2014 to 2015.Kaz Kylheku2015-02-011-1/+1
| | | | | | | | | | | * arith.c, arith.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Update. * LICENSE, METALICENSE: Likewise.
* * match.c (h_trailer): Bugfix: not returning new variableKaz Kylheku2015-01-051-2/+2
| | | | bindings captured in trailer section. Ouch!
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-41/+41
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.
* Source file inclusion implemented: needed for macros.Kaz Kylheku2014-10-201-6/+17
| | | | | | | | | | | | | | | | | | | | | * match.c (include_s): New symbol variable. (v_load): Function extended to handle include semantics. (include): External wrapper function for doing inclusion via v_load. (syms_init): include_s initialized. * match.h (include_s): Declared. (include): Declared. * parser.y (check_for_include): New static function. (clauses_rev): Use check_for_include to replace @(include ..) directive. * txr.1: Documented include. * genvim.txr: Added include symbol. * txr.vim: Regenerated.
* * match.c (match_fun): Bugfix: replace incorrect plain returnKaz Kylheku2014-10-191-1/+1
| | | | | | | | | with debug_return. This causes a stray debug frame to be left on the environment stack which turns to garbage, leading to an invalid longjmp in another debug_return elsewhere which tries to use that frame. This was diagnosed by valgrind indicating accesses below the stack frame, and also by glibc "longjmp causes uninitialized stack frame" abort.
* * match.c (mf_all): Drop data_lineno parameter. InitializeKaz Kylheku2014-10-181-26/+24
| | | | | | | | | | | the corresponding member based on whether or not data is nil. (do_match_line, mf_from_ml, match_filter, match_fun, extract): Do not pass starting line number argument to mf_all. This fixes a bug when the line number at @(eof) for an empty file comes out as zero. (mf_args, v_skip, v_fuzz, v_next, v_gather, v_collect, open_data_source, match_files): Use zero and one instead of num(0) and num(1).
* * match.c (v_eof): Bugfix: we are at EOF not only whenKaz Kylheku2014-10-171-1/+1
| | | | | the remaining data is nil but when it is (nil). This happens for interactive streams.
* * match.c (dest_bind): Remove the restriction of not allowingKaz Kylheku2014-10-171-2/+14
| | | | | | | | | | | @(expr ...) and @var on the left side of a bind. This is useful, and necessary for @(line @(lisp expr)) to work: matching computed line numbers and character positions. * txr.1: Document use of Lisp on left hand side of bind, that there is a restriction on the left hand side of a set, and that Lisp can be used in a line or chr directive for computed matches.
* Converting cast expressions to macros that are retargettedKaz Kylheku2014-10-171-71/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to C++ style casts when compiling as C++. * lib.h (strip_qual, convert, coerce): New casting macros. (TAG_MASK, tag, type, wli_noex, auto_str, static_str, litptr, num_fast, chr, lit_noex, nil, nao): Use cast macros. * arith.c (mul, isqrt_fixnum, bit): Use cast macros. * configure (INT_PTR_MAX): Define using cast macro. * debug.c (debug_init): Use cast macro. * eval.c (do_eval, expand_macro, reg_op, reg_mac, eval_init): Use cast macros. * filter.c (filter_init): Use cast macro. * gc.c (more, mark_obj, in_heap, mark, sweep_one, unmark): Use cast macros. * hash.c (hash_double, equal_hash, eql_hash, hash_equal_op, hash_hash_op, hash_print_op, hash_mark, make_hash, make_similar_hash, copy_hash, gethash_c, gethash, gethash_f, gethash_n, remhash, hash_count, get_hash_userdata, set_hash_userdata, hash_iter_destroy, hash_iter_mark, hash_begin, hash_uni, hash_diff, hash_isec): Use cast macros. * lib.c (code2type, chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc, chk_strdup, num, c_num, string, mkstring, mkustring, upcase_str, downcase_str, string_extend, sub_str, cat_str, trim_str, c_chr, vector, vec_set_length, copy_vec, sub_vec, cat_vec, cobj_print_op, obj_init): Likewise. * match.c (do_match_line, hv_trampoline, match_files, dir_tables_init): Likewise. * parser.l (grammar): Likewise. * parser.y (parse): Likewise. * rand.c (make_state, make_random_state, random_fixnum, random): Likewise. * regex.c (CHAR_SET_L2_LO, CHAR_SET_L2_HI, CHAR_SET_L1_LO, CHAR_SET_L1_HI, CHAR_SET_L0_LO, CHAR_SET_L0_HI, L0_full, L0_fill_range, L1_full, L1_fill_range, L1_contains, L1_free, L2_full, L2_fill_range, L2_contains, L2_free, L3_fill_range, L3_contains, L3_free, char_set_create, char_set_cobj_destroy, nfa_state_accept, nfa_state_empty, nfa_state_single, nfa_state_wild, nfa_state_set,
* * match.c (dest_bind): More detailed log message for variableKaz Kylheku2014-10-161-1/+2
| | | | mismatch.
* New @(line) and @(chr) directives.Kaz Kylheku2014-10-161-1/+48
| | | | | | | | | | | | | * match.c (line_s): New variable. (h_chr, v_line): New static functions. (syms_init): line_s initialized. (dir_tables_init): Register v_line and h_chr. * match.h (line_s): Declared. * txr.1: Document @(line) and @(chr) directives. * txr.vim: Regenerated.
* * match.c (subst_vars): Fix buggy rendering of TXR Lisp expressionsKaz Kylheku2014-10-151-5/+24
| | | | | | | | | | | | | | | | | that evaluate to lists. For instance `@(list)` renders to the string "nil", and `@(list 1 2)` renders as "(1 2)". The desired behavior is "" and "1 2", respectively. (do_output_line): In output directives, there is a similar problem. A @(list) in the middle of an output block turns to nil, and a @(list 1 2) renders in parentheses as (1 2). Furthermore, there is the additional problem that no filtering is applied to the interpolated value. These behaviors are subject to the compatibility option, since they change the externally visible behavior of TXR programs. * txr.1: Document that empty lists in @(output) variable substitutions turn into nothing. Document value of 100 for -C option, describing the above issue.
* Eliminating the extra list wrapping applied to regularKaz Kylheku2014-10-031-28/+30
| | | | | | | | | | | | | | | | | | | expression objects in the syntax tree. The parser just puts out a #<regex ...> instead of (#<regex ...> regex-syntax). * eval.c (do_eval): We no longer need the hack of treating (#<regex> ...) as a special form which evaluates to #<regex>. (expand): We no longer have to skip over regex syntax, so the case is removed. * match.c (h_var, do_txeval, do_match_line): regexp cases are no longer subcases of consp but stand on their own. In do_match_line, we introduce a COBJ case into the type switch for regexes. * parser.y: regexes are now compiled in the regex and lisp_regex grammar rules instead of the dependent rules, and are not wrapped in extra syntax.
* * match.c (h_var): Fix regression introduced in 2014-08-11Kaz Kylheku2014-10-031-4/+3
| | | | | | | | | | | commit. The incompleteness of that change broke the case of an unbound variable followed by a bound variable. The value of the second variable was still being wrapped in the old complicated representation before being pushed to the front of the spec. * txr.1: Replace bogus text which says that variables are not bound to regexes, and so regex matches from variable substitutions do not arise. This works fine after this change.
* * match.c (v_load): Fix regression introduced in 94: broken @(load).Kaz Kylheku2014-08-291-1/+1
|
* Uprooting stupidities in handling of output variables.Kaz Kylheku2014-08-131-5/+1
| | | | | | | | | | | | | | | | | | | | * parser.y (o_elems_transform): Remove useless function which was only unwrapping the strange parse of output vars. (o_elems_opt, rep_elem, quasilit, wordsqlit): Eliminate o_elems_transform call. (o_var, q_var): Eliminate the phrase structure rules which match an extra o_elem or quasi_item, and which incorporate them into the var syntax tree element. Place the modifiers into the third position, not fourth. * eval.c (subst_vars): Eliminate handling of "pat" element. Actually that was not even there thanks to o_elems_transform being applied: dead code. Pull modifiers from the third element of the var form now, not fourth. * match.c (subst_vars): Similar changes as in the match.c subst_vars function. Here the pat variable is even more obviously useless; if it is not nil, it is just punted back to the spec.
* Fix regression in previous change: we must match a compound textKaz Kylheku2014-08-131-8/+13
| | | | | | | | | | | | element whole, and not break it up. * match.c (search_match): Take a spec argument. (h_var): Turn a text element into a one-element spec and process with search_match. * txr.1: Updated text about matching of variables followed by a directive or function, and about consecutive variables via directive.
* When a variable is delimited by some form other thanKaz Kylheku2014-08-121-95/+110
| | | | | | | | | | | | | | | | | | | | the contents of a variable, fixed string or regex, we now use the entire tail of the specline to find the match. So for instance @var@(trailer)foo works as intuition might expect. * match.c (search_form): Static function removed. (search_match): New static function based on search_form. Does not handle regexes, and does not update c->bindings. (h_var): Renamed local variable pat to next. Added a few missing rlcp's. Combined the cases when pat is a cons to one block so consp isn't repeatedly tested. Function now handles a var followed by (sys:text ...) elements specially; the first element of the text block is pulled out and matched. Implemented "var delimiting spec" general case which matches the entire tail of the spec at successive character positions until a match is found, and the skipped text goes into the variable.
* First cut at restructuring how variable matching works in the patternKaz Kylheku2014-08-111-22/+15
| | | | | | | | | | | | | | | | | language. The goal is to remove the strict behavior of using only one element of context after a variable. variable form at parse time: we unravel that first. * parser.y (grammar): Simplifying the phrase structure rule for the var element. All the variants that have a trailing elem are removed. The abstract syntax changes; the modifier moves to the third position in the list. * match.c (h_var): Matching change: the element which follows a variable is now pulled from the specline rather than the variable syntax, which is how it should have been done in the first place. The modifiers are pulled from a different spot in the variable syntax.
* Big switch to reentrant lexing and parsing.Kaz Kylheku2014-08-021-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * parser.l (YY_INPUT): Stop relying on removed yyin_stream; refer to stream via yyextra. (yyin_stream, lineno, errors, spec_file_str, prepared_error_message): Global variables removed. (yyget_column, yyset_column): Missing prototypes not generated by flex in bison bridge mode have to be added by us to avoid warning. (yyerror): Takes parser and scanner as parameters. Prepared error message is now in the parser context. Calls to other error handling functions receive scanner context. (yyerr): New function. (yyerrorf, yyerrprepf): Takes scanner argument, chases extra data to get to parser, and refers to parser variables instead of globals. (num_esc): Scanner argument added. (%option reentrant, %option bison-bridge, %option extra-type): New flex options. (grammar): yyscanner added everywhere. (end_of_char): Takes scanner argument. (parse_init): Removed references to yyin_stream and prepared_error_message. (parse_reset): Function renamed to open_txr_file. Returns results via pointers instead of setting global variables. (regex_parse, lisp_parse): Use reentrant parser interface. * parser.y (yyerror): Prototype removed. (yylex): Prototype moved after grammar, with new arguments. (sym_helper, define_transform): Take scanner argument. (make_expr): Takes parser argument. (rlrec): New static function. (rl): Function turned into macro. (mkexp, symhlpr): New macros. (%purse-parser, %parse-param, %lex-param): New Yacc options. (grammar): Actions re-worked for reentrance. Parser and scanner contexts are passed down to helper functions, in some cases via the three new macros. The result of the parse is stored in the syntax_tree member of the parser_t structure instead of a global. The yylex function receives the scanner instance. (get_spec): Function removed. (parse): New function. * parser.h (lineno, errors, yyin_stream, spec_file_str): Declarations removed. (parser_t): New struct. (yyerr): New function declared. (yyparse, yyerror, yyerrorf, end_of_regex, end_of_char, yylex, yylex_destroy): Declarations updated.
* * Makefile, arith.c, arith.h, combi.c, combi.h, configure, debug.c,Kaz Kylheku2014-07-231-16/+16
| | | | | | | | debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Synchronize license header with LICENSE.
* * match.c (subst_vars): Bugfix: I neglected to apply theKaz Kylheku2014-07-221-1/+1
| | | | | filter which is in effect to the result of interpolating a TXR Lisp expression, oops!
* * match.c (v_do, v_require): Set up and tear down environment frame,Kaz Kylheku2014-07-151-2/+11
| | | | | | | like other situations that evaluate TXR Lisp from the pattern language. Otherwise obscure things will go wrong. (h_do): Same as above, and additionally, add the forgotten call to install the bindings into the match context.
* * match.c (h_eol): Fix broken horizontal @(eol).Kaz Kylheku2014-07-151-1/+1
| | | | | It should be returning next_spec_k, rather than bindings, which indicate a complete match.
* Optimization: add missing tail updates to some listKaz Kylheku2014-06-201-1/+1
| | | | | | | | | | | collecting loops. * lib.c (tuples_func, where, sel): Catch return value of list_collect and update tail variable. * match.c (do_txeval): Likewise. * parser.y (expand_meta): Likewise for list_collect_nconc.
* * Makefile: Install share/txr/stdlib/*.txr material.Kaz Kylheku2014-06-121-1/+2
| | | | | | | | | | | * match.c (do_txeval): If a variable is not in the bindings, fall back on treating it as a TXR Lisp dynamic variable. This allows us to refer to the stdlib variable from a quasistring in a @(load ...) directive. * txr.c (sysroot_init): Register new variable, *txr-version*. * share/txr/stdlib/ver.txr: New file.
* * match.c (v_load): use the abs_path_p function instead ofKaz Kylheku2014-06-121-1/+1
| | | | | | | | | | | | | checking for leading slash. * stream.c (abs_path_p): New function. (stream_init): Register abs_path_p as abs-path-p. * stream.h (abs_path_p): Declared. * txr.1: Documented abs-path-p. * dep.mk: Updated.
* The dumping of bindings and printing of false must nowKaz Kylheku2014-06-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | be explicitly requested by the -B option. * match.c (opt_nobindings): Variable removed. (opt_print_bindings): New variable. (extract): Print bindings or "false" if opt_print_bindings is true. * stream.c (output_produced): Variable removed. (stdio_put_string, stdio_put_char, stdio_put_byte): Remove update of output_produced. * stream.h (output_produced): Declaration removed. * txr.1: Documentation updated. * txr.c (txr_main): Option 'b' does nothing. 'B', 'l', 'a', and '--lisp-bindings' set opt_print_bindings to 1. * txr.h (opt_nobindings): Declaration removed. (opt_print_bindings): Declared. * unwind.c (uw_throw): When exiting due to a query error or file error, print false when opt_print_bindings is true.