| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* match.c (h_fun): New function.
(match_line): Rearranged not to do hash lookup if the directive is a
regex or list. If hash lookup fails, try it as a horizontal function.
(h_define): New function. Handles horizontal function syntax embedded
in line.
(v_define): Handle the horizontal function syntax occuring
on a line by itself. The function info is now stored as a cons cell
whose car is the vertical function and cdr the horizontal one.
(v_fun): Adjust to new function storage convention.
(dir_tables_init): h_define entered in table.
* parser.y: Added syntax for horizontal define.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* match.c (gather_s): New keyword variable.
(v_gather): New function.
(syms_init): gather_s initialized.
(dir_tables_init): v_gather entered into table.
* match.h (gather_s): Declared.
* parser.l: GATHER token scanning added.
* parser.y: GATHER token added. gather_clause nonterminal added.
* txr.1: New directive documented.
* txr.vim: gather keyword introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (prepared_error_message): New static variable.
(yyerror): Emit and clear prepared error message.
(yyerrprepf): New static function.
(yybadtoken): Function moved into parser.y.
(grammar): For irrecoverable lexical errors, stash error message
with yyerrprepf and return the special error token ERRTOK to generate a
syntax error. I could find no other interface to the parser to make it
cleanly exit.
* parser.y (ERRTOK): New terminal symbol, does not appear anywhere
in the grammar.
(spec): Bail after 8 errors, recover to nearest newline, and
use yyerrok to clear error situation.
(YYEOF): Provided by Bison, conditionally defined for other yacc-s.
(yybadtoken): Function moved from parser.l. Checks for the next
token being YYEMPTY or YYEOF, and also handles ERRTOK.
* stream.c (vformat_to_string): New function.
(format): If stream is nil, format to string and return it.
* stream.h (vformat_to_string): Declared.
|
|
|
|
| |
from %right associativity clause.
|
|
|
|
|
|
|
|
|
| |
should only denote multiple spaces, not mixtures of spaces and
tabs. WE have to be careful with tabs because they can be
semantically different from spaces (e.g. file with tab delimited
fields which can be blank, empty or have leading or trailing spaces.)
* txr.1: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
differences between expected and actual test output.
* parser.l (yybadtoken): Handle new terminal symbol, SPACE.
New rule for producing SPACE token out of an extent of
tabs and spaces.
* parser.y (SPACE): New terminal symbol.
(o_var): New nonterminal. I noticed that the var rule was
being used for output elements, and the var rule refers to
elem rather than o_elem. A new o_var rule is a simplified
duplicate of var.
(elem): Handle SPACE token. Transform to regex if it is
a single space, otherwise to literal text.
(o_elem): Handle SPACE token in output.
* tests/001/query-2.txr: This query depends on matching
single spaces and so needs to use escapes.
* tests/001/query-4.txr, test/001/query-4.expected: New test
case, based on query-2.txr. It produces the same output,
but is simpler thanks to the new semantics of space.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nested lists. This is in anticipation of future features.
* lib.c (expr_s): New symbol variable.
(obj_init): expr_s initialized.
* lib.h (expr_s): Declared.
* match.c (dest_bind): Now takes linenum. Tests for the meta-syntax
denoted by the system symbols var_s and expr_s, and throws an
error.
(eval_form): Similar error checks added. Also, hack: do not add
file and line number to an exception which begins with a '('
character; just re-throw it. This suppresses duplicate line
number addition when this throw occurs across some nestings.
(match_files): Updated calls to dest_bind.
* parser.l (yybadtoken): Handle new token kind, METAVAR and METAPAR.
(grammar): Refactoring among patterns: TOK broken into
SYM and NUM, NTOK introduced, unused NUM_END removed.
Rule for @( producing METAPAR in nested state.
* parser.y (METAVAR, METAPAR): New tokens.
(meta_expr): New nonterminal.
(expr): meta_expr and META_VAR productions handled.
|
|
|
|
|
|
| |
hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y,
regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c,
unwind.h, utf8.c, utf8.h: Updated e-mail address.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bugfix in cases inside coll: was not collecting bindings.
Bugfix for until inside coll: was not seeing bindings
from main clause.
* lib.c (ldiff): New function.
* lib.h (ldiff): Declared.
* match.c (match_line): Implemented last clause. Fixed cases
handling by moving misplaced termination check.
(match_files): Implemented last clause.
* parser.y (until_last): New nonterminal symbol.
(collect_clause): Refactored syntax to support until and last.
(elem): Likewise.
* txr.1: Updated.
|
|
|
|
|
| |
syntax tree of o_elems constituent, leading to problems with
consecutive variables in a @(rep).
|
|
|
|
|
| |
cases_clause, choose_clause, elem): Regression bug fix: bad list calls
in parser, lacking nao terminator.
|
|
|
|
|
|
|
|
| |
2011-09-28 commit which introduced the double var match.
* match.c (match_line): Handle case where modifier is t.
* parser.y (var_op): Produce modifir as (t) rather than t.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* match.c (choose_s, longest_k, shortest_k): New variables.
(match_line, match_files): Introduced choose directive.
(match_init): Initialize new variables.
* match.h (choose_s): Declared.
* parser.l (yybadtoken): Handle CHOOSE.
(CHOOSE): Clause added for returning this token.
* parser.y: Added #include "match.h".
(CHOOSE): New token symbol.
(choose_clause): New nonterminal symbol.
(clause): choose_clause added.
(all_clause, some_clause, none_clause, maybe_clause,
cases_clause): Abstract syntax tree tweaked.
(choose_clause): New syntax.
(elem): Abstract syntax trees tweaked for many clauses.
New CHOOSE clauses.
(out_clause): New error case for choose_clause.
|
|
|
|
|
|
|
|
|
|
| |
none, maybe and cases directives.
(match_files): Recognize horizontal version of these directives
by the presence of the extra symbol t and do not process.
Also, bugfix in the all directive: not resetting the
all_match flag when short circuiting out.
* parser.y (clause_parts_h, additional_parts_h): New nonterminals.
(elem): New clauses added.
|
|
|
|
|
|
|
|
|
|
|
|
| |
(match_line): Keyword arguments in coll implemented.
(match_init): chars_k variable initialized.
* parser.l (COLL): Lexical syntax changed to allow for
argument material.
* parser.y (elem): Coll syntax rewritten for arguments.
* txr.1: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
symbol variables.
(match_lines): Keyword arguments in collect implemented.
(match_init): New function.
* match.h (match_init): Declared.
* parser.l (COLLECT): Lexical syntax changed for COLLECT to
allow for argument material.
* parser.y (%union): obj renamed to val.
(exprs_opt): New nonterminal.
(collect_clause): Rewritten for arguments.
* txr.c (main): Call to match_init introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
regex variables which also have nested variables.
Previously this code was assuming that the cases were
mutually exclusive, and the parser happened to work that way.
Also, added support for a "double var" match which occurs
when an unbound variable is followed by a regex variable.
This case should be allowed because it makes sense.
It's similar to a variable followed by a regex, except
that the regex is also a variable binding.
* parser.y (o_elems_transform): New function.
(o_elems_opt, o_elems_opt2, quasilit): Transform o_elems with new
function. This is needed because subst_vars doesn't
deal with the nested var syntax for consecutive variables.
(var): New syntax case '{' IDENT exprs '}' elem. This
allows consecutive variables to be nested in all cases.
|
|
|
|
|
| |
These must have exactly the same precedence as
IDENT for this to work right, of course.
|
|
|
|
|
|
|
| |
terminals was causing @foo@foo to be parsed differently
from @foo@{foo}. We need consecutive variables to be
specially folded in the syntax under a single var_s
node.
|
|
|
|
|
|
|
|
|
|
| |
backslash codes for single backslash. Output clause can be empty.
* parser.l (char_esc): Backslash handled.
Use internal_error rather than abort.
(REGCHAR, LITCHAR): Backslash added to lexical syntax.
* parser.y (output_clause): Allow empty output clause.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* filter.c, filter.h: New files.
* Makefile (OBJS): filter.o added.
* gc.c (mark_obj): Mark new alloc field of string objets.
* hash.c (struct hash): New member, userdata.
(hash_mark): Mark new userdata member of hash.
(make_hash): Initialize userdata.
(get_hash_userdata, set_hash_userdata, hashp): New functions.
* hash.h (get_hash_userdata, set_hash_userdata, hashp): New functions
declared.
* lib.c (getplist, string_extend, cobjp): New functions.
(string_own, string, string_utf8): Initialize new alloc field to nil.
(mkstring, mkustring): Initialize new alloc field to actual size.
(length_str): When length is computed and cached, also compute
and cache alloc.
(init): Call filter_init.
* lib.h (string string): New member, alloc.
(num_fast): Macro converted to inline function.
(getplist, string_extend, cobjp): New functions declared.
* match.c (match_line): Follows change of modifier s-exp syntax.
(format_field): New parameter, filter.
New modifier syntax parsed. Filter retrieved, and applied.
(subst_vars): New parameter, filter. Filter is either applied
in this function or passed to format_field, as needed.
(eval_form): Pass nil to new parameter of subst_vars.
(do_output_line): New parameter, filter. Passed down to subst_vars.
(do_output): New parameter, filter. Passed down to do_output_line.
(match_files): Pass nil filter to subst_vars in cat directive.
Output directive refactored to parse keywords, extract the
filter and pass down to do_output.
* parser.y (regex): Generate (sys:regex regex syntax ...)
instead of (regex syntax ...).
(elem, expr): Updated w.r.t. regex syntax change.
(var): Cases '{' IDENT regex '}' and '{' IDENT NUMBER '}'
are removed. new syntax '{' IDENT exprs '}' to handle these
more generally and allow for keywords.
* txr.1: Updated.
|
|
|
|
|
|
| |
lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c,
regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h: Updated copyright year.
|
|
|
|
|
|
| |
Leading :nothrow with trailing material is an error now.
* txr.1: Updated. Made note of errors in pipes being asynchronous.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
of empty [] into regterm, via empty derivation.
|
|
|
|
|
| |
to match no character and [^] as its complement,
being synonymous with the wildcard dot.
|
| |
|
| |
|
|
|
|
| |
being treated as a non-complemented set of two characters.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This turns out to be easy to do in NFA land.
The complement of an NFA has exactly the same number
and configuration of states and transitions, except
that the states have an inverted meaning; and furthermore,
failed character transitions are routed to an extra
state (which in this impelmentation is permanently
allocated and shared by all regexes). The regex &
is implemented trivially using DeMorgan's.
Also, bugfix: regular expressions like A|B|C are allowed
now by the syntax, rather than constituting syntax error.
Previously, this would have been entered as (A|B)|C.
|
|
|
|
| |
in chrlit. Include <stdlib.h> for abort.
|
|
|
|
|
|
|
|
|
| |
The parser.y file includes "utf8.h", which uses the the type wint_t.
It also includes "lib.h" which uses "wchar_t". But it fails
to include any headers which define these types.
The generated y.tab.c picks up wchar_t by the Bison-inserted
inclusion of <stdlib.h>, so that's how we got that. But wint_t does not
come from any of the headers---if they are standard-conforming.
|
|
|
|
|
| |
that as an object to vformat, resulting in #<garbage: ...>
output.
|
|
|
|
| |
have a _s suffix.
|
|
|
|
|
|
|
|
| |
can be converted to a type long and vice versa. The configure
script tries to detect the appropriate type to use. Also,
some run-time checking is performed in the streams module
to detect which conversions specifier strings to use for
printing numbers.
|
|
|
|
|
|
|
|
|
|
| |
a system package instead of being hacked with the $ prefix.
Keyword symbols are provided. In the matcher, evaluation
is tightened up. Keywords, nil and t are not bindeable, and
errors are thrown if attempts are made to bind them.
Destructuring in dest_bind is strict in the number of items.
String streams are exploited to print bindings to objects
that are not strings or characters. Numerous bugfixes.
|
|
|
|
| |
a value to $$.
|
|
|
|
|
| |
we wouldn't have to declare object variables at all, so why
use an obtuse syntax to do so?)
|
|
|
|
|
|
|
|
|
| |
This is incomplete. There are too many dependencies on
wide character support from the C stream I/O library,
and implicit use of some encoding which may not be UTF-8.
The regex code does not handle wide characters properly.
Character type is still int in some places, rather than wchar_t.
Test suite passes though.
|
|
|
|
|
|
|
|
|
|
|
| |
Lazy strings implemented, incompletely.
Changed string function to implicitly strdup; non-strdup
version changed to string_own. Fixed wrong uses of strdup
rather than chk_strdup.
Functions added to regex module to provide regex matching
as a state machine to which characters are fed.
|
| |
|
|
|