summaryrefslogtreecommitdiffstats
path: root/parser.l
Commit message (Collapse)AuthorAgeFilesLines
* Extending syntax to allow for @VAR and @(...) forms insideKaz Kylheku2011-10-061-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | nested lists. This is in anticipation of future features. * lib.c (expr_s): New symbol variable. (obj_init): expr_s initialized. * lib.h (expr_s): Declared. * match.c (dest_bind): Now takes linenum. Tests for the meta-syntax denoted by the system symbols var_s and expr_s, and throws an error. (eval_form): Similar error checks added. Also, hack: do not add file and line number to an exception which begins with a '(' character; just re-throw it. This suppresses duplicate line number addition when this throw occurs across some nestings. (match_files): Updated calls to dest_bind. * parser.l (yybadtoken): Handle new token kind, METAVAR and METAPAR. (grammar): Refactoring among patterns: TOK broken into SYM and NUM, NTOK introduced, unused NUM_END removed. Rule for @( producing METAPAR in nested state. * parser.y (METAVAR, METAPAR): New tokens. (meta_expr): New nonterminal. (expr): meta_expr and META_VAR productions handled.
* * LICENSE, Makefile, configure, filter.c, filter.h, gc.c, gc.h, hash.c,Kaz Kylheku2011-10-041-1/+1
| | | | | | hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated e-mail address.
* New directive: choose.Kaz Kylheku2011-10-011-0/+7
| | | | | | | | | | | | | | | | | | | | | | * match.c (choose_s, longest_k, shortest_k): New variables. (match_line, match_files): Introduced choose directive. (match_init): Initialize new variables. * match.h (choose_s): Declared. * parser.l (yybadtoken): Handle CHOOSE. (CHOOSE): Clause added for returning this token. * parser.y: Added #include "match.h". (CHOOSE): New token symbol. (choose_clause): New nonterminal symbol. (clause): choose_clause added. (all_clause, some_clause, none_clause, maybe_clause, cases_clause): Abstract syntax tree tweaked. (choose_clause): New syntax. (elem): Abstract syntax trees tweaked for many clauses. New CHOOSE clauses. (out_clause): New error case for choose_clause.
* * parser.l: Implemented backslash continuations in SPECIALKaz Kylheku2011-09-301-13/+27
| | | | | | state, regexes and string literals. * txr.1: Documented.
* * match.c (chars_k): New variable.Kaz Kylheku2011-09-291-2/+2
| | | | | | | | | | | | (match_line): Keyword arguments in coll implemented. (match_init): chars_k variable initialized. * parser.l (COLL): Lexical syntax changed to allow for argument material. * parser.y (elem): Coll syntax rewritten for arguments. * txr.1: Updated.
* * match.c (mingap_k, maxgap_k, gap_k, times_k, lines_k): NewKaz Kylheku2011-09-291-2/+2
| | | | | | | | | | | | | | | | | symbol variables. (match_lines): Keyword arguments in collect implemented. (match_init): New function. * match.h (match_init): Declared. * parser.l (COLLECT): Lexical syntax changed for COLLECT to allow for argument material. * parser.y (%union): obj renamed to val. (exprs_opt): New nonterminal. (collect_clause): Rewritten for arguments. * txr.c (main): Call to match_init introduced.
* Bugfixes: Consistent escaping in various literals. DoubleKaz Kylheku2011-09-261-6/+10
| | | | | | | | | | backslash codes for single backslash. Output clause can be empty. * parser.l (char_esc): Backslash handled. Use internal_error rather than abort. (REGCHAR, LITCHAR): Backslash added to lexical syntax. * parser.y (output_clause): Allow empty output clause.
* * LICENSE, Makefile, configure, gc.c, gc.h, hash.c, hash.h, lib.c,Kaz Kylheku2011-09-231-1/+1
| | | | | | lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated copyright year.
* Bump copyrights to 2010.Kaz Kylheku2010-10-051-1/+1
|
* Resolving parser conflicts.Kaz Kylheku2010-01-191-1/+1
|
* Implemented non-greedy operator.Kaz Kylheku2010-01-151-1/+1
|
* Bugfix: allow unescaped / to be used in regex character classes.Kaz Kylheku2010-01-131-5/+9
|
* Implemented the regular expression ~ and & operators.Kaz Kylheku2010-01-051-4/+4
| | | | | | | | | | | | | | | This turns out to be easy to do in NFA land. The complement of an NFA has exactly the same number and configuration of states and transitions, except that the states have an inverted meaning; and furthermore, failed character transitions are routed to an extra state (which in this impelmentation is permanently allocated and shared by all regexes). The regex & is implemented trivially using DeMorgan's. Also, bugfix: regular expressions like A|B|C are allowed now by the syntax, rather than constituting syntax error. Previously, this would have been entered as (A|B)|C.
* Remove unnecessary cast.Kaz Kylheku2009-12-091-1/+1
|
* * parser.l (YYINPUT): Fix signed/unsigned comparison.Kaz Kylheku2009-12-091-2/+3
|
* * parser.l (YY_NO_UNPUT): Removed superfluous #define. This is notKaz Kylheku2009-12-031-2/+0
| | | | | | | needed because suppressing generation of unput is requested via the %option. In scanners generated by the legacy version of flex, 2.5.4, still widely in use. this redundancy leads to a multiple #define YY_NO_UNPUT and a compiler warning.
* * parser.l: Use flex options to suppress generation of theKaz Kylheku2009-11-281-0/+2
| | | | | unused functons yyunput and yyinput, thus getting rid of some compiler diagnostics.
* Fixed broken yyerrorf. It was still taking char *, and passingKaz Kylheku2009-11-241-1/+1
| | | | | that as an object to vformat, resulting in #<garbage: ...> output.
* Improving portability. It is no longer assumed that pointersKaz Kylheku2009-11-231-3/+7
| | | | | | | | can be converted to a type long and vice versa. The configure script tries to detect the appropriate type to use. Also, some run-time checking is performed in the streams module to detect which conversions specifier strings to use for printing numbers.
* Introducing symbol packages. Internal symbols are now inKaz Kylheku2009-11-211-4/+11
| | | | | | | | | | a system package instead of being hacked with the $ prefix. Keyword symbols are provided. In the matcher, evaluation is tightened up. Keywords, nil and t are not bindeable, and errors are thrown if attempts are made to bind them. Destructuring in dest_bind is strict in the number of items. String streams are exploited to print bindings to objects that are not strings or characters. Numerous bugfixes.
* Changing ``obj_t *'' occurences to a ``val'' typedef. (Ideally,Kaz Kylheku2009-11-201-4/+4
| | | | | we wouldn't have to declare object variables at all, so why use an obtuse syntax to do so?)
* Fix total breakage of yyerror and yyerrorf.Kaz Kylheku2009-11-181-2/+3
|
* More removal of C99 wide character I/O, and tightening upKaz Kylheku2009-11-171-3/+3
| | | | of standard conformance.
* Big round of changes to switch the code base to use the streamKaz Kylheku2009-11-161-45/+47
| | | | | | | | | | | | | | | | | abstraction instead of directly using C standard I/O, to eliminate most uses of C formatted I/O, and fix numerous bugs, such variadic argument lists which lack a terminating ``nao'' sentinel. Bug 28033 is addressed by this patch, since streams no longer provide printf-compatible formatting. The native formatter is extended with some additional capabilities to take over. The work on literal objects is expanded and they are now used throughout the code base. Fixed bad realloc in string output stream: reallocating by number of wide chars rather than bytes.
* Previous commit broke UTF-8 lexing, by changing the get_charKaz Kylheku2009-11-131-2/+2
| | | | | | | | semantics on the input stream to wide character input. Also, reading a query the command line (-c) must read bytes from a UTF-8 encoding of the string. We introduce a new get_byte function which can extract bytes from streams which provide it.
* Continuing wchar_t conversion. Making sure all stdio callsKaz Kylheku2009-11-121-41/+41
| | | | | use wide character functions so that there is no illicit mixing. (But the goal is to replace this usage with txr streams).
* Documenting extended characters in man page.Kaz Kylheku2009-11-121-0/+15
| | | | Cleaned up some more issues related to extended characters.
* Big conversion to wide characters and UTF-8 support.Kaz Kylheku2009-11-111-34/+50
| | | | | | | | | This is incomplete. There are too many dependencies on wide character support from the C stream I/O library, and implicit use of some encoding which may not be UTF-8. The regex code does not handle wide characters properly. Character type is still int in some places, rather than wchar_t. Test suite passes though.
* Kill tabs with spaces (how did they sneak in?).Kaz Kylheku2009-11-041-12/+12
| | | | Fix possible use of uninitialized ch.
* txr-015 2009-10-15txr-015Kaz Kylheku2017-07-311-0/+523