diff options
author | Kaz Kylheku <kaz@kylheku.com> | 2015-08-12 06:59:15 -0700 |
---|---|---|
committer | Kaz Kylheku <kaz@kylheku.com> | 2015-08-12 06:59:15 -0700 |
commit | 08bd6d07429bfaa2abd6ddccc4812272eb0b08cb (patch) | |
tree | 1f6056e7e25e69b24e120fc491d5ea512686c219 /parser.h | |
parent | 4da607e09383e71134c5ba1622f3c31803f8ea9b (diff) | |
download | txr-08bd6d07429bfaa2abd6ddccc4812272eb0b08cb.tar.gz txr-08bd6d07429bfaa2abd6ddccc4812272eb0b08cb.tar.bz2 txr-08bd6d07429bfaa2abd6ddccc4812272eb0b08cb.zip |
Crafting a better parser-priming hack.
The method of inserting a character sequence which generates a
SECRET_TOKEN_E token is being replaced with a purely token based
method.
Because we don't manipulate the input stream, the lexer is not
involved. We don't have to flush its state and deal with the carry-over
of the yy_hold_char.
This comes about because recent changes expose a weakness in the old
scheme. Now that a top-level expression can have the form expr.expr, it
means that the Yacc parser reads one token ahead, to see whether there
is a dot or something else. This lookahead token is discarded. We must
re-create it when we call yyparse again. This re-creation is done by
creating a custom yylex function, which can maintain pushback tokens.
We can prime this array of pushback tokens to generate the
SECRET_TOKEN_E, as well as to re-inject the lookahead symbol that was
thrown away by the previous yyparse. To know which lookahead symbol to
re-inject is simple: the scanner just keeps a copy of the most recent
token that it returns to the parser. When the parser returns, that
token must be the lookahead one.
The tokens we keep now in the parser structure are subject to garbage
collection, and so we must mark them. Since the YYSTYPE union has no
type field, a new API is opened up into the garbage collector to help
implement a conservative GC technique.
* gc.c (gc_is_heap_obj): New function.
* gc.h (gc_is_heap_obj): Declared.
* match.c: Include y.tab.h. This is now needed by any module
that needs to instantiate a parser_t structure, because members
of type YYSTYPE occur in the structure. (parser.h can still be included
without y.tab.h, but only an incomplete declaration for the parser
strucure is then given, and a few functions are not declared.)
* parser.c (yy_tok_mark): New static function.
(parser_mark): Mark the recent token and the pushback tokens.
(parser_common_init): Initialize the recent token, the
pushback tokens, and the pushback stack index.
(pushback_token): New static function.
(prime_parser): hold_byte argument removed. Body considerably
simplified. The catenated stream trick is no longer required.
All we do here is set up two pushback tokens and prime the scanner,
if necessary, so it is in the right start state for Lisp.
* parser.l (YY_DECL): Take over definition of scanning function, renaming
to yylex_impl, so we can implement yylex.
(grammar): Rule which produces SECRET_ESCAPE_E token removed.
(reset_scanner): Function removed.
(yylex): New function.
* parser.h (struct parser): Now only forward-declared unless y.tab.h
has been included. New members, recent_tok, tok_pushback and tok_idx.
(yyset_hold_char): Declared.
(reset_scanner): Declaration removed.
(yylex): Declared (if y.tab.h included).
(prime_parser): Declaration updated.
(prime_scanner): Declared.
* Makefile: express new dependency on existence of y.tab.h of txr.o,
match.o and parser.o.
Diffstat (limited to 'parser.h')
-rw-r--r-- | parser.h | 25 |
1 files changed, 21 insertions, 4 deletions
@@ -31,7 +31,16 @@ typedef struct yyguts_t scanner_t; typedef void *yyscan_t; #endif -typedef struct { +typedef struct parser parser_t; + +#ifdef SPACE + +struct yy_token { + int yy_char; + YYSTYPE yy_lval; +}; + +struct parser { val parser; cnum lineno; int errors; @@ -41,7 +50,11 @@ typedef struct { val syntax_tree; yyscan_t yyscan; scanner_t *scanner; -} parser_t; + struct yy_token recent_tok; + struct yy_token tok_pushback[4]; + int tok_idx; +}; +#endif extern const wchar_t *spec_file; extern val form_to_ln_hash; @@ -53,14 +66,18 @@ void yyerrorf(scanner_t *scanner, val s, ...); void yybadtoken(parser_t *, int tok, val context); void end_of_regex(scanner_t *scanner); void end_of_char(scanner_t *scanner); -int reset_scanner(scanner_t *scanner); +#ifdef SPACE +int yylex(YYSTYPE *yylval_param, yyscan_t yyscanner); +#endif int yylex_init(yyscan_t *pscanner); int yylex_destroy(yyscan_t scanner); parser_t *yyget_extra(yyscan_t scanner); void yyset_extra(parser_t *, yyscan_t); +void yyset_hold_char(yyscan_t, int); void parser_l_init(void); void open_txr_file(val spec_file, val *txr_lisp_p, val *name, val *stream); -void prime_parser(parser_t *, int hold_byte, val name); +void prime_parser(parser_t *, val name); +void prime_scanner(scanner_t *); int parse_once(val stream, val name, parser_t *parser); int parse(parser_t *parser, val name); val source_loc(val form); |