summaryrefslogtreecommitdiffstats
path: root/parser.h
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2015-08-12 06:59:15 -0700
committerKaz Kylheku <kaz@kylheku.com>2015-08-12 06:59:15 -0700
commit08bd6d07429bfaa2abd6ddccc4812272eb0b08cb (patch)
tree1f6056e7e25e69b24e120fc491d5ea512686c219 /parser.h
parent4da607e09383e71134c5ba1622f3c31803f8ea9b (diff)
downloadtxr-08bd6d07429bfaa2abd6ddccc4812272eb0b08cb.tar.gz
txr-08bd6d07429bfaa2abd6ddccc4812272eb0b08cb.tar.bz2
txr-08bd6d07429bfaa2abd6ddccc4812272eb0b08cb.zip
Crafting a better parser-priming hack.
The method of inserting a character sequence which generates a SECRET_TOKEN_E token is being replaced with a purely token based method. Because we don't manipulate the input stream, the lexer is not involved. We don't have to flush its state and deal with the carry-over of the yy_hold_char. This comes about because recent changes expose a weakness in the old scheme. Now that a top-level expression can have the form expr.expr, it means that the Yacc parser reads one token ahead, to see whether there is a dot or something else. This lookahead token is discarded. We must re-create it when we call yyparse again. This re-creation is done by creating a custom yylex function, which can maintain pushback tokens. We can prime this array of pushback tokens to generate the SECRET_TOKEN_E, as well as to re-inject the lookahead symbol that was thrown away by the previous yyparse. To know which lookahead symbol to re-inject is simple: the scanner just keeps a copy of the most recent token that it returns to the parser. When the parser returns, that token must be the lookahead one. The tokens we keep now in the parser structure are subject to garbage collection, and so we must mark them. Since the YYSTYPE union has no type field, a new API is opened up into the garbage collector to help implement a conservative GC technique. * gc.c (gc_is_heap_obj): New function. * gc.h (gc_is_heap_obj): Declared. * match.c: Include y.tab.h. This is now needed by any module that needs to instantiate a parser_t structure, because members of type YYSTYPE occur in the structure. (parser.h can still be included without y.tab.h, but only an incomplete declaration for the parser strucure is then given, and a few functions are not declared.) * parser.c (yy_tok_mark): New static function. (parser_mark): Mark the recent token and the pushback tokens. (parser_common_init): Initialize the recent token, the pushback tokens, and the pushback stack index. (pushback_token): New static function. (prime_parser): hold_byte argument removed. Body considerably simplified. The catenated stream trick is no longer required. All we do here is set up two pushback tokens and prime the scanner, if necessary, so it is in the right start state for Lisp. * parser.l (YY_DECL): Take over definition of scanning function, renaming to yylex_impl, so we can implement yylex. (grammar): Rule which produces SECRET_ESCAPE_E token removed. (reset_scanner): Function removed. (yylex): New function. * parser.h (struct parser): Now only forward-declared unless y.tab.h has been included. New members, recent_tok, tok_pushback and tok_idx. (yyset_hold_char): Declared. (reset_scanner): Declaration removed. (yylex): Declared (if y.tab.h included). (prime_parser): Declaration updated. (prime_scanner): Declared. * Makefile: express new dependency on existence of y.tab.h of txr.o, match.o and parser.o.
Diffstat (limited to 'parser.h')
-rw-r--r--parser.h25
1 files changed, 21 insertions, 4 deletions
diff --git a/parser.h b/parser.h
index 9b336d70..bedebfbe 100644
--- a/parser.h
+++ b/parser.h
@@ -31,7 +31,16 @@ typedef struct yyguts_t scanner_t;
typedef void *yyscan_t;
#endif
-typedef struct {
+typedef struct parser parser_t;
+
+#ifdef SPACE
+
+struct yy_token {
+ int yy_char;
+ YYSTYPE yy_lval;
+};
+
+struct parser {
val parser;
cnum lineno;
int errors;
@@ -41,7 +50,11 @@ typedef struct {
val syntax_tree;
yyscan_t yyscan;
scanner_t *scanner;
-} parser_t;
+ struct yy_token recent_tok;
+ struct yy_token tok_pushback[4];
+ int tok_idx;
+};
+#endif
extern const wchar_t *spec_file;
extern val form_to_ln_hash;
@@ -53,14 +66,18 @@ void yyerrorf(scanner_t *scanner, val s, ...);
void yybadtoken(parser_t *, int tok, val context);
void end_of_regex(scanner_t *scanner);
void end_of_char(scanner_t *scanner);
-int reset_scanner(scanner_t *scanner);
+#ifdef SPACE
+int yylex(YYSTYPE *yylval_param, yyscan_t yyscanner);
+#endif
int yylex_init(yyscan_t *pscanner);
int yylex_destroy(yyscan_t scanner);
parser_t *yyget_extra(yyscan_t scanner);
void yyset_extra(parser_t *, yyscan_t);
+void yyset_hold_char(yyscan_t, int);
void parser_l_init(void);
void open_txr_file(val spec_file, val *txr_lisp_p, val *name, val *stream);
-void prime_parser(parser_t *, int hold_byte, val name);
+void prime_parser(parser_t *, val name);
+void prime_scanner(scanner_t *);
int parse_once(val stream, val name, parser_t *parser);
int parse(parser_t *parser, val name);
val source_loc(val form);