| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now it is possible to use a leading dot on the referencing
dot syntax. This is the is the "unbound reference dot". It
expands to the uref macro, which denotes an unbound-reference:
it produces a function which takes an object as the argument,
and curries the reference implied by the remaining arguments.
* eval.c (uref_s): New global symbol variable.
(eval_init): Intern uref symbol and init uref_s.
* eval.h (uref_s): Declared.
* lib.c (simple_qref_args_p): A qref expression is now
also not simple if it contains an embedded uref, meaning
that it cannot be rendered into the dot notation without
ambiguity.
(obj_print_impl): Support printing (uref a b c) as .a.b.c.
* lisplib.c (struct_set_entries): Add uref to the list of
autoload triggers for struct.tl.
* parser.l (DOTDOT): Consume any leading whitespace as part
of recognizing the DOTDOT token. Otherwise the new rule
for UREFDOT, which matches (mandatory) leading space
will take precedence, causing " .." to be scanned wrong.
(UREFDOT): Rule for new kind of dot token, which is
preceded by mandatory whitespace, and isn't consing
dot (which has mandatory trailing whitespace too,
matched by an earlier rule).
* parser.y (UREFDOT): New token type.
(i_dot_expr, n_dot_expr): New grammar rules.
(list): Handle a leading dot on the first element of a list as
a special case. Things are done this way because trying to
work a UREFDOT into the grammar otherwise causes intractable
conflicts.
(i_expr): The ^, ' and , punctuators are now followed by
an i_dot_expr, so that the expression can be an unbound
dot.
(n_expr): Same change as in i_expr, but using n_dot_expr.
Plus new UREFDOT n_expr production.
* share/txr/stdlib/struct.tl (uref): New macro.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (grammar): Add rules which capture two symbols
glued together, and diagnose as bad token. Of course a
legitimate symbol token can be divided into two that are glued
together. This rule is placed after the legitimate symbol
matching rule, so that if a token can be interpreted as a
single symbol token or as two, the first interpretation is
taken.
|
|
|
|
|
|
|
|
|
| |
* parser.l (grammar): Add a rule that if a floating-point
(of the type that ends in decimal digits with an optional
exponent) is immediately followed by a period which is
not followed by another period (range syntax), it is
trailing junk. For instance 1.0.3 or .2.$, or
1.0. followed by no other input.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h,
arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c,
ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S,
lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h,
parser.c, parser.h, parser.l, parser.y, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, stream.c, stream.h,
struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, share/txr/stdlib/awk.tl,
share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl,
share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl,
share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl,
share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl,
share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl,
share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl,
share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl:
Add 2017 to all copyright headers and strings.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is uncovered by compiling with g++ using
-Wold-style-cast.
* mpi/mpi.c (mp_get_intptr): Use convert macro.
Also in one of the rules producing REGCHAR.
* parser.l (num_esc): Likewise.
* struct.c (static_slot_set, static_slot_ens_rec,
get_equal_method): Use coerce macro for int to pointer
conversion.
* sysif.c (setgroups_wrap): Use convert macro.
* termios.c (termios_unpack, termios_pack): Likewise.
* txr.c (sysroot_init): Likewise.
|
|
|
|
|
|
| |
* parser.l: A stray printf was committed in November 2015.
The spurious output only occurs when certain invalid
floating-point syntax is encountered.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Weakness uncovered by fuzzing with AFL (fast) 2.30b.
The failing test case is regex syntax like
[\1111111...111abc], where the bad character escape
allows an invalid, negatively valued character object to
escape out of the parser into the system leading to an an
out-of-bounds array access in the char set code in the regex
compiler.
* parser.l (num_esc): Make sure that an out-of-range
character is mapped to zero. Set up a default value of
zero for the return variable. If the character token has
too many digits, don't pass them through strtol at all,
which will produce a garbage value. Then in the final
range check, actually replace the value with zero if it
is out of range: issuing a diagnostic is not enough.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (BTKEY, NTKEY): Renamed to BTKWUN and NTKWUN
("keyword and uninterned") respectively. Include an
optional match for the # character.
(BTOK, NTOK): Refer to BTKEY and NTKEY respectively
* parser.y (sym_helper): Implement uninterned symbols
by detecting when the package name string is "#"
and handling specially.
* txr.1: Documented package prefixes and uninterned
symbols.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.c (parser_circ_ref): Don't generate the
circular reference if circular suppression is in
effect.
* parser.h (struct parser): New member, circ_suppress.
We use this for suppressing the generation of
circular #n# references in erased objects.
* parser.l (grammar): Scan #; producing HASH_SEMI token.
* parser.y (HASH_SEMI): New token.
(hash_semis_n_expr, hash_semis_i_expr, ignored_i_exprs,
ignored_n_exprs): New nonterminals, needed for supporting
the use of #; in front of top-level forms.
(spec): Use hash_semis_n_expr and hash_semis_i_expr
instead of n_expr and i_expr.
(r_expr): Support object erasure within nested syntax.
(yybadtoken): Handle H_SEMI token.
(parse): Initialize new circ_suppress member of parser
struct to zero.
* txr.1: Documented.
* genvim.txr (txr_ign_par, txr_ign_bkt, txr_ign_par_interior,
txr_ign_bkt_interior): New regions for colorizing erased
objects (partial support).
(txr_list, txr_bracket, txr_mlist, txr_mbrackets): Include
erased objects by including regions txr_ign_par and
txr_ign_bkt.
* txr.vim, tl.vim: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit implements the parse-side support
for handling a notation that exists in ANSI
Common Lisp for specifying objects with cycles
and shared substructure.
* parser.h (struct parser): New members, circ_ref_hash
and circ_count.
(circref_s, parser_resolve_circ, parser_circ_def,
parser_circ_ref): Declared.
* parser.c (circref_s): New symbol variable.
(parser_mark): Visit the new circ_ref_hash member of the
parser structure.
(parser_common_init): Initialize new members
circ_ref_hash and circ_count of parser structure.
(patch_ref, circ_backpatch): New static functions.
(parser_resolve_circ, parser_circ_def, parser_circ_ref): New
functions.
(circref): New static function.
(parse_init): Initialize circref_s as sys:circref symbol.
Register sys:circref function.
* parser.l (grammar): Scan #<num>= and #<num># notation as
tokens, extracting their numeric value.
* parser.y (HASH_N_EQUALS, HASH_N_HASH): New token types.
(i_expr, n_expr): Adding phrases for hash-equalsign and
hash-hash syntax.
(yybadtoken): Handle new token types in switch.
(parse_once): Call parser_resolve_circ after parsing
to rewrite any remaining #<num># references in the
structure to the objects they denote.
(parse): Reset new struct parse members to initial
state. Call parser_resolve_circ after parsing
to rewrite any remaining #<num># references.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c,
combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h,
jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c,
parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h,
share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl,
share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl,
share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl,
share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl,
share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl,
share/txr/stdlib/struct.tl, share/txr/stdlib/termios.tl,
share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl,
share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl,
share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h,
stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c,
syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c,
unwind.h, utf8.c, utf8.h: Revert to verbatim 2-Clause BSD.
|
|
|
|
|
|
|
| |
* parser.l (grammar): Recognize {WS}* between @
and ; (or the legacy #) in comments.
* txr.1: Documentation updated.
|
|
|
|
|
|
|
|
| |
The current behavior is that there is no lex rule for this, so such a
byte gets echoed.
parser.l (grammar): Add fallback rule to match one byte
in SREGEX state and turn it into 0xDCxx character.
|
|
|
|
|
|
| |
* parser.l (num_esc): Check for converted value being
out of the range of wchar_t or beyond 0x10FFFF, whichever
is less.
|
|
|
|
|
|
|
| |
* parser.l (grammar): The newline character is incorrectly
handled by the same rule under the SREGEX and REGEX states.
In the SREGEX state, just return it as a REGCHAR, not
forgetting to increment the line number.
|
|
|
|
| |
* parser.l: Remove trailing whitespace.
|
|
|
|
|
|
|
|
| |
* parser.l (grammar): Drop colon from unrecognized escape
message. "bad character in directive" handles various cases to
avoid printing junk to the terminal. Basic message harmonizes
with the one in the yybadtoken function in the parser.
Non-UTF-8 byte printed as TXR hex integer literal.
|
|
|
|
|
|
|
| |
* parser.l (yyerrprepf): Replace wrong bare assignment
to parser->prepared_msg with proper set macro which
handles the mutation of a mature generation object
such that it points to a baby object.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* match.c (mandatory_k): New keyword variable.
(h_coll, v_gather, v_collect): Implement :mandatory logic.
(syms_init): Initialize mandatory_k.
* parser.l (grammar): The UNTIL and LAST tokens must be
matched similarly to collect, without consuming the
closing parenthesis, allowing a list of items to be parsed
between the symbol and the closure, in the NESTED state.
* parser.y (gather_clause, collect_clause, elem,
repeat_parts_opt, rep_parts_opt): Adjust to new until/last
syntax. In the matching productions, the abstract syntax
changes to incorporate the options. In the output productions,
we throw an error if options are present.
* txr.1: Documented :mandatory for collect, coll and gather.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c,
arith.h, cadr.c, cadr.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c,
gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h,
lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, rand.c, rand.h, regex.c, regex.h,
share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl,
share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl,
share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, stream.c, stream.h, struct.c, struct.h,
sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h,
unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright.
* linenoise/LICENSE, linenoise/linenoise.c,
linenoise/linenoise.h: Bump one principal author's copyright
from 2014 to 2015. The code is based on a snapshot of 2015
upstream work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* debug.c (show_bindings): Use ~d for level, so as
not to be influenced by *print-base*.
(debug): Use ~d for line numbers.
* lib.c (gensym): Use ~d conversion specifier
for formatting gensym counter into symbol name.
* match.c (LOG_MISMATCH, LOG_MATCH): Use ~d for
line number references.
(h_skip, h_coll, h_fun, h_chr, match_line_completely, v_skip,
v_fuzz, v_gather, v_collect, v_output, v_filter, v_fun,
v_assert, v_load, v_line, h_assert, open_data_source): Use ~d
for line refs, number of iterations, errno values.
* parser.c (repl): Use ~d for prompt line numbers,
numbered variables and the expr-<n> string in error
messages.
* parser.l (yyerrorf, source_loc_str): Use ~d for line
numbers.
* stream.c (print_base_s): New symbol variable.
(formatv): Implement *print-base*.
(stdio_maybe_read_error, stdio_maybe_error, stdio_close,
pipe_close, open_directory, open_file, open_fileno, open_tail,
open_process, run, remove_path): Use ~d for errno values.
(stream_init): Initialize print_base_s and register
*print-base* special variable.
sysif.c (mkdir_wrap, ensure_dir, getcwd_wrap, mknod_wrap,
chmod_wrap, symlink_wrap, link_wrap, readlink_wrap,
excec_wrap, stat_impl, pipe_wrap, poll_wrap, getgroups_wrap,
setuid_wrap, seteuid_wrap, setgid_wrap): Use ~d for
errno values and system function results.
* txr.1: Documented *print-base* and ~d conversion specifier.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The read function no longer works like it used to on an
interactive terminal because of the support for .. and .
syntax on a top-level expression.
The iread function is provided which uses a modified syntax
that doesn't support these operators on a top-level
expression. The parser thus doesn't look one token ahead,
and so iread can return immediately.
* eval.c (eval_init): Register iread intrinsic function.
* parser.c (prime_parser): Only push back the recently seen
token when priming for a regular Lisp read. Handle
the prime_interactive method by preparing a SECRET_ESCAPE_I
token.
(lisp_parse_impl): New static function, formed from previous
lisp_parse. Takes a boolean argument indicating interactive
mode.
(prime_parser_post): New function.
(lisp_parse): Now a wrapper for lisp_parse_impl which
passes a nil to indicate noninteractive read.
(iread): New function.
* parser.h (enum prime_parser): New member, prime_interactive.
(scrub_scanner, iread, prime_parser_post): Declared.
* parser.l (prime_scanner): Handle the prime_interactive case
the same way as prime_lisp.
(scrub_scanner): New function.
* parser.y (SECRET_ESCAPE_I): New token type.
(i_expr): New nonterminal symbol. Like n_expr, but doesn't
support dot or dotdot operators, except in nested
subexpressions.
(spec): Handle SECRET_ESCAPE_I by way of i_expr.
(sym_helper): Before freeing the token lexeme, call
scrub_scanner. If the token is registered as the scanner's
most recently seen token, the scanner must forget that
registration, because it is no longer valid.
(parse): Call prime_parser_post.
* txr.1: Documented iread.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (eval_init): Register intrinsic functions rcons,
rangep from and to.
(eval_init): Register rangep intrinsic.
* gc.c (mark_obj): Traverse RNG objects.
(finalize): Handle RNG in switch.
* hash.c (equal_hash, eql_hash): Hashing for for RNG objects.
* lib.c (range_s, rcons_s): New symbol variables.
(code2type): Handle RNG type.
(eql, equal): Equality for ranges.
(less_tab_init): Table extended to cover RNG.
(less): Semantics defined for ranges.
(rcons, rangep, from, to): New functions.
(obj_init): range_s and rcons_s variables initialized.
(obj_print_impl): Produce #R notation for ranges.
(generic_funcall, dwim_set): Recognize range objects for indexing
* lib.h (enum type): New enum member, RNG. MAXTYPE redefined
to RNG value.
(TYPE_SHIFT): Increased to 5 since there are now 16 type
codes.
(struct range): New struct type.
(union obj): New member rn, of type struct range.
(range_s, rcons_s, rcons, rangep, from, to): Declared.
(range_bind): New macro.
* parser.l (grammar): New rule for recognizing
the #R sequence as HASH_R token.
* parser.y (HASH_R): New terminal symbol.
(range): New nonterminal symbol.
(n_expr): Derives the new range symbol.
The n_expr DOTDOT n_expr rule produces rcons expression rather
than const.
* match.c (format_field): Recognize rcons syntax in fields
which is now what ranges translate to. Also recognize range
object.
* tests/013/maze.tl (neigh): Fix code which destructures
range as a cons. That can't be done any more.
* txr.1: Document ranges.
|
|
|
|
|
|
| |
* parser.l: Different text needed for ).1 and a.1 cases,
because the insertion of a zero cannot fix the latter.
Might as well make the messages more detailed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed for multi-line mode with CR line breaks.
It also makes TXR tolerant when code is ported among
systems with different line endings.
* parser.l (NL): New lex named pattern, matching three
possible line terminators: CR, NL or CR-NL.
(grammar): In places where \n was previously matched,
use {NL}. In a few places where \n is in a character
class, add \r. In one place (comment matching), the
the pattern . which implicitly doesn't match newlines
had to be replaced with [^\r\n].
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (yyerrorf): Don't print the program prefix
and parenthes, except if compatibility to 114 or older is
requested. The main motivation for this is the repl, where the
program prefix is not informative. The new format is also a
de facto standard which is compatible with other parsers.
Vim understands it directly.
* txr.1: Documented.
|
|
|
|
|
|
|
|
| |
* parser.l (grammar): Recognize '.' token in
BRACED state also.
* genvim.txr: @{obj.slot ...} syntax highlighting support.
Include txr_dot and txr_dotdot in txr_bracevar region.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* args.c (args_cat_zap): New function.
* args.h: (args_cat_zap): Declared.
* eval.c (struct_lit_s): New symbol variable.
(eval_init): Initialize struct_lit_s.
* eval.h (struct_lit_s): Declared.
* gc.c (finalize): If a symbol has a struct slot
hash attached to it, we must free it when
the symbol is reclaimed.
* lib.c (make_sym): Initialize symbol's slot_cache pointer
to null.
(copy): Copy structure objects.
(init): Call struct_init to initialize struct module.
* lib.h (SLOT_CACHE_SIZE): New preprocessor symbol
(slot_cache_line_t, slot_cache_t): New typedefs.
(struct sym): New member, slot_cache.
* lisplib.c (struct_set_entries, struct_instantiate): New
static functions.
(liplib_init): Register new functions in dl_table.
parser.y (HASH_S): New terminal symbol.
(struct): New grammar rule.
(n_expr): Derive struct.
(yybadtoken): Map HASH_S to #S string.
parser.l (grammar): Recognize #S and return HASH_S token.
share/txr/stdlib/place.tl (slot): New defplace.
share/txr/stdlib/struct.tl: New file.
struct.c: New file.
struct.h: New file.
* Makefile (OBJS): Adding struct.o.
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (SREGEX): New start state, for stand-alone regex parsing.
(grammar): All REGEX state rules are active in the SREGEX state also.
The rule for the / character returns a REGCHAR if in the SREGEX
state, so it is treated as an ordinary character.
* txr.1: Updated regex-parse documentation about the treatment of
the slash. Also added notes about double escaping when a string literal
is passed to regex-parse.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (grammar): Change order of rule which recognizes FLODOT with
a one-character trailing context other than a dot, and the rule which
diagnoses trailing junk. The issue is that this order gives the wrong
interpretation to 123.E, treating it as 123. followed by E rather than
trailing junk, like in the case of 123.0E or 123.B.
* txr.1: Adding the valid example 1.E5. Removing references to dot as
consing dot. Fixed documentation which says that 1.E is 1 followed by
a consing dot and E. The wrong behavior in fact produced 1.0 followed
by E. No consing dot semantics.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.h (enum prime_parser): New enum.
(prime_parser, prime_scanner, parse): Declarations updated with new
argument.
* parser.c (prime_parser): New argument of enum prime_parser type
Select appropriate secret token for regex and Lisp case. Pass prime
selector down to prime_scanner.
(regex_parse): Do not prepend secret escape to string. Do not use
parse_once function; instead do the parser init and cleanup here and
use the parse function.
(lisp_parse): Pass new argument to parse, configuring the parser to be
primed for Lisp parsing.
* parser.l (grammar): Rule producing SECRET_ESCAPE_R removed.
(prime_scanner): New argument. Pop the scanner state down to INITIAL.
Then unconditionally switch to appopriate state based on priming
configuration.
* parser.y (parse): New argument for priming selection, passed down to
prime parser.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The method of inserting a character sequence which generates a
SECRET_TOKEN_E token is being replaced with a purely token based
method.
Because we don't manipulate the input stream, the lexer is not
involved. We don't have to flush its state and deal with the carry-over
of the yy_hold_char.
This comes about because recent changes expose a weakness in the old
scheme. Now that a top-level expression can have the form expr.expr, it
means that the Yacc parser reads one token ahead, to see whether there
is a dot or something else. This lookahead token is discarded. We must
re-create it when we call yyparse again. This re-creation is done by
creating a custom yylex function, which can maintain pushback tokens.
We can prime this array of pushback tokens to generate the
SECRET_TOKEN_E, as well as to re-inject the lookahead symbol that was
thrown away by the previous yyparse. To know which lookahead symbol to
re-inject is simple: the scanner just keeps a copy of the most recent
token that it returns to the parser. When the parser returns, that
token must be the lookahead one.
The tokens we keep now in the parser structure are subject to garbage
collection, and so we must mark them. Since the YYSTYPE union has no
type field, a new API is opened up into the garbage collector to help
implement a conservative GC technique.
* gc.c (gc_is_heap_obj): New function.
* gc.h (gc_is_heap_obj): Declared.
* match.c: Include y.tab.h. This is now needed by any module
that needs to instantiate a parser_t structure, because members
of type YYSTYPE occur in the structure. (parser.h can still be included
without y.tab.h, but only an incomplete declaration for the parser
strucure is then given, and a few functions are not declared.)
* parser.c (yy_tok_mark): New static function.
(parser_mark): Mark the recent token and the pushback tokens.
(parser_common_init): Initialize the recent token, the
pushback tokens, and the pushback stack index.
(pushback_token): New static function.
(prime_parser): hold_byte argument removed. Body considerably
simplified. The catenated stream trick is no longer required.
All we do here is set up two pushback tokens and prime the scanner,
if necessary, so it is in the right start state for Lisp.
* parser.l (YY_DECL): Take over definition of scanning function, renaming
to yylex_impl, so we can implement yylex.
(grammar): Rule which produces SECRET_ESCAPE_E token removed.
(reset_scanner): Function removed.
(yylex): New function.
* parser.h (struct parser): Now only forward-declared unless y.tab.h
has been included. New members, recent_tok, tok_pushback and tok_idx.
(yyset_hold_char): Declared.
(reset_scanner): Declaration removed.
(yylex): Declared (if y.tab.h included).
(prime_parser): Declaration updated.
(prime_scanner): Declared.
* Makefile: express new dependency on existence of y.tab.h of txr.o,
match.o and parser.o.
|
|
|
|
|
|
|
|
| |
These look like integers involved in qref dot syntax.
* parser.l (DOTFLO): New pattern definition.
(grammar): New rules for detecting cramped floating
literals.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a.b.(expr ...).c -> (qref a b (expr ...) c)
Consing dot requires whitespace.
* eval.c (qref_s): New symbol global variable.
(eval_init): Initialize qref_s.
* eval.h (qref_s): Declared.
* parser.l (REQWS): New pattern definition, required whitespace.
(grammar): New rules to scan CONSDOT (space required on both
sides) and LAMBDOT (space required after).
* parser.y (CONSDOT, LAMBDOT): New token types.
(list): (. n_expr) rule replaced with LAMBDOT and CONSDOT.
(r_exprs): r_exprs . n_expr consing dot rule replaced with CONSDOT.
(n_expr): New n_expr . n_expr rule introduced here for producing
qref expressions.
(yybadtoken): Handle CONSDOT and LAMBDOT.
* txr.1: Documented qref dot.
|
|
|
|
|
|
| |
* parser.l (BTREG, NTREG): Allow an empty string
symbol name with a nonempty package name.
Without this, abc: parses as abc :.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(eval_error): Derive location of error from
the last_form_evaled, if form doesn't have it.
(eval_init): Re-register source-loc-str as binary with an optional arg.
* match.c (debuglf, sem_error, file_err, typed_error): Default new
argument of source_loc_str.
* parser.h (source_loc_str): Declaration updated.
* parser.l (source_loc_str): Take second argument which specifies
alternative value if the source loc info is not found.
* unwind.c (uw_throw): Simplify code thanks to source_loc_str
default argument.
* txr.1: Document new argument of source-loc-str.
|
|
|
|
|
|
|
|
|
|
|
| |
word list literals and word list quasiliterals, except
in <= 109 compatibility mode. An escaped newline in
these literals, together with surrounding whitespace,
now produces a single space, except in <= 109
compatibility mode.
* txr.1: Documented new rules for WLL's and QLL's,
and added compatibility notes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test case: file containing 4(prinl 3). Scanner consumes 4 and (.
The ( is lost when the scanner is reset for the next call to yyparse,
resulting in jut prinl being read and interpreted as a variable.
* parser.c (prime_parser): If present, append hold byte to priming
string. Takes parser_t * instead of parser, and returns void now.
* parser.l (reset_scanner): Now returns int value, the value
of the scanner's yy_hold_char variable which is nonzero when
the scanner is hanging on to an unmatched byte of input.
* parser.h (reset_scanner, prime_parser): Declarations updated.
* parser.y (parse): Pass hold byte returned by reset_scanner to
prime_parser.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.c (parser_destroy): New GC finalizer static function.
(parser_ops): Register parser_destroy.
(parser_common_init): New function, shared by parse and parse_once.
Initializes embedded scanner.
(parser_cleanup): New function, shared by parse_once and
parser_destroy.
(parser): Use parser_common_init.
* parser.h (parser_t): New member, yyscan.
(reset_scanner, parser_common_init): Declared.
* parser.l (reset_scanner): New function.
* parser.y (parse_once): Use parser_common_init, and
thus perform only a few initializations. Do not
define scanner as a local variable.
(parse): Call reset_scanner instead of
yylex_init since the scanner is being reused,
and for the same reason do not call yylex_destroy.
GC will do that now.
|
|
|
|
|
|
|
| |
* parser.l (grammar): Scan a METANUM token in the
BRACED state also. This allows us to correctly
reference op arguments in a quasiliteral, as in
`foo @{@1 [1..2] ","} bar`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (%option): Remove nounput option since we need
yyunput.
(grammar): Rule for matching hex and octal escape in SPECIAL
state recognizes optional semicolon. In 109 compatibility,
this is pushed back into the stream, otherwise consumed.
* txr.1: Updated documentation, including compat notes.
* genvim.txr (txr_char): Include optional semicolon in
match. Corrected some errors where 8 and 9 were being
included as matches for octal digits.
(txr_error): Default match for \x or \o not followed
by digits.
|
|
|
|
|
|
| |
* parser.l (char_esc): Recognize \@ escape.
(grammar): Add a rule for a \@ escape in quasiliterals,
and quasi word list literals.
|
|
|
|
|
|
|
|
|
| |
* parser.l: Only shift to QSPECIAL state when @ is followed
by a trailing context consisting of certain characters.
Not every kind of Lisp object syntax can be introduced
with @ in a quasiliteral. Adding a rule to produce an
error when @ appears that is not followed by an allowed
character.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l: Do not try to recognize floating-point literals
in QSPECIAL state; that is not possible because @134.3
in a quasiliteral parses as a METANUM followed by ".3".
On the other hand, recognize METANUM literals in QSPECIAL state,
so that @@123 scans. Recognize @ as a token in QSPECIAL state,
so @@abc will scan. When transitioning from QSILIT and QWLIT
states to QSPECIAL upon scanning @, return a @ token, which
is now parsed in the grammar.
* parser.y (quasi_meta_helper): New static function.
(q_var): Do not handle SYMTOK any more, only the braced
variable syntax. SYMTOK is handled as a n_expr.
Braced vars are handled with explicit '@' token, which
is now produced by the scanner when it shifts from QSILIT
to QSPECIAL.
(quasi_item): No longer necessary to recognize various
forms here such as quotes and splices. Just recognize a n_expr,
preceded by '@'.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (REGOP): New regex alias for matching all regex
special characters.
(grammar): Several rules for regex special characters merged
together. New rule introduced to match a special character
after a backslash, making it literal. The old rule which makes
literal any character after a backslash now throws an error,
unless version 105 comaptibility is selected.
* txr.1: Documented this behavior change.
|
|
|
|
|
|
|
|
|
| |
* parser.l: Consolidate rules for recognizing quote, unquote, and
quasiquote. An effect of this is that quasiquotes can now occur in
braces and in string quasiliterals.
* parser.y (quasi_item): Support quotes and quasiquotes as quasi items:
that is to say, i.e. objects denoted by @ in a quasiliteral.
|
|
|
|
|
|
|
| |
* parser.l: Combining the handling of hex, octal and binary numeric
literals into a single rule. Implementing an additional rule which
diagnoses such tokens that have trailing junk. Thus, something
like #x1F2AZ is now invalid syntax.
|
|
|
|
|
|
|
| |
moved here from parser.l.
* parser.l (open_txr_file, regex_parse, lisp_parse): Functions
moved from here to parser.c.
|
|
|
|
|
|
|
|
|
|
| |
* parser.h (parser_s): Declared.
(parse_init): Declaration removed.
(parser_l_init): Declared.
* parser.l (parse_init): Function renamed to parser_l_init.
* parser.c: New file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (last_form_expanded): New variable.
(do_expand): New static function; contains previous expand
function.
(expand): Becomes a wrapper for do_expand, with re-entry
counting.
(eval_init): GC-protect last_form_expanded.
* eval.h (last_form_expanded): Declared.
* parser.l (regex_parse, lisp_parse): Just use a simple word for
the name of the regex or string parse location, not the entire
expression itself.
* unwind.c (uw_throw): Check whether expansion was going on
when the unhandled exception was thrown and print additional
information.
|