| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
@(push) is like @(output), but feeds back into input.
Use carefully.
* parser.y (PUSH): New token.
(output_push): New nonterminal symbol.
(output_clause): Handle OUTPUT or PUSH via output_push.
Some logic moved to output_helper.
(output_helper): New function. Transforms both @(output)
and @(push) directives. Checks both for valid keywords;
push has only :filter.
* parser.l (grammar): Recognize @(push similarly to other
directives.
* lib.[ch] (push_s): New symbol variable.
* match.c (v_output_keys): Internal linkage changes to external.
(v_push): New function.
(v_parallel): We must fix the max_line algorithm not to
use an initial value of zero, because lines can go negative
thanks to @(push). We end up rejecting the pushed data.
(v_collect): We can no longer assert that the data line
number doesn't retreat.
(dir_tables_init): Register push directive in table of
vertical directives.
* match.h (append_k, continue_k, finish_k): Existing symbol
variables declared.
(v_output_keys): Declared.
* y.tab.c.shipped,
* y.tab.h.shipped,
* lex.yy.c.shipped: Updated.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
| |
* parser.l (YY_FATAL_ERROR): New macro.
(lex_irrecovarable_error): New function.
(parser_l_init): Take address of yy_fatal_error and cast to
void, to suppress warning that the function is unused.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl,
stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl,
stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl,
stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl,
stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl,
stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl,
stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl,
stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl,
stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl,
stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl,
stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl,
stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl,
stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl,
stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright year bumped to 2023.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (remove_char): New static function.
(DIGSEP, XDIGSEP, NUMSEP, FLOSEP, XNUMSEP, ONUMSEP,
BNUMSEP, ONUM, BNUM): New named lex patterns.
(FLODOT): Use DIGSEP instead of DIG.
(ONUM): Use ODIG instead of [0-7].
(BNUM): Use BDIG instead of [0-1].
(grammar): New rule for producing NUMBER from decimal
token with commas based on BNUMSEP instead of BNUM.
This is a copy and paste so that the BNUM rule doesn't
deal with the comma removal, not to slow it down.
For the octal, binary and hex, we just switch to
BNUMSEP, ONUMSEP and XNUMSEP, so they all go through
one case.
Floating point numbers are also handled with a copy
pasted case using FLOSEP.
* tests/012/syntax.tl: New test cases.
* txr.1: Documented.
* genvim.txr (alpha-noe, digsep, hexsep, octsep, binsep): New
variables.
(txr_pnum, txr_xnum, txr_onum, txr_bnum, txr_num): Integrate
separating commas. Some bugs fixed in txr_num, some simplifications,
better txr_badnum pattern.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c,
cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h,
combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h,
ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h,
glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h,
parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h,
rand.c, rand.h, regex.c, regex.h, signal.c, signal.h,
socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl,
stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl,
stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1,
txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h,
vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year
bumped to 2022.
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (NJPUNC): This inverted class lexical category must
exclude the carriage return character \r, otherwise it matches
it. The JSON keywords true, false and null are recognized as
sequences of NJPUNC. If we don't exclude \r from NJPUNC, it
looks like a symbol constituent, comprising an unrecognized
JSON keyword.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
| |
* parser.l: Remove rule matching double quote in JSON state.
This follows a rule which matches . so is unreachable. The
rule is necessary, but an identical copy of it already appears
earlier.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c,
buf.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h,
combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, ffi.c,
ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c,
glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c,
lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c,
parser.h, parser.l, parser.y, rand.c, rand.h, regex.c,
regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl,
stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.c,
txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h:
License reformatted.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The make_hash function now takes the hash_weak_opt_t
enumeration instead of a pair of flags.
* hash.c (do_make_hash): Take enum argument instead of pair of
flags. Just store the option; nothing to calculate.
(weak_opt_from_flags): New static function.
(tweak_hash): Function removed.
(make_seeded_hash): Adjust to new do_make_hash interface with
help from weak_opt_from_flags.
(make_hash, make_eq_hash): Take enum argument instead of pair
of flags.
(hashv): Calculate hash_weak_opt_t enum from the extracted
flags, pass down to make_eq_hash or make_hash.
* hash.h (tweak_hash): Declration removed.
(make_hash, make_eq_hash): Declarations updated.
* eval.c (me_case, expand_switch): Update make_hash
calls to new style.
(eval_init): Update make_hash calls and get rid of tweak_hash
calls. This renders the tweak_hash function unused.
* ffi.c (make_ffi_type_enum, ffi_init): Update make_hash calls
to new style.
* filter.c (make_trie, trie_add, filter_init): Likewise.
* lib.c (make_package_common, obj_init, obj_print): Likewise.
* lisplib.c (lisplib_init): Likewise.
* match.c (dir_tables_init): Likewise.
* parser.c (parser_circ_def, repl, parse_init): Likewise.
* parser.l (parser_l_init): Likewise.
* struct.c (struct_init, get_slot_syms): Likewise.
* sysif.c (get_env_hash): Likewise.
* lex.yy.c.shipped, y.tab.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each time the scanner processes a floating-point token,
it allocates a string object, just so it can call flo_str.
The object is then garbage. Let's stop doing that.
* lib.c (flo_str_utf8): New function, closely based on flo_str.
Takes a char * string.
* lib.h (flo_str_utf8): Declared.
* parser.l (out_of_range_float): Take the token as a const
char * string instead of a Lisp string, so we can just pass
yytext to this function.
(grammar): Use flo_str_utf8 instead flo_str, and pass yytext
to out_of_range_float.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The big comment I added above end_of_json_unquote summarizes
the issue. This issue has been uncovered by some test cases in
a JSON test suite, not yet committed.
* parser.l <JMARKER>: New start condition. Used as a reliable
marker in the start condition stack, based on which
end_of_json_quasiquote can intelligently fix-up the stack.
(JSON): In the transitions to the quasiquote scanning NESTED
state, push the JMARKER start condition before NESTED.
(JMARKER): The lexer should never read input in the JMARKER
state. If we don't put in a rule to catch this, if that ever
happens, the lexer will just copy the source code to standard
output. I ran into this during debugging.
(end_of_json_unquote): Rewrite the start condition stack
intelligently based on what the Lisp lookahead token has done
to it, so parsing can smoothly continue.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (eval_init): get-json intrinsic registered.
* parser.c (prime_parser): Handle prime_json.
(lisp_parse_impl): Take enum prime_parser argument
directly instead of the interactive flag.
(lisp_parse, nread, iread): Pass appropriate prime_parser
value instead of the original flag.
(get_json): New function. Like nread, but passes prime_json.
* parser.h (enum prime_parser): New constant, prime_json.
(get_json): Declared.
* parser.l (prime_scanner): Handle prime_json.
* parser.y (SECRET_ESCAPE_J): New terminal symbol.
(spec): New productions around SECRET_ESCAPE_J for parsing
JSON.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
* txr.1: Documented.
* share/txr/stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The JSON null will map to the Lisp null symbol. I thought
about using : but that could cause surprises; like when it's
passed to functions as an optional argument, it will trigger
the default value.
* parser.l (JSON): Add rules for producing null keyword.
* txr.1: Documented.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l <JLIT>: Convert \u+0000 sequence to U+DC00
code point, the pseudo-null. Also include JLIT
in in the rule for catching bad bytes that are not
matched by {UANYN}.
* txr.1: Document this treatment as extensions to JSON.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because #J<json> produces the (json ...) form that translates
into quote, ^#J<json> yields a quasiquote around a quote. This
has some disadvantages, because it requires an explicit eval
in some situtions to do what the programmer wants.
Here, we introduce an alternative: the syntax #J^<json> will
produce a quasiquote instead of a quote.
The new translation scheme is
#J X -> (json quote <X>)
#J^ X -> (json sys:qquote <X>)
where <X> denotes the Lisp object translation of JSON syntax X.
* parser.c (me_json): The quote symbol is now already in the
json form, so all that is left to do here is to take the cdr
to pop off the json symbol.
* parser.l (JPUNC, NJPUNC): Allow ^ to be a punctuator in
JSON mode.
* parser.y (json): For regular #J, generate the new (json
quote ...) syntax. Implement J# ^ which sets up the nonzero
quasi_level around the processing of the JSON syntax, so that
everything is in a quasiquote, finally producing the
(json sys:qquote ...) syntax.
* lex.yy.c.shipped, y.tab.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.h (end_of_json_unquote): Declared.
* parser.l (JPUNC, NJPUNC): Add ~ and * characters to set of
JSON punctuators.
(grammar): Allow closing brace character in NESTED, SPECIAL
and QSPECIAL statues to be a token. This is because it occurs
as a lookahead character in this situation #J{"foo":~expr}.
The lexer switches from the JSON to the NESTED start state
when it scans the ~ token, so that expr is treated as Lisp.
But then } is consumed as a lookahead token by the parser in
that same mode; when we pop back to JSON mode, the } token has
already been scanned in NESTED mode.
We add two new rules in JSON mode to the lexer to recognize
the ~ unquote and ~* splicing unquote. Both have to push the
NESTED start condition.
(end_of_json_unquote): New function.
* parser.y (JSPLICE): New token.
(json_val): Logic for unquoting. The array and hash rules must
now be prepared to deal with json_vals and json_pairs now
producing a list object instead of a hash or vector. That is
the signal that the data contains active quasiquotes and must
be translated to the special literal syntax for quasiquoted
vectors and hashes. Here we also add the rules for ~ and ~*
unquoting syntax, including managing the lexer's transition
back to the JSON start condition.
(json_vals, json_pairs): We add the logic here to recognize
unquotes in quasiquoting state. This is more clever than the
way it is done in the Lisp areas of the grammar. If no
quasiquotes occur, we construct a vector or hash,
respectively, and add to it. If unquotes occur and if we
are nested in a quasiquote, we switch the object to a list,
and continue it that way.
(yybadtoken): Handle JSPLICE.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (HASH_N_EQUALS, HASH_N_HASH): Recognize these
tokens in the JSON start state also.
* parser.y (json_val): Add the circular syntax, exactly like
it is done for n_expr and i_expr. And it works!
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(needs buffer literal error message cleanup)
* parser.c (json_s): New symbol variable.
(is_balanced_line): Follow braces out of initial state.
This concession allows the listener to accept input
like #J{"a":"b"}.
(me_json): New static function (macro expander). The #J X
syntax produces a (json Y) form, with the JSON syntax X
translated to a Lisp object Y. If that is evaluated,
this macro translates it to (quote Y).
(parse_init): initialize json_s variable with interned symbol,
and register the json macro.
* parser.h (json_s): Declared.
(end_of_json): Declared.
* parser.l (num_esc): Treat u escape sequences in the same way
as x. This function can then be used for handling the \u
escapes in JSON string literals.
(DIG19, JNUM, JPUNC, NJPUNC): New lex named patterns.
(JSON, JLIT): New lex start conditions.
(grammar): Recognize #J syntax, mapping to HASH_J token,
which transitions into JSON start state.
In JSON start state, handle all the elements: numbers,
keywords, arrays and objects. Transition into JLIT state.
In JLIT start state, handle all the elements of JSON string
literals, including surrogate pair escapes.
JSON literals share the fallback {UANY} fallback patter with
other literals.
(end_of_jason): New function.
* parser.y (HASH_J, JSKW): New token symbols.
(json, json_val, json_vals, json_pairs): New nonterminal
symbols, and rules.
(i_expr, n_expr): Generate json nonterminal, to hook the
stuff into the grammar.
(yybadtoken): Handle JKSW and HASH_J tokens.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped:
Updated.
|
|
|
|
|
|
|
| |
* parser.l (BUFLIT): When reporting a bad characters, do not
show it in the form of an escape sequence.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is fairly obscure. A repro test case is a file which
contains:
3"foo"
When the 3 is parsed, the " is also scanned as a lookahead
token, and when that happens, the lexer shifts into the
STRLIT state. At that point the parse job finishes for
that top-level form.
The next time the parser is called, it will prime the token
stream by pushing the " token into it. But, the lex state is
not put into the STRLIT. State. The result is that the parser
obtains the " token, and then foo is lexically analyzed in the
wrong state as a symbol. A syntax error occurs: symbol token
in the middle of a string literal, instead of just a sequence
of LITCHAR tokens, as expected.
What we can do is associate a lex state with pushback tokens.
If a pushback token has a nonzero lex state which is different
from the current YYSTATE, then when that pushback token is
consumed, we push that state also.
* parser.h (struct yy_token): New member, yy_lex_state.
* parser.c (parser_common_init): Initialize the new
yy_lex_state member of every token member of the parser
structure.
* parser.l (yylex): When feeding a pushed token to the parser,
if that token has a nonzero state, and the state is different
from YYSTATE, we push that state. So for instance a pushed
back " token will carry the STRLIT state, which is different
from the NESTED state that will be in effect at the start of
the parse job, and so it will be pushed, as if the " character
had been scanned. Also, when we call the real yylex_impl,
when we are storing the recenty seen token in recent_tok, we
also store the current YYSTATE along with it. That's how
tokens get associated with a state. The artificial tokens that
are used for priming parsing like SECRET_ESCAPE_E are never
associated with a nonzero state.
* tests/012/syntax.tl: Some test cases that didn't pass
before this.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (grammar): Just like we do in SREGEX, allow an
arbitrary byte in REGEX, mapping it to the DCxx range.
Do the same inside string literals of all types.
* lex.yy.c.shipped: Updated.
* tests/012/parse.tl: New tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main idea in this commit is to change a behavior of the
lexer, and take advantage of it in the parser. Currently, the
lexer recognizes a {UANYN} pattern in two places. That
pattern matches a UTF-8 character. The lexeme is passed to
the decoder, which is expected to produce exactly one wide
character. If the UTF-8 is bad (for instance, a code in the
surrogate pair range U+DCxx) then the decoder will produce
multiple characters. In that case, these rules return ERRTOK
instead of a LITCHAR or REGCHAR. The idea is: why don't we
just return those characters as a TEXT token? Then we can
just incorporate that into the literal or regex.
* parser.l (grammar): If a UANYN lexeme decodes to multiple
characters instead of the expected one, then produce a
TEXT token instead of complaining about invalid UTF-8 bytes.
* parser.y (regterm): Recognize a TEXT item as a regterm,
converting its string value to a compound node in the regex
AST, so it will be correctly treated as a fixed pattern.
(chrlit): If a hash-backslash is followed by a TEXT token,
which can happen now, that is invalid; we diagnose that
as invalid UTF-8.
(quasi_item): Remove TEXT rule, because the litchars
constituent not generates TEXT.
(litchars, restlistchar): Recognize TEXT item, similarly to
regterm.
* tests/012/parse.tl: New file.
* tests/012/parse.expected: Likewise.
|
|
|
|
|
|
|
| |
* parser.l (grammar): Because the \x pattern requires one or
more digits after it, if they are not present, we simply
report \x as an an unrecognized escape. It's better if we
diagnose it properly as a \x that is not followed by digits.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whereas @a..@b parses and transforms to (rcons @a @a),
@(a)..@(a) goes to @(rcons a @(a)).
* parser.l (grammar): Under 248 compatibility or lower, the @
character now produces the OLD_AT token. Otherwise it produces
the '@' character, as before.
* parser.y (OLD_AT): New token replaces the '@' at the old
low precedence position. '@' is now at the highest precedence,
together with OLD_DOTDOT. (We don't care about interactions
between '@' and OLD_DOTDOT, because OLD_DOTDOT only exists in
185 compatibility, in which '@' is OLD_AT).
(meta): The two rules have to be unfortunately duplicated for
OLD_AT, since there is no BNF OR operator in Yacc.
* txr.1: Compat note added.
* lex.yy.c.shipped: Updated.
* y.tab.c.shipped, y.tab.h.shipped: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* METALICENSE: 2020 copyrights bumped to 2021. Added note
about SHA-256 routines from Colin Percival.
* LICENSE, LICENSE-CYG, Makefile, alloca.h, args.c, args.h,
arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c,
chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h,
configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h,
filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h,
hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped,
lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h,
lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c,
regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl,
share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl,
share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl,
share/txr/stdlib/copy-file.tl, share/txr/stdlib/debugger.tl,
share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl,
share/txr/stdlib/each-prod.tl, share/txr/stdlib/error.tl,
share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl,
share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl,
share/txr/stdlib/package.tl, share/txr/stdlib/param.tl,
share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl,
share/txr/stdlib/pmac.tl, share/txr/stdlib/quips.tl,
share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl,
share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl,
share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl,
share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl,
share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl,
share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, socket.c, socket.h, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright year bumped to 2021.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks like Flex has a "never-interactive" option which does
what the batch option should be doing: it removes the use of
isatty.
Historically, it was an %option only with no corresponding
command-line option; newer Flex has a --never-interactive
command line option also.
The batch option actually makes very little difference in the
output. If never-interactive is used, then the batch option
makes no difference.
* parser.l (%option): Add never-interactive option, remove
batch.
(VER, FLEX_VER, YY_NO_UNISTD_H, isatty, no_isatty): Remove all
these macros: under the never-interactive option, there are
no isatty calls and no inclusion of <unistd.h>.
|
|
|
|
|
|
|
|
|
| |
* parser.l (VER, FLEX_VER): New macros.
(isatty): Define the macro differently for Flex 2.5.36 and
older, to counteract the skeleton's "extern int isatty(int)"
declaration. This involves a dummy global variable.
(no_isatty): New dummy global, only under Flex 2.5.36 and
older.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The flex-generated scanner wastefully calls isatty(0) whenever
it is initialized, even though we don't read from a stdio
stream. Even (read "abc") will call isatty(0), which is
unacceptable.
Moreover, this happens even if Flex is told to operate in
batch mode rather than interactive.
With the following change, the isatty calls are gone.
Let's switch Flex to batch mode, too.
* parser.l: Remove <unistd.h> inclusion. Define isatty as a
macro that returns zero. Add the batch option to the scanner.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flex-generated scanners contain a #include <unistd.h>
directive, whose purpose seems to be to ensure that isatty is
declared. This #include is in a bad place, long after all our
headers have been declared. We define macros in lib.h and
elsewhere that can interfere with system headers.
The Homebrew distro people have run into this problem on OS/X.
Our lib.h defines a "noreturn" macro (that should arguably be
named NORETURN, to match the style for INLINE). A "noreturn"
macro interferes with __attribute__((noreturn)) syntax in
system headers. Our main strategy for not causing that problem
is to include all system headers first, before local headers.
* parser.l: include <unistd.h> right after "config.h", if
HAVE_UNISTD_H is defined. Define YY_NO_UNISTD_H to suppress
the flex-generated #include.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The c_num and c_unum functions now take a self argument for
identifying the calling function. This requires changes in a
large number of places.
In a few places, additional functions acquire a self
argument. The ffi module has the most extensive example of
this.
Some functions mention their name in a larger string, or have
scattered literals giving their name; with the introduction of
the self local variable, these are replaced by references to
self.
In the following changelog, the notation TS stands for "take
self argument", meaning that the functions acquires a new "val
self" argument. The notation DS means "define self": the functions
in question defines a self variable, which they pass down.
The notation PS means that the functions pass down an existing
self variable to functions that now require it.
* args.h (args_count): TS.
* arith.c (c_unum, c_num): TS.
(toint, exptv): DS.
* buf.c (buf_check_len, buf_check_alloc_size, buf_check_index,
buf_do_set_len, replace_buf, buf_put_buf, buf_put_i8,
buf_put_u8, buf_put_char, buf_put_uchar, buf_get_bytes,
buf_get_i8, buf_get_u8, buf_get_cptr,
buf_strm_get_byte_callback, buf_strm_unget_byte, buf_swap32,
str_buf, buf_int, buf_uint, int_buf, uint_buf): PS.
(make_duplicate_buf, buf_shrink, sub_buf, buf_print,
buf_pprint): DS.
* chskum.c (sha256_stream_impl, sha256_buf, crc32_buf,
md5_stream_impl, md5_buf): TS.
(chksum_ensure_buf, sha256_stream, sha256, sha256_hash,
md5_stream, md5, md5_hash): PS.
(crc32_stream): DS.
* combi.c (perm_while_fun, perm_gen_fun_common,
perm_str_gen_fun, rperm_gen_fun, comb_vec_gen_fun,
comb_str_gen_fun, rcomb_vec_gen_fun, rcomb_str_gen_fun): DS.
* diff.c (dbg_clear, dbg_set, dbg_restore): DS.
* eval.c (do_eval, gather_free_refs, maprodv, maprendv,
maprodo, do_args_apf, do_args_ipf): DS.
(op_dwim, me_op, map_common): PS.
(prod_common): TS.
* ffi.c (struct txr_ffi_type): release member TS.
(make_ffi_type_pointer): PS and release argument TS.
(ffi_varray_dynsize, ffi_array_in, ffi_array_put_common,
ffi_array_get_common, ffi_varray_in, ffi_varray_null_term): PS.
(ffi_simple_release, ffi_ptr_in_release, ffi_struct_release,
ffi_wchar_array_get, ffi_array_release_common,
ffi_array_release, ffi_varray_release): TS.
(ffi_float_put, double_put, ffi_be_i16_put, ffi_be_u16_put,
ffi_le_i16_put, ffi_le_u16_put, ffi_be_i32_put, ffi_be_u32_put,
ffi_le_i32_put, ffi_sbit_put, ffi_ubit_put, ffi_buf_d_put,
make_ffi_type_array, make_ffi_type_enum, ffi_type_compile,
make_ffi_type_desc, ffi_make_call_desc, ffi_call_wrap,
ffi_closure_dispatch_save, ffi_put_into, ffi_in, ffi_get,
ffi_put, carray_set_length, carray_blank, carray_buf,
carray_buf_sync, carray_cptr, carray_refset, carray_sub,
carray_replace, carray_uint, carray_int): PS.
(carray_vec, carray_list): DS.
* filter.c (url_encode, url_decode, base64_stream_enc_impl): DS.
* ftw.c (ftw_callback, ftw_wrap): DS.
* gc.c (mark_obj, gc_set_delta): DS.
* glob.c (glob_wrap): DS.
* hash.c (equal_hash, eql_hash, eq_hash, do_make_hash,
hash_equal, set_hash_traversal_limit, gen_hash_seed): DS.
* itypes.c (c_i8, c_u8, c_i16, c_u16, c_i32, c_u32, c_i64,
c_u64, c_short, c_ushort, c_int, c_uint, c_long, c_ulong): PS.
* lib.c (seq_iter_rewind): TS and becomes internal.
(seq_iter_init_with_info, seq_setpos, replace_str, less,
replace_vec, diff, isec, obj_print_impl): PS.
(nthcdr, equal, mkstring, mkustring, upcase_str, downcase_str,
search_str, sub_str, cat_str, scat2, scat3, fmt_join,
split_str_keep, split_str_set, trim_str, int_str, chr_int,
chr_str, chr_str_set, vector, vecref, vecref_l, list_vec,
copy_vec, sub_vec, cat_vec, lazy_str_put, lazy_str_gt,
length_str_ge, length_str_lt, length_str_le, cptr_size_hint,
cptr_int, out_lazy_str, out_quasi_str, time_string_local_time,
time_string_utc, time_fields_local_time, time_fields_utc,
time_struct_local, time_struct_utc, make_time, time_meth,
time_parse_meth): DS.
(init_str, cat_str_init, cat_str_measure, cat_str_append,
vscat, time_fields_to_tm, time_struct_to_tm, make_time_impl): TS.
* lib.h (seq_iter_rewind): Declaration removed.
(c_num, c_unum, init_str): Declarations updated.
* match.c (LOG_MISMATCH, LOG_MATCH): PS.
(h_skip, h_coll, do_output_line, do_output, v_skip, v_fuzz,
v_collect): DS.
* parser.c (parser, circ_backpatch, report_security_problem,
hist_save, repl, lino_fileno, lino_getch, lineno_getl,
lineno_gets, lineno_open): DS.
(parser_set_lineno, lisp_parse_impl): PS.
* parser.l (YY_INPUT): PS.
* rand.c (make_random_state): PS.
* regex.c (print_rec): DS.
(search_regex): PS.
* signal.c (kill_wrap, raise_wrap, get_sig_handler,
getitimer_wrap, setitimer_wrap): DS.
* socket.c (addrinfo_in, sockaddr_pack, fd_timeout,
to_connect, open_sockfd, sock_mark_connected,
sock_timeout): TS.
(getaddrinfo_wrap, dgram_set_sock_peer, sock_bind,
sock_connect, sock_listen, sock_accept, sock_shutdown,
sock_send_timeout, sock_recv_timeout, socketpair_wrap): DS.
* stream.c (generic_fill_buf, errno_to_string, stdio_truncate,
string_out_put_string, open_fileno, open_command, base_name,
dir-name): DS.
(unget_byte, put_buf, fill_buf, fill_buf_adjust,
get_line_as_buf, formatv, put_byte, test_set_indent_mode,
test_neq_set_indent_mode, set_indent_mode, set_indent,
inc_indent, set_max_length, set_max_depth, open_subprocess, run ): PS.
(fds_subst, fds_swizzle): TS.
* struct.c (make_struct_type, super, umethod_args_fun): PS.
(method_args_fun): DS.
* strudel.c (strudel_put_buf, strudel_fill_buf): DS.
* sysif.c (errno_wrap, exit_wrap, usleep_wrap, mkdir_wrap,
ensure_dir, makedev_wrap, minor_wrap, major_wrap, mknod_wrap,
mkfifo_wrap, wait_wrap, wifexited, wexitstatus, wifsignaled,
wtermsig, wcoredump, wifstopped, wstopsig, wifcontinued,
dup_wrap, close_wrap, exit_star_wrap, umask_wrap, setuid_wrap,
seteuid_wrap, setgid_wrap, setegid_wrap,
simulate_setuid_setgid, getpwuid_wrap, fnmatch_wrap,
dlopen_wrap): DS.
(chmod_wrap, do_chown, flock_pack, do_utimes, poll_wrap,
setgroups_wrap, setresuid_wrap, setresgid_wrap, getgrgid_wrap): PS.
(c_time): TS.
* sysif.h (c_time): Declaration updated.
* syslog.c (openlog_wrap, syslog_wrap): DS.
* termios.c (termios_pack): TS.
(tcgetattr_wrap, tcsetattr_wrap, tcsendbreak_wrap,
tcdrain_wrap, tcflush_wrap, tcflow_rap, encode_speeds,
decode_speeds): DS.
* txr.c (compato, array_dim, gc_delta): DS.
* unwind.c (uw_find_frames_by_mask): DS.
* vm.c (vm_make_desc): PS.
(vm_make_closure, vm_swtch): DS.
|
|
|
|
|
|
|
|
|
|
| |
Time for some spring cleaning.
* args.c, arith.c, buf.c, cadr.c, chksum.c, debug.c, ftw.c,
gc.c, gencadr.txr, glob.c, hash.c, lisplib.c, match.c,
parser.c, parser.l, parser.y, rand.c, signal.c, stream.c,
strudel.c, syslog.c, tree.c, unwind.c, utf8.c, vm.c: Numerous
unnecessary #include directives removed.
|
|
|
|
|
| |
* parser.l (YY_INPUT): Must use coerce because we are changing
from char * to unsigned char *.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a result of this change, the startup time is reduced.
The command txr -e '(compile-toplevel nil)' shows a 54%
speedup: around 110 milliseconds down from around 170.
Programs that read large amounts of TXR Lisp data will
benefit.
* parser.l (YY_INPUT): Use new get_bytes function instead of
get_byte to read a buffer at a time.
* stream.c (get_bytes): New function.
* stream.h (get_bytes): Declared.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c,
args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h,
chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c,
combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c,
ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c,
glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c,
lib.h, linenoise/linenoise.c, linenoise/linenoise.h,
lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c,
regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl,
share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl,
share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl,
share/txr/stdlib/debugger.tl, share/txr/stdlib/defset.tl,
share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl,
share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl,
share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl,
share/txr/stdlib/package.tl, share/txr/stdlib/param.tl,
share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl,
share/txr/stdlib/pmac.tl, share/txr/stdlib/save-exe.tl,
share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl,
share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl,
share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl,
share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl,
share/txr/stdlib/vm-param.tl,
share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, socket.c, socket.h, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, tree.c, tree.h,
txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c,
vm.h, vmop.h, win/cleansvg.txr: Extended copyright notices
to 2020.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (obj_print_impl): Render the new syntactic conventions
introduced in qref/uref back into the .? syntax. The printers
for qref and uref are united into a single implementation to
reduce code proliferation.
* parser.l (grammar): Produce new tokens OREFDOT and UOREFDOT.
* parser.y (OREFDOT, UREFDOT): New terminal symbols.
(n_expr): Handle .? syntax via the new OREFDOT and UOREFDOT
token via qref_helper and uoref_helper. Logic for the existing
referencing dot is moved into the new qref_helper function.
(n_dot_expr): Handle .? syntax via uoref_helper.
(uoref_helper, qref_helper): New static functions.
* share/txr/stdlib/struct.tl (qref): Handle the new case when
the expression which gives the object is (t expr).
Handle the new case when the first argument after the object
has this form, and is followed by more arguments. Both
these cases emit the right conditional code.
(uref): Handle the leading .? syntax indicated by a leading t
by generating a lambda which checks its argument for nil.
Transformations to qref handle the other cases.
* txr.1: Documentation updated in several places.
|
|
|
|
|
| |
* parser.l (parser_l_init): Assign an eq-based hash to
form_to_ln_hash.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding binary search trees based on the new tnode cell. The
scapegoat algorithm is used, which requires no additional
storage in a cell. In the future we may go to something else,
like red-black trees, and carve out a bit in the tag field
of the cell for the red/black color.
Tree cells store only single key objects, not key/value pairs.
However, which part of the key object is compared is
determined by a custom key function stored in the tree
container. For instance, tree nodes can be cons cells, and car
can be used as the key function; the cdr then stores an
associated value.
Trees have a printed notation
#T(<props> <key>*)
where <props> is a list of up to three items:
<props> ::= ([<key-fn> [<less-fn> [<equal-fn>]]])
key-fn, less-fn and equal-fn are function names.
If they are missing or nil, they default, respectively, to
identity, less and equal.
For security, the printed notation is machine-readable only if
these options are symbols, not lambda expressions.
Furthermore, the symbols must be listed in the special
variable *tree-fun-whitelist*.
* eval.c (less_s): New symbol variable.
(eval_init): Initialize less_s.
* eval.h (less_s): Declard.
* parser.h (grammar): New #T token recognized, mapped to
HASH_T.
* parser.y (HASH_T): New terminal symbol.
(tree): New non-terminal symbol.
(i_expr, n_expr): Add tree to productions.
(fname_helper): New static function.
(yybadtoken): Map HASH_T to "#T".
* protsym.c: Tweaked accidentally; remove.
* tree.c (TREE_DEPTH_MAX): New macro.
(struct tree): New struct type.
(enum tree_iter_state): New enumeration.
(struct tree_iter): New struct type.
(tree_iter_init): New macro.
(tree_s, tree_fun_whitelist_s): New symbol variables.
(tn_size, tn_size_one_child, tn_lookup, tn_find_next,
tn_flatten, tn_build_tree, tr_rebuild,
tr_find_rebuild_scapegoat, tr_insert, tr_lookup, tr_do_delete,
tr_delete, tree_insert_node, tree_insert, tree_lookup_node,
tree_lookup, tree_delete, tree_root, tree_equal_op,
tree_print_op, tree_mark, tree_hash_op): New static functions.
(tree_ops): New static struct.
(tree): New function.
(tree_init): Initialize tree_s and tree_fun_whitelist_s symbol
variables. Register intrinsic functions tree,
tree-insert-node, tree-insert, tree-lookup-node, tree-lookup,
tree-delete, tree-root. Register special variable
*tree-fun-whitelist*.
* tree.h (tree_s, tree_fun_whitelist_s, tree): Declared.
(tree_fun_whitelist): New macro.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Binary search tree nodes are being added as a basic heap data
type. The C type tag is TNOD, and the Lisp type is tnode.
Binary search tree nodes have three elements: a key, a left
child and a right child.
The printed notation is #N(key left right). Quasiquoting
is supported: ^#N(,foo ,bar) but not splicing.
Because tnodes have three elements, they they fit into TXR's
four-word heap cell, not requiring any additional memory
allocation.
These nodes are going to be the basis for a binary search tree
container, which will use the scapegoat tree algorithm for
maintaining balance.
* tree.c, tree.h: New files.
* Makefile (OBJS): Adding tree.o.
* eval.c (expand_qquote_rec): Recurse through tnode cells,
so unquotes work inside #N syntax.
* gc.c (finalize): Add TNOD to no-op case in switch; tnodes
don't require finalization.
(mark_obj): Traverse tnode cell.
* hash.c (equal_hash): Add TNOD case.
* lib.c (tnode_s): New symbol variable.
(seq_kind_tab): New entry for TNOD, mapping to SEQ_NOTSEQ.
(code2type, equal): Handle TNOD.
(obj_init): Initialize tnode_s variable.
(obj_print_impl, populate_obj_hash): Handle TNOD.
(init): Call tree_init function in tree.c.
* lib.h (enum type, type_t): New enumeration TNOD.
(struct tnod): New struct type.
(union obj, obj_t): New union member tn of type struct tnod.
(tnode_s): Declard.
* parserc.c (circ_backpatch): Handle TNOD, so circular
notation works through tnode cells.
* parser.l (grammar): Recognize #N prefix, mapping to
HASH_N token.
* parser.y (HASH_N): New grammar terminal symbol.
(tnode): New nonterminal symbol.
(i_expr, n_expr): Add tnode cases to productions.
(yybadtoken): Map HASH_N to "#N" string.
|
|
|
|
|
|
|
| |
* parser.l (grammar): Fix incorrect format strings that are
supposed to show an invalid input character in hex.
For instance the character 1 comes out as #\x 1 rather
than #\x01.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (out_of_range_float): New static function.
(grammar): Check for flo_str returning nil in several places;
that value is returned for out of range floats. This is not
documented!
* txr.1: Document athat flo-str returns nil for out-of-range
floats.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A null byte in regex and string literals is being processed as
a #\nul instead of correctly turning into #\pnul. Bad UTF-8 is
not being rejected.
* parser.l (REGCHAR, LITCHAR): Use utf8_from_buffer to
properly convert yytext using its true length, rather than
utf8_from which assumes a null-terminated string. Thus
null bytes (including the case of a yytext being single NUL)
are handled properly. Check that the result is exactly one
character (null-terminated buffer, two characters wide).
* utf8.c (utf8_from): Unused function removed.
* utf8.h (utf8_from): Declaration removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h,
arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c,
combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c,
ffi.h, filter.c, filter.h, ftw.h, gc.c, gc.h, glob.c, glob.h,
hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h,
lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c,
regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl,
share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl,
share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl,
share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl,
share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl,
share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl,
share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl,
share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl,
share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl,
share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl,
share/txr/stdlib/termios.tl, share/txr/stdlib/trace.tl,
share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl,
share/txr/stdlib/vm-param.tl, share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, socket.c, socket.h, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h,
unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h,
win/cleansvg.txr: Extended Copyright line to 2018.
|
|
|
|
|
|
|
| |
* lib.c (chk_wrealloc): convert needs to be a coerce.
* parser.l (grammar): Use yyg instead of yyscanner;
the latter is the same pointer but of void * type.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (unicode_ident): New static function.
(BSCHR, NSCHR): Include UONLY match.
(grammar): Use unicode_ident function to validate tokens
obtained from BTOK and NTOK.
* txr.1: Documented changing definition of bident
and lident.
|
|
|
|
|
|
|
| |
* parser.l (SYM, TOK): These are just aliases for the same
pattern. SYM is referenced only in one place; TOK in multiple
places, so let's remove SYM and replace that one reference
by TOK.
|
|
|
|
|
| |
* parser.l (directive_tok): Fix printing of duplicate package
prefix on symbol.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h,
arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c,
combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c,
ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c,
glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c,
lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c,
parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h,
regex.c, regex.h, share/txr/stdlib/awk.tl,
share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl,
share/txr/stdlib/conv.tl, share/txr/stdlib/doloop.tl,
share/txr/stdlib/error.tl, share/txr/stdlib/except.tl,
share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl,
share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl,
share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl,
share/txr/stdlib/op.tl, share/txr/stdlib/package.tl,
share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl,
share/txr/stdlib/pmac.tl, share/txr/stdlib/socket.tl,
share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl,
share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl,
share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl,
share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, socket.c, socket.h, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h,
unwind.c, unwind.h, utf8.c, utf8.h, win/cleansvg.txr:
Extended Copyright line to 2018.
|
|
|
|
|
|
|
|
|
|
| |
* eval.c: doesn't need rand.h.
* filter.c: doesn't need gc.h.
* parser.l: doesn't need eval.h.
* parser.y: doesn't need utf8.h, stream.h, args.h or cadr.h.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem is that a.b .. c.d parses as (qref a b..c d),
which is useless and counterintuitive.
Let's fix it, but with a backward compatibility switch to give
more leeway to any hapless people out there whose code happens
to depend on this unfortunate situation.
We basically use two token numbers for the .. token:
OLD_DOTDOT, and DOTDOT. Both are wired into the grammar. In
backward compatibility mode, the lexer pumps out OLD_DOTDOT.
Otherwise DOTDOT.
* parser.l (grammar): When .. is scanned, return OLD_DOTDOT
when in compatibility with 185 or earlier. Otherwise DOTDOT.
* parser.y (OLD_DOTDOT): New terminal symbol; introduced at
the same high precedence previously occupied by DOTDOT.
(DOTDOT): Changes precedence to lower than '.' and UREFDOT.
(n_expr): Two productions added involving OLD_DOTDOT.
These are copy and paste of the existing productions involving
DOTDOT; the only difference is that OLD_DOTDOT replaces
DOTDOT.
(yybadtoken): Handle OLD_DOTDOT.
* txr.1: Compat notes added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This issue was revealed as a garbage line number in
an unbound variable warning diagnostic, where the variable
occurs in a quasi word list literal. A small test case
is (list #`@var`) where var unbound.
The fix is, in the lexer, to set the yylval->lineno for all
tokens which are declared as <lineno> in the grammar file,
for which doing so has beens neglected. We do this even for
those tokens whose line number values are never accessd in any
rule; it could arise in the future.
* parser.l (grammar): Set the yylval->lineno for the tokens
HASH_BACKSLASH, HASH_B_QUOTE, HASH_SLASH, WORDS, WSPLICE,
QWORDS and QWSPLICE.
|