txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	doc: fix bad deindent under copy-path-rec.	Kaz Kylheku	2021-05-28	1	-0/+1
\| \| \| \|	* txr.1: Add missing .IP after .RE to return the indentation.
*	json: tojson function.	Kaz Kylheku	2021-05-28	5	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): tojson intrinsic registered. * lib.c (tojson): New function. * lib.h (tojson): Declared. * txr.1: Documented. * share/txr/stdlib/doc-syms.tl: Updated.
*	json: two parser bugfixes.	Kaz Kylheku	2021-05-28	2	-266/+268
\| \| \| \| \| \| \| \|	* parser.y (json): Add forgotten call end_of_json in the '^' production. (json_val): Pass zero to vector, not 0 which is nil. * y.tab.c.shipped: Updated.
*	json: indentation support for printing.	Kaz Kylheku	2021-05-28	1	-5/+47
\| \| \| \| \| \| \| \| \|	* lib.c (out_json_rec): Save, establish and restore indentation when printing [ ] and { } notation. Enforce line breaks, and force a line break after the object if one occurred in the object. (out_json): Turn on indentation if it is off (but not if it is forced off). Restore after doing the object.
*	json: printing support.	Kaz Kylheku	2021-05-28	2	-0/+195
\| \| \| \| \| \| \| \| \| \|	First cut, without line breaks or indentation. * lib.c (out_json_str, out_json_rec, out_json): New static functions. (obj_print_impl): Hook in json printing via out_json. * txr.1: Add notes about output to JSON extensions.
*	json: support forgotten null object.	Kaz Kylheku	2021-05-28	3	-3227/+3273
\| \| \| \| \| \| \| \| \| \| \| \| \|	The JSON null will map to the Lisp null symbol. I thought about using : but that could cause surprises; like when it's passed to functions as an optional argument, it will trigger the default value. * parser.l (JSON): Add rules for producing null keyword. * txr.1: Documented. * lex.yy.c.shipped: Updated.
*	json: handling for bad UTF-8 bytes, NUL and \u0000.	Kaz Kylheku	2021-05-28	3	-3181/+3189
\| \| \| \| \| \| \| \| \| \| \|	* parser.l <JLIT>: Convert \u+0000 sequence to U+DC00 code point, the pseudo-null. Also include JLIT in in the rule for catching bad bytes that are not matched by {UANYN}. * txr.1: Document this treatment as extensions to JSON. * lex.yy.c.shipped: Updated.
*	json: hash issues with quasiquoting.	Kaz Kylheku	2021-05-28	3	-1721/+1750
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two problems. One is that in #J{~foo:"bar"} the foo expression is parsed while the scanner is in Lisp mode and the : token is grabbed by the parser as a lookahead token in the same mode. Thus it comes back as a SYMTOK (the colon symbol). We fix this by recognizing the colon symbol as synonym of the colon character. But only if it is spelled ":" in the syntax: we look at the lexeme. The second problem is that even if we fix the above, ~foo produces a sys:unquote form which is rejected because it is not a string. We fix this in the most straightforward way: by deleting the code which restricts keys to strings, an extension I already thought about making anyway. * parser.y (json_col): New non-terminal. This recognizes either a SYMTOK or a ':' token, with the semantic restriction that the SYMTOK's lexeme must be ":". (json_vals): Use yybadtok instead of generic error. (json_pairs): Do not require keys to be strings. Use json_col instead of ':', and also yybadtok instead of generic error. * y.tab.c.shipped: Updated.
*	vim: syntax highlighting for #J syntax.	Kaz Kylheku	2021-05-27	1	-8/+46
\| \| \| \| \| \| \| \| \| \| \| \|	* genvim.txr (dig19, bvar, dir, list): New variables. (txr_bracevar, tl_bracevar, tl_directive, txr_list, txr_bracket, txr_mlist_txr_mbracket): Use variable to specify common contents. JSON stuff added. (txr_ign_tok): Specify contents using @list. (txr_jkeyword, txr_jerr, txr_jpunc, txr_jesc, txr_juesc, txr_jnum): New matches. (txr_junqlist, txr_jstring, txr_jarray, txr_jhash): New regions.
*	doc: document json syntax support.	Kaz Kylheku	2021-05-27	2	-124/+280
\| \| \| \| \| \|	* txr.1: Documented #J, #J^ and json macro. * share/txr/stdlib/doc-syms.tl: Updated.
*	json: omission in quasiquoted array.	Kaz Kylheku	2021-05-27	2	-258/+266
\| \| \| \| \| \| \| \|	* parser.y (json_val): Handle json_vals being a list, indicating quasiquoting. We must nreverse it and turn it into a sys:vector-lit form. * y.tab.c.shipped: Updated.
*	json: implement distinguished json quasiquote.	Kaz Kylheku	2021-05-27	5	-3932/+3954
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because #J<json> produces the (json ...) form that translates into quote, ^#J<json> yields a quasiquote around a quote. This has some disadvantages, because it requires an explicit eval in some situtions to do what the programmer wants. Here, we introduce an alternative: the syntax #J^<json> will produce a quasiquote instead of a quote. The new translation scheme is #J X -> (json quote <X>) #J^ X -> (json sys:qquote <X>) where <X> denotes the Lisp object translation of JSON syntax X. * parser.c (me_json): The quote symbol is now already in the json form, so all that is left to do here is to take the cdr to pop off the json symbol. * parser.l (JPUNC, NJPUNC): Allow ^ to be a punctuator in JSON mode. * parser.y (json): For regular #J, generate the new (json quote ...) syntax. Implement J# ^ which sets up the nonzero quasi_level around the processing of the JSON syntax, so that everything is in a quasiquote, finally producing the (json sys:qquote ...) syntax. * lex.yy.c.shipped, y.tab.c.shipped: Updated.
*	json: support quasiquoting.	Kaz Kylheku	2021-05-27	6	-5159/+5410
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* parser.h (end_of_json_unquote): Declared. * parser.l (JPUNC, NJPUNC): Add ~ and * characters to set of JSON punctuators. (grammar): Allow closing brace character in NESTED, SPECIAL and QSPECIAL statues to be a token. This is because it occurs as a lookahead character in this situation #J{"foo":~expr}. The lexer switches from the JSON to the NESTED start state when it scans the ~ token, so that expr is treated as Lisp. But then } is consumed as a lookahead token by the parser in that same mode; when we pop back to JSON mode, the } token has already been scanned in NESTED mode. We add two new rules in JSON mode to the lexer to recognize the ~ unquote and ~* splicing unquote. Both have to push the NESTED start condition. (end_of_json_unquote): New function. * parser.y (JSPLICE): New token. (json_val): Logic for unquoting. The array and hash rules must now be prepared to deal with json_vals and json_pairs now producing a list object instead of a hash or vector. That is the signal that the data contains active quasiquotes and must be translated to the special literal syntax for quasiquoted vectors and hashes. Here we also add the rules for ~ and ~* unquoting syntax, including managing the lexer's transition back to the JSON start condition. (json_vals, json_pairs): We add the logic here to recognize unquotes in quasiquoting state. This is more clever than the way it is done in the Lisp areas of the grammar. If no quasiquotes occur, we construct a vector or hash, respectively, and add to it. If unquotes occur and if we are nested in a quasiquote, we switch the object to a list, and continue it that way. (yybadtoken): Handle JSPLICE. * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
*	json: extension: allow circle notation.	Kaz Kylheku	2021-05-26	4	-4746/+4808
\| \| \| \| \| \| \| \| \| \|	* parser.l (HASH_N_EQUALS, HASH_N_HASH): Recognize these tokens in the JSON start state also. * parser.y (json_val): Add the circular syntax, exactly like it is done for n_expr and i_expr. And it works! * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
*	New #J syntax for JSON objects in TXR Lisp.	Kaz Kylheku	2021-05-26	7	-5447/+6269
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(needs buffer literal error message cleanup) * parser.c (json_s): New symbol variable. (is_balanced_line): Follow braces out of initial state. This concession allows the listener to accept input like #J{"a":"b"}. (me_json): New static function (macro expander). The #J X syntax produces a (json Y) form, with the JSON syntax X translated to a Lisp object Y. If that is evaluated, this macro translates it to (quote Y). (parse_init): initialize json_s variable with interned symbol, and register the json macro. * parser.h (json_s): Declared. (end_of_json): Declared. * parser.l (num_esc): Treat u escape sequences in the same way as x. This function can then be used for handling the \u escapes in JSON string literals. (DIG19, JNUM, JPUNC, NJPUNC): New lex named patterns. (JSON, JLIT): New lex start conditions. (grammar): Recognize #J syntax, mapping to HASH_J token, which transitions into JSON start state. In JSON start state, handle all the elements: numbers, keywords, arrays and objects. Transition into JLIT state. In JLIT start state, handle all the elements of JSON string literals, including surrogate pair escapes. JSON literals share the fallback {UANY} fallback patter with other literals. (end_of_jason): New function. * parser.y (HASH_J, JSKW): New token symbols. (json, json_val, json_vals, json_pairs): New nonterminal symbols, and rules. (i_expr, n_expr): Generate json nonterminal, to hook the stuff into the grammar. (yybadtoken): Handle JKSW and HASH_J tokens. * lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
*	scanner: tweak buffer literal error message.	Kaz Kylheku	2021-05-26	2	-2/+2
\| \| \| \| \| \| \|	* parser.l (BUFLIT): When reporting a bad characters, do not show it in the form of an escape sequence. * lex.yy.c.shipped: Updated.
*	Version 260txr-260	Kaz Kylheku	2021-05-26	6	-290/+326
\| \| \| \| \| \| \| \| \| \|	* RELNOTES: Updated. * configure, txr.1: Bumped version and date. * share/txr/stdlib/ver.tl: Bumped. * txr.vim, tl.vim: Regenerated.
*	tests: fix vtest being hindrance to error finding.	Kaz Kylheku	2021-05-25	3	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	* tests/common.tl (vtest): Only if the expected expression is :error or (quote :error) do we wrap the expansion and evaluation of the test expression with exception handling, because only then do we expect an error. When the test expression is anything else, we don't intercept any errors, and so problems in test cases are easier to debug now. * tests/012/struct.tl: In one case we must initialize the gensym-counter to 4 to compensate for the change in vtest to get the same gensym numbers in the output.
*	listener: complete on structs and FFI typedefs.	Kaz Kylheku	2021-05-25	1	-7/+11
\| \| \| \| \| \| \| \| \|	* parser.c (find_matching_syms): Apply DeMorgan's on the symbol binding tests, to use sum of positive tests instead of product of negations. Check for struct and ffi type bindings in the default case. The default and '[' cases are rearranged so that the '[' case omits these, so as not to complete on a struct or FFI typedef after a [.
*	window-map: add tests, improve doc, add examples.	Kaz Kylheku	2021-05-25	2	-23/+83
\| \| \| \| \| \| \|	* tests/012/seq.tl: New tests. * txr.1: Improve documentation of window-map's :wrap and :reflect. Add examples.
*	doc: issue in identity function heading.	Kaz Kylheku	2021-05-25	2	-3/+4
\| \| \| \| \| \| \|	* txr.1: Fix heading repeating identity instead of listing idenitty and identity. share/txr/stdlib/doc-syms.tl: Regenerated.
*	doc: functions apply to arguments not vice versa.	Kaz Kylheku	2021-05-25	1	-27/+32
\| \| \| \| \| \|	* txr.1: Fix numerous instances of text which uses the wording that arguments are applied to a function. A few of the changes repair wording that was entirely botched.
*	doc: maintenance in description of toint, tofloat.	Kaz Kylheku	2021-05-25	1	-4/+6
\| \| \| \|	* txr.1: Improve wording, eliminate superfluous comma.
*	window-map: broken :wrap and :reflect.	Kaz Kylheku	2021-05-25	2	-11/+54
\| \| \| \| \| \| \| \| \| \| \|	* lib.c (window_map_list): Rewrite :wrap and :reflect support. The main issue with these is that they only sample items from the front of the input list and generate both flanks of the boundary from that prefix; :reflect is additionaly buggy due to applying nreverse to a sub which can return the original sequence. * tests/012/seq.tl: Some test coverage for window-map.
*	matcher: allow hash pattern to omit values.	Kaz Kylheku	2021-05-24	3	-8/+59
\| \| \| \| \| \| \| \| \| \| \| \|	The @(hash ...) operator now allows key-only patterns like (42) or (@x), where x could be bound or unbound. This has separate semantics from when a value is present. * share/txr/stdlib/match.tl (compile-hash-match): Implement. * tests/011/patmatch.tl: Test. * txr.1: Document.
*	matcher: fix funny comma placement.	Kaz Kylheku	2021-05-24	1	-2/+2
\| \| \| \| \| \|	* share/txr/stdlib/match.tl (compile-hash-match): Fix unquoting comma that had been strangely moved to the previous line.
*	parser: improve diagnostic for unterminated exprs.	Kaz Kylheku	2021-05-24	2	-4/+10
\| \| \| \| \| \| \| \| \| \|	* parser.y (parse): When issuing the diagostic indicating the likely starting line of the unterminated expression, instead of mentioning that line in the diagnostic text, let's just issue the diagnostic against that line. The programmer's text editor can then jump to that line. * y.tab.c.shipped: Updated.
*	compiler: bugfix: warnings deferred too far.	Kaz Kylheku	2021-05-23	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because with-compilation-unit is keyed of load-recursive, when compilation is happening in the context of a load (a top-level form in a loaded file calls compile-file or compile-update file) warnings are deferred until the end of the load. That might never occur if the load doesn't complete, because, say, the image quits for some reason. If the following is the content of a file which is loaded: (compile-file "foo") (exit 0) then warnings during the compilation are not issued when compile-file terminates, and will never be issued because of the termination due to the exit call. * share/txr/stdlib/compiler.tl (in-compilation-unit): New special variable. (with-compilation-unit): Use in-compilation-unit to determinw when a compilation unit has ended, and dump all the deferred warnings then. We will bind load-recursive because that is required for deferring warnings.
*	ffi: fix crash: carray argument type.	Kaz Kylheku	2021-05-22	1	-1/+1
\| \| \| \| \| \|	* ffi.c (make_ffi_type_pointer): Set the by_value_in flag only if the in function has been specified. Otherwise tft->in is a null pointer and will be used if this pointer type appears as an argument.
*	eval: bugfix: expand keys in case{q,ql,qual}*	Kaz Kylheku	2021-05-21	1	-2/+2
\| \| \| \| \| \| \|	* eval.c (me_case): When we evaluate the keys of a caseq, caseql* or casequal* construct, we must use expand_eval. I ran into this problem trying to use constants defined as symbol macros as keys.
*	doc: improvements ARGUMENTS AND OPTIONS.	Kaz Kylheku	2021-05-21	1	-14/+80
\| \| \| \| \| \|	* txr.1: Round out the documentation with various missing details, or details that benefit repeating in more than one place int the document.
*	txr/doc: refer to arguments not data-files.	Kaz Kylheku	2021-05-21	2	-2/+2
\| \| \| \| \| \| \|	* txr.1: The arguments after the script-file are not necessary data files; they can have any meaning. * txr.c (help): Also adjust the help text.
*	txr: match help text wording to doc.	Kaz Kylheku	2021-05-21	1	-6/+6
\| \| \| \| \|	* txr.c (help): Refer to "script-file" rather than "query-file", just like the documentation.
*	quips: new TTY joke, and take on familiar saying.	Kaz Kylheku	2021-05-21	1	-0/+2
\| \| \| \|	* share/txr/stdlib/quips.tl (%quips%): Entries added.
*	mpi: bug converting most negative 64 bit value.	Kaz Kylheku	2021-05-21	1	-2/+3
\| \| \| \| \| \| \| \| \|	* mpi/mpi.c (s_mp_in_big_range): If the value is negative, extend the range. This is exactly the same fix as what was applied to mp_in_range in 2019 in commit 11b5c567124a61d8e8249a0fbcce47f2688573c6. This function should have been fixed at the same time. The corresponding test cases now pass.
*	match: binary-integer conv tests for #x-8000...	Kaz Kylheku	2021-05-21	1	-0/+21
\| \| \| \| \| \|	* tests/016/arith.tl: Test providing coverage for the most negative two's complement integer, #x-800...00 in various sizes. The 64 bit cases are failing.
*	mpi: incorrect unsigned integer extraction.	Kaz Kylheku	2021-05-21	1	-4/+6
\| \| \| \| \| \| \| \| \| \|	* mpi/mpi.c (mp_get_uintptr, mp_get_double_uintptr): Fix loops which shift and mask the bignum digits together in the wrong way. The post-iteration shift is also wrong. We are fine in mp_get_uintptr because the affected code is in an #if that doesn't actually occur: bignum digits are pointer-sized. mp_get_double_uintptr affects the conversion of bignums to 64 bits on 32 bit platforms.
*	mpi: bug in range test predictes.	Kaz Kylheku	2021-05-21	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	* mpi/mpi.c (mp_in_range, s_mp_in_big_range): The ptrnd calculation here is wrong; it adds together dissimilar units: bits and bytes. In the case of mp_in_range, we are okay by fluke, because the calculation works out to 1 anyway. We would not be okay of a mp_digit was half the size of a pointer. In s_mp_in_big_range we have a problem. On 32 bit platforms, ptrnd is wrongly calculated as 1 rather than 2, and so values perfectly in range are rejected.
*	math: add some tests related to integer conversion.	Kaz Kylheku	2021-05-21	1	-0/+50
\| \| \| \| \| \|	* tests/016/arith.tl: Add tests covering the fixnum/bignum knee, and ffi operations of various sizes that provide coverage of various conversion routines.
*	listener: don't complete on unbound symbols	Kaz Kylheku	2021-05-18	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch prevents Tab-completing on interned symbols that have no binding. The current behavior is, for instance: 1> 'hamsandwich hamsandwich 2> 'ham[Tab] 2> 'hamsandwich ;; completes The new behavior will not complete hamsandwich, because it has no binding as a function or variable. * parser.c (find_matching_syms): Treat the default case the same as after '[': a function or variable binding is required, or the symbol is not listed. Use fboundp instead of lookup_fun. They are the same, except in TXR 127 compat mode, which includes macros under fboundp.
*	doc: macrolet doesn't contain top-level forms.	Kaz Kylheku	2021-05-17	1	-0/+28
\| \| \| \| \| \|	* txr.1: For eval and for compilation, document that symacrolet and macrolet do not enclose multiple top-level forms.
*	doc: deindent top-level form rules.	Kaz Kylheku	2021-05-17	1	-1/+0
\| \| \| \| \|	* txr.1: Because we are not in indented paragraphs, we don't need the opening .RS.
*	doc: a round of documentation fixes.	Paul A. Patience	2021-05-17	1	-98/+190
\| \| \| \| \|	* txr.1: Hyphenation, punctuation, spelling and formatting throughout document.
*	doc: incorrect precedence of repeat special clauses.	Kaz Kylheku	2021-05-17	1	-16/+28
\| \| \| \| \| \| \| \| \|	* txr.1: The precedence among the repeat clauses is documented incorrectly: the @(mod) clause has a lower precedence than modlast and last. Redocumenting this area for better clarity, and mentioning why @(empty) isn't in the precedence list. This issue was reported by Paul A. Patience with a patch, which I reworked.
*	doc: deffi, defplace: syntax heading.	Paul A. Patience	2021-05-17	1	-4/+4
\| \| \| \| \| \|	* txr.1: Use let-like small indentation for the clauses of defplace. Formatting adjusted slightly by K. K. Remove spurious defmacro element in deffi syntax.
*	doc: rewrite flawed doc for sme operator.	Kaz Kylheku	2021-05-16	1	-15/+42
\| \| \| \| \| \| \|	* txr.1: The documentation of the semantics of sme contradicts itself by neglecting to specify that the middle part of the input is searched for match for the middle pattern mpat. Let's fix this by giving detailed semantics in bulleted form.
*	lib: sys_rplacd misnamed parameter.	Kaz Kylheku	2021-05-14	1	-3/+3
\| \| \| \| \|	* lib.c (sys_rplacd): Change parameter name from new_car to new_cdr, for obvious reasons.
*	compiler: better code for global var definitions.	Kaz Kylheku	2021-05-14	4	-10/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (rt_defvarl): More accurate self string. (rt_defv): New static function: like rt_defvarl but ensures that the new variable has a binding cell, and returns that cell instead of the hash cell. (op_defvarl): Take advantage of rt_defv to not have to cons up the binding cell. (eval_init): Register sys:rt-defv intrinsic. * parser.c (read_file_common): Compiled files are now version 7, so we must recognize them. We still load version 6 files because rt:defvarl still exists for them. * share/txr/stdlib/compiler.tl (expand-defvarl): Improve the generated code in two ways. Firstly, use the new sys:rt-defv, which returns the binding cell, so that the value can be stored into it with rplacd without having to cons up anything. Secondly, if there is no value expression, don't emit the code to do the assignment. (%tlo-ver%): Bump compiled file version to (7 0). * txr.1: Add note about TXR 260 loading version 7 and 6.
*	doc: sort doc-syms, and html-decode.	Kaz Kylheku	2021-05-13	2	-2049/+2063
\| \| \| \| \| \| \| \| \| \| \|	* genman.txr: generate doc-syms as a sorted list fed to hash-from-pairs. Now the symbols won't jump around so much whenever we update it. Also, the names must be HTML-decoded. For instance "str<" was being stored as "str<" causing (doc 'str<) to fail. We use TXR @(output) to adjust the formatting as if it were maintained by hand. * share/txr/stdlib/doc-syms.tl: Regenerated.
*	Version 259txr-259	Kaz Kylheku	2021-05-13	6	-865/+920
\| \| \| \| \| \| \| \| \| \|	* RELNOTES: Updated. * configure, txr.1: Bumped version and date. * share/txr/stdlib/ver.tl: Bumped. * txr.vim, tl.vim: Regenerated.