txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Change return value spec for base64 stream functions.	Kaz Kylheku	2017-08-22	2	-9/+27
\| \| \| \| \| \| \| \| \|	* filter.c (base64_stream_enc, base64_stream_dec): Count bytes encoded or decoded (using a fast integral counter which efficiently overflows to a Lisp value that may be a bignum). * txr.1: Doc updated.
*	buffers: fix infinite loop in buf_grow.	Kaz Kylheku	2017-08-22	1	-6/+8
\| \| \| \| \| \| \|	* buf.c (buf_grow): When size is zero and len is nonzero, the loop doesn't terminate. Replace silly loop with straightforward calculation: grow buffer by 25%, capped at INT_PTR_MAX, or grow to the length, whichever is larger.
*	parser: bugfix: empty buf literal problem.	Kaz Kylheku	2017-08-22	1	-1/+2
\| \| \| \| \| \| \|	* parser.y (buflit): Fix neglect to call end_of_buflit in the empty buffer literal case, which precipitates syntax errors when an empty buffer literal #b'' is embedded in other syntax.
*	Default the length argument of truncate-stream.	Kaz Kylheku	2017-08-21	2	-2/+19
\| \| \| \| \| \| \| \| \| \|	* stream.c (truncate_stream): If the len argument is missing, default to the current position, obtained by using the seek operation. (stream_init): Fix up registration of truncate-stream for one optional argument. * txr.1: Documentation of truncate-stream updated.
*	Update and expose base64 stream functions.	Kaz Kylheku	2017-08-18	3	-22/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* filter.c (base64_stream_enc): Change return value behavior. Return in unlimited mode, or number of bytes encoded. (get_base64_char): Stop reading when an invalid character is encountered, push it back and and return 0. (b64_code): Don't throw for invalid characters. This case now only occurs if 0 is passed in. (base64_stream_dec): Drop nchars argument. Read until get_base64_char returns 0 due to EOF or an invalid character. (base64_decode): Don't pass third arg to base64_stream_dec. (filter_init): base64-stream-enc and base64-stream-dec intrinsics registered. * filter.h (base64_stream_dec): Declaration updated. * txr.1: Documented.
*	Revising out-of-memory handling.	Kaz Kylheku	2017-08-18	5	-31/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't want to be aborting on OOM, but throwing an exception. * lib.c (alloc_error_s): New symbol variable. (oom_realloc): Global variable removed. (oom): New static function. (chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc): Call oom instead of removed oom_realloc handler. (env): Throw alloc-error rather than error by calling oom. (obj_init): Initialize alloc_error_s. (init): Drop function pointer argument; do not initialize removed oom_realloc. * lib.h (alloc_error_s): Declared. (oom_realloc): Declaration removed. (init): Declaration updated. * txr.1: Type tree diagram includes alloc-error.
*	vec-set-length maintenance.	Kaz Kylheku	2017-08-17	1	-4/+13
\| \| \| \| \| \| \|	* lib.c (vec_set_length): Check new length against INT_PTR_MAX rather than size_t limit. We want to keep the length a fixnum. If the allocation needs to increase, grow it by 25%, not by doubling it.
*	Rewriting string-extend.	Kaz Kylheku	2017-08-17	1	-26/+25
\| \| \| \| \| \| \| \| \| \|	* lib.c (string_extend): Restructure internals with these goals: no loop: calculate needed space in one step; if the allocation needs to grow, then grow it by 25% or step it up to exactly the needed size, whichever of the two is larger. Overflow check against INT_PTR_MAX, since len and alloc fields of string are not fixnums but integers extracted with c_num.
*	parser: fix byacc regression related to hash-semi.	Kaz Kylheku	2017-08-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "byacc_fool" rule needed a small update when in November 2016 it became involved in newly introduced #; (hash semicolon) syntax for commenting out. The problem is that when a top-level nested list expression is followed by #; then a byacc-generated parser throws a syntax error. This is because the byacc_fool production only generates a n_expr, or empty. Thus only those symbols are allowed which may begin a n_expr. The hash-semicolon token isn't one of them. * parser.y (byacc_fool): Add a production rule which generates a HASH_SEMI. Thus HASH_SEMI is now one of the terminals that may legally follow any sentential form that matches a byacc_fool rule after it.
*	parser: more efficient treatment of string literals.	Kaz Kylheku	2017-08-17	1	-27/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this patch we switch the string literal parser from right recursion to left recursion, so that it doesn't require a Yacc stack depth proportional to the number of characters in the literal. Secondly, we build the string directly in a syntax-directed way, rather than building a list of characters and then walking the list to build a string. This was discovered as a byacc regression, though the fix is not for the sake of byacc but basic efficiency. The Sep 29, 2015 commit 111650e235ab2e529fa1529b1c9a23688a11cd1f "Implementation of static slots for structures." extended the string literal in the struct.tl test case in such a way that if the parser is generated by byacc rather than GNU Bison, the test case fails with a "yacc stack overflow". I haven't done any regression testing with byacc in over two years so I didn't notice this. Quasiliterals could use this treatment also. Word list literals benefit from this change, but they still use a Yacc stack depth proportional to the number of words, since the accumulation of words is right recursive. * parser.y (lit_char_helper): Static function removed. (restlitchar): New grammar nonterminal symbol. (strlit, quasi_item, wordslit): No need to call lit_char_helper. (litchars): A litchars is now either a single LITCHAR, or else a LITCHARS followed by a sequence of more. This sequence is a separate production rule called restlitchar, which is purely left recursive. (If litchars is made directly left recursive without this helper rule, intractable reduce/reduce and shift/reduce conflicts arise.)
*	Get continuations working on aarch64.	Kaz Kylheku	2017-08-16	2	-15/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* unwind.c (UW_CONT_FRAME_BEFORE, UW_CONT_FRAME_AFTER): New preprocessor symbols. (revive_cont): The "frame slack" zero-filled padding logic is replaced by capturing an actual part of the real stack before the uw_stack unwind frame. On aarch64, there is content there we must actually capture. Experiment shows that exactly 128 bytes is enough, and that corresponds to the frame_slack value. (capture_cont): Capture UW_CONT_FRAME_BEFORE bytes before the uw_stack unwind frame location. Also, the "capture_extra" is replaced by UW_CONT_FRAME_AFTER constant, to harmonize. * unwind.h (UW_FRAME_ALIGN): New preprocessor symbol. (union uw_frame): On aarch64, we ask the compiler, via a GCC-specific attribute syntax, to align the address of frame objects to a 16 byte boundary. This solves a crash in the continuation code. Continuation capture is keyed to unwind frame addresses. When a captured continuation is revived onto the stack, the delta between its original address and the revive address must be a multiple of 16. The reason is that this preserves the stack and frame pointer alignment. Certain instructions on the aarch64, require the stack pointer to be 16 byte aligned. There are other ways we could achieve this, but the easiest is to align the original frames to 16 bytes.
*	Port to aarch64 (ARM 8).	Kaz Kylheku	2017-08-16	3	-2/+72
\| \| \| \| \| \| \| \| \| \| \| \|	Continuations don't work yet. * gc.c (STACK_TOP_EXTRA_WORDS): New macro. (mark): On aarch64, we must include four words above the stack top. Some live root pointers sometimes hide there which are not in any of the callee-saved register that end up in the machine context via jmp_save. * jmp.S (jmp_save, jmp_restore): Implement for aarch64.
*	Allow character inputs in some bit operations.	Kaz Kylheku	2017-08-16	2	-1/+43
\| \| \| \| \| \| \| \| \|	* arith.c (logand, logior, logxor): Allow one operand to be a character, if the opposite opernad is a fixnum integer. The result is a character. (bit): Allow the value being tested to be a character. * txr.1: Updated.
*	ffi: new FFI type I/O functions.	Kaz Kylheku	2017-08-16	3	-0/+139
\| \| \| \| \| \| \| \| \|	* ffi.c (put_obj, get_obj, fill_obj): New functions. (ffi_init): put-obj, get-obj, fill-obj intrinsics registered. * ffi.h (put_obj, get_obj, fill_obj): Declared. * txr.1: Documented.
*	buf: provide way to work with on-stack buffers.	Kaz Kylheku	2017-08-16	2	-3/+7
\| \| \| \| \| \| \| \|	* buf.c (init_borrowed_buf): New function. (make_borrowed_buf): Reduced to wrapper around init_borrowed_buf. * buf.h (init_borrowed_buf): Declared.
*	buf: new buffer stream.	Kaz Kylheku	2017-08-14	5	-0/+397
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* buf.c (struct buf_strm): New struct type. (buf_strm_mark, int buf_strm_put_byte_callback, buf_strm_put_string, buf_strm_put_char, buf_strm_put_byte, buf_strm_get_byte_callback, buf_strm_get_char, buf_strm_get_byte, buf_strm_unget_char, buf_strm_unget_byte, buf_strm_seek, buf_strm_truncate, buf_strm_get_prop, buf_strm_set_prop, buf_strm_get_error, buf_strm_get_error_str): New static functions. (buf_strm_ops): New static struct. (buf_strm): New static function. (make_buf_stream, get_buf_from_stream): New functions. (buf_init): Register new intrinsic functiions make-buf-stream and get-buf-from-stream. Call fill_stream_ops on new buf_strm_ops to fill default operations in place of function pointers that have been left null. * buf.h (make_buf_stream, get_buf_from_stream): Declared. * lisplib.c (with_stream_set_entries): Add with-out-buf-stream and with-in-buf-stream to auto-load symbols for with-stream.tl module. * share/txr/stdlib/with-stream.tl (with-out-buf-stream, with-in-buf-stream): New macros. * txr.1: New section about buffer streams.
*	buf: tiny code improvement.	Kaz Kylheku	2017-08-14	1	-1/+1
\| \| \| \| \|	* buf.c (buf_grow): Use the previously calculated delta value, rather than re-evaluating the equivalent expression.
*	bugfix: buf-put-uchar	Kaz Kylheku	2017-08-14	1	-1/+1
\| \| \| \| \|	* buf.c (buf_put_uchar): Fix wrong conversion that is causing this function to reject values in the 128-255 range.
*	bugfix: seek-stream :from-end not working.	Kaz Kylheku	2017-08-14	1	-1/+1
\| \| \| \| \|	* stream.h (enum strm_whence): Fix strm_end and strm_start being duplicate values; strm_end must map to SEEK_END.
*	base64 funtions: factor out stream filtering internals.	Kaz Kylheku	2017-08-09	2	-15/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The base64_encode and base64_decode functions internally work with streams. This change factors out those internals into separate functions (with the intent that these will be usefully exposed, in another commit). * filter.c (base64_stream_enc): New function, made out of the internals of base64_encode. (base64_encode): Simple wrapper for base64_stream_enc. (base64_stream_dec): New function, made out of the internals of base64_decode. (base64_decode): Simple wrapper for base64_stream_dec. * filter.h (base64_stream_enc, base64_stream_dec): Declared.
*	ffi: new buf-carray function.	Kaz Kylheku	2017-08-08	3	-0/+29
\| \| \| \| \| \| \| \| \|	* ffi.c (buf_carray): New function. (ffi_init): Registered buf-carray intrinsic. * ffi.c (buf_carray): Declared. * txr.1: Documented.
*	New divides function.	Kaz Kylheku	2017-08-07	3	-0/+53
\| \| \| \| \| \| \| \| \|	* arith.c (divides): New function. (arith_init): Intrinsic registered. * arith.h (divides): Declared. * txr.1: Documented.
*	Make len a synonym for length.	Kaz Kylheku	2017-08-07	2	-2/+10
\| \| \| \| \| \| \|	* eval.c (eval_init): Register the same function under length and len. * txr.1: Documented.
*	New spl and tok: variants of tok-str and split-str.	Kaz Kylheku	2017-08-07	4	-0/+86
\| \| \| \| \| \| \| \|	* eval.c (eval_init): Register spl and tok intrinsics. * lib.c (spl, tok): New functions. * txr.1: Documented.
*	tok-str requires two arguments, not just one.	Kaz Kylheku	2017-08-07	1	-1/+1
\| \| \| \|	* eval.c (eval_init): Fix incorrect registration of tok-str.
*	bugfix: n-ary arith functions must check single arg.	Kaz Kylheku	2017-08-05	2	-9/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are allowing calls like (* "a") and (+ "a") without diagnosing that the argument isn't of a valid type. Note that (max "a") is fine beacause min and max use the less function; they are not strictly numeric. * lib.c (nary_op): Beef up function with additional argument for type checking the unary case. (unary_num, unary_arith, unary_int): New static functions. (plusv, mulv, logandv, logiorv): Use new nary_op interface. (gtv, ltv, gev, lev, numeqv, numneq): Check the first number. * lib.c (nary_op): Declaration updated.
*	Add sum and prod convenience functions.	Kaz Kylheku	2017-08-05	4	-0/+60
\| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): prod and sum intrinsics registered. * lib.c (sum, prod): New functions. * lib.h (sum, prod): Declared. * txr.1: Documented.
*	New functions digpow and digits.	Kaz Kylheku	2017-08-05	3	-0/+143
\| \| \| \| \| \| \| \| \| \| \|	* arith.c (digcommon): New static function. (digpow, digits): New functions. (arith_init): New digpow and digits intrinsic functions registered. * arith.h (digpow, digits): Declared. * txr.1: New functions documented.
*	Bugfix: (sys:expr . atom) bad syntax out of parser.	Kaz Kylheku	2017-08-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	* parser.y (expand_meta): Fix incorrect conversion of (sys:var x) when x is a non-bindable term to (sys:expr . x). Should be (sys:expr x). This doesn't have that much of an impact, I don't think. It prevent certain degenerate forms from working like @(bind x @"str"). The bad thing is that this particular one has a silent problem: @"str" wrongly evaluates to #\s. Neverheless, this doesn't seem worth the addition of a compat flag test; the odds of someone depending on @"str" producing #\s in some pattern language code see vanishingly low.
*	Bi-directional string tree match for non-vars.	Kaz Kylheku	2017-08-02	2	-2/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is an inconsistency in @(bind) in that given @(bind x y) where x is a variable, both directions are tried for a string tree match. x could be tree of strings and y a string atom, or vice versa. But if x is just an atom, or a Lisp evaluation, then only one direction is tried. @(bind @(list "a" "b") "a") succeeds, but @(bind "a" @(list "a" "b")) fails. * match.c (dest_bind): Test both directions in the scalar and Lisp evaluated cases of the left hand side. Subject to compatibility, just in case. * txr.1: Compat note added.
*	doc: aret description refers to ret.	Kaz Kylheku	2017-08-02	1	-1/+1
\| \| \| \| \|	* txr.1: Fix description of aret, which wrongly refers to the ret macro.
*	doc: explain qref and uref.	Kaz Kylheku	2017-08-02	1	-6/+17
\| \| \| \| \|	* txr.1: Under the qref and uref operators, mention what these stand for and how the terminology is derived.
*	doc: note about global symbol macros.	Kaz Kylheku	2017-08-02	1	-0/+4
\| \| \| \| \|	* txr.1: Adding note that a symbol cannot be a global symbol macro and global variable at the same time.
*	doc: missing formatting in qref equivalence.	Kaz Kylheku	2017-08-02	1	-0/+2
\| \| \| \| \|	* txr.1: equivalence between .(qref ...) and (uref ...) now typeset properly in monospaced font.
*	doc: mention listener in Lisp intro.	Kaz Kylheku	2017-08-02	1	-4/+6
\| \| \| \| \| \|	* txr.1: Remove clumsy "firstly, secondly, thirdly" because we need a "fourthly" which is too much. Intro now mentions that Lisp evaluation is also possible via the listener.
*	doc: cross reference different #b.	Kaz Kylheku	2017-08-02	1	-0/+13
\| \| \| \| \|	* txr.1: Note under #b binary number syntax that #b is also used for buffer literals, and vice versa.
*	bugfix: spurious nils in pad function's output.	Kaz Kylheku	2017-08-02	1	-5/+6
\| \| \| \| \| \|	* eval.c (pad): Incoming sequence must be nullified, otherwise empty vectors and strings produce a spurious nil. This affects the weave function, which uses pad.
*	genvim: ^ is constituent of identifiers.	Kaz Kylheku	2017-08-01	1	-1/+1
\| \| \| \| \|	* genvim.txr (iskeyword): add ^ character. Now r^ and others are colorized properly.
*	Evaluate doloop forms in an implicit tagbody.	Kaz Kylheku	2017-07-31	2	-12/+22
\| \| \| \| \| \| \| \| \| \|	This eliminates one incompatibility between doloop and ANSI CL do. * share/txr/stdlib/doloop.tl (sys:expand-doloop): Wrap body in tagbody form. * txr.1: Documentation updated.
*	doc: note about label symbols in tagbody.	Kaz Kylheku	2017-07-31	1	-0/+11
\| \| \| \| \|	* txr.1: Note added that a tagbody label may be any symbol whatsoever.
*	Small code cleanup in tagbody.	Kaz Kylheku	2017-07-31	1	-4/+3
\| \| \| \| \| \| \|	* share/txr/stdlib/tagbody.tl (tagbody): Reduce unnecessary use of DWIM brackets to parentheses in calculation of bblocks. Remove entry-lbl local variable, propagating its initform to its one and only use site.
*	bugfix: tagbody mustn't expose anonymous block.	Kaz Kylheku	2017-07-30	1	-8/+9
\| \| \| \| \| \| \| \|	* share/txr/stdlib/tagbody.tl (tagbody): Use progn for the trivial case, and in the ordinary case, the sys:for-op special form directly rather than the for loop maro. sys:for-op doesn't introduce a block; the for macro is doing that.
*	Optimize trivial tagbody.	Kaz Kylheku	2017-07-30	1	-35/+37
\| \| \| \| \| \| \| \| \|	* share/txr/stdlib/tagbody.tl (tagbody): If the body contains no labels, then emit a simple block. Note that we should just be emitting a progn here; however, there is a bug in tagbody in that there is an anonymous block. This is not documented, and a consequence of the looping construct used. So for now we preserve that behavior in the reduced case.
*	listener: handle incomplete buf literals.	Kaz Kylheku	2017-07-30	1	-1/+23
\| \| \| \| \|	* parser.c (is_balanced_line): Handle #b'...' syntax with some new states and transitions.
*	New macros doloop and doloop*.	Kaz Kylheku	2017-07-30	3	-0/+236
\| \| \| \| \| \| \| \| \| \|	* lisplib.c (doloop_set_entries, doloop_instantiate): New functions. (lisplib_init): Register autoload for doloop macros. * share/txr/stdlib/doloop.tl: New file. * txr.1: Documented.
*	doc: grammar under Ranges.	Kaz Kylheku	2017-07-29	1	-1/+1
\| \| \| \|	* txr.1: Superfluous article a deleted, and sentence reworded.
*	genvim: flag trailing junk in #x #o #b literals.	Kaz Kylheku	2017-07-29	1	-9/+14
\| \| \| \| \| \| \| \| \| \| \|	* genvim.txr (txr_pnum): New match; matches a superset of the #x, #o and #b literals with the inclusion of trailing alphanumeric junk. Highlighted as Error. (txr_xnum, txr_onum, txr_bnum): New match categories, formed by renaming the previous #x, #o and #b matches. These are contained in txr_pnum, highlighted as Number. (txr_bracevar, txr_directive, txr_list, txr_bracket, txr_mlist, txr_mbracket): Include txr_pnum.
*	doc: struct literals: bad syntax synopsis.	Kaz Kylheku	2017-07-29	1	-1/+1
\| \| \| \|	* txr.1: Fix incorrect #H prefix which should of course be #S.
*	doc: grammar in setuid section.	Kaz Kylheku	2017-07-29	1	-1/+1
\| \| \| \|	* txr.1: anything code -> any code.
*	genvim: flag invalid # syntax.	Kaz Kylheku	2017-07-28	1	-0/+2
\| \| \| \| \| \| \|	* genvim.txr (txr_error): New match in this category for # followed by something other than H, S or R. Some characters other than these are valid after #, but are covered by explicit matches that occur later.