txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	buf: compression tests.	Kaz Kylheku	2022-05-30	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	* buf.c (buf_compress): Let's use the level value of -1 if not specified, so Zlib defaults it to 6, or whatever. * tests/012/buf.tl: New tests. * txr.1: Note that -1 is a valid level value and that is the default.
*	utf8: bugfix: trailing char fragment ignored.	Kaz Kylheku	2022-05-20	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After "years of trouble-free operation" a bug in the UTF-8 decoder was found, which violates its property that any sequence of bytes will decode to some kind of string, which will encode to the original bytes. When the UTF-8 data prematurely ends in the middle of a valid character, the decoder just drops that data as if it didn't exist. So for instance the two-byte sequence E6 BC should decode to "\xDCE6\xDCBC", since it is a fragment of a three-byte UTF-8 sequence. It actually decodes to the empty string. * utf8.c (utf8_bfom_buffer): When the buffer is exhausted, if we are not in the utf8_init state, it means we were in the middle of a UTF-8 sequence. Walk the bytes from the backtrack point to the end of the buffer and store them into the string as U+DCxx codes. * tests/012/buf.tl: Tests added for this via buf-str, str-buf.
*	buf: bugfix: int-buf, uint-buf refer to alloc size.	Kaz Kylheku	2021-05-04	1	-0/+4
	* buf.c (int_buf, uint_buf): Refer to the buffer length b->len rather than the underlying allocation size b->size. Referring to b->size will not only produce the wrong value when it is larger than len, but b->size can be null for a borrowed buffer, producing a crash. * tests/012/buf.tl: Tests.