summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ffi: rework endian-type rput/rget routines on big endian.Kaz Kylheku2025-02-151-24/+40
| | | | | | | | | | | | | * ffi.c (ffi_be_i16_rput, ffi_be_i16_rget, ffi_be_u16_rput, ffi_be_u16_rget, ffi_be_i32_rput, ffi_be_i32_rget, ffi_be_u32_rput, ffi_be_u32_rget, ffi_le_i16_rput, ffi_le_i16_rget, ffi_le_u16_rput, ffi_le_u16_rget, ffi_le_i32_rput, ffi_le_i32_rget, ffi_le_u32_rput, ffi_le_u32_rget): Rewriten to avoid memory clearing, memsets, pointer arithmetic and use of helper functions. The big endian rput and rget functions just wrap the non-endian versions. The ones which need byte swapping work in terms of a full ffi_arg word. For instance to prepare a 16 bit big endian unsigned return value we byte swap the uin16_t, then convert fo ffi_arg.
* ffi: big endian: broken be-int16 closure return.Kaz Kylheku2025-02-131-1/+1
| | | | | | | * ffi.c (ffi_be_i16_rput): We need to memset the remaining parts of the 64 bit word to 0, like in all the other ffi_be_xxx_put functions that are less than 64 bits wide. Also removing the (void) tft cast is removed since tft is used.
* ffi: big endian: broken int8 and uint8 return values.Kaz Kylheku2025-02-131-4/+6
| | | | | | | | | | | | * ffi.c (ffi_i8_rput, ffi_i8_rget, ffi_u8_rput, ffi_u8_rget): These functions are not doing the correct job; they are just casting the pointer to the target type, like on little endian. The big endian rget must fetch the entire 64 bit word (ffi_arg) and convert its value to the target type. If it's a character value, the actual bits are found at *(src + 7) not at *src. The rput function must do the reverse; convert the value to the 64 bit ffi_arg and store that.
* vm: missed cases of signal check in backwards branchKaz Kylheku2025-02-071-6/+15
| | | | | | | | | | | | Only the JMP instruction is checking for a backwards branch and calling sig_check_fast() so that a loop can be interrupted by Ctrl-C. The compiler can optimize that so that a backwards jump is performed by an instruction in the IF family. * vm.c (vm_if, vm_ifq, vm_ifql): Check for a backwards branch and call sig_check_fast. Also, eliminate the redundant call to vm_insn_bigop, which is just a masking macro. The ip variable is already the result of vm_insn_bigop.
* read-until-match: streamline get_char calls.Kaz Kylheku2025-01-301-3/+7
| | | | | | | | | | | | | Similarly to what was done in get_csv, we optimize he use of get_char and unget_char. I see a 7.5% speed improvement in a simple benchmark of the awk macro with the record separator rs set to #/\n/. * regex.c (scan_until_common): Obtain the strm_ops operations of the stream, and pull the low level get_char and unget_char virtual operations from it. Call these directly in the loop. Thereby, we avoid all the type checking overhead in these functions.
* get-csv: further get-char optimization.Kaz Kylheku2025-01-301-15/+6
| | | | | | | | | | | Another 5-6% gained form this. * stream.c (us_get_char, us_unget_char): Static functions removed. (get_csv): Retrieve the get_char and unget_char pointers from the strm_ops structure outside of the loop, and then just call these pointers. Careful: the unget_char virtual has reversed parameters compared to the global function.
* get-csv: use unsafe version string-extend.Kaz Kylheku2025-01-303-11/+49
| | | | | | | | | | | | | | | Another almost 16% speedup. * lib.c (us_length_STR): New static function. (string_extend): Use us_length_STR, since we know the object is of type STR. (us_string_extend_STR_CHR): New function. (length_str): Handle STR case via use_length_STR. * lib.h (us_string_extend_STR_CHR): Declared. * stream.c (get_csv): Use us_string_extend_STR_CHR instead of string_extend.
* string-extend: don't use set macro to update length.Kaz Kylheku2025-01-301-1/+1
| | | | | | | * lib.c (string_extend): We know that num_fast + delta is in the fixnum range, because we checked this condition. So we can just assign it without informing the garbage collector. This yields about a 16% speedup in get-csv.
* awk: add CSV support.Kaz Kylheku2025-01-303-3/+82
| | | | | | | | | | | * stdlib/awk (awk-state upd-rec-to-f): Handle a new case of fs being the keyword symbol :csv, producing a field-splitting lambda that calls get-csv. * tests/015/awk-basic.tl: Several new test cases for this CSV feature. * txr.1: Documented.
* get-csv: speed up with unsafe get-char.Kaz Kylheku2025-01-301-4/+17
| | | | | | | | | I'm seeing about a 6% improvement in get-csv from this. * stream.c (us_get_char, us_unget_char): New static functions, which assume all arguments have correct type. (get_csv): If we use source_opt, validate that it's a stream with class_check. Use us_get_char and use_unget_char.
* cobj: optimize subclass checks based on depth 1 assumptionKaz Kylheku2025-01-301-10/+6
| | | | | | | | | | | | | | | | | Nowhere in the image do we have cobj_class inheritance deeper than one. No class has a superclass which itself has a superclass. Based on this, we can eliminate loops coded to handle the general case. * lib.c (sutypep, cobjclassp): Do not iterate to chase the chain of super pointers. Do the subclass check based on the assumption that there is at most a super pointer to class which itself then has no super. (cobj_register_super): Assert if the situation occurs that a class is registered with a super that is not a root. All these calls take place on startup, so if the assumption is wrong, the assert will be 100% reproducible.
* vector: ensure minimum alloc size.Kaz Kylheku2025-01-291-6/+7
| | | | | | | | | | | | | | Like in a recent commit for mkstring, we impose a minimum allocation size of 6 for vectors, which means 8 cells together with the two informaton words at the base of the vector. * lib.c (vec_own): Take an alloc parameter in addition to the length, which is stored in v[vec_alloc]. (vector): Impose a minimum alloc size of 6. (copy_vec, nested_vec_of_v): Pass alloc parameter to vec_own which is the same as the length parameter; i.e. no behavior change for these functions.
* string-extend: grow faster.Kaz Kylheku2025-01-291-2/+2
| | | | | | | * lib.c (string_extend): When more space is needed in the string, grow by 50% rather than 25%. This speeds up code at the expense of some wasted space. Waste space can be dealt with by the final flag in programs where it matters.
* mkstring: minimum 7 char alloc size.Kaz Kylheku2025-01-291-2/+3
| | | | | | * lib.c (mkstring): Do not allocate less than 8 characters, including null terminator, to the string. This speeds up code which builds up strings from empty, one character at a time.
* build: remove HAVE_MALLOC_USABLE_SIZE.Kaz Kylheku2025-01-293-66/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The malloc_usable_size use in the STR type actually makes operations like string_extend substantially slower. It is faster to store the allocated size locally. Originally, on platforms that have malloc_usable_size, we were able to use the word of memory reclaimed from the string type to store a cached hash code. But that logic was revereted in 2022, so there is no such benefit. * configure (have_malloc_usable_size): Variable removed. Test for the malloc_usable_size function removed. (HAVE_MALLOC_USABLE_SIZE, HAVE_MALLOC_NP): Do not define these preprocessor symbols. * lib.c (HAVE_MALLOC_NP_H): Do not test for this variale to include <malloc_np.h> (string-own, string, string_utf8, mkstring, mkustring, string_extend, string_finish, string_set_code, string_get_code, length_str): Eliminate #ifdefs on HAVE_MALLOC_USABLE_SIZE. * lib.h (struct wstring): Eliminate #ifdef on MALLOC_USABLE_SIZE, so alloc member is unconditionally defined on all platforms.
* awk: use prepared lambdas for field separation.Kaz Kylheku2025-01-283-64/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle field separations with lambdas, similarly to record separation. The idea is that we replace the rec-to-f method, which contains a cond statement checking the variables for which field separation discipline applies, with a lambda which is updated whenever any of those ariables change. * awk.tl (awk-state): New instance slot, rec-to-f. (awk-state :postinit): Call new upd-rec-to-f method so that rec-to-f is populated with the default field separating lambda. (awk-state rec-to-f): Method removed. (awk-state upd-rec-to-f): New method, based on rec-to-f. This doesn't perform the field separation, but returns a lambda which will perform it. (awk-state loop): We must call upd-rec-to-f whenever we change par-mode, because it influences field separation. (awk-mac-let): Replace the symbol macros fs, ft, fw and kfs with new implementations that use the reactive slot mechanism provided by rslot. Whenever the awk macro assigns any of these, the upd-rec-to-f method will be called. * tests/015/awk-basic.tl: New file. These basic tests of field separation pass before and after this change. * tests/common.tl (otest, motest): New macros.
* doc: *print-flo-format*: show string value in quotes.Kaz Kylheku2025-01-251-1/+1
| | | | | * txr.1: The example possible value "~3,4f" should be shown as a string literal in quotes.
* get-csv: bugfix: return nil on EOF.Kaz Kylheku2025-01-243-5/+40
| | | | | | | | | | | | | | | | | * stream.c (get_csv): Let's add a new state init. If get_char returns nil and we are in the init state, let's bail to a nil return. While we are at it, let's not allocate the record or string until we read at least one character. If we read a character in the init state, let's allocate those two objects, and then change to the rfield state and fall through to it to handle the character. * tests/010/csv.tl: Fix one incorrect test: (tocsv "") now returns nil, as it should. Add tests for multiple record extraction, also covering missing line termination on the last record as well as CR-LF termination. * txr.1: Documented nil return conditions.
* New functions for producing CSV.Kaz Kylheku2025-01-244-0/+111
| | | | | | | | | | | | * stream.c (put_csv, tocsv): New functions. (stream_init): put-csv and tocsv intrinsics registered. * stream.h (put_csv, tocsv): Declared. * tests/010/csv.tl (mtest-pcsv): New macro. New test cases. * txr.1: Documented.
* doc: split Data Interchange Support section.Kaz Kylheku2025-01-241-1/+3
| | | | * txr.1: Split Data Interchange Support into JSON and CSV.
* get-csv: refactor into switches.Kaz Kylheku2025-01-211-19/+26
| | | | | | | * stream.c (get_csv): All cases handle end-of-stream the same way, so we check for nil outside of the case switch. Then only characters need to be handled, so we can call c_chr(ch) and switch on it.
* get-csv: rewrite in C.Kaz Kylheku2025-01-214-91/+79
| | | | | | | | | | | | | | * autload.c (csv_set_entries, csv_instantiate): Functions removed. (autoload_init): Autoload registration for stdlib/csv removed. * stdlib/csv.tl: File removed. * stream.c (get_csv): New function. (stream_init): Register get-csv intrinsic. * stream.h (get_csv): Declared.
* get-csv: use symbols for states.Kaz Kylheku2025-01-211-45/+44
| | | | | | * csv.tl (get-csv): Since there are only three states, there is no jump table optimization. We might as well use keyword symbols for the states rather than integers.
* get-csv: simplify implementation by CR-LF folding.Kaz Kylheku2025-01-211-30/+7
| | | | | | | | | * stdlib/csv.tl (get-csv): Pre-process the input by a small state machine that maps CR-LF sequences to LF. Then we don't have to recognize #\return anywhere in the state machine and can delete the cr and qcr states, as well as all the code recognizing #\return and branching to those states.
* New function: get-csv.Kaz Kylheku2025-01-214-0/+373
| | | | | | | | | | | | | * autloload.c (csv_set_entries, csv_instantiate): New static funtions. (autoload_init): Register autoload of stdlib/csv module via new functions. * stdlib/csv.tl: New file. * tests/010/csv.tl: Likewise. * txr.1: Documented.
* New macros for enumerated constants.Kaz Kylheku2025-01-204-0/+174
| | | | | | | | | | | | | * autoload.c (enum_set_entries, enum_instantiate): New static functions. (autoload_init): Register autoload of stdlib/enum module via new functions. * stdlib/enum.tl: New file. * tests/016/enum.tl: Likewise. * txr.1: Documented.
* Makefile: tidy up clean targets.Kaz Kylheku2025-01-171-3/+12
| | | | | | | | | | | | | * Makefile (clean): Do not remove the run.sh file here. That is a temporary file that install-tests should be cleaning away when done, which we can remove in distclean. (clean-doc, clean-tests): New targets, which let us clean up generated documentation files and the test-generated state information in tst. (distclean): Do not remove the documentation stuff here; rely on clean-doc. Also depend on clean-tests. Do remove run.sh here. (install-tests): Add missing .PHONY. Remove run.sh.
* lflow/lopip: optimize one argument situations via lop1.Kaz Kylheku2025-01-171-21/+25
| | | | | | | | | | | | | | | | | | | | | | | In an opip pipeline, only the first pipeline element can receive multiple arguments. The subsequent elements receive the single return value from the previous element. Therefore if it is a left-inserting pipeline created by lopip, only the first element needs to use lop. The others can use lop1, resulting in an optimization. Furthermore in the flow/lflow macros, even the first function in the pipeline is called with one argument: the result of the input expression. So the case of lflow, every element of the pipe that would translate to lop can go to lop1 instead. * stdlib/opt.tl (sys:opip-expand): Calculate a local variable called opsym-rest which determines which op symbol we use for the recursive call. This is the same as the incoming opsym, except in the case when opsym is lop, in which case we substitute lop1. (sys:lopip1): New macro, like lopip but uses lop1 for the first element also. (lflow): Expand to sys:lopip1 rather than lopip.
* New macro: lop1.Kaz Kylheku2025-01-174-3/+66
| | | | | | | | | | | | | * autoload.c (op_set_entries): Autoload on lop1 symbol. * stldlib/op.tl (sys:op-expand): Add lop1 case. (sys:opip-expand): Add lop1 to the list of operators that are recgonized and specially treated. (lop1): New macro. * tests/012/op.tl: New tests. * txr.1: Documented.
* json: flat must override all effects of :standardKaz Kylheku2025-01-163-14/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | * stream.h (struct json_opts): Member flat removed. I noticed that !jo.flat was always being tested together with jo.fmt == json_fmt_standard. Except for a few places where the code only tested for json_fmt_standard, resulting in flat output, but some extra spaces. What distiguishes flat mode now is simply that we disable stream indentation. * lib.c (out_json_rec): Remove tests for !jo.flat. (out_json): Remove initialization of jo.flat member. In this function we set up indentation on the stream resulting in multi-line mode (existing behavior). (put_json): Remove initialization of jo.flat member. If flat mode is requested, then it overrides the format to json_fmt_default. I.e. json_fmt_standard coresponding to :standard is only in effect if flat is not requested. In this function we set up indentation on the stream if flat mode isn't requested, otherwise we disable indentation (existing behavior, enough to make flat work). * tests/010/json.tl: Tests for flat mode, :standard formatting, and combinaton of both.
* put_json: bug: incorrect defaulting of flat argument.Kaz Kylheku2025-01-151-2/+3
| | | | | | | | | | this also affects put_jsonl and tojson. * lib.c (put_json): The flat argument must be properly defaulted. Without this we are treating it as true when it is missing due to the convention that missing args are signaled by the : symbol. This bugs breaks the ability to use the :standard value for *print-json-format*.
* lop: don't insert args when metas present.Kaz Kylheku2025-01-083-26/+114
| | | | | | | | | | | | | | | | | | | | | The lop macro is inconsistent from op in that it inserts the trailing function arguments on the left even if arguments are explicitly given in the form via @1, @2, ... or @rest. This change makes lop is equivalent to op in all situations when these metas are given. * stdlib/op.tl (compat-225, compat-298): New top-level variables. (op-expand): local variable compat replaced by references to compat-225. If compat-298 is *not* in effect, then metas are checked for first in the cond, preventing the lop transformation from taking place. * tests/012/op.tl: Test cases for lop, combinations of do with lop and a few for op also. * txr.1: Redocumented, added compat notes.
* Copyright year bump 2025.Kaz Kylheku2025-01-01141-143/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright bumped to 2025.
* match-case: bugfix in conversion to casequal.Kaz Kylheku2024-12-202-2/+3
| | | | | | | | | | | | | * stdlib/match.tl (match-case-to-casequal): the (do inc dfl-cnt) action has a problem: it inserts an implicit extra parameter to the invocation of inc, which crashes the + addition due to that parameter being the matching @nil object. We don't need this entire case because it handles @nil, which also matches the following case for (sys:var ...), since @nil is (sys:var nil). That case ahs the same action of incrementing dfl-cnt. * tests/011/patmatch.tl: Test case added.
* Version 298.txr-298Kaz Kylheku2024-12-174-5/+17
| | | | | | | | | | * RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date.
* listener: regression: txr bails when .txr-profile missing.Kaz Kylheku2024-12-161-0/+3
| | | | | * parser.c (load_rcfile): When neither ~/.txr-profile nor ~/.txr_profile exist, then bail. Do not pass nil to abs-path-p and other functions.
* Version 297.txr-297Kaz Kylheku2024-12-166-735/+771
| | | | | | | | | | | | * RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated.
* bug: string range length signed/unsigned.Kaz Kylheku2024-12-161-1/+2
| | | | | | | | * lib.c (length_str_range): On platforms where wchar_t is unsigned, we calculate bogus values for reversed ranges. On Android, gcc warns about the code, and the recently added tests fail. Let's cast the characters to long before doing the subtraction, which is the argument type of labs.
* string ranges: bug: ranges of length 1.Kaz Kylheku2024-12-153-2/+24
| | | | | | | | | * lib.c (seq_iter_init_with_info): String ranges are inclusive. We must not assume at a range whose endpoints are the same is empty; we must check that case for the endpoints being strings. * tests/012/seq.tl: New tests.
* tests for string range length.Kaz Kylheku2024-12-151-0/+12
| | | | * tests/012/seq.tl: New tests.
* lib: fix g++ warning in map_common.Kaz Kylheku2024-12-111-1/+1
| | | | | | * eval.c (map_common): use the all_zero_init macro, defined differently for C and C++ for initializing a struct to all zero.
* quips: jab at Weller.Kaz Kylheku2024-12-081-0/+1
| | | | * stdlib/quips.tl (%quips%): New one.
* switch to .txr-profile and .txr-historyKaz Kylheku2024-11-143-19/+74
| | | | | | | | | | | | | | | | | | | | The profile and history files should have used hyphens from the beginning. Let's switch to that but continue to work with the old files if present, as an obsolescent feature. * parser.c (open_txr_file): Treat files with .txr-profile suffix as Lisp. (load_rcfile): Arguments rearranged. This function now needs the home directory and the existence test function, but does not need the profile file name. It tries .txr-profile first, then .txr-profile. (repl): Call load_rcfile in the new way. Try two history files: first .txr-history and then .txr_history. Remember which one was used so the same one is saved. If neither file existed at startup, then save new history into .txr-history. * txr.1: Documented.
* ffi: bug: flexible structure size calculation.Kaz Kylheku2024-10-071-3/+2
| | | | | | | | | | | * ffi.c (make_ffi_type_struct): We must calculate the size of a flexible structure the way GCC does it. We cannot simply truncate it at the offset of the member. Rather, the size is calculated in the usual way. The alignment of the array is taken into account for the purpose of determining what is the most aligned member of the structure, and then padding is added, if required. Thus, the size may exceed the offset of that member.
* copy: now handles range objects.Kaz Kylheku2024-10-014-22/+20
| | | | | | | | | | | | | | | | Ranges are iterable, denoting abstract sequences. The copy function now copies a range by constructing the array. This is useful when copy is used for the purpose of obtaining a mutable copy. For example, (shuffle 0..100) will now work, returning a shuffled vector of the integers from 0 to 99. * lib.c (copy): Handle RNG case via vec_seq. * tests/012/seq.tl, * tests/012/sort.tl: New test cases. * txr.1: Documented. Documentation for the copy function improved.
* refset, replace: adjust diagnostic for unsupported object.Kaz Kylheku2024-09-301-2/+2
| | | | | | | * lib.c (refset, replace): Word the bad object diagnostic in terms of it not being a modifiable sequence. This covers cases when the object is something abstractly iterable, like a range. We don't want to say that it's not a sequence.
* regex: closure operations don't output epsilon states.Kaz Kylheku2024-09-151-6/+12
| | | | | | | | | | | | * regex.c (nfa_closure, nfa_move_closure): Do not add epsilon states to the output array. We only need to add them to the stack for the spanning traversal. Epsilon states are not real states; they are just a representation of the concept of transitioning to multiple states at the same time. When we add them to the output, they just end up being ignored when nfa_move_closure is called again on that set, since it only cares about states that have real transitions for a character.
* regex: bugfix: not taking full advantage of REGM_MATCH_DONE.Kaz Kylheku2024-09-151-2/+2
| | | | | | | | | | | | * regex.c (nfa_has_transitions): The logical disjunction here is wrong. We would like to test whether a state has transitions if it is not an epsilon state. The code which uses this macro doesn't care about epsilon states, even though they have transitions; those are squeezed out by transitively closure. The wrong condition here makes the code think that a NFA set has transitions when it does not, preventing the result code REGM_MATCH_DONE to be produced which can spare a character from being consumed from a stream.
* regex: misleading comment.Kaz Kylheku2024-09-151-1/+2
| | | | | | * regex.c (mfa_move_closure): Fix misleading wording in comment. The state which matches the character (has a transition for it) is not the one added to the move set.
* read-until-match: fix regression.Kaz Kylheku2024-09-143-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | Commit 9aa751c8a4f845ef2d2bba091c81ffeded941afd broke things. This fix affects the function read-until-match, scan-until-match and count-until-match which share implementation. * regex.c (scan_until_common): In the REGM_MATCH_DONE and REGM_MATCH cases, we must push the character onto the local stack, before doing the match = stack assignment. Otherwise, it's possible that the stack is empty and so no match is recorded. The REGM_FAIL case will then behave as if no match was found, consuming a character and continuing. * txr.1: Codify an existing behavior: only non-empty matches for the regex are considered by read-until-match. * tests/015/regex.tl: New file. I am amazed to discover that we don't seem to have a test suite for regexes at all. Putting the tests here which confirm this fix and provide coverage for some edge cases in read-until-match.