txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	del/replace with index-list: fix semantics.	Kaz Kylheku	2023-07-18	1	-11/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit does two things. The replace function, implemented under the hood by four specializations: replace-list, replace-vec, replace-str and replace-buf, will handle the index-list case a little differently. This is needed to fix the ability of the del macro work on place designated by an index list, such as: (del [sequence '(1 3 5 6)] which now deletes elements 1, 3, 5 and 6 from the sequence, and returns a sequence of those items. The underlying implementation uses replace with an index-list, which is now capable of deleting items. Previously, replace would stop processing the index list when the replacement-sequence corresponding to the index list ran out of items. Now, when the replacement-sequence runs out of items, the remaining index-list sequence elements specify items to be deleted. For instance if str holds "abcdefg" then: (set [str '(1 3 5)] "xy") will change str to "axcyeg". Elements 1 and 3 are replaced by x and y, respectively. Element 5, the letter f, is deleted, because the replacement "xy" has no element corresponding to 5. * lib.c (replace_list, replace_str, replace_vec): Implement new deleteion semantics for the case when the replacement sequence runs out of items. * buf.c (replace_buf): Likewise. * tests/010/seq.txr: Some new test cases here for deletion. * tests/010/seq.expected: Updated. * txr.1: Documented new semantics of replace, including a new restriction that if elements are being deleted, the indices should be monotonically increasing regardless of the type of the sequence (not only list). A value of 289 for the -C option documented, which restores the previous behavior of replace (breaking deletion by index-list, unfortunately: you don't always get to simulate an old version of TXR while using new features.)
*	printer: print (sys:vector-list ()) as #() not #nil.	Kaz Kylheku	2023-07-17	1	-1/+4
\| \| \| \| \|	* lib.c (obj_print_impl): Check for this case and handle. The #nil syntax is not readable.
*	unique: use sequence iteration	Kaz Kylheku	2023-07-10	1	-20/+8
\| \| \| \| \|	* lib.c (unique): Use seq_iter_t rather than dividing into list and vector cases.
*	Bug: ranges not treated as iterable in some situations.	Kaz Kylheku	2023-06-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (seq_kind_tab): Map the RNG type to SEQ_VECLIKE rather than SEQ_NOTSEQ. This one liner change causes ranges to be rejected for iteration needlessly. For instance, we can now do (take 42 "00".."99"). Note that ranges are not literally vector-like; they don't support the ref operation. The effect of this change is simply that they are not rejected due to being SEQ_NOTSEQ. The iteration code already handles RNG specially, so the SEQ_VECLIKE classification doesn't actually matter.
*	Callable integers become assignable places.	Kaz Kylheku	2023-06-30	1	-30/+66
\| \| \| \| \| \| \| \| \|	* lib.c (dwim_set): Handle seq argument being an integer or range. * tests/012/callable.tl: A few tests. * txr.1: Documented.
*	New: callable integers and ranges.	Kaz Kylheku	2023-06-28	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (do_generic_funcall): Allow integers and ranges to be function callable. They take one argument and index into it or extract a slice. In the case of ranges, this is a breaking change. Ranges can already be used in the function position in some limited ways that are not worth preserving. * tests/012/callable.tl: New file. * tests/012/iter.tl: Here we fix two instances of breakage. Using txr -C 288 will restore the behaviors previously tested here. * txr.1: Documented.
*	equal: bug: broken equality substitution.	Kaz Kylheku	2023-06-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (equal): Several cases which react to the type of the left argument have a default path which wrongly short-circuits to an early return. All these cases must break through to the logic at the end of the function which tests the right side for a possible equality substitution. * tests/012/struct.tl: One breaking test cases added. equal was found to return nil for two structures that have equal lists as their equality substitute.
*	ssort: gc bug in vector case.	Kaz Kylheku	2023-06-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	* gc.[ch] (gc_prot_array_alloc): Return the COBJ via new pointer argument. * lib.c (ssort_vec): Capture the object from gt_prot_array_alloc into a local variable, That makes it visible to the garbage collector, so it won't be prematurely reclaimed. Since we don't use or return that object, we need to use gc_hint.
*	New @(push) directive.	Kaz Kylheku	2023-06-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	@(push) is like @(output), but feeds back into input. Use carefully. * parser.y (PUSH): New token. (output_push): New nonterminal symbol. (output_clause): Handle OUTPUT or PUSH via output_push. Some logic moved to output_helper. (output_helper): New function. Transforms both @(output) and @(push) directives. Checks both for valid keywords; push has only :filter. * parser.l (grammar): Recognize @(push similarly to other directives. * lib.[ch] (push_s): New symbol variable. * match.c (v_output_keys): Internal linkage changes to external. (v_push): New function. (v_parallel): We must fix the max_line algorithm not to use an initial value of zero, because lines can go negative thanks to @(push). We end up rejecting the pushed data. (v_collect): We can no longer assert that the data line number doesn't retreat. (dir_tables_init): Register push directive in table of vertical directives. * match.h (append_k, continue_k, finish_k): Existing symbol variables declared. (v_output_keys): Declared. * y.tab.c.shipped, * y.tab.h.shipped, * lex.yy.c.shipped: Updated. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	lib: string bug spotted on Solaris.	Kaz Kylheku	2023-06-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	* lib.c (string_finish): On platforms where we do not HAVE_MALLOC_USABLE_SIZE, there is a compiler diagnostic. The code is inappropriately using the set macro and mkloc to assign the st->st.alloc member, which is just a number (cnum type), and not a val. The set macro is only needed when mutating a gc heap object to point to another gc heap object.
*	New functions keep-keys-if, separate-keys.	Kaz Kylheku	2023-06-07	1	-1/+135
\| \| \| \| \| \| \| \| \| \| \|	* lib.[ch] (keep_keys_if, separate_keys): New functions. * eval.c (eval_init): keep-keys-if, separate-keys intrinsics registered. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	lib: fix issue uncovered by recent vm CALL insn change.	Kaz Kylheku	2023-05-24	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions funcall1 through funcall4, when invoking a VM function, are not defending against the case when there are more arguments than the function can take. As a result, some :mass-delegate tests in tests/012/oop.tl are failing. They expect an :error result, but the calls are succeeding in spite of passing too many parameters via the delegate interface. The tests/012/lambda.tl suite should catch this, but it has unfortunate weaknesses. * lib.c (funcall1, funcall2, funcall3, funcall4): When dispatching the general VM case via vm_execute_closure, check that if the closure has fewer fixed parameters than arguments we are passing, it must be variadic, or else there is an error. * tests/012/lambda.tl (call-lambda-fixed): New function. Unlike call-lambda, which uses the apply dot syntax, this switches on the argument list shape and dispatches direct calls. These compile to the CALL instruction cases with four arguments or less which will exercise funcall, funcall1, ... funcall4. Also, adding some missing test cases that probe behavior with excess arguments.
*	sort: optimizations eliding keyfun and access.	Kaz Kylheku	2023-05-03	1	-15/+45
\| \| \| \| \| \| \| \|	* lib.c (quicksort): Avoid calls to keyfun when it's known to be identity, (mergesort): Likewise. Also, avoid redundant accesses to the vector when merging, for that index which has not moved between iterations.
*	gc: use single allocation for prot_array.	Kaz Kylheku	2023-05-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* gc.c (prot_array): Add self pointer; arr member becomes flexible array. (prot_array_mark): We now check the handle itself for null, because the whole thing is freed. (prot_array_free): Function removed. (prot_array_ops): Wire cobj_destroy_free_op in place of prot_array_free. This fixes a memory leak because prot_array_free was not freeing the handle, only the array. (gc_prot_array_alloc): Fix to allocate everything in one swoop and store the self-pointer in the named member rather than arr[-1]. The self argument is not required; we drop it. The size argument cannot be anywhere near INT_PTR_MAX, because such an array wouldn't fit into virtual memory, so it is always safe to add a small value to the size. (prot_array_free): Obtain the self-pointer, and free the handle, replacing it with a null pointer. * gc.h (gc_prot_array_alloc): Declaration updated. * lib.c (ssort_vec): Don't pass self to gc_prot_array_alloc. * lib.h (container): New macro.
*	sort: support stable sorting via ssort and snsort.	Kaz Kylheku	2023-05-02	1	-0/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For array-like objecgts, these objects use an array-based merge sort, using an auxiliary array equal in size to the original array. To provide the auxiliary array, a new kind of very simple vector-like object is introduced into the gc module: protected array. This looks like a raw dynamic C array of val type, returned as a val . Under the hood, there is a heap object there, which makes the array traversable by the garbage collector. The whole point of this exercise is to make the new mergesort function safe even if the caller-supplied functions misbehave in such a way that the auxiliary array holds the only references to heap objects. gc.c (struct prot_array): New struct, (prot_array_cls): New static variable. (gc_late_init): Register COBJ class, retaining in prot_array_cls. (prot_array_mark, prot_array_free): New static functions. (prot_array_ops): New static structure. (prot_array_alloc, prot_array_free): New functions. * gc.h (prot_array_alloc, prot_array_free): Declared. * lib.c (mergesort, ssort_vec): New static function. (snsort, ssort): New functions. * lib.h (snsort, ssort): Declared. * tests/010/sort.tl: Cover ssort. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	sort: correct name in error reporting.	Kaz Kylheku	2023-05-02	1	-4/+3
\| \| \| \| \| \|	* lib.c (sort_vec): Take self argument instead of assuming that we are sort; this can be called by nsort. (nsort, sort): Pass self to sort_vec.
*	sort: replace Lomuto partitioning with Hoare	Kaz Kylheku	2023-05-01	1	-30/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm seeing numbers aobut the same performance on a sorted vector of integers, and 21% faster on vector of N random integers in the range [0, N). Also, this original algorithm handles well the case of an array consisting of a repeated value. The code we are replacing degrates to quadratic time. * lib.c (med_of_three, middle_pivot): We don't use the return value, so don't calculate and return one. (quicksort): Revise to Hoare: scanning from both ends of the array, exchanging elements. * tests/010/sort.tl: New file. We test sort with lists and vectors from length zero to eight, all permutations.
*	match: ^#S() and ^#H(()) patterns must work	Kaz Kylheku	2023-04-29	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quasiquote patterns not containing unquotes are not working, because the parser transforms them into quoted objects. For instance ^#S(time) becomes the form (quote #S(time)) and not the form (sys:qquote (sys:struct-lit time)). The pattern matching compiler doesn't treat quote specially, only sys:qquote. * parser.y (unquotes_occur): Function removed. (vector, hash, struct, tree, json_vals, json_pairs): Remove use of unquotes_occur. Thus vector, hash, struct, tree and JSON syntax occurring within a backquote will be turned into a special literal whether or not it contains unquotes. * lib.c (obj_print_impl): Do not print the form (sys:hash-lit) as #Hnil, but #H(). * stdlib/match.tl (transform-qquote): Add a case which will handle ^#H(), as if it were ^H(()). Bugfix in the ^H(() ...) case. The use of @(coll) means it fails to match the empty syntax when no key/value pairs are specified, whereas @(all) respects vacuous truth. * test/011/patmatch.tl: A few tests. * y.tab.shipped, y.tab.h.shipped: Updated.
*	printer: print tree as #T(...) beyond max depth.	Kaz Kylheku	2023-03-23	1	-0/+3
\| \| \| \| \| \| \|	* lib.c (obj_print_impl): For consistenfcy with other aggregates---lists, vectors and hashes---when the maximum depth has been exceeded we should likewise print binary search tree objects as #T(...).
*	printer: [] shouldn't print as [. nil].	Kaz Kylheku	2023-03-23	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	* lib.c (obj_print_impl): In the case when dwim has no args, and the logic short circuits to a closing brace, bypassing the loop, we should only use the dot notation if the terminating atom is other than nil. * tests/012/readprint.tl: Tests added.
*	ignerr: fix unused warning	Kaz Kylheku	2023-03-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ignerr intrinsic macro generates code that has an unused variable. We fix it by turning it into a gensym, since unused warnings aren't generated for gensyms. * eval.c (unused_arg_s): New static variable. (me_ignerr): Use the value of unused_arg_s instead of error_s, for the argument of the catch clause. (eval_init): gc-protect unused_arg_s. (eval_late_init): New function in which we initialized unused_arg_s. The gensym function cannot be used during eval_init. * eval.h (eval_late_init): Declared. * lib.c (init): Call eval_late_init after some other late initializations.
*	Copyright year bump 2023.	Kaz Kylheku	2023-01-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2023.
*	cat-str/join/join-with: allow nested sequences	Kaz Kylheku	2022-10-25	1	-62/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The measure/allocate/catenate functions which underlie the cat-str implementation are streamlined, simplifying the code. At the same time, they handle nested sequences of string/character items. * lib.c (struct cat_str): New member, seen_one. This flips from 0 to 1 after the first item has been seen in the cat_str_measure pass or cat_str_append pass. Each item other than the first is preceded by a separator. (cat_str_measure, cat_str_append): The more_p argument is dropped. We account for the separator with the help of the new seen_one flag, which allows us to easily recurse over items that are sequences. (cat_str_alloc): Reset the seen_one flag in preparation for the cat_str_append pass. (cat_str, vscat, scat2, scat3, join_with): Simplified. * tests/015/split.tl: New tests. * txr.1: Redocumented.
*	args: don't use alloca for const size cases.	Kaz Kylheku	2022-10-15	1	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* args.h (args_decl_list): This macro now handles only constant values of N. It declares an anonyous container struct type which juxtaposes the struc args header with exactly N values. This is simply defined as a local variable without alloca. (args_decl_constsize): Like args_decl, but requiring a constant N; implemented via args_decl_list. (args_decl_list_dyn): New name for the old args_decl_list which calls alloca. No places in the code depend on this at all, except the definition of args_decl. (args_decl): Retargeted to args_decl_list_dyn. There is some inconsistency in the macro naming in that args_decl_constsize depends on args_decl_list, and args_decl depends on arg_decl_list_dyn. This was done to minimize diffs. Most direct uses of args_decl_list have a constant size, but a large number of args_decl uses do not have a constant size. * eval.c (op_catch): Use args_decl_constsize. * ffi.c (ffi_struct_in, ffi_struct_get, union_out): Likewise. * ftw.c (ftw_callback): Likewise. * lib.c (funcall, funcall1, funcall2, funcall3, funcall4, uniq, relate): Likewise. * socket.c (sockaddr_in_unpack, sockaddr_in6_unpack, sockaddr_un_unpack): Likewise. * stream.c (formatv): Likewise. * struct.c (struct_from_plist, struct_from_args, make_struct_lit): Likewise. * sysif.c (termios_unpack): Likewise. * time.c (broken_time_struct): Likewise.
*	funcall: consolidate VM fun handling.	Kaz Kylheku	2022-10-14	1	-34/+80
\| \| \| \| \| \| \|	* lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Handle FVM case separately, regardless of the f.variadic flag. If the VM function is variadic, the call can still use the special cases.
*	funcall: handle optargs in funcall helpers.	Kaz Kylheku	2022-10-14	1	-8/+85
\| \| \| \| \| \|	* lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Handle some situations when the function is a built-in with optional args.
*	funcall: don't route to generic_fun on optargs.	Kaz Kylheku	2022-10-14	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Do not go through the generic_funcall slow path just because the target function has optional arguments. It's possible that the call is supplying all of the required arguments. Let's try it like that and then if it doesn't work and there are optionals, check again and go the generic_funcall route. This might not be an overall improvement by itself, if we end up going to generic_funcall in more cases than not. However, this change paves the way for more changes: handling some cases of optargs in these helpers.
*	json: support standard-style formatting.	Kaz Kylheku	2022-10-11	1	-36/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stream.c (standard_k, print_json_format_s): New symbol variables. (stream_init): New variables initialized. * stream.h (enum json_fmt): New enum. (standard_k, print_json_format_s): Declared. * lib.c (out_json_rec): Take enum json_fmt param, and pass it recursively. Printing for vector and dictionaries reacts to argument value. (out_json, put_json): Examine value of special var print-json-format and calculate enum json_fmt value from this. Pass to out_json_rec. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	put-json: restore indent on unwinding.	Kaz Kylheku	2022-10-11	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following situation is observed in the listener. 1> (put-json #H(() (a 1))) { print: invalid object a in JSON during evaluation at expr-1:1 of form (put-json #H(() (a 1))) 1> 1> An indent established in the aborted JSON print job has been left in the stream. * lib.c (put_json): Save the indent value also, not only the mode. Restore the indent mode and value on unwinding, not just on a normal exit from out_json_rec, similiarly to what obj_print does.
*	strings: revert caching of hash value.	Kaz Kylheku	2022-10-08	1	-24/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Research indicates that this is something useful in languages that abuse strings for implementing symbols. We have interned symbols. * lib.h (struct string): Remove hash member. * lib.c (string_own, string, string_utf8, mkustring, string_extend, replace_str, chr_str_set): Remove all initializations and updates of the removed hash member. * hash.c (equal_hash): Do not cache string hash value.
*	strings: take advantage of malloc_usable_size	Kaz Kylheku	2022-10-06	1	-13/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On platforms which have the malloc_usable_size function, we don't have to store the allocated size of an object; malloc provides us the allocated size (which may be larger than we requested). Here we take advantage of this for strings. And since we don't have to store the string allocated size any more, we use that field for something else: storing the hash code (for seed zero). This can speed up some hashing operations. * configure (have_malloc_usable_size): New variable. Configure test for have_malloc_usable size. We have to try several header files, too. We set the configure variable HAVE_MALLOC_USABLE_SIZE, and possibly HAVE_MALLOC_H or HAVE_MALLOC_NP_H. * lib.h (struct string): If HAVE_MALLOC_USABLE_SIZE is true, we define a member called hash insetad of alloc. Also, we change alloc to cnum. * lib.c: Include <malloc_np.h> if HAVE_MALLOC_NP_H is defined. (string_own, string, string_utf8, mkstring, mkustring, init_str, string_extend, string_finish, string_set_code, string_get_code, length_str, replace_str, chr_str_set): Fix code for both cases. On platforms with malloc_usable_size, we have the allocated size from malloc, so we don't have to retrieve it from the object or store it. Any operations which mutate the string must reset the hash field to zero; zero means "hash has not been calculated". * hash.c (equal_hash): Just retrive a string's hash value, if it is nonzero, otherwise calculate, cache it and return it. * gc.c (mark_obj): The alloc member of struct string is a machine integer now; no need to mark it.
*	seq-iter: bugfix: floating-point ranges.	Kaz Kylheku	2022-09-15	1	-24/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (seq_iter_get_range_bignum): Static function renamed to seq_iter_get_range_number because it in fact generalizes to numbers. (seq_iter_peek_range_bignum): Renamed to seq_iter_peek_range_number. (seq_iter_get_rev_range_bignum): Renamed to seq_iter_get_rev_range_number. (seq_iter_peek_rev_range_bignum): Renamed to seq_iter_peek_rev_range_number. (si_range_bignum_ops): Renamed to si_range_number_ops. (si_rev_range_bignum_ops): Renamed to si_rev_range_number_ops. (seq_iter_init_with_info): Handle ranges where the from value is floating-point. Also, if the from-value is bignum that fits into cnum range, we now try to handle that as a cnum range. * tests/012/iter.tl: New tests.
*	Implement NaN boxing.	Kaz Kylheku	2022-09-13	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On platforms with 64 bit pointers, and therefore 64-bit-wide TXR values, we can use a representation technique which allows double floating-point values to be unboxed. Fixnum integers are reduced from 62 bits to 50, and there is a little more complexity in the run-time type checking and dispatch which costs extra cycles. The support is currently off by default; it must be explicitly enabled with ./configure --nan-boxing. * lib.h (NUM_MAX, NUM_MIN, NUM_BIT): Define separately for NaN boxing. (TAG_FLNUM, TAG_WIDTH, NAN_TAG_BIT, NAN_TAG_MASK, TAG_BIGMASK, TAG_BIGSHIFT, NAN_FLNUM_DELTA): New preprocessor symbols. (enum type, type_t): The FLNUM enumeration constant moves to just after LIT, so that its value is the same as TAG_FLNUM. (struct flonum): Does not exist under NaN boxing. (union obj): No fl member under NaN boxing. (tag, is_ptr): Separately defined for NaN boxing. (is_flo): New function under NaN boxing. (tag_ex): New function. It's like tag, but identifies floating-point values as TAG_FLNUM. The tag function continues to map them to TAG_PTR, which is wrong under NaN boxing, but needed in order not to separately write tons of cases in the arith.c module. (type): Use tag_ex, so TAG_FLNUM is handled, if it exists. (auto_str, static_str, litptr, num_fast, chr, c_n, c_u): Different definition for NaN boxing. (c_ch, c_f): New function. (throw_mismatch): Attribute with NORETURN. (nao): Separate definition for NaN boxing. * lib.c (seq_kind_tab): Reorder initializer to follow enum reordering. (seq_iter_rewind): use c_n and c_ch functions, since type checking has been done in those cases. The self parameter is no longer needed. (iter_more): use c_ch on CHR object. (equal): Use c_f accessor to get double value rather than assuming there is a struct flonum representation. (stringp): Use tag_ex, otherwise a floating-point number is identified as TAG_PTR. (diff, isec, isecp): Don't pass removed self parameter to seq_iter_rewind. * arith.c (c_unum, c_dbl_num, c_dbl_unum, plus, minus, signum, gt, lt, ge, le, numeq, logand, logior, logxor, logxor_old, bit, bitset, tofloat, toint, width, c_num, c_fixnum): Extract floating-point value using c_f accessor. Handle CHR type separately from NUM because the storage representation is no longer identical; CHR values have a two bit tag over bits where NUM has ordinary value bits. NUM is tagged at the NaN level with the upper 14 bits being 0xFFFC. The remaining 50 bits are the value. (flo): Construct unboxed float under NaN boxing by taking image of double as a 64 bit value, and adding the delta offset, then casting to the val pointer type. (c_flo): Separate implementation for NaN boxing. (integerp, numberp): Use tag_ex. * buf.c (str_buf, buf_int): Separate CHR and NUM cases, like in numerous arith.c functions. * chksum.c (sha256_hash, md5_hash): Use c_ch accessor for CHR value. * hash.c (equal_hash, eql_hash): Handle CHR separately. Use c_f accessor for floating-point value. (eq_hash): Use tag_ex and handle TAG_FLNUM value under NaN boxing. Handle CHR separately from NUM. * ffi.c (ffi_float_put, ffi_double_put, carray_uint, carray_int): Handle CHR and NUM separately. * stream.c (formatv): Use c_f accessor. * configure: disable automatic selection of NaN boxing on 64 bit platforms, for now. Add test whether -Wno-strict-aliasing is supported by the compiler, performed only if NaN boxing is enabled. We need to disable this warning because it goes off on the code that reinterprets an integer as a double and vice versa.
*	Reduce proliferation of TAG_SHIFT.	Kaz Kylheku	2022-09-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	* arith.c (num_to_buffer, c_unum, c_dbl_num, c_dbl_unum, c_num, c_fixnum): Use c_n inline function instead of open coding exactly the same thing. * lib.c (c_chr): Likewise. * struct.c (make_struct_type, lookup_slot, lookup_static_slot_desc, static_slot_p): Likewise.
*	syntax: read and print [. x] and [. @x].	Kaz Kylheku	2022-09-08	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (obj_print_impl): Handle (dwim . atom) syntax by printing [. atom]. Note that (dwim . @var) and (dwim . @(expr)) already print as [. @var] and [. @(expr)]; this is not new. But none of these forms are supported by reading without the accompanying change to the parser. * parser.y (dwim): Handle the [. expr] and [ . expr] syntax, so that forms like [. a] and [. @a] have print-read consistency. The motivation is to be able to [. @args] in pattern matching to match a DWIM forms; I tried that and was surprised to have it blow up in my face. * tests/012/readprint.tl: New test file. Future printer/parser changes will be tested here. Historically, changes to the syntax have not been consistently unit-tested. * y.tab.c.shipped: Regenerated.
*	New macro: close-lazy-streams.	Kaz Kylheku	2022-08-28	1	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (lazy_stream_s): New symbol variable. (lazy_streams_binding): New static variable. (lazy_stream_register): New static function (lazy_stream_cons): If the stream is associated with a lazy cons, register it with lazy_stream_register. (obj_init): gc-protect lazy_streams_binding variable. Intern the sys:lazy-streams symbol. * lib.h (lazy_streams_s): Declared. * eval.c (eval_init): Register sys:lazy-streams special variable. * stdlib/getput.tl (close-lazy-streams): New macro. * autoload.c (getput_set_entries): Trigger autload on close-lazy-streams symbol. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New function: search-all	Kaz Kylheku	2022-08-17	1	-9/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): search-all intrinsic registered. * lib.c (search_common): New Boolean argument all, indicating whether all positions are to be returned. We must handle this in the two places where empty key and sequence are handled, and also in the main loop. A trick is used: the found variable is now bound by list_collect_decl, but not used for collecting unless all is true. (search, rsearch, contains): Pass 0 for all argument of search_common. (search_all): New function. * lib.h (search_all): Declared. * tests/012/seq.tl: New tests. * txr.1: Documented. * stdlib/doc-syms.tl: Regenerated.
*	stringp: rewrite.	Kaz Kylheku	2022-07-28	1	-5/+13
\| \| \| \| \| \|	* lib.c (stringp): Examine tag and then type separately, rather than using the canned type function. This leads to slightly nicer code, shorter by a couple of instructions.
*	New function: count.	Kaz Kylheku	2022-07-18	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general count function, with keyfun and testfun, is noticeably absent. Let's implement it. * lib.[ch] (count): New function. * eval.c (eval_init): Register count intrinsic. * tests/012/seq.tl: Some tests for count. * txr.1: Add count to count-if section. Revise documentation based on pos/pos-if. * stdlib/doc-syms.tl: Updated.
*	New function: str	Kaz Kylheku	2022-06-12	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The str function is like mkstring but allows a fill pattern to be specified. * eval.c (eval_init): str intrinsic registered. * lib.[ch[ (str): New function. * tests/015/str.tl: New file. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New: spln and tokn functions.	Kaz Kylheku	2022-05-30	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of trying to work the new count parameter into the spl and tok functions, it's better to make new ones. * eval.c (eval_init): spln and tokn intrinsics registered. * lib.[ch] (spln, tokn): New functions. * tests/015/split.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	tok-str: takes count argument.	Kaz Kylheku	2022-05-28	1	-4/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Update registration of tok-str. * lib.c (tok_str): New argument, count_opt. Implemented in the compat 155 case; what the heck. (tok): Pass nil to new parameter of tok_str. * lib.h (tok_str): Declaration updated. * tests/015/split.tl: New tests. * txr.1: Documented.
*	split-str: new count parameter.	Kaz Kylheku	2022-05-17	1	-7/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Fix up registration of split-str to account for new parameter. * lib.c (split_str_keep): Implement new optional count argument. (spl): Pass nil value to split_str_keep for new argument. I'd like this function to benefit from this argument also, but the design isn't settled. (split_str): Pass nil argument to split_str_keep. * lib.h (split_str_keep): Declaration updated. * tests/015/split.tl: New tests. * txr.1: Documented.
*	Print ([...] . @var) and ([...] . @(expr)) notation.	Kaz Kylheku	2022-05-11	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \|	This change fixes objects like (@a @b . @c) being printed as (@a @b sys:var c). This is piggybacked into the logic which renders dotted unquotes. In other words, we are already printing (x . ,y) in that from rather than (x sys:unquote y); we just recognize sys:var, and sys:expr in the same code. * lib.c (obj_print_impl): Recognize dotted metavariables and metaexpressions similarly to dotted unquotes.
*	New function: isecp.	Kaz Kylheku	2022-03-30	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register isecp intrinsic. * lib.c (isecp): New function. * lib.h (isecp): Declared. * stdlib/compiler.tl (lambda-apply-transform, dump-compiled-objects): Use isecp instead of isec, since the actual intersection of symbols isn't needed, only whether it exists. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New function: partition-if.	Kaz Kylheku	2022-02-23	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register partition-if intrinsic. * lib.c (partition_if_countdown_funv, partition_if_func): New functions. (partition_if): New function. * lib.h (partition_if): Declared. * tests/012/seq.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	find-max: convert to seq_info iteration.	Kaz Kylheku	2022-02-22	1	-64/+17
\| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (find_max): Simplify into a single loop rather than handling various sequence types specially. This means it works for all iterable objects now. * txr.1: find-max documentation updated; discussion of hash tables removed, since the described behavior is the one expected for hash tables as iterables. * tests/012/seq.tl: Add some test coverage.
*	New functions: find-max-key and find-min-key.	Kaz Kylheku	2022-02-21	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register find-max-key and find-min-key intrinsics. * lib.c (find_max_key, find_min_key): New functions. * lib.h (find_max_key, find_min_key): Declared. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	ffi: move socket stuff to socket module.	Kaz Kylheku	2022-02-17	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* alloca.h (zalloca): Macro moved here from ffi.c; it's useful to any code that wants to do a zero-filled alloca, and socket.c needs it now. * ffi.c (HAVE_SOCKETS): All includes conditional on HAVE_SOCKETS removed. (zalloca): Macro removed; moved to alloca.h. (ffi_type_struct_checked, ffi_type_lookup): Static functions changed to external linkage. (ffi_type_size, ffi_type_put, ffi_type_get): New functions, used by external module that has incomplete definition of struct txr_ffi_type. (type_by_size): New static array, moved out of ffi_init_extra_types function. (ffi_type_by_size): New function. (ffi_init_extra_types): type_by_size array relocated to file scope. (sock_opt, sock_set_opt): Moved to socket.c, and adjusted to use newly developed external access to needed ffi mechanisms. (ffi_init): Numerous definitions related to sockets removed; these go to socket.c. * ffi.h (struct txr_ffi_type): Declared here now as incomplete type. (ffi_type_struct_checked, ffi_type_size, ffi_type_put, ffi_type_get, ffi_type_lookup, ffi_type_by_size): Declared. * lib.c (init): Call new function sock_init. * socket.c (sock_opt, sock_set_opt): New functions, moved from ffi.c, and slightly adapted to work with external interfaces exposed by ffi.c. (sock_init): New function. This performs unconditional initializations not keyed to the lazy loading lisplib.c mechanism. Here we create the socklen-t FFI type. FFI types lookup doesn't trigger lazy loading, so we do it this way; the alternative would be to introduce lazy load triggering to FFI type lookup, just for this one type. (sock_load_init): All the socket function and variable registrations move here from ffi_init.
*	all: wrong self name.	Kaz Kylheku	2022-02-13	1	-1/+1
\| \| \| \|	* lib.c (all_satisfy): self should be "all" rather than "some".