summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* parser: remove some wasteful string object allocations.Kaz Kylheku2024-07-224-211/+236
| | | | | | | | | | | | | | | | * lib.c (int_str_wc): New function, made out of int_str. This can be used by the parser to work with a wchar_t * string without having to create a string object. (int_str): Implemented in terms of int_str_wc. * parser.l (grammar): Remove string_own calls from numerous rule bodies that use int_str to return a number. These rules now capture the wchar_t string, pass it to int_str_wc and then immediately free it. Whereas string_own allocates an extra object and leaves it to the garbage collector. * lex.yy.c.shipped: Regenerated.
* vim: #: and : notation in symbols not handled right.Kaz Kylheku2024-07-211-1/+1
| | | | | | | | | * genvim.txr (tl_ident): Move the pattern which matches the # and #: symbols above the more general one. Because, remember, because Vim's syntax match definitions work such that the last definition which matches wins, even if it's not the longest match. Not regenerating the txr.vim and tl.vim files; that will be done at release time.
* doc: fix miscoded backslashes in examples.Kaz Kylheku2024-07-201-5/+5
| | | | | | * txr.1: In examples for append, whereq and tuples*, fix backslashes not encoded as the \e sequence, causing improper rendering.
* oop: special methods to handle missing slots.Kaz Kylheku2024-07-194-8/+158
| | | | | | | | | | | | | | | | | | | | | * struct.h (slotset_s, static_slot_s, static_slot_set_s): New symbol variables declared. (enum special_slot): New enum symbols: slot_m, slotset_m, static_slot_m, static_slot_set_m. * struct.c (slotset_s, static_slot_s, static_slot_set_s): New symbol variables. (special_sym): Associate new symbols with new enums. (struct_init): Intern slotset, static-slot and static-slot-set symbols, initializing the variables. Change the registrations of the same-named functions to use the variables. (slot, maybe_slot, slotset, static_slot, static_slot_set): In the no-such-slot case, check for the special method and call it. * tests/012/oop.tl: New tests. * txr.1: Documented.
* struct: slot warning only for bindable symbols.Kaz Kylheku2024-07-171-1/+2
| | | | | | * stdlib/struct.tl (sys:check-slot): Don't issue the diagnostic "<obj> isn't the name of a struct slot" for slots that are not bindable symbols like obj."abc".
* doc: fix misleading claim about (. pattern).Kaz Kylheku2024-07-161-1/+3
| | | | | | * txr.1: We cannot say that (. pattern) is not a list pattern, since it is indistinguishable from pattern, which could itself be a list pattern.
* doc: misleading info about macro param lists.Kaz Kylheku2024-07-161-5/+4
| | | | | | | | | * txr.1: In the "Comparison to Macro Parameter Lists" section, which compares structural pattern matching to macro parameter lists, we remove the outdated claim that all positions in a macro parameter pattern must bind a variable. This has not been true since t has been supported as a way to support binding variables.
* New functions: find-maxes and find-mins.Kaz Kylheku2024-07-165-2/+86
| | | | | | | | | | | * eval.c (eval_init): New intrinsic functions find-maxes and find-mins. * lib.[ch] (find_maxes, find_mins): New function. * tests/012/seq.tl: New tests. * txr.1: Documented.
* json: new special var *print-json-type*.Kaz Kylheku2024-07-124-11/+45
| | | | | | | | | | | | | | | | | | | | This variable controls whether we emit the "__type" key for structures. * lib.c (out_json_rec): React to the new variable, via the flag in the json_opts structure: include the "__type" key only if it is requested. (out_json, put_json): Initialize the type flag in the josn_opts according to the *print-json-type* dynamic variable. * stream.c (print_json_type_s): New symbol variable. (stream_init): print_json_type_s initialized, and corresponding special variable registered, with intial value t. * stream.h (struct json_opts): New bitfield member, type. (print_json_type_s): Declared. * txr.1: Documented.
* New funtion related to where function.Kaz Kylheku2024-07-115-0/+176
| | | | | | | | | | | | | | * eval.c (eval_init): register intrinsics wheref, whereq, whereql and wherequal. * lib.c (wheref_fun): New static function. (wheref, whereq, whereql, wherequal): New functions. * lib.h (wheref, whereq, whereql, wherequal): Declared. * tests/012/seq.tl: New tests. * txr.1: Documented.
* doc: partition, split, split*: clarifications about indicesKaz Kylheku2024-07-101-2/+12
| | | | | | | | * txr.1: Clarify what repeated values mean in partition, since they are allowed. For split/split*, clarify that indice have to be strictly increasing after negative indicates are displaced by the sequence length, and that the behavior is unspecified otherwise.
* partition, split, split*: bug handling negative indices.Kaz Kylheku2024-07-102-22/+35
| | | | | | | | | | | | | | | | * lib.c (partition_func, split_func, split_star_func): When negative indices occur after the sequence has already been shortened, the conversion to positivce must take into account the base. This must be added so that the positive index produced is relative to the original length of the input sequence. When index_rebased is calculated, the base is subtracted out again. If we based the positive index off the shortened length, it's as if we are subtracting base twice. * tests/012/seq.tl: Dubious test cases for split* are replaced with the new results that make sense. Additional test cases are added which cover this issue, for not only split* but split and partition.
* split, split*: fix poor behavior for beyond-length indices.Kaz Kylheku2024-07-102-30/+32
| | | | | | | | | | * lib.c (split_func, split_Star_func): Ignore indices greater than the length of the sequence, the same as negative indices are ignored which don't become nonnegative after adding the length. * tests/012/seq.tl: Fix questionable test cases, which now confirm the right behavior.
* partition: add negative index tests.Kaz Kylheku2024-07-101-0/+36
| | | | * tests/012/seq.tl: New tests.
* split**: split for far negative indices.Kaz Kylheku2024-07-102-1/+33
| | | | | | | | | * lib.c (split_star_func): In empty index case, convert sequence via sub(seq, zero, t), so that ranges are properly expanded. This was done in split_func and partition_func in recent commits. * tests/012/seq.tl: Test cases added.
* split*: add missing tests.Kaz Kylheku2024-07-101-0/+125
| | | | | | * tests/012/seq.tl: Add tests for split* that were lost in some editing. Corresponding tests exist for split already and for partition.
* split: fix for far negative indices.Kaz Kylheku2024-07-102-1/+29
| | | | | | | | | * lib.c (split_func): In empty index case, convert sequence via sub(seq, zero, t), so that ranges are properly expanded. This was done in partition_func in the previous commit. * tests/012/seq.tl: Test cases added.
* split, split*, partition: tests, fixes.Kaz Kylheku2024-07-102-2/+412
| | | | | | | | | | | | | | * lib.c (partition_func): In empty index list case, run the sequence through sub(seq, zero, t) so that ranges are expanded: e.g. 1..3 becomes (1 2). The corresponding code in split_func and split_star_func also needs this fix, but the current test cases don't reproduce a problem. (partition_split_common): Likewise here. * tests/012/seq.tl: Tests for split, split* and partition. Some tests have questionable results. We accept these as they are for now; will address these.
* partition, split, split*: infinite looping regression.Kaz Kylheku2024-07-081-6/+9
| | | | | | | | | | | | | | This is a bug introduced in 9cfa3435 on 2024-02-24. The underlying cause is lack of test coverage for these functions. * lib.c (partition_func, split_func, split_star_func): The original code iterated through the indies using the pop macro, thus extracting the next index and stepping in one step. The iter_begin rewrite wrongly moved the iter_step into one of the cases. The index iteration must be stepped in the case where the loop is continued vi continue, otherwise an infinite loop results.
* sub: don't produce an iterator.Kaz Kylheku2024-07-072-4/+4
| | | | | | | | | | | | | | | | | | | | | Having the sub function yield an iterator in some cases is a defective requirement, causing problems like this: 1> (partition 1..10 '(2 3)) ((1 2) (3) #<seq-iter: a24c380>) With fix: 1> (partition 1..10 '(2 3)) ((1 2) (3) (4 5 6 7 8 9)) * lib.c (sub_iter): When the interval is open and we are operating on a sequence via iter-begin, do not return an iterator. Convert it to a lazy list. Not subjecting this to -C compat flag; I can't imagine anyone writing code to depend on this, rather than stepping around it as a bugx. * txr.1: Documentation updated.
* json: fix flat-p argument in put-json and put-jsons.Kaz Kylheku2024-07-062-33/+40
| | | | | | | | | | | | | | | | The flat-p flag is not being passed through ther recursion. Some of the formatting code emits newlines unconditionally. Instead of passing down the enum json_fmt, we pass down a new structure which carries the flat-p flag also. * stream.h (struct json_opts): New struct. * lib.c (out_json_rec): Take a struct json_opts argument instead of enum json_fmt. Handle the flat flag to avoid generating line breaks. (out_json, put_json): Prepare json_opt structure and pass to out_json_rec.
* json: support printing structs in JSON format.Kaz Kylheku2024-07-062-0/+78
| | | | | | | * lib.c (out_json_sym): New static function. (out_json_rec): Handle structp. * txr.1: Documented.
* hash: fix: equal hashes being reduced modulo NUM_MAX.Kaz Kylheku2024-07-042-5/+1
| | | | | | | | | | Almost exactly 6 years ago, commit 612f4f57e made hashes use the full width of the ucnum type. However, several reductions of the form hash &= NUM_MAX were forgotten. * hash.c (hash_hash_op): Eliminate reductions modulo NUM_MAX. * struct.c (struct_inst_hash): Likewise.
* regex: eliminate unnecessary stack push.Kaz Kylheku2024-07-031-1/+0
| | | | | | | | * regex.c (scan_until_common): in the REGM_MATCH case, there is no need to push the current character into the unget stack. That character is not supposed to be pushed back, and it won't be because it's below the match point; the stack node just ends up recycled at the end of the function.
* regex: don't consume input past final match.Kaz Kylheku2024-07-031-26/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The read-until-match functions and the two others in the same family always read a character beyond the characters matched by the regex. This will cause blocking behavior in cases where a TTY or network socket has provided the a matching record delimiter already, using a trivial, fixed-length regex. Similar behavior is seen in GNU Awk also, with its RS (record separator); let's fix it in our world. We introduce a REGM_MATCH_DONE result code, which, like REGM_MATCH, indicates that the state machine is an acceptance state. Unlike REGM_MATCH it also indicates that no more transitions are possible. For instance, for a regex like #/ab|c/, the REGM_MATCH_DONE code will be indicated when the input "ab" is seen, or the input "c" is seen. Any additional characters will cause a mismatch. This indication makes it possible for the caller to avoid reading more characters from an input source. * regex.c (enum regm_reesult, regm_result_t): New REGM_MATCH_DONE enum member. (nfa_has_transitions): New macro. (nfa_closure, nfa_move_closure): New pointer-to-int parameter more. This is set to true only if one or more states in the output state have transitions. (nfa_run): Initialize new local variable more and pass to nfa_closure and nfa_move closure. Break out of the character feeding loop if more is zero. (regex_machine_reset): Pass more parameter to nfa_closure. (regex_machine_feed): Pass more parameter to nfa_move_closure. When returning REG_MATCH, if more is false, return REG_MATCH_DONE. In the derivatives implementation, we report REGM_MATCH_DONE when the derivative we have calculated is null. (search_regex, match_regex): Break loop on REGM_MATCH_DONE, and avoid feeding the null character in that case. (match_regex_right): Likewise, and also handle the REGM_MATCH_DONE case specially at the end. We need to check whether the match reached the end of the string (is anchored to the right). If not, we continue the search. (regex_prefix_match): Break loop on REGM_MATCH_DONE. (scan_until_common): If we hit REGM_MATCH_DONE, break out of the loop and proceed straight to the out_match block, indicating that no characters need to be pushed back from the stack.
* New functions: cshuffle and cnshuffle.Kaz Kylheku2024-07-014-11/+90
| | | | | | | | | | | | | | | | | | | These functions find random cyclic permutations. * eval.c (eval_init): Register cshuffle and cnshuffle intrinsics. * lib.c (nshuffle_impl): New static function, formed out of nshuffle. (nhuffle): Now wrapper around nshuffle_impl. (shuffle): Also wraps nshuffle_impl rather than nshuffle. (cnshuffle, cshuffle): New funtions. * lib.h (cnshuffle, cshuffle): Declared. * txr.1: Documented new functions. Also added warning about limitations on permutation reachability in relation to PRNG state size.
* build: split tlos into three groups rather than two.Kaz Kylheku2024-06-301-2/+5
| | | | | | | | | | | | It seems that there are several more .tlo files that we should compile earlier for a better build time. * Makefile (STDLIB_MIDDLE_TLOS): New variable. We include error.tlo in here because a new circular dependency has been revealed involving usr:catch. (STDLIB_LATE_TLOS): Also exclude STDLIB_MIDDLE_TLOS. (all): Depend on STDLIB_MIDDLE_TLOS between the early and late ones.
* txr: @(data) must disable recycling.Kaz Kylheku2024-06-301-0/+1
| | | | | | * match.c (v_data): Set c->top to zero; after capturing the c->data pointer to a variable, we must no longer forcibly recycle the head of the data as we march down.
* lib: rcyc_cons set type to CONS.Kaz Kylheku2024-06-291-0/+1
| | | | | * lib.c (rcyc_cons): Reset type of recycled cons to CONS, in case the object is a LCONS.
* txr: bugfix: reset top flag when copying context.Kaz Kylheku2024-06-291-0/+1
| | | | | | * match.c (match_files): This function sometimes receives a copy of a top-marked context. The copy must not be top-marked, or very bad things happen.
* doc: eq, eql, equal cleanuip.Kaz Kylheku2024-06-291-8/+20
| | | | | | * txr.1: Mention that floating point numbers may be boxed or unboxed, and so may or may not be comparable with eq. Remove superfluous adjectives like actually and slightly.
* doc: typo in new text about file descriptor option.Kaz Kylheku2024-06-291-1/+1
| | | | * txr.1: Fix "descriptr" typo.
* txr: better implementation of previous change.Kaz Kylheku2024-06-291-42/+25
| | | | | | | | | | | | | | | | | | | We drop the global variable because all it's doing is marking a particular match_files_ctx structure as being top. We can do that with a flag inside the structure. * match.c (match_files_ctx): Remove struct tag; that was added while developing the previous patch and ended up not being required. New member, top. (mf_current): Static variable removed. (mf_all, mf_args, mf_data, mf_spec, mf_spec_bindings, mf_file, mf_file_data, mf_from_ml): Initialize new member to zero. (step_data): Check c->top being true rather than c == mf_current. (v_next_impl): We don't have to save and restore anything here; just set nc.top to 1. (extract): We remove the uw_simple_catch_begin block, since there is no global to restore. Just set c.top to 1.
* txr: real solution for spurious retention problem.Kaz Kylheku2024-06-291-24/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We take advantage of the nested, recursive nature of the pattern language. Whenever a new data context is initiated, we indicate that that context is current, in a dynamic variable. Then, in the various data scanning directives (scan, collect, repeat, gather), when performing the c->data = rest(c->data) step to march down the lazy list, check whether the context c is the current one. If that is the case, we know that backtracking is not possible, and so we can safely pass the c->data cons to the rcyc_cons function for recycling. Otherwise we do the c->data = rest(c->data) only. It's already the case that whenever backtracking is necessary, a new, disposable context is allocated which is thrown away on failure. Those contexts are never registered as current, and so never match. * match.c (match_files_ctx): Introduce a tag name to the structure, struct match_files_ctx. Remove the volatile qualifier from the data member, and put it back int the original order. New member up for linking these structures into a stack. (mf_current): New static variable. (step_data): New static function. This is where we do the retention-resistant step of c->data. (v_skip, v_fuzz, v_gather, v_collect, match_files_byref): Use step_data function rather than c->data = rest(c->data). (v_next_impl, extract): Register the new scanning context as current by assigning it to mf_current, taking care to save and restore the previous value, even if the matching is abandoned by an exception.
* Version 295.txr-295Kaz Kylheku2024-06-287-1346/+1440
| | | | | | | | | | | | | | * RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated. * protsym.c: Regenerated.
* match: new @(scan-all) operator.Kaz Kylheku2024-06-284-3/+57
| | | | | | | | | | | | | | | | This is like @(scan) but collects all matches over the suffixes of the list. * autoload.c (match_set_entries): Intern scan-all symbol. * stdlib/match.tl (compile-scan-all-match): New function. (compile-match): Dispatch compile-scan-all-match on scan-all symbol. * tests/011/patmatch.tl: Tests for scanall and also missing tests for scan. * txr.1: Documented.
* match: bad indentation.Kaz Kylheku2024-06-271-4/+4
| | | | | * stdlib/match.tl (compile-scan-match): Fix wrong indentation of let* body.
* genprotsym: fix bugs.Kaz Kylheku2024-06-271-5/+16
| | | | | | | | * genprotsym.txr: Do not glob .c files; use "git ls-files" to only pick up tracked files. Skip the protsym.c file itself. Parenthesize any if guards that have || operators in them. Rearrange the output so that @{gpp "&&"} expressions actually operate on lists and not strings.
* packages: streamline is-a-string testing.Kaz Kylheku2024-06-271-28/+19
| | | | | | | * lib.c (make_sym, make_package, use_sym_as, find_symbol, find_symbol_fb, intern_intrinsic, intern_fallback_intrinsic): replace stringp test with dummy calls to c_str, for the side effect of the type check.
* open-process: new ?fdno option for selecting stream fd.Kaz Kylheku2024-06-263-8/+61
| | | | | | | | | | | | | | | | | | | | | | | | If, for instance ?2 is specified in the mode string argument of open-process and related functions, this means that the file descriptor 2 of the process will be used as the data source (or sink) for the stream that is returned by the function. With this feature we can easily read the standard error of a process while leaving its standard output unredirected. * stream.c (do_parse_mode): Parse the ? mode option. (open_subprocess): Check for the presence of the alternative file descriptor in the stdio_mode structure, and and use it isntead of STDIN_FILENO or STDOUT_FILENO. * stream.h (struct stdio_mode): New member, streamfd. (stdio_mode_init_blank, stdio_mode_init_r, stdio_mode_init_rpb, stdio_mode_init_blank, stdio_mode_init_r, stdio_mode_init_rpb): Update initializer macros to cover the new member, setting it to the default value -1 (not specified). * txr.1: Documented.
* iter-begin: handle FLNUM.Kaz Kylheku2024-06-262-0/+19
| | | | | | | | | | * lib.c (iter_begin, iter_more, iter_item, iter_step, iter_reset, copy_iter): Handle FLNUM like NUM, so that we don't wastefully return a dynamic iterator object. * tests/012/iter.tl: Test cases for numeric and character iteration. Test cases for iter-begin on some basic types. copy-iter test for floats.
* copy-iter: test for common types.Kaz Kylheku2024-06-261-0/+18
| | | | | * tests/012/iter.tl: Test copy-iter for lists, vectors, integers, characters, strings, string ranges, numeric ranges.
* copy-iter: test that the combi iterators copy.Kaz Kylheku2024-06-261-0/+12
| | | | * tests/015/comb.tl: New tests.
* copy-iter: some tests.Kaz Kylheku2024-06-261-2/+15
| | | | | | | * test/012/oop-seq.tl: Add tests that verify that an OOP iter without a copy method cannot be copied with copy-iter, and that one which has the method can be copied, in accordance with the requirements.
* txr: deal with spurious retention problem.Kaz Kylheku2024-06-241-28/+33
| | | | | | | | | | | | | | | | | | | | | | When we open a data source using @(next), or use one implicitly at the top level of the script, we would like the top level scanning construct as @(collect) or @(repeat) which does not backtrack not to consume memmory when moving through a large file. I experimented with ways to fix this in the past that were ineffective, but I think I hit upon a working approach. * match.c (match_files_ctx): Make the data field (pointer to lazy list data source) volatile. (match_files_byref): New function, based on converting match_files to take a context by pointer rather than by value. (match_files): By-value wrapper for match_files_byref that most constructs use. (v_next_impl): When opening a stream source, use match_files_byref to avoid possible duplication of the structure. Without the volatile in match_files_ctx, this doesn't squelch all spurious retention.
* combi: fix permi and rpermi; impl combi, rcombi; test.Kaz Kylheku2024-06-244-32/+265
| | | | | | | | | | | | | | | | | | | | | | | | | * combi.c (permi_get, permi_peek): Fix algorithm. (permi_mark): New static function. (permi_ops): Reference permi_mark for mark operation. (permi): Initialize it->ul.next to nao as required by new get/peek algorithm. (rpermi_get, rpermi_peek): Fix algorithm. (rpermi_mark): New static function. (rpermi_ops): Reference permi_mark for mark operation. (rpermi): Initialize it->ul.next to nao as required by new get/peek algorithm. (combi_get, combi_peek, combi_mark, combi_clone): New static functions. (combi_ops): New static structure. (combi): New function. (rcombi_get, rcombi_peek, rcombi_mark, rcombi_clone): New static functions. (rcombi_ops): New static structure. (rcombi): New function. * combi.h (combi, rcombi): Declared. * tests/015/comb.tl: New tests.
* doc: permi, rpermi, combi, rcombi.Kaz Kylheku2024-06-211-4/+36
| | | | | * txr.1: document new iterator-based combinatoric functions.
* lib: replace generic errors with more specific errors.Kaz Kylheku2024-06-201-78/+90
| | | | | | | | | | | | | | | | | | | | * lib.c (unsup_obj, iter_step, last, nthcdr, list_collect, list_collect_nconc, list_collect_append, list_collect_nreconc, list_collect_revappend, nreverse, reverse, replace_list, lazy_appendv, tuples, tuples_star, chk_grow_vec, chk_manage_vec, chk_wrealloc, chk_substrdup, chk_substrdup_utf8, chk_strdup_8bit, chk_xalloc, endp, mkstring, mkustring, string_extend, replace_str, replace_str, cat_str_measure, fmt_str_sep, split_str_keep, spln, tok_str, tokn, cmp_str, int_str, chr_str, chr_str_set, chr_str_set, symbol_package, make_package, use_sym_as, find_symbol, find_symbol_fb, intern_intrinsic, intern_fallback_intrinsic, get_current_package, func_interp, func_get_form, callerror, vec_set_length, vecref, vecref_l, replace_vec, replace_obj, fill_vec, cat_vec, int_cptr, calc_win_size, mismatch, rmismatch, refset, dwim_set, dwim_del, rangeref): Replace error in exceptions with more specific error like type_error, range_error, numeric_error or alloc_error.
* combi: replace some zerop tests.Kaz Kylheku2024-06-201-6/+6
| | | | | | | * combi.c (rperm, comb_init): Replace zerop(k) tests with k == zero which is more specific and faster, testing only for the integer zero that we care about.
* string ranges: individual positions are ascending/descending.Kaz Kylheku2024-06-203-45/+36
| | | | | | | | | | | | | | | | | | | | In this patch we get rid of the wrongheaded notion that a string range, as such, is ascending or descending. In fact, the corresponding character positions are individually ascending or descending. * lib.c (seq_iter_get_range_str): Either increment or decrement the character in the step string depending on whether that position is in order or reversed. (seq_iter_get_rev_range_str): Static function removed. (si_rev_range_str_ops): Static structure removed. (seq_iter_init_with_info): For string ranges, use si_range_str_ops regardless of the strings being lexicographically reversed. * test/012/iter.tl: New test case. * txr.1: Redocumented string ranges.