| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
* regex.c (scan_until_common): in the REGM_MATCH case, there
is no need to push the current character into the unget stack.
That character is not supposed to be pushed back, and it won't
be because it's below the match point; the stack node just
ends up recycled at the end of the function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The read-until-match functions and the two others in the same
family always read a character beyond the characters matched
by the regex. This will cause blocking behavior in cases where
a TTY or network socket has provided the a matching record
delimiter already, using a trivial, fixed-length regex.
Similar behavior is seen in GNU Awk also, with its RS (record
separator); let's fix it in our world.
We introduce a REGM_MATCH_DONE result code, which, like
REGM_MATCH, indicates that the state machine is an acceptance
state. Unlike REGM_MATCH it also indicates that no more
transitions are possible.
For instance, for a regex like #/ab|c/, the REGM_MATCH_DONE
code will be indicated when the input "ab" is seen, or the
input "c" is seen. Any additional characters will cause a
mismatch. This indication makes it possible for the caller to
avoid reading more characters from an input source.
* regex.c (enum regm_reesult, regm_result_t): New
REGM_MATCH_DONE enum member.
(nfa_has_transitions): New macro.
(nfa_closure, nfa_move_closure): New pointer-to-int parameter
more. This is set to true only if one or more states in
the output state have transitions.
(nfa_run): Initialize new local variable more and pass to
nfa_closure and nfa_move closure. Break out of the character
feeding loop if more is zero.
(regex_machine_reset): Pass more parameter to nfa_closure.
(regex_machine_feed): Pass more parameter to nfa_move_closure.
When returning REG_MATCH, if more is false, return
REG_MATCH_DONE. In the derivatives implementation, we report
REGM_MATCH_DONE when the derivative we have calculated is
null.
(search_regex, match_regex): Break loop on REGM_MATCH_DONE,
and avoid feeding the null character in that case.
(match_regex_right): Likewise, and also handle the
REGM_MATCH_DONE case specially at the end. We need to check
whether the match reached the end of the string (is anchored
to the right). If not, we continue the search.
(regex_prefix_match): Break loop on REGM_MATCH_DONE.
(scan_until_common): If we hit REGM_MATCH_DONE, break out
of the loop and proceed straight to the out_match block,
indicating that no characters need to be pushed back from
the stack.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These functions find random cyclic permutations.
* eval.c (eval_init): Register cshuffle and cnshuffle
intrinsics.
* lib.c (nshuffle_impl): New static function, formed out of
nshuffle.
(nhuffle): Now wrapper around nshuffle_impl.
(shuffle): Also wraps nshuffle_impl rather than nshuffle.
(cnshuffle, cshuffle): New funtions.
* lib.h (cnshuffle, cshuffle): Declared.
* txr.1: Documented new functions. Also added warning
about limitations on permutation reachability in relation
to PRNG state size.
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that there are several more .tlo files that we should
compile earlier for a better build time.
* Makefile (STDLIB_MIDDLE_TLOS): New variable. We include
error.tlo in here because a new circular dependency has been
revealed involving usr:catch.
(STDLIB_LATE_TLOS): Also exclude STDLIB_MIDDLE_TLOS.
(all): Depend on STDLIB_MIDDLE_TLOS between the early and late
ones.
|
|
|
|
|
|
| |
* match.c (v_data): Set c->top to zero; after capturing
the c->data pointer to a variable, we must no longer
forcibly recycle the head of the data as we march down.
|
|
|
|
|
| |
* lib.c (rcyc_cons): Reset type of recycled cons to CONS,
in case the object is a LCONS.
|
|
|
|
|
|
| |
* match.c (match_files): This function sometimes receives a
copy of a top-marked context. The copy must not be top-marked,
or very bad things happen.
|
|
|
|
|
|
| |
* txr.1: Mention that floating point numbers may be boxed or
unboxed, and so may or may not be comparable with eq. Remove
superfluous adjectives like actually and slightly.
|
|
|
|
| |
* txr.1: Fix "descriptr" typo.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We drop the global variable because all it's doing is marking
a particular match_files_ctx structure as being top. We can
do that with a flag inside the structure.
* match.c (match_files_ctx): Remove struct tag; that was
added while developing the previous patch and ended up not
being required. New member, top.
(mf_current): Static variable removed.
(mf_all, mf_args, mf_data, mf_spec, mf_spec_bindings, mf_file,
mf_file_data, mf_from_ml): Initialize new member to zero.
(step_data): Check c->top being true rather than c ==
mf_current.
(v_next_impl): We don't have to save and restore anything
here; just set nc.top to 1.
(extract): We remove the uw_simple_catch_begin block,
since there is no global to restore. Just set c.top to 1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We take advantage of the nested, recursive nature of the
pattern language. Whenever a new data context is initiated, we
indicate that that context is current, in a dynamic variable.
Then, in the various data scanning directives (scan, collect,
repeat, gather), when performing the c->data = rest(c->data)
step to march down the lazy list, check whether the context c
is the current one. If that is the case, we know that
backtracking is not possible, and so we can safely pass the
c->data cons to the rcyc_cons function for recycling.
Otherwise we do the c->data = rest(c->data) only.
It's already the case that whenever backtracking is necessary,
a new, disposable context is allocated which is thrown away on
failure. Those contexts are never registered as current, and
so never match.
* match.c (match_files_ctx): Introduce a tag name to the
structure, struct match_files_ctx. Remove the volatile
qualifier from the data member, and put it back int the
original order. New member up for linking these structures
into a stack.
(mf_current): New static variable.
(step_data): New static function. This is where we do the
retention-resistant step of c->data.
(v_skip, v_fuzz, v_gather, v_collect, match_files_byref):
Use step_data function rather than c->data = rest(c->data).
(v_next_impl, extract): Register the new scanning context
as current by assigning it to mf_current, taking care
to save and restore the previous value, even if the
matching is abandoned by an exception.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* RELNOTES: Updated.
* configure (txr_ver): Bumped version.
* stdlib/ver.tl (lib-version): Bumped.
* txr.1: Bumped version and date.
* txr.vim, tl.vim: Regenerated.
* protsym.c: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is like @(scan) but collects all matches over the
suffixes of the list.
* autoload.c (match_set_entries): Intern scan-all symbol.
* stdlib/match.tl (compile-scan-all-match): New function.
(compile-match): Dispatch compile-scan-all-match on scan-all
symbol.
* tests/011/patmatch.tl: Tests for scanall and also missing
tests for scan.
* txr.1: Documented.
|
|
|
|
|
| |
* stdlib/match.tl (compile-scan-match): Fix wrong indentation
of let* body.
|
|
|
|
|
|
|
|
| |
* genprotsym.txr: Do not glob .c files; use "git ls-files" to
only pick up tracked files. Skip the protsym.c file itself.
Parenthesize any if guards that have || operators in them.
Rearrange the output so that @{gpp "&&"} expressions actually
operate on lists and not strings.
|
|
|
|
|
|
|
| |
* lib.c (make_sym, make_package, use_sym_as, find_symbol,
find_symbol_fb, intern_intrinsic, intern_fallback_intrinsic):
replace stringp test with dummy calls to c_str, for the
side effect of the type check.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If, for instance ?2 is specified in the mode string argument
of open-process and related functions, this means that the
file descriptor 2 of the process will be used as the data
source (or sink) for the stream that is returned by the
function. With this feature we can easily read the standard
error of a process while leaving its standard output
unredirected.
* stream.c (do_parse_mode): Parse the ? mode option.
(open_subprocess): Check for the presence of the alternative
file descriptor in the stdio_mode structure, and and use it
isntead of STDIN_FILENO or STDOUT_FILENO.
* stream.h (struct stdio_mode): New member, streamfd.
(stdio_mode_init_blank, stdio_mode_init_r,
stdio_mode_init_rpb, stdio_mode_init_blank, stdio_mode_init_r,
stdio_mode_init_rpb): Update initializer macros to cover the
new member, setting it to the default value -1 (not
specified).
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (iter_begin, iter_more, iter_item, iter_step,
iter_reset, copy_iter): Handle FLNUM like NUM, so that we
don't wastefully return a dynamic iterator object.
* tests/012/iter.tl: Test cases for numeric and character
iteration. Test cases for iter-begin on some basic types.
copy-iter test for floats.
|
|
|
|
|
| |
* tests/012/iter.tl: Test copy-iter for lists, vectors,
integers, characters, strings, string ranges, numeric ranges.
|
|
|
|
| |
* tests/015/comb.tl: New tests.
|
|
|
|
|
|
|
| |
* test/012/oop-seq.tl: Add tests that verify that an OOP iter
without a copy method cannot be copied with copy-iter,
and that one which has the method can be copied, in
accordance with the requirements.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we open a data source using @(next), or use one
implicitly at the top level of the script, we would like
the top level scanning construct as @(collect) or @(repeat)
which does not backtrack not to consume memmory when moving
through a large file.
I experimented with ways to fix this in the past that
were ineffective, but I think I hit upon a working approach.
* match.c (match_files_ctx): Make the data field (pointer to
lazy list data source) volatile.
(match_files_byref): New function, based on converting
match_files to take a context by pointer rather than by value.
(match_files): By-value wrapper for match_files_byref that
most constructs use.
(v_next_impl): When opening a stream source, use
match_files_byref to avoid possible duplication of the
structure. Without the volatile in match_files_ctx, this
doesn't squelch all spurious retention.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* combi.c (permi_get, permi_peek): Fix algorithm.
(permi_mark): New static function.
(permi_ops): Reference permi_mark for mark operation.
(permi): Initialize it->ul.next to nao as required by
new get/peek algorithm.
(rpermi_get, rpermi_peek): Fix algorithm.
(rpermi_mark): New static function.
(rpermi_ops): Reference permi_mark for mark operation.
(rpermi): Initialize it->ul.next to nao as required
by new get/peek algorithm.
(combi_get, combi_peek, combi_mark, combi_clone): New static
functions.
(combi_ops): New static structure.
(combi): New function.
(rcombi_get, rcombi_peek, rcombi_mark, rcombi_clone): New
static functions.
(rcombi_ops): New static structure.
(rcombi): New function.
* combi.h (combi, rcombi): Declared.
* tests/015/comb.tl: New tests.
|
|
|
|
|
| |
* txr.1: document new iterator-based combinatoric
functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (unsup_obj, iter_step, last, nthcdr, list_collect,
list_collect_nconc, list_collect_append, list_collect_nreconc,
list_collect_revappend, nreverse, reverse, replace_list,
lazy_appendv, tuples, tuples_star, chk_grow_vec,
chk_manage_vec, chk_wrealloc, chk_substrdup,
chk_substrdup_utf8, chk_strdup_8bit, chk_xalloc, endp,
mkstring, mkustring, string_extend, replace_str, replace_str,
cat_str_measure, fmt_str_sep, split_str_keep, spln, tok_str,
tokn, cmp_str, int_str, chr_str, chr_str_set, chr_str_set,
symbol_package, make_package, use_sym_as, find_symbol,
find_symbol_fb, intern_intrinsic, intern_fallback_intrinsic,
get_current_package, func_interp, func_get_form, callerror,
vec_set_length, vecref, vecref_l, replace_vec, replace_obj,
fill_vec, cat_vec, int_cptr, calc_win_size, mismatch,
rmismatch, refset, dwim_set, dwim_del, rangeref):
Replace error in exceptions with more specific error
like type_error, range_error, numeric_error or alloc_error.
|
|
|
|
|
|
|
| |
* combi.c (rperm, comb_init): Replace zerop(k)
tests with k == zero which is more specific and
faster, testing only for the integer zero that
we care about.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In this patch we get rid of the wrongheaded notion that a
string range, as such, is ascending or descending. In fact,
the corresponding character positions are individually
ascending or descending.
* lib.c (seq_iter_get_range_str): Either increment or
decrement the character in the step string depending on
whether that position is in order or reversed.
(seq_iter_get_rev_range_str): Static function removed.
(si_rev_range_str_ops): Static structure removed.
(seq_iter_init_with_info): For string ranges, use
si_range_str_ops regardless of the strings being
lexicographically reversed.
* test/012/iter.tl: New test case.
* txr.1: Redocumented string ranges.
|
|
|
|
|
|
|
|
|
| |
* combi.c (rperm, comb, rcomb): In the default
case for generic sequences, check k, like
in the other cases and return the special
case result.
* tests/015/comb.tl: New tests.
|
|
|
|
|
|
|
|
| |
For calculating the length of a range, we can't just do
numeric subtractions because it fails for string ranges.
* lib.c (length_str_range, length_rng): New static functions.
(length): Use length_rng for ranges.
|
|
|
|
|
|
| |
* txr.1: Revise text which claims that when iter-begin
is invoked on an existing iterator, the returned iterator
may share state it. We recently fixed that with the cloning.
|
|
|
|
|
| |
* combi.c (permi_ops, rpermi_ops): Change external linkage to
internal.
|
|
|
|
|
|
|
|
|
|
|
| |
* combi.c (rpermi_get, rpermi_peek, rpermi_clone): New static
functions.
(rpermi_ops): New static structure.
(rpermi): New function.
* combi.h (rpermi): Declared.
* eval.c (eval_init): Register rpermi intrinsic.
|
|
|
|
|
| |
* combi.c (permi_iter): Don't need gc_hint(obj) since we are
returning that value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The quant_fun blindly trusts that the state object has
the correct type. But the function environment is mutable.
To fix this, we end up switching the state from cptr_typed
to cobj because cptr_typed has a weak form of safety based
on symbols that is better suited for FFI stuff. For anything
built into the language, we want it to be bulletproof.
* arith.c (quant_state_s): New symbol variable.
(quant_cls): New static variable.
(psq_ops): Rename to quant_ops.
(quant_fun): Parameter renamed to state. We use
cobj_handle to get the pointer to the context structure,
using the quant_cls class.
(quantile): Use cobj to create the context object, rather
than cptr_typed.
(arith_init): Initialized quant_state_s and quant_cls.
|
|
|
|
|
|
|
|
|
| |
* gc.c (gc_prot_array_s): New symbol variable.
(gc_late_init): Initialize gc_prot_array_s. Use it when
registering prot_array_cls. The _s variables are gc-protected
by registrations in the protsym.c module which gets regularly
updated, at least before every software release.
The cobj class array is not traversed by gc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several seq_iter_t kinds of objects cannot be correctly
bitwise copied, because they point to an iterator object
that cannot be shared.
* lib.c (seq_iter_clone_op): New static function.
(si_hash_ops, si_tree_ops, si_oop_ops, si_fast_oop_ops):
Use seq_iter_clone_op, which uses the copy function
to duplicate it->ui.iter after doing a bitwise copy of
the structure.
* lib.h (seq_iter_ops_init_full): New macro.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.c (hash_iter_ops): Use copy_hash_iter as the clone
operation.
(copy_hash_iter): New function.
* hash.h (copy_hash_iter): Declared.
* tests/010/hash.tl: New tests.
* txr.1: Documented.
|
|
|
|
|
|
| |
* lib.c (iter_reset): Propagate sinf variable and
seq_info call which initializes it into the scopes
where it is used.
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (copy_iter): Use the copy method for arguments which
are structures, or else return just the objects if they
implement list-like sequences. Error out otherwise.
For an argument that is not an iterators or structure, error
out if it is not a number, nil, or a list-like sequence.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.h (struct cobj_ops): New function pointer, clone.
(cobj_ops_init, cobj_ops_init_ex): Add clone argument to macros.
* lib.c (seq_iter_cobj_ops): Use copy_iter as the clone operation.
(cptr_ops): Use copy_cptr as clone operation.
(copy): Replace if statements by check whether COBJ has a clone
operation. If so, we use it to copy the object.
* struct.h (enum special_slot): New member, copy_m.
* struct.c (copy_s): New symbol variable.
(special_sym): Associate copy_m enum value with copy symbol.
(struct_init): Initialize copy_s with interned symbol.
(struct_inst_clone): New static function.
(struct_type_ops): Specify no clone operation via null pointer.
(struct_inst_ops): Specify struct_inst_clone as clone
operation.
* arith.c (psq_ops): Indicate no clone operation via null pointer.
* buf.c (buf_strm_ops): Likewise.
* chksum.c (sha1_ops, sha256_ops, md5_ops): Likewise.
* ffi.c (ffi_type_builtin_ops, ffi_type_struct_ops,
ffi_type_ptr_ops, ffi_type_enum_ops, ffi_closure_ops,
union_ops): Likewise.
(carray_borrowed_ops, carry_owned_ops, carray_mmap_ops):
Specify copy_carray as clone operation.
* gc.c (prot_array_ops): Indicate no clone operation via
null pointer.
* gzip.c (gzio_ops_rd, gzip_ops_wr): Likewise.
* hash.c (hash_iter_ops): Likewise.
(hash_ops): Specify copy_hash as clone operation.
* parser.c (parser_ops): Indicate no clone operation via
null pointer.
* rand.c (random_state_clone): New static function.
(random_state_ops): Use random_state_clone as clone function.
* regex.c (char_set_obj_ops, regex_obj_ops): Indicate no clone
operation via null pointer.
* socket.c (dgram_strm_ops): Likewise.
* stream.c (null-ops, stdio_ops, tail_ops, pipe_ops,
dir_ops, string_in_ops, byte_in_ops, strlist_in_ops,
string_out_ops, strlist_out_ops, cat_stream_ops,
record_adapter_ops): Likewise.
* strudel.c (strudel_ops): Likewise.
* sysif.c (cptr_dl_ops, opendir_ops): Likewise.
* syslog.c (syslog_strm_ops): Likewise.
* unwind.c (cont_ops): Likewise.
* vm.c (vm_desc_ops, vm_closure_ops): Likewise.
* tree.c (tree_ops): Use copy_search_tree for clone
operation.
(tree_iter_ops): Use copy_tree_iter for clone operation.
* genchksum.txr: Changes in chksum.c specified in one
place here.
* tests/012/oop.tl: Couple of new tests.
* txr.1: Documented.
|
|
|
|
|
|
| |
* eval.c (eval_init): Register copy-iter intrinsic.
* lib.[ch] (copy_iter): New function.
|
|
|
|
|
| |
* stdlib/quips.tl (%quips%): Remove quip about lecithin;
it does not wear well.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (seq_iter_mark_oop, seq_iter_mark_cat): New static
functions.
(si_oop_ops, si_fast_oop_ops): Use seq_iter_mark_oop instead
of the generic one, because we need to mark the next field,
not only the iter.
(si_cat_ops): Use seq_iter_mark_cat, since we need to mark
only the second field, dargs.
* lib.h (seq_iter_ops_init_mark): New macro.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (eval_init): Register permi intrinsic.
* combi.c (permi_get, permi_peek, permi_clone): New static
functions.
(permi_ops): New static structure.
(permi_iter): New static function.
(permi): New function.
* combi.h (permi): Declared.
* lib.h (struct seq_iter_ops): New function pointer, clone.
(seq_iter_ops_init, seq_iter_ops_init_nomark): Initialize
new member.
(seq_iter_ops_init_clone): New macro.
(seq_iter_cls): Existing external name declared.
(seq_iter_cobj_ops, seq_iter_mark_op): Previously internal
names declared external.
* lib.c (seq_iter_mark_op, seq_iter_cobj_ops): Static variables
become extern.
(seq_iter_clone): New static function.
(seq_iter_init_with_info): Use seq_iter_clone instead of assuming
we can trivially clone an iterator state bitwise.
|
|
|
|
|
|
|
|
|
|
| |
The seq_iter_ops static structure is an instance of cobj_ops.
Its name is the same identifier as that of struct
seq_iter_ops, which is not related. This is confusing.
* lib.c (seq_iter_ops): Structure renamed to seq_iter_obj_ops.
(seq_begin, iter_begin, iter_dynamic, iter_catv): References
to object updated to new name.
|
|
|
|
|
| |
* combi.c (perm_str): We don't have to convert the string
to a list and then vector, since we have vec_seq.
|
|
|
|
|
|
| |
* combi.c (check_k): New static function.
(perm, rperm, comb, rcomb): Replace copy pasted code
with call to check_k.
|
|
|
|
|
|
|
|
|
|
| |
In a recent commit, the defaulting of the separator in quasiliteral
variable formatting was moved down into the fmt_cat routine.
One stray case remains in subst_vars.
* eval.c (subst_vars): A call to fmt_cat is specifying a separator
value consisting of a single space. This is wrong, preventing
fmt_cat from defaulting it in different ways according to type.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In this commit, output variables in the TXR Pattern language and
in TXR Lisp quasiliterals now support separator strings for values
that are strings and buffers. Values which are buffers appear
differently: they are rendered as a sequence of lower case hex
digit pairs. When a string-valued variable specifies a separator,
the separator appears between characters of the string value.
Previously, the separator was ignored. When a buffer-valued
variable specifies a separator. the separator appears between
pairs of digits, not between digits. For instance if ethaddr
is a variable holding #b'08:00:27:79:c7:f5', then the quasiliteral
`@ethaddr` produces "08002779c7f" whereas `@{ethaddr ":"}`
produces "08:00:27:79:c7:f5".
* buf.[ch] (buf_str_sep): New function.
* lib.[ch] (fmt_str_sep): New function.
* eval.c (fmt_cat): If the argument is a string, and a separator
is present, replace the value with the result of calling
fmt_str_sep. If the argument is a buffer, and a separator is
present, use buf_str_sep to convert to a string, otherwise
use tostringp.
* txr.1: Section on Output Variables updated.
* tests/012/readprint.tl: New tests.
|
|
|
|
|
|
|
| |
* lib.c (interpose): non-list cases consolidated into
one, which uses generic iteration and building.
* tests/012/seq.tl: New tests
|