| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we compile the regex expression (compound "str*"), calling
regex-source on the compiled regex object yields "str*".
That, of course, is treated as regex character syntax if
fed back to regex-compile, and the * becomes an operator.
We want the source to be (compound "str*").
This happens because the AST optimizer reduces
(compound X) -> X.
* regex.c (regex_compile): If the optimized expression is just
a character string atom S, then for the purposes of
maintaining the source code, convert it to (compound S).
|
|
|
|
|
|
|
|
| |
* regex.c (regex_range_full_fun, regex_range_left_fun,
regex_range_right_fun, regex_range_search_fun): New functions.
(regex_init): Register fr^$, fr^, fr$ and frr intrinsics.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is some infrastructure which will support *print-circle*.
* lib.h (struct strm_ctx): Forward declared.
(struct cobj_ops): Add context parameter to print function
pointer.
(cobj_print_op, obj_print_impl): Add context parameter to
declarations.
* hash.c (hash_print_op): Take context argument and
pass it down in obj_print_impl calls.
* lib.c (cobj_print_op, out_quasi_str): Likewise
(obj_print_impl): Likewise, and also pass to
COBJ print method.
(obj_print, obj_pprint): Pass null pointer
as context argument to obj_print_impl.
* regex.c (regex_print): Take context parameter and ignore it.
* socket.c (dgram_print): Likewise.
* stream.h (struct strm_ctx): New struct type.
(struct strm_base): New ctx member, pointer to struct
strm_ctx.
(stream_print_op): Add context parameter to declaration.
(get_set_ctx, get_ctx): Declared.
* stream.c (strm_base_init): Add null pointer to initializer.
(strm_base_cleanup): Add assertion against context pointer
being non-null: that indicates that some stream operation
installed a context pointer and neglected to restore it to
null before returning, which is bad because context will be
stack allocated.
(stream_print_op, stdio_stream_print, cat_stream_print): Take
context parameter and ignore it.
(get_set_ctx, get_ctx): New functions.
* struct.c (struct_type_print): Take context parameter and
ignore it.
(struct_inst_print): Take context parameter and pass
down to obj_print_impl.
|
|
|
|
|
|
|
|
|
|
|
| |
Since much regex code assumes these are binary, the easiest
and briefest approach is to implement a code transformation
pass which rewrites n-ary forms into binary.
* regex.c (reg_nary_unfold, reg_nary_to_bin): New
functions.
(regex_compile): Put raw sexp through reg_nary_to_bin
to expand the nary syntax.
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (reg_expand_nongreedy, reg_compile_csets):
Generalize the compound_s case slightly by referring
to sym rather than hard-coded compound_s. Then handle
most of the regex operators under this same case.
Their semantics are not relevant to the expansions
being performed in these functions: all their arguments
are regexes to be recursed over.
|
|
|
|
|
|
|
|
|
| |
* regex.c (range_regex_all, regex_range_all): New functions.
(regex_init): Register rra intrinsic function.
* regex.c (range_regex_all, regex_range_all): Declared.
* txr.1: Documented rra.
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (regex_range_search): New function.
(regex_init): Register regex_range_search as rr intrinsic.
* regex.h (regex_range_search): Declared.
* txr.1: Documented rr, and added reference to it
in description of regex-range.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (search_regex): Handle negative starting positions
according to the convention elsewhere and fail excessively
negative ones. Consistently fail on starting positions
exceeding the length of the string. Handle zero length
matches by reporting them against the start position
or position one past the last character, based on the
value of from-end.
* txr.1: search-regex documentation updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c,
combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h,
jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c,
parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h,
share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl,
share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl,
share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl,
share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl,
share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl,
share/txr/stdlib/struct.tl, share/txr/stdlib/termios.tl,
share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl,
share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl,
share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h,
stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c,
syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c,
unwind.h, utf8.c, utf8.h: Revert to verbatim 2-Clause BSD.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (match_regex): Bail if pos is too positive,
beyond length of string.
(match_regex_right): Include the pos == end case in the
iteration, so we can match an empty suffix of the string. The
inner loop guard takes care of not feeding any characters from
the string into the regex machine in this case; we just feed
the terminating zero to get the final state.
(match_regst): Normalize a negative pos, otherwise the sub_str
calculation will be junk, since match_regex returns a
normalized position. After normalizing, check that if the
position is still negative, the match must fail.
(match_regst_right_old, match_regst_right): Use zero rather
than t as the range end in sub_str. That way if len is
zero and neg(len) produces zero, an empty string will
be sliced out. For negative values, the zero serves
as one position beyond the last char, just like t.
(do_match_full_offs, regex_match_full, regex_range_full,
regex_range_left): Fail match if normalized starting
pos is negative.
(regex_range_right): Fix completely bogus calculation of the
returne range in the case when the end position defaults to
the string length.
|
|
|
|
| |
* regex.c (puts_clear_flag): Fix bad indentation.
|
|
|
|
|
|
|
|
|
| |
* regex.c (regex_source): New function.
(regex_init): regex-source intrinsic registered.
* regex.h (regex_source): Declared.
* txr.1: Documented.
|
|
|
|
|
| |
* regex.c (print_rec): Fix checking arg1
for consp but accessing arg2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (do_match_full, do_match_full_offs, do_match_left,
do_match_left_offs, do_match_right, do_match_right_offs):
New static functions.
(regex_match_full_fun, regex_match_right_fun,
regex_match_full, regex_match_left, regex_match_right,
regex_range_full, regex_range_left, regex_range_right):
New functions.
(regex_init): Register f^$, f^, f$, m^$, m^, m$, r^$,
r^ and r$ intrinsics.
* regex.h (regex_match_full_fun, regex_match_right_fun,
regex_match_full, regex_match_left, regex_match_right,
regex_range_full, regex_range_left, regex_range_right):
Declared.
* txr.1: Documented new functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way the end-position argument works in match-regex-right
and match-regst-right is poorly considered. It basically
enforces a constraint that there is a match which ends
at that position and does not go beyond. This patch changes
it work right: the functions test that the regex matches
up to that position, as if the string ended there.
* regex.c (match_regex_right_old): New static function,
identical to the previous match_regex_right.
Since we won't ever be using this inside TXR from any
other module, we don't make it external.
(match_regex_right): Rewritten to new semantics.
(match_regst_right_old): New static function;
provides the semantics of the old match_regst_right
based on match_regex_right_old.
(regex_init): Register match-regex-right and match-regst-right
intrinsics to the match_regex_right_old and
match_regst_right_old functions if compatibility <= 150 is
requested. Otherwise they go to the rewritten new
functions.
* txr.1: Documentation updated, and compat notes
added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The documentation says that match-regex returns the length.
Actually, it returns the position after the last character
matched. This makes a difference when the match doesn't begin
at character zero.
The actual behavior is that of the match_regex C function
which has behaved that way since the dawn of TXR, and
internals depend on it behaving that way.
So the internal function is being retained, and a new function
is being registered as the match-regex intrinsic. The choice
of binding for match-regex is subject to the compatibility
option.
The behavior of match-regst is also being fixed since its
return value is incorrect due to this issue. Since its
return value makes no sense at all (does not represent
the matched text), it is not subject to the compatibility
option; it is just fixed to conform with the documentation.
* regex.c (match_regex_len): New function.
(match_regst): Keep using match_regex, but use its return
value properly. This simplifies the range extraction code,
which is why match_regex works that way in the first place.
(regex_init): Register match-regex to match_regex_len,
unless compatibility <= 150 is requested; then register
to match_regex.
* regex.h (match_regex_len): Declared.
* txr.1: Compatibility notes added.
|
|
|
|
|
|
|
|
|
| |
* regex.c (regsub): Allow the second argument to be
a function, which is called with str as an argument,
and returns a range which indicates what part of
the string is to be replaced, or else nil.
* txr.1: Documented functional argument of regsub.
|
|
|
|
|
|
|
|
|
| |
* regex.c (match_regex, match_regex_right): Detect
a negative start or end position, respectively,
and add the string length to it. If it is still
negative, bail with nil.
* txr.1: Documented.
|
|
|
|
|
|
|
|
| |
* eval.c (eval_init): Remove all regex-related function
registrations from here.
* regex.c (regex_init): Move regex-related function
registrations here.
|
|
|
|
| |
* regex.c (reg_optimize): Implement ~~R -> R reduction.
|
|
|
|
|
|
|
|
| |
* regex.c (reg_optimize): Based on the reasoning in the
previous commit, we can also statically optimize a
complement whose argument is the t regex: match nothing.
We convert that to match everything: the .* regex.
Now (regex-compile "~[]") -> #/.*/.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The form (match-regex "xy" #/~ab/) should return 2 (full
match) because "xy" is in the complement of the set { "ab" }.
It wrongly returns 1.
* regex.c (reg_derivative): Handle the case when
the derivative of the complement's constituent expression
yields nil. This means that the complemented regex matches
the input. In this case, the complement must lapse to the .+
regex: match one or more characters. That is to say, if the
input has at least one more character, there is a match, which
covers all such characters. Otherwise there is no match: the
input matches the complemented regex. In the t case, the
return value is also wrong. If the complemented regex hits
a brick wall (matches nothing, not even the empty string),
the correct complement is "match everything": the .* regex.
Not the match empty string regex!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't have to flip between two arrays, since the
nfa_closure and and nfa_move_closure can write the
output set into the same array.
* regex.c (struct nfa_machine): Replace flip and flop
members with a single set.
(nfa_closure, nfa_move_closure): out array parameter removed;
in renamed to set. References to in and out simply replaced
with set.
(nfa_run): Allocate one set instead of two, plus the stack.
Remove code to swap the two pointers on each iteration.
(regex_machine_reset): Prepare initial closure in the one
and only set array.
(regex_machine_init): Allocate set array, rather than flip an
flop.
(regex_machine_cleanup): Free set array and null out pointer
rather than flip and flop arrays.
(regex_machine_feed): Pass just the set ot the
nfa_move_closure function. Remove flip/flop pointer swapping
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (struct nfa_machine_t): Remove move and clos
array pointers, replace with flip and flop. Remove
nmove member.
(nfa_move): Static function removed.
(nfa_move_closure): New static function, based on nfa_move and
logic from nfa_closure.
(nfa_run): Use nfa_move_closure and flip between two
arrays.
(regex_machine_reset): Remove reference to nmove member
in nfa_machine_t. Prepare initial closure in flip array.
(regex_machine_init): Allocate flip and flop arrays,
rather than removed move and clos.
(regex_machine_cleanup): Free flip and flop arrays and
zero out the pointers, rather than removed move and clos.
(regex_machine_feed): Replace nfa_move and nfa_closure
with combined nfa_move_closure from flip to flop,
and exchange of flip and flop arrays.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although we are garbage-collected, being able to clean up on
shutdown is nevertheless useful for uncovering leaks. Leaks
can occur, for instance, due to neglect to free out-of-heap
satellite data from objects that are reclaimed by gc.
This feature is long overdue.
* arith.c, arith.h (arith_free_all): New function.
* gc.c, gc.h (gc_free_all): New function.
* lib.c (init): Remove program name parameter and
redundant initialization of progname globl variable.
* lib.h (progname): Superfluous declaration removed.
This is already declared in txr.h.
(init): Declaration updated.
* regex.c (char_set_destroy): Do not check the static
allocation flag here; just destroy the object.
Do check for a null pointer, though.
(char_set_cobj_destroy): This cobj destructor now
checks the static flag of the char set object and
avoids freeing it. Thus our char set singletons are
left alone by gc, but our global freeing function
takes care of them.
(wide_cs): New static variable moved out of
wide_display_char_p to static scope.
(regex_free_all): New function.
* regex.h (regex_free_all): Declared.
* txr.c (progname): const qualifier and initializer removed.
(main): Ensure progname is always dynamically allocated, even
in the argv[0] == 0 case. Do not pass progname to init;
it doesn't take that argument any more.
(free_all): New static function.
(txr_main): Implement --free-all option.
* txr.h (progname): Declaration updated.
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (rcyc_pop): Just assume that *plist points to a cons
and access the fields directly.
(rcyc_cons): Don't bother with rplacd.
(rcyc_list): Don't bother with set macro.
* regex.c (read_until_match): Defensive coding: locally
ensure that rcyc_pop won't be called on a nil stack,
which will now segfault.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (ead_until_match): Use rcyc_pop instead of pop
to move the conses to the recycle list. We know these
are not shared with anything. Adding additional logic
to completely recycle the stack.
* socket.c (dgram_get_char): Use rcyc_pop to
get the character from the push-back list.
* stream.c (stdio_get_char): Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (read_until_match): New argument, include_match.
Three times repeated termination code refactored into block
reached by forward goto.
(regex_init): Registration of read-until-match updated.
* regex.h (read_until_match): Declaration updated.
* stream.c (struct record_adapter_base): New member,
include_match.
(record_adapter_get_line): Pass match to read_until_match
as new argument.
(record_adapater): New argument, include_match.
(stream_init): Update registration of record-adapter.
* stream.h (record_adapter): Declaration updated.
* txr.1: Updated.
|
|
|
|
|
| |
* regex.c (read_until_match): Completely rewrite broken,
unsalvageable, garbage logic.
|
|
|
|
|
|
|
| |
* arith.c, cadr.c, debug.c, eval.c, filter.c, gencadr.txr, glob.c,
hash.c, linenoise/linenoise.c, lisplib.c, match.c, parser.c, rand.c,
regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c,
unwind.c, utf8.c: Remove unncessary header files.
|
|
|
|
|
| |
* regex.c (print_rec): Handle '[' and ']' in backslash-adding
switch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (out_str_char): Static function becomes extern.
* lib.h (out_str_char): Declared.
* regex.c (puts_clear_flag, putc_clear_flag): New static
functions.
(print_class_char): Take semicolon flag argument.
Use out_str_char to render characters not escaped locally.
Clear the semicolon flag.
(paren_print_rec): Take semicolon flag argument, and pass it
down. Clear it when printing parentheses.
(print_rec): Take semicolon flag argument, and pass
down to lower level functions. Use putc_clear_flag and
puts_clear_flag instead of put_string and put_char.
Use out_str_char for char object not esaped locally.
(regex_print): define semi_flag and pass it down
to print_rec.
|
|
|
|
| |
* regex.c (print_class_char): Add missing character cases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (read_until_match): New function.
(regex_init): Registered read-until-match intrinsic.
* regex.h (read_until_match): Declared.
* stream.c (struct delegate_base): New struct type.
(delegate_base_mark, delegate_put_string, delegate_put_char,
delegate_put_byte, delegate_get_char, delegate_get_byte,
delegate_unget_char, delegate_unget_byte, delegate_close,
delegate_flush, delegate_seek, delegate_truncate,
delegate_get_prop, delegate_set_prop, delegate_get_error,
delegate_get_error_str, delegate_clear_error,
make_delegate_stream): New static functions.
(struct record_adapter_base): New struct type.
(record_adapter_base_mark, record_adapter_mark_op,
record_adapter_get_line): New static functions.
(record_adapter_ops): New static structure.
(record_adapter): New function.
(stream_init): Registered record-adapter intrinsic.
* stream.h (record_adapter): Declared.
* txr.1: Documented read-until-match and record-adapter.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c,
arith.h, cadr.c, cadr.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c,
gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h,
lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, rand.c, rand.h, regex.c, regex.h,
share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl,
share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl,
share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, stream.c, stream.h, struct.c, struct.h,
sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h,
unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright.
* linenoise/LICENSE, linenoise/linenoise.c,
linenoise/linenoise.h: Bump one principal author's copyright
from 2014 to 2015. The code is based on a snapshot of 2015
upstream work.
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (range_regex): Return range.
(search_regst): Use appropriate accessors on
range returned by range_regex.
* lib.c (tok_where): Destructure range returned by
range_regex, using range_bind.
* txr.1: Documented changed behavior.
|
|
|
|
|
|
|
|
| |
* regex.c (search_regex): In the Sep 7 2015 commit
titled "Don't use prot1 for temporary gc protection",
a rel1 call was left behind, causing an assert whenever
the function is used for a succesful "from end"
search.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TXR is moving to custom assembly-language routines.
This is mainly motivated by a very dubious thing done in the
GNU C Library setjmp and longjmp in the name of security.
Evidently, glibc's setjmp "mangles" certain pointer values
which are stored into the jmp_buf buffer. It's been that way
since 2005, evidently. This means that, firstly, all along,
the use of setjmp in gc.c to get registers into a buffer so
they can be scanned has not actually worked properly. More
importantly, this pointer mangling in setjmp and longjmp is
very hostile to a stack copying implementation of delimited
continuations. The reason is that continuations contain
jmp_buf buffers, which get relocated in the process of
capturing and reviving a continuation. Any pointers in a
jmp_buf which point into the captured stack segment have to be
fixed up to point into the relocated location. Mangled
pointers make this difficult, requiring hacks which are
specific to glibc and the machine architecture. We might as
well implement a clean, well-behaved setjmp and longjmp.
* Makefile (jmp.o): New object file.
(dbg/%.o, opt/%.o): New rules for .S prerequisites.
* args.c, arith.c, cadr.c, combi.c, cadr.c, combi.c, debug.c,
eval.c, filter.c, glob.c, hash.c, lib.c, match.c, parser.c,
rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c,
syslog.c, txr.c, unwind.c, utf8.c: Removed <setjmp.h>
include.
* gc.c: Switch to struct jmp and jmp_save, instead
of jmp_buf and setjmp.
* jmp.S: New source file.
* signal.h (struct jmp): New struct type.
(jmp_save, jmp_restore): New function declarations
denoting assembly language routines in jmp.S.
(extended_jmp_buf): Uses struct jmp instead of
setjmp.
(extended_setjmp): Use jmp_save instead of setjmp.
(extended_longjmp): Use jmp_restore instead of
longjmp.
|
|
|
|
|
|
|
| |
* regex.c (reg_optimize): If the empty regex is and-ed with
another regex, that other regex must be nullable, otherwise
the and matches nothing. This is captured in some new
reductions for the and operator.
|
|
|
|
|
|
| |
* regex.c (reg_optimize): No need to check reg_matches_all in
and optimization case because the argument object has already
been reduced that way by reg_optimize recursion.
|
|
|
|
|
|
| |
* regex.c (reg_compl_char_p): New static function.
(reg_optimize): Optimize various cases of the
or operator: (R|) -> R?, (a|b) -> [ab] and others.
|
|
|
|
|
|
| |
* regex.c (regex_optimize): Simplify compounded
uses of repetition operators: RR* -> R, R+? -> R*
and so on.
|
|
|
|
|
| |
regex.c (print_rec): Bugfix: handle symbols in character
class syntax.
|
|
|
|
|
| |
* regex.c (reg_optimize): Transform ~.*c to (.*[^c])?
and ~c.* to ([^c].*)? where c is a single-character match.
|
|
|
|
|
|
|
| |
* regex.c (reg_single_char_p, invert_single): New static
functions.
(reg_optimize): Simplify complement operator optimizations
using new functions.
|
|
|
|
|
| |
* regex.c (reg_optimize): [a] -> a. Also take advantage
of this where the complement case generates [a].
|
|
|
|
|
| |
* regex.c (reg_optimize): Recognize and transform several
cases: ~c -> ([^c]?|..+); ~[^c] -> ([c]?|..+); and ~.*c.* -> [^c]*.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (dv_compile_regex): Replaced by two functions,
reg_expand_nongreedy and reg_compile_csets.
(reg_expand_nongreedy, reg_compile_csets): New static
functions.
(reg_optimize): New static function.
(regex_compile): Expand nongreedy syntax in incoming regex,
and then optimize it before deciding whether to use NFA or
derivatives. If derivatives are used, compile the
character sets in the regex to character set objects.
(regex_init): Register some intrinsic functions for debugging,
sys:reg-expand-nongreedy and sys:reg-optimize.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The t regex means "match nothing". This patch allows the NFA
compiler to handle it. This will be necessary for an upcoming
regex optimizer which can put out such an object. Also, the
recursive regex printer can print the object now.
* regex.c (nfa_kind_t): New enum member, nfa_reject.
(nfa_state_reject): New static function.
(nfa_compile_regex): Compile t regex into a reject
state which cannot reach its corresponding acceptance
state.
(nfa_map_states): Handle nfa_reject case in switch, similarly
to nfa_accept: nothing to transition into.
(print_rec): Render the t regex as the empty character class [].
|
|
|
|
|
|
| |
* regex.c (nfa_compile_regex, dv_compile_regex, reg_nullable,
reg_matches_all, reg_derivative, regex_requires_dv): Throw an
exception for the bad operator case.
|