summaryrefslogtreecommitdiffstats
path: root/regex.c
Commit message (Collapse)AuthorAgeFilesLines
* Recycle conses in unget-char and read-until-match.Kaz Kylheku2016-04-201-3/+7
| | | | | | | | | | | | * regex.c (ead_until_match): Use rcyc_pop instead of pop to move the conses to the recycle list. We know these are not shared with anything. Adding additional logic to completely recycle the stack. * socket.c (dgram_get_char): Use rcyc_pop to get the character from the push-back list. * stream.c (stdio_get_char): Likewise.
* read-until-match can optionally keep matched text.Kaz Kylheku2016-04-201-21/+19
| | | | | | | | | | | | | | | | | | | | * regex.c (read_until_match): New argument, include_match. Three times repeated termination code refactored into block reached by forward goto. (regex_init): Registration of read-until-match updated. * regex.h (read_until_match): Declaration updated. * stream.c (struct record_adapter_base): New member, include_match. (record_adapter_get_line): Pass match to read_until_match as new argument. (record_adapater): New argument, include_match. (stream_init): Update registration of record-adapter. * stream.h (record_adapter): Declaration updated. * txr.1: Updated.
* Fix broken read_until_match.Kaz Kylheku2016-04-191-17/+51
| | | | | * regex.c (read_until_match): Completely rewrite broken, unsalvageable, garbage logic.
* Header file cleanup.Kaz Kylheku2016-01-221-1/+0
| | | | | | | * arith.c, cadr.c, debug.c, eval.c, filter.c, gencadr.txr, glob.c, hash.c, linenoise/linenoise.c, lisplib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Remove unncessary header files.
* Regex printing not escaping [ and ].Kaz Kylheku2016-01-121-1/+2
| | | | | * regex.c (print_rec): Handle '[' and ']' in backslash-adding switch.
* Print control chars in regexes using \x.Kaz Kylheku2016-01-121-53/+70
| | | | | | | | | | | | | | | | | | | | * lib.c (out_str_char): Static function becomes extern. * lib.h (out_str_char): Declared. * regex.c (puts_clear_flag, putc_clear_flag): New static functions. (print_class_char): Take semicolon flag argument. Use out_str_char to render characters not escaped locally. Clear the semicolon flag. (paren_print_rec): Take semicolon flag argument, and pass it down. Clear it when printing parentheses. (print_rec): Take semicolon flag argument, and pass down to lower level functions. Use putc_clear_flag and puts_clear_flag instead of put_string and put_char. Use out_str_char for char object not esaped locally. (regex_print): define semi_flag and pass it down to print_rec.
* regex_print: [ and ] in char class must be escaped.Kaz Kylheku2016-01-121-1/+1
| | | | * regex.c (print_class_char): Add missing character cases.
* Record-delimiting stream adapter.Kaz Kylheku2016-01-011-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | * regex.c (read_until_match): New function. (regex_init): Registered read-until-match intrinsic. * regex.h (read_until_match): Declared. * stream.c (struct delegate_base): New struct type. (delegate_base_mark, delegate_put_string, delegate_put_char, delegate_put_byte, delegate_get_char, delegate_get_byte, delegate_unget_char, delegate_unget_byte, delegate_close, delegate_flush, delegate_seek, delegate_truncate, delegate_get_prop, delegate_set_prop, delegate_get_error, delegate_get_error_str, delegate_clear_error, make_delegate_stream): New static functions. (struct record_adapter_base): New struct type. (record_adapter_base_mark, record_adapter_mark_op, record_adapter_get_line): New static functions. (record_adapter_ops): New static structure. (record_adapter): New function. (stream_init): Registered record-adapter intrinsic. * stream.h (record_adapter): Declared. * txr.1: Documented read-until-match and record-adapter.
* Copyright year bump.Kaz Kylheku2015-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright. * linenoise/LICENSE, linenoise/linenoise.c, linenoise/linenoise.h: Bump one principal author's copyright from 2014 to 2015. The code is based on a snapshot of 2015 upstream work.
* range-regex returns range, not cons.Kaz Kylheku2015-12-071-2/+2
| | | | | | | | | | | * regex.c (range_regex): Return range. (search_regst): Use appropriate accessors on range returned by range_regex. * lib.c (tok_where): Destructure range returned by range_regex, using range_bind. * txr.1: Documented changed behavior.
* Fix serious regression in search_regex.Kaz Kylheku2015-11-061-3/+1
| | | | | | | | * regex.c (search_regex): In the Sep 7 2015 commit titled "Don't use prot1 for temporary gc protection", a rel1 call was left behind, causing an assert whenever the function is used for a succesful "from end" search.
* Stop using C library setjmp/longjmp.Kaz Kylheku2015-10-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR is moving to custom assembly-language routines. This is mainly motivated by a very dubious thing done in the GNU C Library setjmp and longjmp in the name of security. Evidently, glibc's setjmp "mangles" certain pointer values which are stored into the jmp_buf buffer. It's been that way since 2005, evidently. This means that, firstly, all along, the use of setjmp in gc.c to get registers into a buffer so they can be scanned has not actually worked properly. More importantly, this pointer mangling in setjmp and longjmp is very hostile to a stack copying implementation of delimited continuations. The reason is that continuations contain jmp_buf buffers, which get relocated in the process of capturing and reviving a continuation. Any pointers in a jmp_buf which point into the captured stack segment have to be fixed up to point into the relocated location. Mangled pointers make this difficult, requiring hacks which are specific to glibc and the machine architecture. We might as well implement a clean, well-behaved setjmp and longjmp. * Makefile (jmp.o): New object file. (dbg/%.o, opt/%.o): New rules for .S prerequisites. * args.c, arith.c, cadr.c, combi.c, cadr.c, combi.c, debug.c, eval.c, filter.c, glob.c, hash.c, lib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Removed <setjmp.h> include. * gc.c: Switch to struct jmp and jmp_save, instead of jmp_buf and setjmp. * jmp.S: New source file. * signal.h (struct jmp): New struct type. (jmp_save, jmp_restore): New function declarations denoting assembly language routines in jmp.S. (extended_jmp_buf): Uses struct jmp instead of setjmp. (extended_setjmp): Use jmp_save instead of setjmp. (extended_longjmp): Use jmp_restore instead of longjmp.
* Additional reductions for and.Kaz Kylheku2015-09-291-0/+6
| | | | | | | * regex.c (reg_optimize): If the empty regex is and-ed with another regex, that other regex must be nullable, otherwise the and matches nothing. This is captured in some new reductions for the and operator.
* Simplify and optimization.Kaz Kylheku2015-09-291-4/+1
| | | | | | * regex.c (reg_optimize): No need to check reg_matches_all in and optimization case because the argument object has already been reduced that way by reg_optimize recursion.
* Optimize some cases of the regex branch operator.Kaz Kylheku2015-09-291-0/+43
| | | | | | * regex.c (reg_compl_char_p): New static function. (reg_optimize): Optimize various cases of the or operator: (R|) -> R?, (a|b) -> [ab] and others.
* Some optimizations for * ? and +.Kaz Kylheku2015-09-291-4/+21
| | | | | | * regex.c (regex_optimize): Simplify compounded uses of repetition operators: RR* -> R, R+? -> R* and so on.
* Regex printer fails on \w, \s or \d in char class.Kaz Kylheku2015-09-291-0/+2
| | | | | regex.c (print_rec): Bugfix: handle symbols in character class syntax.
* More complement optimizations.Kaz Kylheku2015-09-281-0/+19
| | | | | * regex.c (reg_optimize): Transform ~.*c to (.*[^c])? and ~c.* to ([^c].*)? where c is a single-character match.
* Streamline some regex optimizations.Kaz Kylheku2015-09-281-15/+48
| | | | | | | * regex.c (reg_single_char_p, invert_single): New static functions. (reg_optimize): Simplify complement operator optimizations using new functions.
* Optimization for one-character range.Kaz Kylheku2015-09-271-2/+7
| | | | | * regex.c (reg_optimize): [a] -> a. Also take advantage of this where the complement case generates [a].
* Optimize complement operator more.Kaz Kylheku2015-09-271-0/+28
| | | | | * regex.c (reg_optimize): Recognize and transform several cases: ~c -> ([^c]?|..+); ~[^c] -> ([c]?|..+); and ~.*c.* -> [^c]*.
* S-exp level regex optimization.Kaz Kylheku2015-09-271-32/+156
| | | | | | | | | | | | | | * regex.c (dv_compile_regex): Replaced by two functions, reg_expand_nongreedy and reg_compile_csets. (reg_expand_nongreedy, reg_compile_csets): New static functions. (reg_optimize): New static function. (regex_compile): Expand nongreedy syntax in incoming regex, and then optimize it before deciding whether to use NFA or derivatives. If derivatives are used, compile the character sets in the regex to character set objects. (regex_init): Register some intrinsic functions for debugging, sys:reg-expand-nongreedy and sys:reg-optimize.
* Support t regex in NFA compiler and in printer.Kaz Kylheku2015-09-271-1/+16
| | | | | | | | | | | | | | | | The t regex means "match nothing". This patch allows the NFA compiler to handle it. This will be necessary for an upcoming regex optimizer which can put out such an object. Also, the recursive regex printer can print the object now. * regex.c (nfa_kind_t): New enum member, nfa_reject. (nfa_state_reject): New static function. (nfa_compile_regex): Compile t regex into a reject state which cannot reach its corresponding acceptance state. (nfa_map_states): Handle nfa_reject case in switch, similarly to nfa_accept: nothing to transition into. (print_rec): Render the t regex as the empty character class [].
* Replace internal_error with exception throws in regex.Kaz Kylheku2015-09-271-7/+7
| | | | | | * regex.c (nfa_compile_regex, dv_compile_regex, reg_nullable, reg_matches_all, reg_derivative, regex_requires_dv): Throw an exception for the bad operator case.
* Bug in complement case of reg_matches_all.Kaz Kylheku2015-09-271-1/+2
| | | | | | * regex.c (reg_matches_all): A complement matches all if its argument matches nothing, not if its argument is anything but the empty match nil.
* regex: major optimization for complement operator.Kaz Kylheku2015-09-241-1/+46
| | | | | | | | | | | This change a huge improvement for expressions that use complement, directly or via the non-greedy % operator. * regex.c (reg_matches_all): New static function. (reg_derivative): When the dervative is applied to a complement expression, identify situations when the remaining expression cannot possibly match anything, and convert them to the t expression.
* Regex state-marking counter wraparound bug.Kaz Kylheku2015-09-151-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a NFA regex goes through more than 4.29 billion state transitions, the state coloring "visited" marker wraps around. There could still exist states with old values at or near zero, which destroys the correctness of the closure calculations. * regex.c (nfa_handle_wraparound): New static function. The wraparound situation is handled by detecting when the next marker value is UINT_MAX. When this happens, we visit all states, marking them to UINT_MAX. Then we visit them again, marking them to zero, and set the next marker value to 1. (nfa_free): Added comment about why we don't have a wraparound check, in case it isn't obvious. (nfa_run): Check for wraparound before eveyr nfa_closure call. (regex_machine_reset): Check for wraparound before nfa_closure call. Fix: store the counter back in the start state's visited field. (regex_machine_init): Initialize the n.visited field of the regex machine structure to zero. Not strictly necessary, since it's initialized moments later in regex_machine_reset, but good form. (regex_machine_feed): Check for wraparound before nfa_closure call.
* Use alloca for some temporary arrays in regex module.Kaz Kylheku2015-09-151-11/+5
| | | | | * regex.c (nfa_free): Use alloca for array of all states. (nfa_run): Use alloca for move, closure and stack arrays.
* Remove limit on NFA state size and allocate tightly.Kaz Kylheku2015-09-151-62/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * regex.c (struct regex): New member, nstates. (NFA_SET_SIZE): Preprocessor symbol removed. (struct nfa_machine): New member, nstates. (nfa_all_states): Function removed. (nfa_map_states): New static function. (nfa_count_one, nfa_count_states, nfa_collect_one): New static functions. (nfa_free): Takes nstates argument. Calculate array of all states using nfa_map_states over nfa_collect_one rather than nfa_all_states. The array is tightly allocated. Also the spanning tree traversal needs just one root, nfa.start. It's not clear why nfa_all_states used nfa.start and nfa.accept as roots. (nfa_closure): Takes nstates parameter; array bounds checking performed tightly against nstates rather than NFA_SET_SIZE. (nfa_move): Check against NFA_SET_SIZE removed. (nfa_run): Take nstates argument. Allocate arrays tightly. Pass nstates to nfa_closure. (regex_destroy): Pass regex->nstates to nfa_free. (regex_compile): Initialize regex->nstates. (regex_run): Pass regex->nstates to nfa_run. (regex_machine_reset): Pass nstates to nfa_closure. (regex_machine_init): Initialize n.nstates member of regex machine. Allocate arrays tightly. (regex_machine_feed): Pass nstates to nfa_closure.
* Fix memory leak in regexes.Kaz Kylheku2015-09-141-1/+1
| | | | | | * regex.c (nfa_free): The visited marker must be incremented, otherwise nfa_all_states will only collect start and accept.
* Don't use prot1 for temporary gc protection.Kaz Kylheku2015-09-071-3/+1
| | | | | | | | | | | | * lib.c (split_str, split_str_set, list_str, int_str): Use gc_hint rather than prot1/rel1. More efficient, doesn't use space in the prot_stack array. * regex.c (search_regex): Likewise. * stream.c (vformat_str, formatv, run): Likewise. In formatv, rel1 wasn't being called in the uw_unwind block, so this fixes a bug.
* Count East Asian Wide and Full Fidth chars as two columns.Kaz Kylheku2015-08-101-0/+66
| | | | | | | | | | | | | | | | | * regex.c (create_wide_cs): New static function. (wide_display_char_p): New function. * regex.h (wide_display_char_p): Declared. * stream.c (put_string, put_char): Use wide_display_char_p to determine whether an extra column need be counted. Also bugfix: iswprint evidently cannot be relied to work over the entire Unicode range, at least not in the C locale. Glibc's version and is reporting valid Japanese characters as unprintable on Ubuntu. As a hack we instead check for control characters and invert the result: control chars are unprintable. * tests/009/json.expected: Updated.
* Pass pretty flag to cobj print operation.Kaz Kylheku2015-08-011-2/+3
| | | | | | | | | | | | | | | | | | | | | * hash.c (hash_print_op): Take third argument, and call cobj_print_impl rather than cobj_print. * lib.c (cobj_print_op): Take third argument. The object class is * printed with obj_print_impl. (obj_print_impl): Static function becomes extern. Passes its pretty flag argument to cobj print virtual function. * lib.h (cobj_ops): print takes third argument. (cobj_print_op): Declaration updated. (obj_print_impl): Declared. * regex.c (regex_print): Takes third argument, and ignores it. * stream.c (stream_print_op, stdio_stream_print, cat_stream_print): Take third argument, and ignore it. * stream.h (stream_print_op): Declaration updated.
* Correction to COBJ initialization pattern.Kaz Kylheku2015-07-301-2/+2
| | | | | | | | | | | | | In fact, the previosuly documented process is not correct and still leaves a corruption problem under generational GC (which has been the default for some time). * HACKING: Document flaw in the initialization pattern previously thought to be correct, and show fix. * hash.c (copy_hash): Fix instance of incorrect pattern. * regex.c (regex_compile): Likewise.
* Bugfix: throwing error when trying to print valid regexps.Kaz Kylheku2015-04-191-1/+1
| | | | | | * regex.c (print_rec): Only dianose "bad object in regex syntax" for some atom other than nil, which denotes an empty (sub)expression, like what results from #// or #/a|/ and such.
* * regex.c (match_regex_right): Bugfix: zero length matchesKaz Kylheku2015-02-201-1/+1
| | | | | should return zero length, rather than nil. This is achieved by trying the match at one past the last character.
* String-returning wrappers for some regex matching functions.Kaz Kylheku2015-02-201-0/+21
| | | | | | | | | | | * eval.c (eval_init): Register search-regst, match-regst and match-regst-right intrinsics. * regex.c (search_regst, match_regst, match_regst_right): New functions. * regex.h (search_regst, match_regst, match_regst_right): Declared. * txr.1: Documented new variants.
* * regex.c (print_rec): A compound must use parentheses forKaz Kylheku2015-02-151-2/+8
| | | | elements which have a higher precedence than catenation.
* Update copyright notices from 2014 to 2015.Kaz Kylheku2015-02-011-1/+1
| | | | | | | | | | | * arith.c, arith.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Update. * LICENSE, METALICENSE: Likewise.
* Use macro to initialize cobj_ops.Kaz Kylheku2015-01-291-14/+10
| | | | | | | | | | * lib.h (cobj_ops_init): New macro. * hash.c (hash_ops, hash_iter_ops): Initialize with cobj_ops_init. * rand.c (random_state_ops): Likewise. * regex.c (char_set_obj_ops, regex_obj_ops): Likewise.
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-16/+16
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.
* Converting cast expressions to macros that are retargettedKaz Kylheku2014-10-171-67/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to C++ style casts when compiling as C++. * lib.h (strip_qual, convert, coerce): New casting macros. (TAG_MASK, tag, type, wli_noex, auto_str, static_str, litptr, num_fast, chr, lit_noex, nil, nao): Use cast macros. * arith.c (mul, isqrt_fixnum, bit): Use cast macros. * configure (INT_PTR_MAX): Define using cast macro. * debug.c (debug_init): Use cast macro. * eval.c (do_eval, expand_macro, reg_op, reg_mac, eval_init): Use cast macros. * filter.c (filter_init): Use cast macro. * gc.c (more, mark_obj, in_heap, mark, sweep_one, unmark): Use cast macros. * hash.c (hash_double, equal_hash, eql_hash, hash_equal_op, hash_hash_op, hash_print_op, hash_mark, make_hash, make_similar_hash, copy_hash, gethash_c, gethash, gethash_f, gethash_n, remhash, hash_count, get_hash_userdata, set_hash_userdata, hash_iter_destroy, hash_iter_mark, hash_begin, hash_uni, hash_diff, hash_isec): Use cast macros. * lib.c (code2type, chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc, chk_strdup, num, c_num, string, mkstring, mkustring, upcase_str, downcase_str, string_extend, sub_str, cat_str, trim_str, c_chr, vector, vec_set_length, copy_vec, sub_vec, cat_vec, cobj_print_op, obj_init): Likewise. * match.c (do_match_line, hv_trampoline, match_files, dir_tables_init): Likewise. * parser.l (grammar): Likewise. * parser.y (parse): Likewise. * rand.c (make_state, make_random_state, random_fixnum, random): Likewise. * regex.c (CHAR_SET_L2_LO, CHAR_SET_L2_HI, CHAR_SET_L1_LO, CHAR_SET_L1_HI, CHAR_SET_L0_LO, CHAR_SET_L0_HI, L0_full, L0_fill_range, L1_full, L1_fill_range, L1_contains, L1_free, L2_full, L2_fill_range, L2_contains, L2_free, L3_fill_range, L3_contains, L3_free, char_set_create, char_set_cobj_destroy, nfa_state_accept, nfa_state_empty, nfa_state_single, nfa_state_wild, nfa_state_set,
* C++ upkeep.Kaz Kylheku2014-10-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR's support for compiling as C++ pays off: C++ compiler finds serious bugs introduced in August 2 ("Big switch to reentrant lexing and parsing"). The yyerror function was being misused; some of the calls reversed the scanner and parser arguments. Since one of the two parameters is void *, this reversal wasn't caught. * parser.l (yyerror): Fix first two arguments being reversed. (num_esc): Change previously correct call to yyerror to follow reversed arguments, so that it stays correct. * parser.y (%parse-param): Change order of these directives so that the scnr parameter is before the parser parameter. This causes the yacc-generated calls to yyerror to have the arguments in the correct order. It also has the effect of changing the signature of yyparse, reversing its parameters. (parse): Update call to yyparse to new argument order. * parser.h (yyparse): Declaration removed. (yyerror): Declaration updated. * regex.c (regex_kind_t): New enum typedef. (struct regex): Use regex_kind_t rather than an enum inside the struct, which has different scoping rules under C++. * txr.c (get_self_path): Fix signed/unsigned warning.
* Version 99.txr-99Kaz Kylheku2014-10-051-0/+1
| | | | | | | | | | | | * RELNOTES: Updated. * configure, txr.1: Bumped version. * share/txr/stdlib/ver.txr: Likewise * Makefile: Improve binary packaging rules. * regex.c: #include <stdarg.h> added.
* Printing of regular expression objects implemented.Kaz Kylheku2014-10-041-1/+151
| | | | | | | | * regex.c (regex_print): New static function. (regex_obj_ops): Registered regex_print. (print_class_char, paren_print_rec, print_rec): New static functions. * dep.mk: Regenerated.
* Keep regex source code in regex objects, in anticipationKaz Kylheku2014-10-041-2/+13
| | | | | | | | | of pretty-printing. Fix object construction bugs. * regex.c (struct regex): New member, source. (regex_mark): Ensure source is visited by garbage collector. (regex_compile): Store regex_sexp in source. Fix violations of section 3.2 of HACKING document.
* Using unified COBJ representation for both regex kinds,Kaz Kylheku2014-10-021-29/+42
| | | | | | | | | | | | | | | | | | | | | | | | rather than the list-based notation for derivative-based regexes, and an encapsulated COBJ for NFA-based regexes. * lib.c (compiled_regex_s): Variable removed. (obj_init): Initialization of compiled_regex_s removed. * lib.h (compiled_regex_s): Declaration removed. * regex.c (struct regex, regex_t): New type. (regex_destroy): Object is now a regex_t, not nfa_t. (regex_mark): New function. (regex_obj_ops): Register regex_mark operation. (reg_nullable, reg_derivative): Remove cases that handles compiled_regex_s. (regex_compile): Output of dv_compile_regex becomes a cobj nwo. Output of nfa_compile_regex must be embedded in regex_t structure. (regexp): Drop the check for compiles_regex_s. (regex_nfa): Function removed. (regex_run, regex_machine_init): Use cobj_handle to retrieve regex_t * pointer and dispatch appropriate code based on regex->kind.
* GC correctness fixes: make sure we pin down objects for which we borrowKaz Kylheku2014-08-251-1/+8
| | | | | | | | | | | | | | | low level C pointers, while we execute code that can cons memory. * lib.c (list_str): Protect the str argument. (int_str): Likewise. * regex.c (search_regex): protect the haystack string, while using the h pointer to its data, since regex_run can use the derivative-based engine which conses. * stream.c (vformat_str): Protect str argument, since put_char might conceivably cons. (vformat): Protect fmtstr.
* * Makefile, arith.c, arith.h, combi.c, combi.h, configure, debug.c,Kaz Kylheku2014-07-231-16/+16
| | | | | | | | debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Synchronize license header with LICENSE.
* * eval.c (eval_init): register range_regex and tok_whereKaz Kylheku2014-06-261-0/+13
| | | | | | | | | | | | | | as intrinsics. * lib.c (tok_where): New function. * lib.h (tok_where): Declared. * regex.c (range_regex): New function. * regex.h (range_regex): Declared. * txr.1: Documented tok-where and range-regex.