| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(regterm): REGTOKEN production factored out to regtoken.
(regclass): Reverted prior commmit's changes.
(regclassterm): Reverted prior commit, removing REGTOKEN
production for character classes, and introduced a regtoken
production. So now the keyword symbols are part of the
character class abstract syntax.
(regtoken): New production rule.
* regex.c (regex_space_chars): Converted to internal linkage.
(char_set_compile): Handle token keywords in character class
abstract syntax.
* regex.h (regex_space_chars): External declaration removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (init): Call regex_init.
* parser.l: return new REGTOKEN kind.
* parser.y (REGTOKEN): New token type.
(REGTERM): Translate REGTERM to keyword.
(regclass): Restructured to handle inherited nodes as lists.
(regclassterm): Produce $$ as list. Add handling for REGTOKEN
occurring inside character class by expanding it. This might not
be the best approach.
(yybadtoken): Handle REGTOKEN in switch.
* regex.c (struct any_char_set, struct small_char_set,
struct displaced_char_set, struct large_char_set,
struct xlarge_char_set): New bitfield member, stat.
(char_set_create): New parameter for indicating static char set.
(char_set_destroy): Do not free a static char set.
(char_set_compile): Pass zero to new parameter of char_set_create.
(spaces): New static array.
(space_cs, digit_cs, word_cs, cspace_cs, cdigit_cs, cword_cs): New
static pointers to char_set_t.
(init_special_char_sets, nfa_compile_given_set): New static function.
(nfa_compile_regex, dv_compile_regex): Handle new character set token
keywords.
(space_k, digit_k, word_char_k, cspace_k, cdigit_k, cword_char_k,
regex_space_chars): New variables.
(regex_init): New function.
* regex.h (space_k, digit_k, word_char_k, cspace_k, cdigit_k,
cword_char_k, regex_space_chars, regex_init): Declared.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (cons_find): New function.
(expand_op): Use cons_find rather than tree_find to look for
rest_gensym.
* regex.c (regsub): Rearranged arguments so that the string
is last. This is better for partial evaluaton via the op
operator.
* regex.h (regsub): Updated declaration.
|
|
|
|
|
|
|
|
| |
* regex.c (regsub): New function.
* regex.h (regsub): Declared.
* txr.1: Doc stub added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* arith.h: Likewise.
* debug.c: Added copyright header.
* debug.h: Updated copyright year.
* eval.c: Likewise.
* eval.h: Likewise.
* filter.c: Likewise.
* filter.h: Likewise.
* gc.c: Likewise.
* gc.h: Likewise.
* hash.c: Likewise.
* hash.h: Likewise.
* lib.c: Likewise.
* lib.h: Likewise.
* match.c: Likewise.
* match.h: Likewise.
* parser.h: Likewise.
* regex.c: Likewise.
* regex.h: Likewise.
* stream.c: Likewise.
* stream.h: Likewise.
* txr.c: Likewise, and e-mail address.
* txr.h: Updated copyright year.
* unwind.c: Likewise.
* unwind.h: Likewise.
|
|
|
|
|
|
|
|
| |
* parser.h: Do not include <stdio.h>
* regex.c: Include <limits.h>
* regex.h: Do not include <limits.h>
|
|
|
|
|
|
| |
hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y,
regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c,
unwind.h, utf8.c, utf8.h: Updated e-mail address.
|
|
|
|
|
|
| |
lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c,
regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h: Updated copyright year.
|
| |
|
| |
|
|
|
|
| |
in regex module not exposed in header. Etc.
|
|
|
|
| |
can be taken advantage of for better diagnostics.
|
|
|
|
|
|
|
|
| |
can be converted to a type long and vice versa. The configure
script tries to detect the appropriate type to use. Also,
some run-time checking is performed in the streams module
to detect which conversions specifier strings to use for
printing numbers.
|
|
|
|
|
| |
we wouldn't have to declare object variables at all, so why
use an obtuse syntax to do so?)
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the changes are in the area of representing sets.
Also, a bug was found in the compilation of regex character sets:
ranges straddling two adjacent blocks of 32 characters were
not being added to the character set. However, ranges falling
within a single 32 block, or spanning three or more such blocks,
worked properly. This bug is not tickled by common ranges
such as A-Z, or 0-9, which land within a 32 block.
|
|
|
|
|
|
|
|
|
| |
This is incomplete. There are too many dependencies on
wide character support from the C stream I/O library,
and implicit use of some encoding which may not be UTF-8.
The regex code does not handle wide characters properly.
Character type is still int in some places, rather than wchar_t.
Test suite passes though.
|
|
|
|
|
|
| |
Regexps can be bound to variables.
New freeform directive.
|
|
|
|
| |
Bugfixes.
|
|
|
|
|
|
|
|
|
|
|
| |
Lazy strings implemented, incompletely.
Changed string function to implicitly strdup; non-strdup
version changed to string_own. Fixed wrong uses of strdup
rather than chk_strdup.
Functions added to regex module to provide regex matching
as a state machine to which characters are fed.
|
|
|
|
|
| |
and used for matching. This Just Works because of
the way match_line treats variables.
|
|
|