| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (create_wide_cs): New static function.
(wide_display_char_p): New function.
* regex.h (wide_display_char_p): Declared.
* stream.c (put_string, put_char): Use wide_display_char_p
to determine whether an extra column need be counted. Also bugfix:
iswprint evidently cannot be relied to work over the entire Unicode
range, at least not in the C locale. Glibc's version and is reporting
valid Japanese characters as unprintable on Ubuntu. As a hack we
instead check for control characters and invert the result: control
chars are unprintable.
* tests/009/json.expected: Updated.
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (eval_init): Register search-regst, match-regst
and match-regst-right intrinsics.
* regex.c (search_regst, match_regst, match_regst_right): New functions.
* regex.h (search_regst, match_regst, match_regst_right): Declared.
* txr.1: Documented new variants.
|
|
|
|
|
|
|
|
|
|
|
| |
* arith.c, arith.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h,
filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h,
match.c, match.h, parser.h, rand.c, rand.h, regex.c, regex.h,
signal.c, signal.h, stream.c, stream.h, sysif.c, sysif.h, syslog.c,
syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h:
Update.
* LICENSE, METALICENSE: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsics.
* lib.c (chr_isblank, chr_isunisp): New functions.
* lib.h (chr_isblank, chr_isunisp): Declared.
* regex.h (spaces): Declaration for existing variable added.
* txr.1: Documented chr-isblank and chr-isunisp.
* genvim.txr: Add missing sysif.c.
* txr.vim: Regenerated.
|
|
|
|
|
|
|
|
| |
debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c,
hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y,
rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c,
stream.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h: Synchronize license header with LICENSE.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as intrinsics.
* lib.c (tok_where): New function.
* lib.h (tok_where): Declared.
* regex.c (range_regex): New function.
* regex.h (range_regex): Declared.
* txr.1: Documented tok-where and range-regex.
|
|
|
|
|
|
|
|
| |
to something more useful.
* regex.h (match_regex_right): Change name of parameter.
* txr.1: Documented match-regex-right.
|
|
|
|
|
|
| |
* regex.h (match_regex_right): Declared.
* eval.c (eval_init): Register match_regex_right as instrinsic.
|
|
|
|
| |
Fixing some errors in copyright comments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect that it has two arguments now.
* parser.y (grammar): Update calls to regex_compile to
pass two arguments. Since we don't expect regex_compile to
parse, we specify the error stream as nil.
(spec): The "secret syntax" for a regex is simplified
not to include the slashes. This provides better diagnostics for
unterminated syntax and requires less string processing to generate.
Also, the form returned doesn't have the regex symbol
consed onto it, which parse_regex just throws away.
* regex.c (regex_compile): Now takes a stream argument.
* regex.h (regex_compile): Declaration updated.
* txr.1: Updated
|
|
|
|
|
|
|
| |
* regex.h (regex_compile): Don't call argument
regex_sexp, since it can be a string.
* txr.1: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(regterm): REGTOKEN production factored out to regtoken.
(regclass): Reverted prior commmit's changes.
(regclassterm): Reverted prior commit, removing REGTOKEN
production for character classes, and introduced a regtoken
production. So now the keyword symbols are part of the
character class abstract syntax.
(regtoken): New production rule.
* regex.c (regex_space_chars): Converted to internal linkage.
(char_set_compile): Handle token keywords in character class
abstract syntax.
* regex.h (regex_space_chars): External declaration removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (init): Call regex_init.
* parser.l: return new REGTOKEN kind.
* parser.y (REGTOKEN): New token type.
(REGTERM): Translate REGTERM to keyword.
(regclass): Restructured to handle inherited nodes as lists.
(regclassterm): Produce $$ as list. Add handling for REGTOKEN
occurring inside character class by expanding it. This might not
be the best approach.
(yybadtoken): Handle REGTOKEN in switch.
* regex.c (struct any_char_set, struct small_char_set,
struct displaced_char_set, struct large_char_set,
struct xlarge_char_set): New bitfield member, stat.
(char_set_create): New parameter for indicating static char set.
(char_set_destroy): Do not free a static char set.
(char_set_compile): Pass zero to new parameter of char_set_create.
(spaces): New static array.
(space_cs, digit_cs, word_cs, cspace_cs, cdigit_cs, cword_cs): New
static pointers to char_set_t.
(init_special_char_sets, nfa_compile_given_set): New static function.
(nfa_compile_regex, dv_compile_regex): Handle new character set token
keywords.
(space_k, digit_k, word_char_k, cspace_k, cdigit_k, cword_char_k,
regex_space_chars): New variables.
(regex_init): New function.
* regex.h (space_k, digit_k, word_char_k, cspace_k, cdigit_k,
cword_char_k, regex_space_chars, regex_init): Declared.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (cons_find): New function.
(expand_op): Use cons_find rather than tree_find to look for
rest_gensym.
* regex.c (regsub): Rearranged arguments so that the string
is last. This is better for partial evaluaton via the op
operator.
* regex.h (regsub): Updated declaration.
|
|
|
|
|
|
|
|
| |
* regex.c (regsub): New function.
* regex.h (regsub): Declared.
* txr.1: Doc stub added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* arith.h: Likewise.
* debug.c: Added copyright header.
* debug.h: Updated copyright year.
* eval.c: Likewise.
* eval.h: Likewise.
* filter.c: Likewise.
* filter.h: Likewise.
* gc.c: Likewise.
* gc.h: Likewise.
* hash.c: Likewise.
* hash.h: Likewise.
* lib.c: Likewise.
* lib.h: Likewise.
* match.c: Likewise.
* match.h: Likewise.
* parser.h: Likewise.
* regex.c: Likewise.
* regex.h: Likewise.
* stream.c: Likewise.
* stream.h: Likewise.
* txr.c: Likewise, and e-mail address.
* txr.h: Updated copyright year.
* unwind.c: Likewise.
* unwind.h: Likewise.
|
|
|
|
|
|
|
|
| |
* parser.h: Do not include <stdio.h>
* regex.c: Include <limits.h>
* regex.h: Do not include <limits.h>
|
|
|
|
|
|
| |
hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y,
regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c,
unwind.h, utf8.c, utf8.h: Updated e-mail address.
|
|
|
|
|
|
| |
lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c,
regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h: Updated copyright year.
|
| |
|
| |
|
|
|
|
| |
in regex module not exposed in header. Etc.
|
|
|
|
| |
can be taken advantage of for better diagnostics.
|
|
|
|
|
|
|
|
| |
can be converted to a type long and vice versa. The configure
script tries to detect the appropriate type to use. Also,
some run-time checking is performed in the streams module
to detect which conversions specifier strings to use for
printing numbers.
|
|
|
|
|
| |
we wouldn't have to declare object variables at all, so why
use an obtuse syntax to do so?)
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the changes are in the area of representing sets.
Also, a bug was found in the compilation of regex character sets:
ranges straddling two adjacent blocks of 32 characters were
not being added to the character set. However, ranges falling
within a single 32 block, or spanning three or more such blocks,
worked properly. This bug is not tickled by common ranges
such as A-Z, or 0-9, which land within a 32 block.
|
|
|
|
|
|
|
|
|
| |
This is incomplete. There are too many dependencies on
wide character support from the C stream I/O library,
and implicit use of some encoding which may not be UTF-8.
The regex code does not handle wide characters properly.
Character type is still int in some places, rather than wchar_t.
Test suite passes though.
|
|
|
|
|
|
| |
Regexps can be bound to variables.
New freeform directive.
|
|
|
|
| |
Bugfixes.
|
|
|
|
|
|
|
|
|
|
|
| |
Lazy strings implemented, incompletely.
Changed string function to implicitly strdup; non-strdup
version changed to string_own. Fixed wrong uses of strdup
rather than chk_strdup.
Functions added to regex module to provide regex matching
as a state machine to which characters are fed.
|
|
|
|
|
| |
and used for matching. This Just Works because of
the way match_line treats variables.
|
|
|