summaryrefslogtreecommitdiffstats
path: root/regex.c
Commit message (Collapse)AuthorAgeFilesLines
* Changes to make the code portable to C++ compilers, whichKaz Kylheku2009-11-241-9/+9
| | | | can be taken advantage of for better diagnostics.
* Renaming global variables that denote symbols, such that theyKaz Kylheku2009-11-241-16/+16
| | | | have a _s suffix.
* Improving portability. It is no longer assumed that pointersKaz Kylheku2009-11-231-5/+6
| | | | | | | | can be converted to a type long and vice versa. The configure script tries to detect the appropriate type to use. Also, some run-time checking is performed in the streams module to detect which conversions specifier strings to use for printing numbers.
* Introducing symbol packages. Internal symbols are now inKaz Kylheku2009-11-211-1/+2
| | | | | | | | | | a system package instead of being hacked with the $ prefix. Keyword symbols are provided. In the matcher, evaluation is tightened up. Keywords, nil and t are not bindeable, and errors are thrown if attempts are made to bind them. Destructuring in dest_bind is strict in the number of items. String streams are exploited to print bindings to objects that are not strings or characters. Numerous bugfixes.
* Changing ``obj_t *'' occurences to a ``val'' typedef. (Ideally,Kaz Kylheku2009-11-201-22/+22
| | | | | we wouldn't have to declare object variables at all, so why use an obtuse syntax to do so?)
* Following-up on diagnostics obtained by running code through C++Kaz Kylheku2009-11-181-8/+8
| | | | | | compiler. Idea: allocator functions return char * instead of void *, like malloc did in classic pre-ANSI C. That way we are forced to use a cast except when the target pointer is char * already.
* Warning fixes.Kaz Kylheku2009-11-171-1/+1
|
* * regex.c (nfa_all_states, nfa_closure): visited parameterKaz Kylheku2009-11-171-2/+2
| | | | should be unsigned.
* Regular expression module updated to do unicode character sets.Kaz Kylheku2009-11-121-49/+433
| | | | | | | | | | | Most of the changes are in the area of representing sets. Also, a bug was found in the compilation of regex character sets: ranges straddling two adjacent blocks of 32 characters were not being added to the character set. However, ranges falling within a single 32 block, or spanning three or more such blocks, worked properly. This bug is not tickled by common ranges such as A-Z, or 0-9, which land within a 32 block.
* Big conversion to wide characters and UTF-8 support.Kaz Kylheku2009-11-111-3/+3
| | | | | | | | | This is incomplete. There are too many dependencies on wide character support from the C stream I/O library, and implicit use of some encoding which may not be UTF-8. The regex code does not handle wide characters properly. Character type is still int in some places, rather than wchar_t. Test suite passes though.
* Version 019txr-019Kaz Kylheku2009-11-031-7/+7
| | | | | | Regexps can be bound to variables. New freeform directive.
* Got regex working over lazy strings from freeform.Kaz Kylheku2009-11-021-25/+82
| | | | Bugfixes.
* Start of implementation for freestyle matching.Kaz Kylheku2009-11-021-0/+76
| | | | | | | | | | | Lazy strings implemented, incompletely. Changed string function to implicitly strdup; non-strdup version changed to string_own. Fixed wrong uses of strdup rather than chk_strdup. Functions added to regex module to provide regex matching as a state machine to which characters are fed.
* Trivial change allows regexps to be bound to variables,Kaz Kylheku2009-10-301-0/+5
| | | | | and used for matching. This Just Works because of the way match_line treats variables.
* txr-015 2009-10-15txr-015Kaz Kylheku2017-07-311-7/+10
|
* txr-011 2009-09-25txr-011Kaz Kylheku2017-07-311-0/+631