| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
that we don't clobber the null terminator in the target string, or try
read past the end of the source data. This affects the @(freeform)
directive.
|
| |
|
| |
|
|
|
|
|
|
| |
the type codes of spuriously reached nodes; reached objects
will not be removed by weak processing and so it's better
to just detect those situations and short-circuit.
|
|
|
|
|
|
|
|
| |
Exponential memory consumption behavior was observed when
matching the input aaaaaa....
against the regex a?a?a?a?....aaaa....
The fix is to eliminate common subexpressions
from the derivative for the or operator.
|
| |
|
| |
|
|
|
|
| |
in a code path where it sometimes isn't.
|
|
|
|
|
|
|
| |
discovered on Red Hat EL 4 with gcc 3.4.3.
In the collect loop, set car(success) to nil.
Somehow the generated code hangs on to the last
matching position for a regex, preventing GC.
|
| |
|
|
|
|
|
|
| |
on 32 bit x86 Fedora. This happens because the lazy list variable
``data'' in the match_files function is optimized to a register,
but a stale value of that variable persists in the backing storage.
|
|
|
|
| |
variable.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
of cases to reduce consing. In reg_derivative_list, we avoid
consing the full or expression if either branch is t, and
also save a cons when the first element has a null derivative.
In reg_derivative the oneplus and zeroplus cases are split,
since zeroplus can re-use the input expression, when it's
just a one-character match, deriving nil.
|
|
|
|
|
|
|
|
| |
case whereby R%S matches nothing at all when S is not empty
but equivalent to empty, or more generally when S is nullable.
A much nicer definition is ``the intersection of R* and
the set of all strings that do not contain a non-empty substring
that matches S, followed by S''.
|
|
|
|
| |
of empty [] into regterm, via empty derivation.
|
|
|
|
|
| |
to match no character and [^] as its complement,
being synonymous with the wildcard dot.
|
| |
|
| |
|
|
|
|
|
| |
Correct wrong text: all operators can take an empty regex.
Clarify escaping rules within a character class.
|
|
|
|
| |
Correct wrong text: all operators can take an empty regex.
|
| |
|
|
|
|
| |
from text during HTML conversion.
|
| |
|
| |
|
|
|
|
| |
taking a double derivative of the first item.
|
|\ |
|
| | |
|
| | |
|
|/
|
|
|
|
| |
Revised description of regex operators. Added section
on intersection and complement, which may not be familiar
to regex users.
|
|
|
|
| |
algebraic reductions in the derivative for the operator.
|
| |
|
|
|
|
| |
being treated as a non-complemented set of two characters.
|
|
|
|
|
|
|
| |
NFA or derivatives. The default behavior is NFA, with
derivatives used if the regular expression contains
uses of complement or intersection. The --dv-regex
option forces derivatives always.
|
| |
|
| |
|
|
|
|
|
| |
regex operations (complement, intersection).
The syntax extensions documentation are retained.
|
| |
|
|\
| |
| |
| |
| | |
Conflicts:
ChangeLog
|
| |
| |
| |
| |
| |
| |
| |
| | |
in the middle of statement block.
* lib.h (TAG_MASK): Becomes type cnum rather than long.
(nao): Based off 1 rather than -1 to avoid left shift of
negative number.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This turns out to be easy to do in NFA land.
The complement of an NFA has exactly the same number
and configuration of states and transitions, except
that the states have an inverted meaning; and furthermore,
failed character transitions are routed to an extra
state (which in this impelmentation is permanently
allocated and shared by all regexes). The regex &
is implemented trivially using DeMorgan's.
Also, bugfix: regular expressions like A|B|C are allowed
now by the syntax, rather than constituting syntax error.
Previously, this would have been entered as (A|B)|C.
|
| |
|
|
|
|
|
|
|
|
| |
* hash.h (sethash): Declared.
* lib.c (cobj_handle): New function.
* lib.h (cobj_handle): Declared.
|
|
|
|
|
|
| |
no null pointer check over struct cobj_ops operations.
New typechecking function for COBJ objects.
|
| |
|
|
|
|
|
| |
from now on, which is compatible with unsigned char *.
No implicit conversion to or from this type, in C or C++.
|
|
|
|
|
|
|
| |
(more): Update heap_min_bound and heap_max_bound.
(in_heap): Do early rejection tests on the pointer. If it's
not aligned, or it's completely outside of the bounding
box of the heap area, short circuit to false.
|