| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
| |
If the virtual line is partially matched, the remainder of
the line is folded back into list form. In this case, the
data line number must be incremented. Otherwise the calling
context may conclude that no progress was made, and
skip a line of input. I.e. the unmatched part of the input
is a new line, even if there had originally
been no line break at that point.
|
|
|
|
|
|
|
|
|
| |
* lib.c (split_str_sep): New function.
(split_str): Semantics changed; the second argument
is not a set of separator characters (like in split_str_sep)
but rather a separator string. Fixed bug: if the input
string is empty, the output list is empty. This caused
infinite looping behavior in @(freeform).
|
|
|
|
|
|
| |
that we don't clobber the null terminator in the target string, or try
read past the end of the source data. This affects the @(freeform)
directive.
|
| |
|
| |
|
|
|
|
|
|
| |
the type codes of spuriously reached nodes; reached objects
will not be removed by weak processing and so it's better
to just detect those situations and short-circuit.
|
|
|
|
|
|
|
|
| |
Exponential memory consumption behavior was observed when
matching the input aaaaaa....
against the regex a?a?a?a?....aaaa....
The fix is to eliminate common subexpressions
from the derivative for the or operator.
|
| |
|
| |
|
|
|
|
| |
in a code path where it sometimes isn't.
|
|
|
|
|
|
|
| |
discovered on Red Hat EL 4 with gcc 3.4.3.
In the collect loop, set car(success) to nil.
Somehow the generated code hangs on to the last
matching position for a regex, preventing GC.
|
| |
|
|
|
|
|
|
| |
on 32 bit x86 Fedora. This happens because the lazy list variable
``data'' in the match_files function is optimized to a register,
but a stale value of that variable persists in the backing storage.
|
|
|
|
| |
variable.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
of cases to reduce consing. In reg_derivative_list, we avoid
consing the full or expression if either branch is t, and
also save a cons when the first element has a null derivative.
In reg_derivative the oneplus and zeroplus cases are split,
since zeroplus can re-use the input expression, when it's
just a one-character match, deriving nil.
|
|
|
|
|
|
|
|
| |
case whereby R%S matches nothing at all when S is not empty
but equivalent to empty, or more generally when S is nullable.
A much nicer definition is ``the intersection of R* and
the set of all strings that do not contain a non-empty substring
that matches S, followed by S''.
|
|
|
|
| |
of empty [] into regterm, via empty derivation.
|
|
|
|
|
| |
to match no character and [^] as its complement,
being synonymous with the wildcard dot.
|
| |
|
| |
|
|
|
|
|
| |
Correct wrong text: all operators can take an empty regex.
Clarify escaping rules within a character class.
|
|
|
|
| |
Correct wrong text: all operators can take an empty regex.
|
| |
|
|
|
|
| |
from text during HTML conversion.
|
| |
|
| |
|
|
|
|
| |
taking a double derivative of the first item.
|
|\ |
|
| | |
|
| | |
|
|/
|
|
|
|
| |
Revised description of regex operators. Added section
on intersection and complement, which may not be familiar
to regex users.
|
|
|
|
| |
algebraic reductions in the derivative for the operator.
|
| |
|
|
|
|
| |
being treated as a non-complemented set of two characters.
|
|
|
|
|
|
|
| |
NFA or derivatives. The default behavior is NFA, with
derivatives used if the regular expression contains
uses of complement or intersection. The --dv-regex
option forces derivatives always.
|
| |
|
| |
|
|
|
|
|
| |
regex operations (complement, intersection).
The syntax extensions documentation are retained.
|
| |
|
|\
| |
| |
| |
| | |
Conflicts:
ChangeLog
|
| |
| |
| |
| |
| |
| |
| |
| | |
in the middle of statement block.
* lib.h (TAG_MASK): Becomes type cnum rather than long.
(nao): Based off 1 rather than -1 to avoid left shift of
negative number.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This turns out to be easy to do in NFA land.
The complement of an NFA has exactly the same number
and configuration of states and transitions, except
that the states have an inverted meaning; and furthermore,
failed character transitions are routed to an extra
state (which in this impelmentation is permanently
allocated and shared by all regexes). The regex &
is implemented trivially using DeMorgan's.
Also, bugfix: regular expressions like A|B|C are allowed
now by the syntax, rather than constituting syntax error.
Previously, this would have been entered as (A|B)|C.
|
| |
|
|
|
|
|
|
|
|
| |
* hash.h (sethash): Declared.
* lib.c (cobj_handle): New function.
* lib.h (cobj_handle): Declared.
|
|
|
|
|
|
| |
no null pointer check over struct cobj_ops operations.
New typechecking function for COBJ objects.
|
| |
|