| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
* parser.h (parse_init): Declared.
* parser.l (parse_init): New function.
* txr.c (main): Call parse_init.
(txr_main): No need to gc-protect yyin_stream since parse_init does it.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* filter.c (struct filter_par): wchar_t becomes wchli_t.
* lib.h (wchli_t): New type: an incomplete structure type,
so that a pointer to this type is incompatible with anything else.
(wli): Macro produces const wchli_t * pointer instead of
const wchar_t *.
(auto_str, static_str): Accept a const wchli_t * instead
of const wchar_t *, making it impossible to misuse these
functions by passing in a literal.
* stream.c (string_out_put_char): These type changes showed
this hack to have a bug. Confronted with the need to cast
from const wchar_t * to const wchli_t *, it's obvious that
the conversion has to be done properly with the + 1 in the
one platform case, but not the other.
* txr.c (version): Type changed to const wchli_t.
* txr.h (version): Declaration updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TODO: there should be some type safety with the new wli macro
so that if it is forgotten, there will be a diagnostic.
* configure (lit_align): New configuration variable
and configuration test. Generates LIT_ALIGN in config.h.
Fixed the integer-holds-pointer test for the different output
from the nm program on Cygwin. The arrays become common symbols
marked C which do not show an offset attribute, only size:
one less column.
* filter.c (to_html_table, from_html_table): wrap wide string
literals with the wli macro. This must be done from now on for
all literals and initializes of arrays that are going to be
directly converted to type tagged val-s.
* lib.h (wli): New macro.
(auto_str, static_str, litptr, lit_noex): Handle wide literals on
platforms where they are aligned to only two bytes, such that we don't
have two bits in the pointer. We can still add our 11 bit type tag, but
then when recovering the pointer to the data, we have may have
to fix up the pointer.
* parser.l: Another portability issue here. Flex generates a scanner
which has #include <unistd.h> in the middle, after the source file's
own #includes which can introduce macros. On Cygwin, there is some
hygiene problem whereby our "noreturn" macro causes the <unistd.h>
header to generate bad syntax and fail to compile. Stupid Cygwin
and even stupider flex! The workaround is to include <unistd.h>
at the top in the flex source.
* stream.c (string_out_put_char): This is one more place where
the string literal handling hack spreads.
* txr.c (version): Wrap string in wli.
|
|
|
|
|
|
| |
hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y,
regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c,
unwind.h, utf8.c, utf8.h: Updated e-mail address.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New eof directive.
Fixes in skip directive to work very well with eof.
Consecutive variable matching semantics improved; concept of double
variable match introduced for unbound variable followed by
regex variable.
Directives collect and coll have keyword arguments for more control
over their behavior.
Paralle directives (all, some, none, ...) are available in
horizontal mode.
New choose directive for selecting one of numerous alternatives
GC bugfix in new filtering code.
The code has an issue compling with GNU C++ instead of C,
which is something that is supported by this project.
Not a release-blocking issue. Not easy to fix without
restructuring some code.
* txr.c (version): Bumped.
* txr.1: Bumped version and set date.
* configure (txr_ver): Bumped.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
symbol variables.
(match_lines): Keyword arguments in collect implemented.
(match_init): New function.
* match.h (match_init): Declared.
* parser.l (COLLECT): Lexical syntax changed for COLLECT to
allow for argument material.
* parser.y (%union): obj renamed to val.
(exprs_opt): New nonterminal.
(collect_clause): Rewritten for arguments.
* txr.c (main): Call to match_init introduced.
|
| |
|
|
|
|
|
|
| |
lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c,
regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h: Updated copyright year.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
NFA or derivatives. The default behavior is NFA, with
derivatives used if the regular expression contains
uses of complement or intersection. The --dv-regex
option forces derivatives always.
|
|
|
|
|
| |
from now on, which is compatible with unsigned char *.
No implicit conversion to or from this type, in C or C++.
|
| |
|
|
|
|
| |
gc failing to mark a local variable in txr_main.
|
|
|
|
| |
in regex module not exposed in header. Etc.
|
| |
|
|
|
|
|
| |
Valgrind protection of free blocks. This works independently
of --gc-debug.
|
| |
|
|
|
|
| |
have a _s suffix.
|
|
|
|
|
|
|
|
| |
can be converted to a type long and vice versa. The configure
script tries to detect the appropriate type to use. Also,
some run-time checking is performed in the streams module
to detect which conversions specifier strings to use for
printing numbers.
|
|
|
|
|
|
|
|
|
|
| |
a system package instead of being hacked with the $ prefix.
Keyword symbols are provided. In the matcher, evaluation
is tightened up. Keywords, nil and t are not bindeable, and
errors are thrown if attempts are made to bind them.
Destructuring in dest_bind is strict in the number of items.
String streams are exploited to print bindings to objects
that are not strings or characters. Numerous bugfixes.
|
|
|
|
|
| |
we wouldn't have to declare object variables at all, so why
use an obtuse syntax to do so?)
|
| |
|
| |
|
|
|
|
| |
of standard conformance.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
abstraction instead of directly using C standard I/O,
to eliminate most uses of C formatted I/O,
and fix numerous bugs, such variadic argument lists which
lack a terminating ``nao'' sentinel.
Bug 28033 is addressed by this patch, since streams no longer provide
printf-compatible formatting. The native formatter is extended with
some additional capabilities to take over.
The work on literal objects is expanded and they are now used
throughout the code base.
Fixed bad realloc in string output stream: reallocating by number
of wide chars rather than bytes.
|
|
|
|
| |
Bumped version numbers, and cleaned up trailing whitespace from some files.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test suite exercises -c now.
txr.c (txr_main): If the script specified with -c is not terminated
by a newline, just add a newline. On the shell command line, it's a
nuisance to have to add the extra line before closing the quote.
It's also awkward in scripting, because the shell (or at
least Bash 3.0) does not produce a final terminating newline in command
substitution syntax like -c "$(cat file)". The last newline in
the file is trimmed, and has to be explicitly added in the script
itself, which is wrong in the case when the file is empty.
|
|
|
|
|
|
|
|
| |
semantics on the input stream to wide character input.
Also, reading a query the command line (-c) must
read bytes from a UTF-8 encoding of the string.
We introduce a new get_byte function which can extract bytes
from streams which provide it.
|
|
|
|
|
| |
use wide character functions so that there is no illicit
mixing. (But the goal is to replace this usage with txr streams).
|
| |
|
|
|
|
|
| |
so that the C library streams do the encoding. Once the program
is weaned from C library wide character stream I/O, this can go away.
|
|
|
|
|
|
|
|
|
| |
This is incomplete. There are too many dependencies on
wide character support from the C stream I/O library,
and implicit use of some encoding which may not be UTF-8.
The regex code does not handle wide characters properly.
Character type is still int in some places, rather than wchar_t.
Test suite passes though.
|
|
|
|
|
|
| |
is needed for pipes that terminate abnormally or return failed
termination. Pipe and stdio streams have an extra description field
so they are printed in a readable way.
|
| |
|
|
|
|
| |
by the integer constant 0 rather than a proper null pointer constant.
|
|
|
|
|
| |
bottom is unreliable due to the unpredictable allocation order of local
variables.
|
|
|
|
|
|
| |
Regexps can be bound to variables.
New freeform directive.
|
|
|
|
|
|
|
|
|
|
|
| |
Lazy strings implemented, incompletely.
Changed string function to implicitly strdup; non-strdup
version changed to string_own. Fixed wrong uses of strdup
rather than chk_strdup.
Functions added to regex module to provide regex matching
as a state machine to which characters are fed.
|