summaryrefslogtreecommitdiffstats
path: root/txr.c
Commit message (Collapse)AuthorAgeFilesLines
* Version 041txr-041Kaz Kylheku2011-10-301-1/+1
|
* Bugfix: prepared_error_message variable needs to be gc-protected.Kaz Kylheku2011-10-261-1/+1
| | | | | | | | | * parser.h (parse_init): Declared. * parser.l (parse_init): New function. * txr.c (main): Call parse_init. (txr_main): No need to gc-protect yyin_stream since parse_init does it.
* Version 040txr-040Kaz Kylheku2011-10-201-1/+1
|
* Version 039txr-039Kaz Kylheku2011-10-101-1/+1
|
* Following up to previous commit's TODO.Kaz Kylheku2011-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | * filter.c (struct filter_par): wchar_t becomes wchli_t. * lib.h (wchli_t): New type: an incomplete structure type, so that a pointer to this type is incompatible with anything else. (wli): Macro produces const wchli_t * pointer instead of const wchar_t *. (auto_str, static_str): Accept a const wchli_t * instead of const wchar_t *, making it impossible to misuse these functions by passing in a literal. * stream.c (string_out_put_char): These type changes showed this hack to have a bug. Confronted with the need to cast from const wchar_t * to const wchli_t *, it's obvious that the conversion has to be done properly with the + 1 in the one platform case, but not the other. * txr.c (version): Type changed to const wchli_t. * txr.h (version): Declaration updated.
* Ported to Cygwin.Kaz Kylheku2011-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TODO: there should be some type safety with the new wli macro so that if it is forgotten, there will be a diagnostic. * configure (lit_align): New configuration variable and configuration test. Generates LIT_ALIGN in config.h. Fixed the integer-holds-pointer test for the different output from the nm program on Cygwin. The arrays become common symbols marked C which do not show an offset attribute, only size: one less column. * filter.c (to_html_table, from_html_table): wrap wide string literals with the wli macro. This must be done from now on for all literals and initializes of arrays that are going to be directly converted to type tagged val-s. * lib.h (wli): New macro. (auto_str, static_str, litptr, lit_noex): Handle wide literals on platforms where they are aligned to only two bytes, such that we don't have two bits in the pointer. We can still add our 11 bit type tag, but then when recovering the pointer to the data, we have may have to fix up the pointer. * parser.l: Another portability issue here. Flex generates a scanner which has #include <unistd.h> in the middle, after the source file's own #includes which can introduce macros. On Cygwin, there is some hygiene problem whereby our "noreturn" macro causes the <unistd.h> header to generate bad syntax and fail to compile. Stupid Cygwin and even stupider flex! The workaround is to include <unistd.h> at the top in the flex source. * stream.c (string_out_put_char): This is one more place where the string literal handling hack spreads. * txr.c (version): Wrap string in wli.
* * LICENSE, Makefile, configure, filter.c, filter.h, gc.c, gc.h, hash.c,Kaz Kylheku2011-10-041-1/+1
| | | | | | hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated e-mail address.
* Version 038Kaz Kylheku2011-10-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New eof directive. Fixes in skip directive to work very well with eof. Consecutive variable matching semantics improved; concept of double variable match introduced for unbound variable followed by regex variable. Directives collect and coll have keyword arguments for more control over their behavior. Paralle directives (all, some, none, ...) are available in horizontal mode. New choose directive for selecting one of numerous alternatives GC bugfix in new filtering code. The code has an issue compling with GNU C++ instead of C, which is something that is supported by this project. Not a release-blocking issue. Not easy to fix without restructuring some code. * txr.c (version): Bumped. * txr.1: Bumped version and set date. * configure (txr_ver): Bumped.
* * match.c (mingap_k, maxgap_k, gap_k, times_k, lines_k): NewKaz Kylheku2011-09-291-0/+1
| | | | | | | | | | | | | | | | | symbol variables. (match_lines): Keyword arguments in collect implemented. (match_init): New function. * match.h (match_init): Declared. * parser.l (COLLECT): Lexical syntax changed for COLLECT to allow for argument material. * parser.y (%union): obj renamed to val. (exprs_opt): New nonterminal. (collect_clause): Rewritten for arguments. * txr.c (main): Call to match_init introduced.
* Version 037.Kaz Kylheku2011-09-261-1/+1
|
* * LICENSE, Makefile, configure, gc.c, gc.h, hash.c, hash.h, lib.c,Kaz Kylheku2011-09-231-1/+1
| | | | | | lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated copyright year.
* Version 036.txr-036Kaz Kylheku2011-09-221-1/+1
|
* Version 035.Kaz Kylheku2010-10-051-1/+1
|
* Bump copyrights to 2010.Kaz Kylheku2010-10-051-1/+1
|
* Version 034.txr-034Kaz Kylheku2010-02-281-1/+1
|
* Version 033.Kaz Kylheku2010-01-261-1/+1
|
* Version 032.Kaz Kylheku2010-01-251-1/+1
|
* Version 031.txr-031Kaz Kylheku2010-01-251-1/+1
|
* Version 030.txr-030Kaz Kylheku2010-01-191-1/+1
|
* Version 029.Kaz Kylheku2010-01-181-1/+1
|
* Version 028.Kaz Kylheku2010-01-161-1/+1
|
* Dynamically determine which regex implementation to use:Kaz Kylheku2010-01-131-0/+4
| | | | | | | NFA or derivatives. The default behavior is NFA, with derivatives used if the regular expression contains uses of complement or intersection. The --dv-regex option forces derivatives always.
* Eliminate the void * disease. Generic pointers are of mem_t *Kaz Kylheku2009-12-041-1/+1
| | | | | from now on, which is compatible with unsigned char *. No implicit conversion to or from this type, in C or C++.
* Version 027.txr-027Kaz Kylheku2009-12-031-1/+1
|
* Fix for failing test suite on MIPS machine, due toKaz Kylheku2009-12-031-2/+2
| | | | gc failing to mark a local variable in txr_main.
* Code cleanup. All private functions static. Private stuffKaz Kylheku2009-11-281-4/+4
| | | | in regex module not exposed in header. Etc.
* Version 026.txr-026Kaz Kylheku2009-11-261-1/+1
|
* More Valgrind support. New option --vg-debug which turns onKaz Kylheku2009-11-251-0/+11
| | | | | Valgrind protection of free blocks. This works independently of --gc-debug.
* Version 025Kaz Kylheku2009-11-241-1/+1
|
* Renaming global variables that denote symbols, such that theyKaz Kylheku2009-11-241-2/+2
| | | | have a _s suffix.
* Improving portability. It is no longer assumed that pointersKaz Kylheku2009-11-231-0/+1
| | | | | | | | can be converted to a type long and vice versa. The configure script tries to detect the appropriate type to use. Also, some run-time checking is performed in the streams module to detect which conversions specifier strings to use for printing numbers.
* Introducing symbol packages. Internal symbols are now inKaz Kylheku2009-11-211-3/+5
| | | | | | | | | | a system package instead of being hacked with the $ prefix. Keyword symbols are provided. In the matcher, evaluation is tightened up. Keywords, nil and t are not bindeable, and errors are thrown if attempts are made to bind them. Destructuring in dest_bind is strict in the number of items. String streams are exploited to print bindings to objects that are not strings or characters. Numerous bugfixes.
* Changing ``obj_t *'' occurences to a ``val'' typedef. (Ideally,Kaz Kylheku2009-11-201-24/+24
| | | | | we wouldn't have to declare object variables at all, so why use an obtuse syntax to do so?)
* Version 024.txr-024Kaz Kylheku2009-11-191-1/+1
|
* Version 023.Kaz Kylheku2009-11-181-1/+1
|
* More removal of C99 wide character I/O, and tightening upKaz Kylheku2009-11-171-2/+1
| | | | of standard conformance.
* Version 022.txr-022Kaz Kylheku2009-11-171-1/+1
|
* Big round of changes to switch the code base to use the streamKaz Kylheku2009-11-161-66/+73
| | | | | | | | | | | | | | | | | abstraction instead of directly using C standard I/O, to eliminate most uses of C formatted I/O, and fix numerous bugs, such variadic argument lists which lack a terminating ``nao'' sentinel. Bug 28033 is addressed by this patch, since streams no longer provide printf-compatible formatting. The native formatter is extended with some additional capabilities to take over. The work on literal objects is expanded and they are now used throughout the code base. Fixed bad realloc in string output stream: reallocating by number of wide chars rather than bytes.
* Version 021 preparation.txr-021Kaz Kylheku2009-11-151-2/+2
| | | | Bumped version numbers, and cleaned up trailing whitespace from some files.
* Allow -c scripts to not have a trailing newline.Kaz Kylheku2009-11-131-0/+3
| | | | | | | | | | | | | Test suite exercises -c now. txr.c (txr_main): If the script specified with -c is not terminated by a newline, just add a newline. On the shell command line, it's a nuisance to have to add the extra line before closing the quote. It's also awkward in scripting, because the shell (or at least Bash 3.0) does not produce a final terminating newline in command substitution syntax like -c "$(cat file)". The last newline in the file is trimmed, and has to be explicitly added in the script itself, which is wrong in the case when the file is empty.
* Previous commit broke UTF-8 lexing, by changing the get_charKaz Kylheku2009-11-131-1/+1
| | | | | | | | semantics on the input stream to wide character input. Also, reading a query the command line (-c) must read bytes from a UTF-8 encoding of the string. We introduce a new get_byte function which can extract bytes from streams which provide it.
* Continuing wchar_t conversion. Making sure all stdio callsKaz Kylheku2009-11-121-61/+62
| | | | | use wide character functions so that there is no illicit mixing. (But the goal is to replace this usage with txr streams).
* Whitespace.Kaz Kylheku2009-11-121-0/+1
|
* * txr.c (main): call setlocale to set the LC_CTYPE to en_US.UTF-8,Kaz Kylheku2009-11-111-0/+2
| | | | | so that the C library streams do the encoding. Once the program is weaned from C library wide character stream I/O, this can go away.
* Big conversion to wide characters and UTF-8 support.Kaz Kylheku2009-11-111-33/+34
| | | | | | | | | This is incomplete. There are too many dependencies on wide character support from the C stream I/O library, and implicit use of some encoding which may not be UTF-8. The regex code does not handle wide characters properly. Character type is still int in some places, rather than wchar_t. Test suite passes though.
* Throw exception on stream error during close, or I/O operations. ThisKaz Kylheku2009-11-061-2/+2
| | | | | | is needed for pipes that terminate abnormally or return failed termination. Pipe and stdio streams have an extra description field so they are printed in a readable way.
* Version 020.txr-020Kaz Kylheku2009-11-011-1/+1
|
* Bug ID 27895: Calls to protect have an argument list terminatedKaz Kylheku2009-11-011-2/+2
| | | | by the integer constant 0 rather than a proper null pointer constant.
* Bug ID 27899: Garbage collection problem: method of locating stackKaz Kylheku2009-11-011-4/+10
| | | | | bottom is unreliable due to the unpredictable allocation order of local variables.
* Version 019txr-019Kaz Kylheku2009-11-031-1/+1
| | | | | | Regexps can be bound to variables. New freeform directive.