summaryrefslogtreecommitdiffstats
path: root/ChangeLog
Commit message (Collapse)AuthorAgeFilesLines
* Improved support for broken unicode.Kaz Kylheku2011-10-101-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | Regex support for extra-large character sets not compiled in if wchar_t is not wide enough for it. The utf-8 properly throws exceptions when encountering characters that it cannot represent, instead of silently ignoring the situation and continuing with incorrectly computed data. * regex.c (FULL_UNICODE): New macro. (CHAR_SET_L3, CHAR_SET_L2_LO, CHAR_SET_L2_HI): Only defined if full unicde is available. (CHSET_XLARGE, cset_L3_t, struct xlarge_char_set, L2_full, L3_fill_range, L3_contains): Ditto. (unon char_set): Member x1 present only under FULL_UNICODE. (char_set_destroy, char_set_add, char_set_add_range, char_set_contains): CHSET_XLARGE cases only available on FULL_UNICODE. (char_set_compile): Default cst variable to CHSET_LARGE. * utf8.c (FULL_UNICODE): New macro. (conversion_error): New function. (utf8_from_uc): Throw error if not FULL_UNICODE and character is outside the BMP. (utf8_decode): Likewise.
* * HACKING: Documented portability hacks for narrow wchar_t.Kaz Kylheku2011-10-101-0/+4
|
* Version 039txr-039Kaz Kylheku2011-10-101-0/+52
|
* One more swing at this with the axe.Kaz Kylheku2011-10-091-0/+9
| | | | | | | * lib.h (wini, wref): New macros. * stream.c (string_out_put_char): Rewritten with macros to eliminate preprocessor #if test.
* * lib.h (wli, lit_noex): We need null characters on both endsKaz Kylheku2011-10-091-0/+8
| | | | | | | so that this hack is correct for null strings. When recovering the wchar_t pointer from a null literal object, we wil increment unconditionally, since it always points to a null character. We end up skipping past null terminator #1, but safely landing on #2.
* Following up to previous commit's TODO.Kaz Kylheku2011-10-091-0/+24
| | | | | | | | | | | | | | | | | | | | | | * filter.c (struct filter_par): wchar_t becomes wchli_t. * lib.h (wchli_t): New type: an incomplete structure type, so that a pointer to this type is incompatible with anything else. (wli): Macro produces const wchli_t * pointer instead of const wchar_t *. (auto_str, static_str): Accept a const wchli_t * instead of const wchar_t *, making it impossible to misuse these functions by passing in a literal. * stream.c (string_out_put_char): These type changes showed this hack to have a bug. Confronted with the need to cast from const wchar_t * to const wchli_t *, it's obvious that the conversion has to be done properly with the + 1 in the one platform case, but not the other. * txr.c (version): Type changed to const wchli_t. * txr.h (version): Declaration updated.
* Ported to Cygwin.Kaz Kylheku2011-10-091-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TODO: there should be some type safety with the new wli macro so that if it is forgotten, there will be a diagnostic. * configure (lit_align): New configuration variable and configuration test. Generates LIT_ALIGN in config.h. Fixed the integer-holds-pointer test for the different output from the nm program on Cygwin. The arrays become common symbols marked C which do not show an offset attribute, only size: one less column. * filter.c (to_html_table, from_html_table): wrap wide string literals with the wli macro. This must be done from now on for all literals and initializes of arrays that are going to be directly converted to type tagged val-s. * lib.h (wli): New macro. (auto_str, static_str, litptr, lit_noex): Handle wide literals on platforms where they are aligned to only two bytes, such that we don't have two bits in the pointer. We can still add our 11 bit type tag, but then when recovering the pointer to the data, we have may have to fix up the pointer. * parser.l: Another portability issue here. Flex generates a scanner which has #include <unistd.h> in the middle, after the source file's own #includes which can introduce macros. On Cygwin, there is some hygiene problem whereby our "noreturn" macro causes the <unistd.h> header to generate bad syntax and fail to compile. Stupid Cygwin and even stupider flex! The workaround is to include <unistd.h> at the top in the flex source. * stream.c (string_out_put_char): This is one more place where the string literal handling hack spreads. * txr.c (version): Wrap string in wli.
* * dep.mk: Regenerated. Too easy to neglect this file.Kaz Kylheku2011-10-091-0/+4
|
* * match.c (vars_to_bindings): Regression fix: recent commitKaz Kylheku2011-10-091-0/+6
| | | | | caused test failure. An empty list not treated as a valid collect variable list.
* * configure: Fixed indentation.Kaz Kylheku2011-10-091-0/+4
|
* * txr.1: Removed references to obsolete @(next) variant.Kaz Kylheku2011-10-081-0/+4
|
* * match.c (vars_to_bindings): New function.Kaz Kylheku2011-10-081-0/+7
| | | | | | (match_line): keyword argument :vars implemented for coll. * txr.1: Documented :vars.
* * match.c (vars_k): New symbol variable.Kaz Kylheku2011-10-081-0/+6
| | | | | (match_files): Implemented :vars in collect. (match_init): New symbol variable initialized.
* * txr.1: Augment example of @/.*/ being used to skip to theKaz Kylheku2011-10-081-0/+6
| | | | | end of the line with @(skip) which is now better style, since it avoids reaching for regexes.
* * match.c (match_line): Skip directive bugfix. If skip is theKaz Kylheku2011-10-081-0/+6
| | | | | last item on the line, it must match the whole line by returning success.
* * match.c (mintimes_k, maxtimes_k): New keyword variables.Kaz Kylheku2011-10-081-0/+10
| | | | | | | | | (match_line): Implemented :mintimes and :maxtimes, changing the semantics of :times. (match_files): Likewise. (match_init): New keyword variables initialized. * txr.1: Updated.
* * HACKING: Formatting.Kaz Kylheku2011-10-081-0/+4
|
* * match.c (match_files): Fixed spectacular bug in function calling,Kaz Kylheku2011-10-071-0/+9
| | | | | | | | dating back to before October 2009 when txr was put into git. Basically, unbound variables were not handled right after the function return, due to the increment step being wrongly written as ``piter = cdr(aiter)'' in the for loop that processes the ub_p_a_pairs. Evil cut and paste!
* * match.c (greedy_k): New keyword symbol variable.Kaz Kylheku2011-10-071-0/+9
| | | | | | | | (match_line): Greedy skip implemented. (match_files): Likewise. (match_init): New keyword symbol variable initialized. * txr.1: Updated.
* * lib.c (eol_s): New symbol variable.Kaz Kylheku2011-10-071-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | (obj_init): New variable initialized. * lib.h (eol_s): Declared. * match.c (match_line): Implemented horizontal skip as and new eol directive. (match_lines): Vertical skip defers to horizontal skip if there is trailing material. * txr.1: Updated. * lib.c (eol_s): New symbol variable. (obj_init): New variable initialized. * lib.h (eol_s): Declared. * match.c (match_line): Implemented horizontal skip as and new eol directive. (match_lines): Vertical skip defers to horizontal skip if there is trailing material. * txr.1: Updated.
* * lib.c (flatten_helper): Function removed.Kaz Kylheku2011-10-071-0/+5
| | | | (flatten): Recurse directly, using func_n1.
* * txr.1: fxed wrong word.Kaz Kylheku2011-10-071-0/+4
|
* Extending syntax to allow for @VAR and @(...) forms insideKaz Kylheku2011-10-061-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | nested lists. This is in anticipation of future features. * lib.c (expr_s): New symbol variable. (obj_init): expr_s initialized. * lib.h (expr_s): Declared. * match.c (dest_bind): Now takes linenum. Tests for the meta-syntax denoted by the system symbols var_s and expr_s, and throws an error. (eval_form): Similar error checks added. Also, hack: do not add file and line number to an exception which begins with a '(' character; just re-throw it. This suppresses duplicate line number addition when this throw occurs across some nestings. (match_files): Updated calls to dest_bind. * parser.l (yybadtoken): Handle new token kind, METAVAR and METAPAR. (grammar): Refactoring among patterns: TOK broken into SYM and NUM, NTOK introduced, unused NUM_END removed. Rule for @( producing METAPAR in nested state. * parser.y (METAVAR, METAPAR): New tokens. (meta_expr): New nonterminal. (expr): meta_expr and META_VAR productions handled.
* Renaming the currying combinators according to new scheme.Kaz Kylheku2011-10-061-0/+12
| | | | | | | | | | * lib.c (bind2): Function renamed to curry_12_2. (bind2other): Function renamed to curry_12_1. (do_bind_2, do_bind2other): Helpers renamed likewise. (tree_find): Follows rename of bind2. * match.c (match_files): deffilter code follows bind2 rename to curry_12_2.
* * lib.c (funcall3, curry_123_2): New functions.Kaz Kylheku2011-10-061-0/+23
| | | | | | | | | | | | | | | | | | | | | | (do_curry_123_2): New static function. * lib.h (funcall3, curry_123_2): Declared. * match.c (subst_vars): Bugfix: throw error on unbound variable instead of ignoring the situation. This bug caused unbound variables in quasiliterals to be silently ignored. (eval_form): Function changed to three argument form, so that it takes a line number for reporting errors. Restructured to catch the new unbound variable exception from subst_vars, and re-throw it with a line number. Also, throws exception now instead of returning nil if itself it detets an unbound variable. Uses of eval_form no longer have to test the return value for nil, but just assume it worked. (match_lines): Currying calls to eval form updated to use curry_123_2. Test of eval return value eliminated. In function calls, eval isn't used for reducing symbol arguments to values, because it now throws in the unbound case, and it's not worth setting up a catch for this. Instead, assoc is used directly.
* * match.c (match_files): In function calls, the deletion ofKaz Kylheku2011-10-051-0/+6
| | | | | the unbound variable from the argument list can be done with a destructive operation since that list is a copy.
* * LICENSE, Makefile, configure, filter.c, filter.h, gc.c, gc.h, hash.c,Kaz Kylheku2011-10-041-0/+7
| | | | | | hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated e-mail address.
* * match.c (match_line, match_files): Another correction to how bindingsKaz Kylheku2011-10-041-0/+14
| | | | | | | | | | | | | are handled in collect/coll. New bindings from the main clause and last clause must override old bindings. This is done by some additional set difference operations based on symbol identity. Otherwise it is possible to end up with multiple bindings for the same symbol, which is untidy. If the collect clause scrubs a variable with forget and re-binds it, then combining that environment with the previous bindings will create a duplicate. Also, fixed a serious bug with the bindings from the last clause; the append was wrongly put into the loop that processes the collected lists.
* * lib.c (acons): New function.Kaz Kylheku2011-10-041-0/+18
| | | | | | | | | | | | | | | | | (set_diff): Optimize common case: list1 and list2 are the same, or list2 is substructure of list1. Situations in which this won't be the case for variable bindings are rare. * lib.h (acons): Declared. * match.c (match_line): Use acons rather than acons_new, when binding variables that we know are new (the symbol is unbound). When computing the set difference over bindings, use cons cell equality, rather than symbol equality. Symbol equality is wrong because a binding can be removed, and then a new binding can be introduced using the same symbol. This must be treated as a different binding.
* Bugfixes to the semantics of binding environments, whichKaz Kylheku2011-10-041-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | were broken in the face of deletions (local, forget). For some stupid reason, I had written a destructive routine for removing elements from an association list, and used it as the basis for the local and forget directives. * lib.c (eq_f, car_f): New variables. (identity_tramp, equal_tramp): Obsolete functions removed. (apply): Broken function disabled at run time. (funcall, funcall1, funcall2): Throw meaningful error instead of aborting. (alist_remove_test): New static function. (alist_remove, alist_remove1): Rewritten to be functional rather than destructive. (alist_nremove, alist_nremove1): Destructive functions, using previous implementations of alist and alist_nremove. (do_sort): Recurses directly rather than via sort. That was probably why this helper was introduced! (find, set_diff): New functions. (obj_init): gc-protect new variables eq_f and car_f, and initialize them. Initializations for equal_f and identity_f changed to use equal and identity directly, without the obsolete wrappers. * lib.h (eq_f, car_f, alist_nremove, alist_nremove1, find, set_diff): Declared. * match.c (match_line): Use set_diff to determine what bindings are new, rather than ldiff and ldiff-like logic which break when the new bindings do not share structure with the old. (match_files): Likewise.
* * txr.1: Starte dodcumenting the forgotten merge directive.Kaz Kylheku2011-10-031-0/+4
|
* Implemented new last clause for collect and coll.Kaz Kylheku2011-10-031-0/+21
| | | | | | | | | | | | | | | | | | | | Bugfix in cases inside coll: was not collecting bindings. Bugfix for until inside coll: was not seeing bindings from main clause. * lib.c (ldiff): New function. * lib.h (ldiff): Declared. * match.c (match_line): Implemented last clause. Fixed cases handling by moving misplaced termination check. (match_files): Implemented last clause. * parser.y (until_last): New nonterminal symbol. (collect_clause): Refactored syntax to support until and last. (elem): Likewise. * txr.1: Updated.
* * parser.y (rep_elem): Bugfix: forgotten o_elems_transform ontxr-038Kaz Kylheku2011-10-021-0/+6
| | | | | syntax tree of o_elems constituent, leading to problems with consecutive variables in a @(rep).
* * match.c (match_line): Handle trailer_s directive.Kaz Kylheku2011-10-021-0/+7
| | | | | | (match_files): Remove check against trailer_s not having trailing material. If it doesn't, it's a vertical directive processed here, otherwise leave it alone so match_line processed it.
* Compiles as C++ again.Kaz Kylheku2011-10-021-0/+13
| | | | | | | | | | | * lib.h (cons_set): New macro. * match.c (match_line, match_files): In collect clause handlers, move variable declarations above goto, and initialize with cons_set, instead of declaring and initializing with cons_bind. This eliminates the stupid C++ error that goto skips a variable initialization (which happens even when it can be trivially proven that the has no next use at the goto site!)
* Spelling.Kaz Kylheku2011-10-011-1/+1
|
* Version 038Kaz Kylheku2011-10-011-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New eof directive. Fixes in skip directive to work very well with eof. Consecutive variable matching semantics improved; concept of double variable match introduced for unbound variable followed by regex variable. Directives collect and coll have keyword arguments for more control over their behavior. Paralle directives (all, some, none, ...) are available in horizontal mode. New choose directive for selecting one of numerous alternatives GC bugfix in new filtering code. The code has an issue compling with GNU C++ instead of C, which is something that is supported by this project. Not a release-blocking issue. Not easy to fix without restructuring some code. * txr.c (version): Bumped. * txr.1: Bumped version and set date. * configure (txr_ver): Bumped.
* Maintaining C++ compiling (except for two issues that willKaz Kylheku2011-10-011-0/+17
| | | | | | | | | | | | | | | | need another commit). * filter.c: Include "gc.h" for prototype of protect. (struct filter_pair): Use const wchar_t *, so we can assign literals. (html_hex_continue): Ditto. * lib.c (and): Function renamed to andf, since and is a C++ operator. * lib.h (and): Declaration renamed. * match.c (match_files): Use of and updated to andf.
* HACKING: Clarified that --vg-debug is also needed to turn on onKaz Kylheku2011-10-011-0/+5
| | | | the Valgrind support at run-time, in addition to building it in.
* New test case, covering some filtering from HTML/XML.Kaz Kylheku2011-10-011-0/+10
| | | | | | | | * Makefile: Defined TXR_ARGS for new test case. * tests/008/students.expected: New file. * tests/008/students.txr: New file. * tests/008/students.xml: New file.
* * filter.c (filters, filter_init): Serious gc bug fixed: neglected toKaz Kylheku2011-10-011-0/+6
| | | | | inform the garbage collector about the filters global variable. Ouch!
* New test case under tests/008.Kaz Kylheku2011-10-011-0/+14
| | | | | | | | | | | | * Makefile: Made previous TXR_ARGS for 008 specific to tokenizing test case, and introduced separate TXR_ARGS for this test case. * tests/008/configfile: New file. * tests/008/configfile.expected: New file. * tests/008/configfile.txr: New file.
* Deleted reference to accidentally added file.Kaz Kylheku2011-10-011-1/+0
|
* Tokenizing test case, exercising for @(coll :gap 0)Kaz Kylheku2011-10-011-0/+11
| | | | | | | | | | and horizontal @(choose :shortest ...). * Makefile: Defined TXR_ARGS for tests/008 directory. * tests/008/data: New file. * tests/008/tokenize.expected: New file. * tests/008/tokenize.txr: New file.
* New test case, covering exception handling across nestedKaz Kylheku2011-10-011-0/+11
| | | | | | | | | | function invocations. * Makefile (TEST): Test targets marked as .PHONY, because they are. * tests/007/except-1.expected: New file. * tests/007/except-1.out: New file. * tests/007/except-1.txr: New file.
* * parser.y (all_clause, some_clause, none_clause, maybe_clause,Kaz Kylheku2011-10-011-0/+6
| | | | | cases_clause, choose_clause, elem): Regression bug fix: bad list calls in parser, lacking nao terminator.
* Merge commit '4afe959'Kaz Kylheku2011-10-011-0/+9
|\ | | | | | | | | | | | | Conflicts: ChangeLog Lost commit.
| * Regression bug fix: longest match variables broken byKaz Kylheku2011-10-011-0/+9
| | | | | | | | | | | | | | | | 2011-09-28 commit which introduced the double var match. * match.c (match_line): Handle case where modifier is t. * parser.y (var_op): Produce modifir as (t) rather than t.
* | * txr.1: Documented choose and horizontal mode for paralleKaz Kylheku2011-10-011-0/+5
|/ | | | constructs.
* New directive: choose.Kaz Kylheku2011-10-011-0/+24
| | | | | | | | | | | | | | | | | | | | | | * match.c (choose_s, longest_k, shortest_k): New variables. (match_line, match_files): Introduced choose directive. (match_init): Initialize new variables. * match.h (choose_s): Declared. * parser.l (yybadtoken): Handle CHOOSE. (CHOOSE): Clause added for returning this token. * parser.y: Added #include "match.h". (CHOOSE): New token symbol. (choose_clause): New nonterminal symbol. (clause): choose_clause added. (all_clause, some_clause, none_clause, maybe_clause, cases_clause): Abstract syntax tree tweaked. (choose_clause): New syntax. (elem): Abstract syntax trees tweaked for many clauses. New CHOOSE clauses. (out_clause): New error case for choose_clause.