| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
can now be used. Within nested forms,
Lisp-compatible ; comments are suported.
* parser.l: Support @# and ; comments.
* txr.1: Documentation updated.
* txr.vim: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bignums, based on Michael Fromberger's MPI library, are integrated
into the input syntax, stream output, equality testing, the garbage
collector, and hashing.
The plus operation handles transitions between fixnums and bignums.
Other operations are still fixnum only.
* Makefile (CFLAGS): Add mpi directory to include file search.
(OBJS): Include new arith.o module and all of MPI_OBJS.
(MPI_OBJS, MPI_OBJS_BASE): New variables.
* configure (mpi_version, have_quilt, have_patch): New variables.
Script detects whether patch and quilt are available. Unpacks
mpi library, applies patches. Detects 128 bit integer type.
Records more information in config.h about the sizes of types.
* dep.mk: Updated.
* depend.txr: Make work with paths that have directory components.
* eval.c (eval_init): Rename of nump to fixnump.
* gc.c (finalize, mark_obj): Handle BGNUM case.
* hash.c: (hash_c_str): Changed to return unsigned long
instead of long.
(equal_hash): Handle BGNUM case.
(eql_hash): Handle bignums with equal-hash, but other
objects as eq.
* lib.c (num_s): Variable renamed to fixnum_s.
(bignum_s): New symbol variable.
(code2type): Follow rename of num_s. Handle BGNUM case.
(typeof): Follow rename of num_s.
(eql): Handle bignums using equal, and other types using eq.
(equal): Handle BGNUM case.
(chk_calloc): New function.
(c_num): Wording change in error message: is not a fixnum.
(nump): Renamed to fixnump.
(bignump): New function.
(plus): Function removed, reimplemented in arith.c.
(int_str): Handle integers which are too large for wcstol
using bignum conversion. Base 0 is no longer passed to
wcstol but converted to 10 because the special semantics
for 0 would be inconsistent for bignums.
(obj_init): Follow rename of num_s. Initialize bignum_s.
|
|
|
|
| |
IDENT token. This allows for character literals like #\$.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
warning.
(eval_init): New functions registered: typeof and vector functions,
as well as length_list.
* lib.c (length): Function renamed to length_list, because it is
list specific.
(length_vec, size_vec, vector_list): New functions.
(length): New function, generic over lists, vectors and strings.
* lib.h (length_list, length_vec, size_vec, vector_list): Declared.
* match.c (h_var, h_fun, robust_length, v_deffilter, v_fun): Use
length_list instead of length.
* parser.l: Introduced # token.
* parser.y (vector): New nonterminal.
(expr): vector is a kind of expr.
(chrlist): Bugfix: single-character syntax was not working;
for instance #\x to denote the charcter x.
(lit_char_helper): Use length_list instead of length.
* stream.c (string_in_get_line): Bugfix: this was using
the wrong length function: length was being applied to a string.
The genericity of length makes that correct now, but changing
to length_str anyway.
* txr.1: Blank sections created for functions. Vector syntax
documented.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.h (ln_to_forms_hash): Declaration removed.
* parser.l (ln_to_forms_hash): Variable removed.
(parse_init): Initialization and protection of ln_to_forms_hash
removed.
* parser.y (rl): Update of ln_to_forms_hash removed.
* txr.1:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lisp interpreter added.
* gc.c (finalize, mark_obj): Handle ENV objects.
* hash.c (struct hash): acons_new_l_fun function
pointer order of arguments change.
(equal_hash): Handle ENV.
(make_hash, gethash_l): Use cobj_handle for
type safety. Follow change in acons_new_l.
(gethash, gethash_f, remhash, hash_count,
hash_get_userdata, hash_set_userdata, hash_next): Use cobj_handle.
(gethash_n): New function.
* hash.h (gethash_n): Declared.
* lib.c (env_s): New symbol variable.
(code2type, equal): Handle ENV. (plusv, minusv, mul, mulv, trunc, mod,
gtv, ltv, gev, lev, maxv, minv, int_str): New functions.
(rehome_sym): New static function.
(func_f0, func_f1, func_f2, func_f3, func_f4, func_n0, func_n1,
func_n2, func_n3, func_n4): Initialize new fields of struct func.
(func_f0v, func_f1v, func_f2v, func_f3v, func_f4v,
func_n0v, func_n1v, func_n2v, func_n3v, func_n4v,
func_interp): New functions.
(apply): Function removed: sanely re-implemented in new eval.c file.
(funcall, funcall1, funcall2, funcall3, funcall4): Handle
variadic and interpreted functions.
(acons, acons_new, acons_new_l, aconsq_new, aconsq_new_l): Reordered
arguments for compatibility with Common Lisp acons.
(obj_init): Special hack to prepare hash_s symbol, which is
needed for type checking inside the hash table funtions invoked
by make_package, at a time when the symbol is not yet interned.
Initialize new env_s variable.
(obj_print, obj_pprint): Handle ENV. Fix confusing rendering of
of function type.
(init): Call new function eval_init.
* lib.h (enum type): New enumeration member ENV.
(struct func): functype member changed to bitfield.
New bitfied members minparam and variadic.
New members in f union: f0v, f1v, f2v, f3v,
f4v, n0v, n1v, n2v, n3v, n4v.
(struct env): New type.
(union obj): New member e of type struct env.
(env_s): Variable declared.
(plusv, minusv, mul, mulv, trunc, mod, gtv, ltv, gev, lev, maxv, minv,
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* match.c (resolve_k): New keyword symbol variable.
(h_parallel, v_parallel): Implement :resolve keyword in @(some)
directive.
(syms_init): New symbol variable initialized.
* parser.l: Allow (some) to have argument material.
* parser.y (some_clause, elem): SOME syntax adjusted.
* txr.1: Documented new :resolve keyword in @(some).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lisp. The difference is that splice is spelled ,* because @
already means something, and that there is only one quote operator.
None of this does anything; it is only syntax.
* lib.c (quote_s, qquote_s, unquote_s, splice_s): New variables.
(obj_init): New variables initialized.
* lib.h (quote_s, qquote_s, unquote_s, splice_s): Declared.
* parser.l: Added recognition rules.
* parser.y (SPLICE): New symbolic token.
(list): Added new syntax for quote and splicing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
need the single quote in the Lisp way for suppressing evaluation,
eventually.
I'm going with a Scheme-compatible syntax for character literals.
It has a richer repertoire of standard character names than Common
Lisp, and has a x convention for coding characters in hex.
* lib.c (obj_print): Print characters in a Scheme-like way.
* parser.h (end_of_char): New function declared.
* parser.l (grammar): Implement rules for #\ syntax, with
involving new HASH_BACKSLASH token.
(end_of_regex): Enhancement: added check that end_of_regex is
called in correct state, like the one in end_of_char.
(end_of_char): New function.
* parser.y (repeat_rep_helper, o_elems_transform, define_transform,
lit_char_helper): Functions changed to static.
(rl): Function moved down, past the grammar section.
(HASH_BACKSLASH): New terminal symbol.
(chrlit): Grammar redesigned.
(char_from_name): New function.
* txr.1: Character syntax documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tree representation of the TXR query.
* match.c (debuglf, sem_error, file_err, eval_form): Line number argument replaced
with the form to which the situation pertains. Location information is
pulled from the hash table entry associated with the form.
(dest_set, dest_bind, eval_form, vars_to_bindings): Context argument
renamed since it isn't a line number.
(struct match_line_ctx): spec_lineno member removed.
(ml_all, ml_bindings_specline): lineno parameter removed.
(LOG_MISMATCH, LOG_MATCH, h_var, h_skip, h_coll, h_parallel,
match_line): Pass elem to debuglf instead of line number.
as context.
(h_trailer, h_eol): define elem for LOG_MISMATCH and LOG_MATCH macros.
(h_fun): Pass elem variable to debuglf instead of line number.
Body stored as a simple cons cell once again (no line number).
(do_output_line): Line number parameter removed. Pass specline to
sem_error instead of line number.
(do_output): Adjusted for one less parameter in do_output_line.
(mf_from_ml): Pass one less parameter to ml_all. Conversion of
specline to spec is just a wrapping into a nested list,
with no line number.
(spec_bind): Linenumber variable parameter removed from macro.
Definition simplified.
(v_skip): Pass specline to debuglf instead of spec_linenum,
which is no longer computed.
(v_trailer): Use new definition of specline. Pass first_spec
to sem_error instead of spec_linenum.
Computation of ff_specline no longer has to skip line number.
(v_freeform, v_block, v_accept_fail, v_next, v_parallel, v_gather,
v_collect, v_merge, v_bind, hv_trampoline, v_cat, v_output,
v_try, v_defex, v_throw, v_deffilter, v_filter, match_funcall): Use new
definition of specline. Pass first_spec to sem_error instead of
spec_linenum. (v_forget_local): Specline computed differently since
there is no linenumber to skip.
(h_define): Back to implified representation of function with
no extra cell for line number.
(v_define, v_fun): Pass first_spec to sem_error instead of
spec_linenum. Back to implified representation of function with no
extra cell for line number.
(match_files): first_spec_item computed differently.
Pass first_spec to sem_error instead of spec_linenum.
* parser.h (source_loc): Declared.
* parser.l (source_loc): New function.
* parser.y:x (grammar): Removed line numbers from abstract sytnax
tree. A few more places needed the annotation of forms with location
info, and a couple of cases of the need to propagate the info was
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
outside of the code, in hash tables.
* filter.c (make_trie, trie_add): Update to three-argument
make_hash.
* hash.c (struct hash): New members, hash_fun, assoc_fun
acons_new_l_fun.
(ll_hash): Renamed to equal_hash.
(eql_hash): New static function.
(cobj_hash_op): Follows ll_hash rename.
(hash_grow): Use new function indirection to call hashing function.
(make_hash): New argument to specify type of hashing. Initialize new
members of struct hash.
(gethash_l, gethash, remhash): Use function indirection for hashing and
chain search and update.
(pushhash): New function.
* hash.h (make_hash): Declaration updated with new parameter.
(pushhash): Declared.
* lib.c (eql_f): New global variable.
(eql, assq, aconsq_new, aconsq_new_l): New functions.
(make_package): Updated to new three-argument make_hash.
(obj_init): gc-protect and initialize new variable eql_f.
* lib.h (eql, assq, aconsq_new, aconsq_new_l): Declared.
* match.c (dir_tables_init): Updated to there-argument make_hash.
* parser.h (form_to_ln_hash, ln_to_forms_hash): Global variables
declared.
* parser.l (form_to_ln_hash, ln_to_forms_hash): New global variables.
(grammar): Set yylval.lineno for tokens that are classified to
that type in parser.y.
(parse_init): Initialize and gc-protect new global variables.
* parser.y (rl): New static helper function.
(%union): New member, lineno.
(ALL, SOME, NONE, MAYBE, CASES, CHOOSE, GATHER,
AND, OR, END, COLLECT, UNTIL, COLL, OUTPUT, REPEAT,
REP, SINGLE, FIRST, LAST, EMPTY, DEFINE,
TRY, CATCH, FINALLY, ERRTOK, '('): Reclassified as lineno type.
In the grammar, these keywords can thus provide a stable line number
from the lexer.
(grammar): Numerous rules updated to add constructs to the
line number hash tables via the rl helper.
* dep.mk: Updated.
* Makefile (depend): Use the installed, stable txr in the
system path to update dependencies rather than locally built ./txr, to
prevent the problem that txr is broken because out out-of-date
dependencies, and thus cannot regenerate dependencies.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
variables moved from parser.l.
(opt_lisp_bindings): New variable.
(dump_bindings): Dump Lisp syntax bindings
on standard output if opt_lisp_bindings is set.
(v_cat): Do not complain about trailing material;
this is not compatible with horizontal cat.
* parser.l (opt_nobindings, opt_arraydims): Moved
to match.c.
* txr.c (txr_main): New options, --lisp-bindings
and the equivalent -l.
* txr.h: opt_lisp_bindings declared.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* match.c (gather_s): New keyword variable.
(v_gather): New function.
(syms_init): gather_s initialized.
(dir_tables_init): v_gather entered into table.
* match.h (gather_s): Declared.
* parser.l: GATHER token scanning added.
* parser.y: GATHER token added. gather_clause nonterminal added.
* txr.1: New directive documented.
* txr.vim: gather keyword introduced.
|
|
|
|
|
|
|
|
|
| |
* parser.h (parse_init): Declared.
* parser.l (parse_init): New function.
* txr.c (main): Call parse_init.
(txr_main): No need to gc-protect yyin_stream since parse_init does it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (prepared_error_message): New static variable.
(yyerror): Emit and clear prepared error message.
(yyerrprepf): New static function.
(yybadtoken): Function moved into parser.y.
(grammar): For irrecoverable lexical errors, stash error message
with yyerrprepf and return the special error token ERRTOK to generate a
syntax error. I could find no other interface to the parser to make it
cleanly exit.
* parser.y (ERRTOK): New terminal symbol, does not appear anywhere
in the grammar.
(spec): Bail after 8 errors, recover to nearest newline, and
use yyerrok to clear error situation.
(YYEOF): Provided by Bison, conditionally defined for other yacc-s.
(yybadtoken): Function moved from parser.l. Checks for the next
token being YYEMPTY or YYEOF, and also handles ERRTOK.
* stream.c (vformat_to_string): New function.
(format): If stream is nil, format to string and return it.
* stream.h (vformat_to_string): Declared.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
differences between expected and actual test output.
* parser.l (yybadtoken): Handle new terminal symbol, SPACE.
New rule for producing SPACE token out of an extent of
tabs and spaces.
* parser.y (SPACE): New terminal symbol.
(o_var): New nonterminal. I noticed that the var rule was
being used for output elements, and the var rule refers to
elem rather than o_elem. A new o_var rule is a simplified
duplicate of var.
(elem): Handle SPACE token. Transform to regex if it is
a single space, otherwise to literal text.
(o_elem): Handle SPACE token in output.
* tests/001/query-2.txr: This query depends on matching
single spaces and so needs to use escapes.
* tests/001/query-4.txr, test/001/query-4.expected: New test
case, based on query-2.txr. It produces the same output,
but is simpler thanks to the new semantics of space.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TODO: there should be some type safety with the new wli macro
so that if it is forgotten, there will be a diagnostic.
* configure (lit_align): New configuration variable
and configuration test. Generates LIT_ALIGN in config.h.
Fixed the integer-holds-pointer test for the different output
from the nm program on Cygwin. The arrays become common symbols
marked C which do not show an offset attribute, only size:
one less column.
* filter.c (to_html_table, from_html_table): wrap wide string
literals with the wli macro. This must be done from now on for
all literals and initializes of arrays that are going to be
directly converted to type tagged val-s.
* lib.h (wli): New macro.
(auto_str, static_str, litptr, lit_noex): Handle wide literals on
platforms where they are aligned to only two bytes, such that we don't
have two bits in the pointer. We can still add our 11 bit type tag, but
then when recovering the pointer to the data, we have may have
to fix up the pointer.
* parser.l: Another portability issue here. Flex generates a scanner
which has #include <unistd.h> in the middle, after the source file's
own #includes which can introduce macros. On Cygwin, there is some
hygiene problem whereby our "noreturn" macro causes the <unistd.h>
header to generate bad syntax and fail to compile. Stupid Cygwin
and even stupider flex! The workaround is to include <unistd.h>
at the top in the flex source.
* stream.c (string_out_put_char): This is one more place where
the string literal handling hack spreads.
* txr.c (version): Wrap string in wli.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nested lists. This is in anticipation of future features.
* lib.c (expr_s): New symbol variable.
(obj_init): expr_s initialized.
* lib.h (expr_s): Declared.
* match.c (dest_bind): Now takes linenum. Tests for the meta-syntax
denoted by the system symbols var_s and expr_s, and throws an
error.
(eval_form): Similar error checks added. Also, hack: do not add
file and line number to an exception which begins with a '('
character; just re-throw it. This suppresses duplicate line
number addition when this throw occurs across some nestings.
(match_files): Updated calls to dest_bind.
* parser.l (yybadtoken): Handle new token kind, METAVAR and METAPAR.
(grammar): Refactoring among patterns: TOK broken into
SYM and NUM, NTOK introduced, unused NUM_END removed.
Rule for @( producing METAPAR in nested state.
* parser.y (METAVAR, METAPAR): New tokens.
(meta_expr): New nonterminal.
(expr): meta_expr and META_VAR productions handled.
|
|
|
|
|
|
| |
hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y,
regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c,
unwind.h, utf8.c, utf8.h: Updated e-mail address.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* match.c (choose_s, longest_k, shortest_k): New variables.
(match_line, match_files): Introduced choose directive.
(match_init): Initialize new variables.
* match.h (choose_s): Declared.
* parser.l (yybadtoken): Handle CHOOSE.
(CHOOSE): Clause added for returning this token.
* parser.y: Added #include "match.h".
(CHOOSE): New token symbol.
(choose_clause): New nonterminal symbol.
(clause): choose_clause added.
(all_clause, some_clause, none_clause, maybe_clause,
cases_clause): Abstract syntax tree tweaked.
(choose_clause): New syntax.
(elem): Abstract syntax trees tweaked for many clauses.
New CHOOSE clauses.
(out_clause): New error case for choose_clause.
|
|
|
|
|
|
| |
state, regexes and string literals.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
| |
(match_line): Keyword arguments in coll implemented.
(match_init): chars_k variable initialized.
* parser.l (COLL): Lexical syntax changed to allow for
argument material.
* parser.y (elem): Coll syntax rewritten for arguments.
* txr.1: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
symbol variables.
(match_lines): Keyword arguments in collect implemented.
(match_init): New function.
* match.h (match_init): Declared.
* parser.l (COLLECT): Lexical syntax changed for COLLECT to
allow for argument material.
* parser.y (%union): obj renamed to val.
(exprs_opt): New nonterminal.
(collect_clause): Rewritten for arguments.
* txr.c (main): Call to match_init introduced.
|
|
|
|
|
|
|
|
|
|
| |
backslash codes for single backslash. Output clause can be empty.
* parser.l (char_esc): Backslash handled.
Use internal_error rather than abort.
(REGCHAR, LITCHAR): Backslash added to lexical syntax.
* parser.y (output_clause): Allow empty output clause.
|
|
|
|
|
|
| |
lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c,
regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h: Updated copyright year.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This turns out to be easy to do in NFA land.
The complement of an NFA has exactly the same number
and configuration of states and transitions, except
that the states have an inverted meaning; and furthermore,
failed character transitions are routed to an extra
state (which in this impelmentation is permanently
allocated and shared by all regexes). The regex &
is implemented trivially using DeMorgan's.
Also, bugfix: regular expressions like A|B|C are allowed
now by the syntax, rather than constituting syntax error.
Previously, this would have been entered as (A|B)|C.
|
| |
|
| |
|
|
|
|
|
|
|
| |
needed because suppressing generation of unput is requested via
the %option. In scanners generated by the legacy version of
flex, 2.5.4, still widely in use. this redundancy leads to
a multiple #define YY_NO_UNPUT and a compiler warning.
|
|
|
|
|
| |
unused functons yyunput and yyinput, thus getting rid of
some compiler diagnostics.
|
|
|
|
|
| |
that as an object to vformat, resulting in #<garbage: ...>
output.
|
|
|
|
|
|
|
|
| |
can be converted to a type long and vice versa. The configure
script tries to detect the appropriate type to use. Also,
some run-time checking is performed in the streams module
to detect which conversions specifier strings to use for
printing numbers.
|
|
|
|
|
|
|
|
|
|
| |
a system package instead of being hacked with the $ prefix.
Keyword symbols are provided. In the matcher, evaluation
is tightened up. Keywords, nil and t are not bindeable, and
errors are thrown if attempts are made to bind them.
Destructuring in dest_bind is strict in the number of items.
String streams are exploited to print bindings to objects
that are not strings or characters. Numerous bugfixes.
|
|
|
|
|
| |
we wouldn't have to declare object variables at all, so why
use an obtuse syntax to do so?)
|
| |
|
|
|
|
| |
of standard conformance.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
abstraction instead of directly using C standard I/O,
to eliminate most uses of C formatted I/O,
and fix numerous bugs, such variadic argument lists which
lack a terminating ``nao'' sentinel.
Bug 28033 is addressed by this patch, since streams no longer provide
printf-compatible formatting. The native formatter is extended with
some additional capabilities to take over.
The work on literal objects is expanded and they are now used
throughout the code base.
Fixed bad realloc in string output stream: reallocating by number
of wide chars rather than bytes.
|
|
|
|
|
|
|
|
| |
semantics on the input stream to wide character input.
Also, reading a query the command line (-c) must
read bytes from a UTF-8 encoding of the string.
We introduce a new get_byte function which can extract bytes
from streams which provide it.
|
|
|
|
|
| |
use wide character functions so that there is no illicit
mixing. (But the goal is to replace this usage with txr streams).
|
|
|
|
| |
Cleaned up some more issues related to extended characters.
|
|
|
|
|
|
|
|
|
| |
This is incomplete. There are too many dependencies on
wide character support from the C stream I/O library,
and implicit use of some encoding which may not be UTF-8.
The regex code does not handle wide characters properly.
Character type is still int in some places, rather than wchar_t.
Test suite passes though.
|
|
|
|
| |
Fix possible use of uninitialized ch.
|
|
|