summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* parser: allow funny UTF-8 in regexes and literals.Kaz Kylheku2021-04-084-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main idea in this commit is to change a behavior of the lexer, and take advantage of it in the parser. Currently, the lexer recognizes a {UANYN} pattern in two places. That pattern matches a UTF-8 character. The lexeme is passed to the decoder, which is expected to produce exactly one wide character. If the UTF-8 is bad (for instance, a code in the surrogate pair range U+DCxx) then the decoder will produce multiple characters. In that case, these rules return ERRTOK instead of a LITCHAR or REGCHAR. The idea is: why don't we just return those characters as a TEXT token? Then we can just incorporate that into the literal or regex. * parser.l (grammar): If a UANYN lexeme decodes to multiple characters instead of the expected one, then produce a TEXT token instead of complaining about invalid UTF-8 bytes. * parser.y (regterm): Recognize a TEXT item as a regterm, converting its string value to a compound node in the regex AST, so it will be correctly treated as a fixed pattern. (chrlit): If a hash-backslash is followed by a TEXT token, which can happen now, that is invalid; we diagnose that as invalid UTF-8. (quasi_item): Remove TEXT rule, because the litchars constituent not generates TEXT. (litchars, restlistchar): Recognize TEXT item, similarly to regterm. * tests/012/parse.tl: New file. * tests/012/parse.expected: Likewise.
* parser: fix few memory leaks in error recovery.Kaz Kylheku2021-04-081-0/+4
| | | | | | | | | * parser.y (var, o_var): In a few error productions in which we have a SYMTOK item, we should free the lexeme. This doesn't solve all leaks: any time we have a parser stack containing SYMTOK or TEXT items that belong to rules that have not yet been reduced, and the parse job is aborted due to errors, we leak those.
* parser: fix poor diagnosis of \x invalid escape.Kaz Kylheku2021-04-081-1/+12
| | | | | | | * parser.l (grammar): Because the \x pattern requires one or more digits after it, if they are not present, we simply report \x as an an unrecognized escape. It's better if we diagnose it properly as a \x that is not followed by digits.
* build: calm restless yacc.Kaz Kylheku2021-04-081-6/+1
| | | | | | | | | | * Makefile (%.tab.c %.tab.h): Remove the trick of keeping the old y.tab.h file if it has not changed. This was once a good idea, but now that we have a proper grouped targets pattern rule which knows that y.tab.h depends on and is produced from parser.y, the trick causes y.tab.h to be perpetually out of date due to its old time stamp, and so yacc is run on every build.
* doc: bad syntax under doc function.Kaz Kylheku2021-04-081-1/+1
| | | | * txr.1: Fix formatting.
* Version 256txr-256Kaz Kylheku2021-04-076-965/+1023
| | | | | | | | | | * RELNOTES: Updated. * configure, txr.1: Bumped version and date. * share/txr/stdlib/ver.tl: Bumped. * txr.vim, tl.vim: Regenerated.
* doc: support doc function on android.Kaz Kylheku2021-04-071-2/+2
| | | | | * share/txr/stdlib/doc-lookup.tl (open-url): Define for android, which has xdg-open in the termux environment.
* utf8: fix backtracking bugs in buffer decoder.Kaz Kylheku2021-04-072-3/+12
| | | | | | | | | | | | | | | | | | | | * utf8.c (utf8_from_buffer): Fix incorrect backtracking logic for handling bad UTF-8 bytes. Firstly, we are not backtracking to the correct byte. Because src is incremented at the top of the loop, the backtrack pointer must be set to src - 1 to point to the possibly bad byte. Secondly, when we backtrack, we are neglecting to rewinding nbytes! Thus after backtracking, we will not scan the entire input. Let's avoid using nbytes, and guard the loop based on whether we hit the end of the buffer; then we don't have any nbytes state to backtrack. * tests/017/ffi-misc.tl: New test case converting a three-byte UTF-8 encoding of U+DC01: an invalid character in the surrogate range. We test that the buffer decoder turns this into three characters, exactly like the stream decoder. Another test case for invalid bytes following a valid sequence start.
* awk: bugfix: string rs must not compile as regex.Kaz Kylheku2021-04-071-5/+5
| | | | | | | | * share/txr/stdlib/awk.tl (awk-state loop): When rs contains a string, do not pass it directly to regex-compile, because that function calls regex-parse when the argument is a string. Wrap it it a (compound ...) tree node to get it to be treated as sequence of characters to match.
* gc: fix astonishing bug in weak hash processing.Kaz Kylheku2021-04-061-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a flaw that has been in the code since the initial implementation in 2009. Weak hash tables are only partially marked during the initial garbage collection marking phase. They are put into a global list, which is then walked again to do the weak processing: to expire items which are not reachable, and then finish walking the table objects. Problem is, the code assumes that this late processing will not discover more hash tables and put them into that global list. This creates a problem when weak hash table contain weak hash tables, such as in the important and very common case when a global variable (binding stored in a weak hash table) contains a weak hash table! These hash tables discovered during weak hash table processing are partially marked, and left that way. The result is that their table vectors get prematurely scavenged by the garbage collector, and then fall victim to use-after-free crashing. Note: do_iters doesn't have this bug. Though the reachable_iters list resembles reachable_weak_hashes, the key difference is that do_iters does not do any marking, and so will not discover any more reachable objects. All it does is update some counts in the hashes to which the still-reachable iterators point. * hash.c (do_weak_tables): Clear the reachable_weak_hashes list on entry into the function, taking a local copy of its head. After walking the list, check the global variable again; it if has become non-null, it means more weak tables were discovered and added to the list. In that case, make a recursive call (susceptible to tail call treatment) to process the list again.
* qref: bugfix: handle a.(b).?c correctly.Kaz Kylheku2021-04-051-1/+1
| | | | | * share/txr/stdlib/struct.tl (qref): Do not assume that (b) is the name of a slot to be looked up. Use qref to handle it.
* struct: fix lack of hygiene in null-safe qref.Kaz Kylheku2021-04-051-1/+3
| | | | | | | | | The expression a.?b is not being treated hygienically; a is evaluated twice. This is only if the null-safe object is the left most; a.b.?c is hygienic. * share/txr/stdlib/struct.tl (qref): Add the necessary gensym use to fix the broken case.
* doc: document null-safe method call.Kaz Kylheku2021-04-051-4/+26
| | | | | * txr.1: The notation obj.?(fun ...) exists, but is not documented. Let's fix that.
* compiler: remove optional param from lookup-var.Kaz Kylheku2021-04-051-5/+3
| | | | | | * share/txr/stdlib/compiler.tl (struct env): The mark-used optional parameter of lookup-var is not used anywhere, and so always nil. Let's remove it.
* INSTALL: revise outdated text, add cross-compiling advice.Kaz Kylheku2021-04-041-5/+40
| | | | | | | * INSTALL: Mention the parallel debug and optimized build capability of txr: no need to have two separate directories for that. New section on handling the .tl files in cross-compilation, when the txr executable isn't native.
* doc: remove superfluous words.Kaz Kylheku2021-04-041-1/+1
| | | | | * txr.1: under "File-Wide Insertion of Gensyms", remove superfluous verb phrase from sentence.
* doc: vice versa formatting.Kaz Kylheku2021-04-041-1/+1
| | | | | * txr.1: Under "Treatment of Literals", fix lack of close double quote in italicization of vice versa.
* doc: clarify definition of top-level form.Kaz Kylheku2021-04-041-3/+6
| | | | | | | * txr.1: In the definition of what is a top-level form to the compiler, replace poor wording about macro-expansion in rule 6, and add a rule which makes it clear that the rules are recursive.
* doc: note about environment handling in compile.Kaz Kylheku2021-04-041-1/+11
| | | | | | * txr.1: Add notes about environment handling when an interpreted function is compiled, and how hlet/hlet* can be used to obtain sharing.
* doc: fix missing item periods.Kaz Kylheku2021-04-041-20/+20
| | | | * txr.1: All missing item number periods added.
* doc: double word in awk intro.Kaz Kylheku2021-04-041-1/+1
| | | | * txr.1: Fix "implement implement".
* awk: relax restriction on :name.Kaz Kylheku2021-04-042-10/+9
| | | | | | | | * share/txr/stdlib/awk.tl (sys:awk-expander): Do not impose stricter restrictions on :name than the block mechanism itself. * txr.1: Documentation updated.
* doc: block names need not be symbols.Kaz Kylheku2021-04-041-1/+7
| | | | | * txr.1: The block implementation doesn't care whether blocks are symbols; anything comparable with eq may be used.
* func-optparam-count: bugfix.Kaz Kylheku2021-04-031-1/+1
| | | | | | | * lib.c (get_param_counts): If there are no optional parameters, then the oa variable stays negative; we must turn that into a zero, otherwise we return the bogus value -1 as the number of optional arguments.
* lib: new function for documentation lookup.Kaz Kylheku2021-04-035-1/+2142
| | | | | | | | | | | | | | | | * genman.txr: dump contents of symhash into a doc-syms.tl library file, as a defvarl form. * lisplib.c (doc_instantiate, doc_set_entries): New static functions. (lisplib_init): Register autoload for doc-lookup module to symbols doc and *doc-url*. * share/txr/stdlib/doc-lookup.tl: New file. * share/txr/stdlib/doc-syms.tl: Likewise. * txr.1: Documented.
* doc: dialect note capitalization.Kaz Kylheku2021-03-311-8/+8
| | | | * txr.1: Consistently capitalize Dialect Note
* doc: PP fixes.Kaz Kylheku2021-03-311-4/+0
| | | | | * txr.1: Remove two unnecessary .PP directives and a blank line before one.
* doc: formatting of notes under circle, erase notation.Kaz Kylheku2021-03-311-3/+7
| | | | | | * txr.1: Don't use TP* for notes and dialect notes because it doesn't fit these paragraphs that don't have an indented margin.
* doc: bad indenation under if directive.Kaz Kylheku2021-03-311-1/+1
| | | | * txr.1: Add .PP to deindent after example.
* doc: fix wording under --lispKaz Kylheku2021-03-311-2/+2
| | | | | * txr.1: Fix grammar problem and wording for --lisp and --compiled.
* doc: split up -l or --lisp-bindingsKaz Kylheku2021-03-311-1/+2
| | | | | * txr.1: Give the two -l and --lisp-bindings synonyms in the same way was other synonyms, as two separate .IP items.
* doc: style items better, without grid style.Kaz Kylheku2021-03-311-6/+11
| | | | | | * genman.txr: Use an alternative solution for dl.items elemens which places short items to the left of their definining text, while allowing long items to overhang.
* doc: blank lines after IP sections.Kaz Kylheku2021-03-302-21/+10
| | | | | | | | * checkman.txr (check-ip): New pattern function for checking for IP, coIP and meIP macros followed by blank line. This causes a formatting issue in HTML. * txr.1: Fix numerous instances of problem caught by check-ip.
* doc: missing RS/RE.Kaz Kylheku2021-03-301-0/+2
| | | | * txr.1: add .RS/.RE pair in Quote and Quasiquote.
* doc: add grid styling to itemized lists.Kaz Kylheku2021-03-301-1/+14
| | | | | * genman.txr: add CSS rules targeting <dl class="items">, which are now supported in man2html.
* doc: incorrect synopsis of push.Kaz Kylheku2021-03-301-4/+5
| | | | | | * txr.1: Under the summary of place-mutating operations, rewrite the description of push which falsely claims that the pushed item is returned.
* compiler: incorrect self-check in spy framework.Kaz Kylheku2021-03-301-2/+2
| | | | | | * share/txr/stdlib/compiler.tl (compiler (pop-closure-spy, pop-access-spy)): The stack underflow checkt must be done by checking top, not the incoming spy argument.
* doc: copy and paste of :wrap under window-mapKaz Kylheku2021-03-301-1/+1
| | | | * txr.1: Fix about :reflect wrongly referring to :wrap.
* doc: fix under stream indentationKaz Kylheku2021-03-301-1/+1
| | | | * txr.1: indent-foff misspelled as intent-foff.
* doc: numerous grammar fixes.Paul A. Patience2021-03-281-21/+25
| | | | | * txr.1: Fix grammar, punctuation, formatting, and cases of misspellings landing on dictionary words.
* expander: fun: misleading diagnostic.Kaz Kylheku2021-03-281-1/+1
| | | | | * eval.c (do_expand): argument of fun is not in "operator position"; fixed wording.
* doc: fix space before period.Kaz Kylheku2021-03-281-2/+2
| | | | | * txr.1: Fix two occurrences of \*(TL being separated from a period by a space in the ARGUMENTS AND OPTIONS section.
* compiler: cache param-info objects.Kaz Kylheku2021-03-272-13/+16
| | | | | | | | | | | | | | | * share/txr/stdlib/compiler.tl (%param-info%): New global variable. (compiler comp-fun-form): Use get-param-info function to get param-info object. (get-param-info): Retrieve object from cache, using the function as the key. If not found, create the entry. (compiler-emit-warning): Use get-param-info. * share/txr/stdlib/param.tl (struct param-info): Remove symbol slot, replacing it with the function. (param-info :postinit): No need to do symbol-function lookup; the function is given.
* compiler: regressions in source loc propagationKaz Kylheku2021-03-271-25/+27
| | | | | * share/txr/stdlib/compiler.tl (reduce-lisp, reduce-constant): Propagate source location to rewritten forms.
* compile/eval: more standard formatting for diags.Kaz Kylheku2021-03-275-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch eliminates parentheses from the error messages, as well as a leading ./ being added to relative paths. The word "warning: " is moved into the error message, so that it does not appear before the location. Example, when doing (compile-file "path/to/foo.tl"). Before patch: warning: (./path/to/foo.tl:37): unbound function foo After: path/to/foo.tl:37: warning: unbound function foo Now when I compile out of Vim, it nicely jumps to errors in Lisp code. * eval.c (eval_exception): Drop parentheses from error location, add colon. (eval_warn): Prepend "warning: " to format string. (eval_defr_warn): Drop parentheses from location, and prepend "warning: " to format string. * parser.c (repl-warning): Drop "warning:" prefix. * share/txr/stdlib/compiler.tl (open-compile-streams): Do not do parent substitution for relative paths if the parent path is the empty string "", to avoid inserting ./ onto relative paths in that case. * share/txr/stdlib/error.tl (sys:loc): Drop parentheses and space from location. (compile-error) Separate location with colon and space. (compile-warning, compile-defr-warning): Likewise and add "warning: " prefix. * unwind.c (uw_rthrow): Drop "warning: " prefix. (uw_warningf): Add "warning: " prefix. (uw_dump_deferred_warnings): Drop "warning: " prefix.
* compiler: bugfix: bad expand-quasi-mods call.Kaz Kylheku2021-03-271-1/+1
| | | | | | | | | * share/txr/stdlib/compiler.tl (expand-quasi-args): Here, expand-quasi-mods is being called with the wrong number of arguments. This was likely intended to be a recursive call to expand-quasi-args. Let's convert it to that. Removing this case also works, but it is nicer not to generate the sys:fmt-simple call.
* compiler: check number of arguments.Kaz Kylheku2021-03-272-35/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We implement rudimentary compile-time checking beween function calls and function definitions. * share/txr/stdlib/compiler.tl (dstruct frag): We add one more optional BOA parameter, corresponding to a new slot. This is used when compiling a lambda. A lambda fragment is annotated with the parameter parser object which gives information about its arguments. (struct fbinding): New slot, pars. When processing a sys:fbind or sys:lbind form, we decorate the lexical function bindings with the parameter object pulled from the lambda fragment that is compiled for each function binding. (*unchecked-calls*): New special variable. This is used for checking, at the end of the compilation unit, the arguments of calls to functions that were not defined at the time of the call. (compiler comp-fbind): When processing the lambda expressions, propagate the parameter object from the compiled lambda fragment to the function binding. (compiler comp-fun-form): On entry, look up the function being called and if it is lexical or has a global definition, check the arguments. If it has no definition, push information into the *unchecked-calls* list to do the check later, if possible. Also, there is a behavior change here now: optimizations are now applied here only to functions that don't have a lexical binding. Thus if the application lexically redefines a standard function, and calls it, we won't try to optimize it. (param-check): New function. * share/txr/stdlib/param.tl (param-info): New struct. This presents information about a global function in a similar way to param-parser, using some of the same fields. With this object we can check the call to a lexical function or global function in a uniform way, using the same code.
* compiler: fix: careless constant folding of call.Kaz Kylheku2021-03-271-1/+4
| | | | | | | | | | | * share/txr/stdlib/compiler.tl (compiler comp-apply-call): The conditions for constant-folding a call expressions are too weak. The first argument could be a quoted symbol, which is a constant expression, and so we end up wrongly evaluating an expression like (call 'print '3) at compile time. We can constant-fold if the first expression evaluates to a symbol, which names a constant-foldable function, or else if it evaluates to something which is not a bindable symbol.
* Version 255txr-255Kaz Kylheku2021-03-266-971/+1010
| | | | | | | | | | * RELNOTES: Updated. * configure, txr.1: Bumped version and date. * share/txr/stdlib/ver.tl: Bumped. * txr.vim, tl.vim: Regenerated.
* doc: improve wording under copy-hash.Kaz Kylheku2021-03-261-5/+6
| | | | | * txr.1: Relationship between make-similiar-hash and copy-hash is expressed more accurately.