summaryrefslogtreecommitdiffstats
path: root/stdlib/optimize.tl
Commit message (Collapse)AuthorAgeFilesLines
* Copyright year bump 2025.Kaz Kylheku2025-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright bumped to 2025.
* tests: suppress warnings in seq.tl.Kaz Kylheku2024-03-081-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When tests/012/compile.tl compiles tests/012/seq.tl, there are now some compiler warnings due to constant expressions that throw. We introduce a new compiler option to suppress them, and then use it. * stdlib/comp-opts.tl: New file. The definitions related to compiler options are moved here out of compile.tl, so that optimize.tl can use them. * stdlib/compiler.tl (compile-opts, %warning-syms%, when-opt, *compile-opts*, opt-controlled-diag): Moved to comp-opts.tl. New constant-throws option added to compile-opts and %warning-syms%. (safe-constantp): Make the constant expression throws diagnostic conditional on the new option. * stdlib/optimize.tl: Load comp-opts file. (basic-blocks do-peephole-block): Make diagnostic about throwing situation subject to constant-throws option. * tests/012/seq.tl: Turn off constant-throws warning option before the ref tests that work with ranges. Fix: one of the expressions calls refs with the wrong number of arguments, which was unintentional. * txr.1: Document new diagnostic option.
* Copyright year bump 2024.Kaz Kylheku2024-01-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2024.
* compiler: optimizer must watch for throwing constant exprsKaz Kylheku2023-12-201-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have these issues, which are regressions: 1> (compile-toplevel '(/ 1 0)) ** expr-1:1: warning: sys:b/: constant expression (sys:b/ 1 0) throws ** /: division by zero ** during evaluation at expr-1:1 of form (sys:b/ 1 0) 1> (compile-toplevel '(let ((a 1) (b 0)) (/ a b))) ** /: division by zero ** during evaluation at expr-1:1 of form (compile-toplevel [...]) While the compiler's early pass constant folding is careful to detect constant expressions that throw, care was not taken in the optimizer's later constant folding which takes place after constant values are propagated around. After the fix: 1> (compile-toplevel '(let ((a 1) (b 0) (c t)) (if c (/ a b)))) ** expr-1:1: warning: let: function sys:b/ with arguments (1 0) throws #<sys:vm-desc: 9aceb20> 2> (compile-toplevel '(let ((a 1) (b 0) (c nil)) (if c (/ a b)))) #<sys:vm-desc: 9aef9f0> * stdlib/compiler.tl (compiler): New slot top-form. (compile-toplevel): Initialize the top-form slot of the compiler. The optimizer uses this to issue a warning now. Since the warning is based on analyzing generated code, we cannot trace it to the code more precisely than to the top-level form. * stdlib/optimize.tl (basic-blocks): New slot, warned-insns. List of instructions that have been warned about. (basic-blocks do-peephole-block): Rearrange the constant folding case so that as part of the pattern match condition, we include the fact that the function will not throw when called with those constant arguments. Only in that case do we do the optimization. We warn in the case when the function call does throw. A function rejected due to throwing could be processed through this rule multiple times, under multiple peephole passes, so for that reason we use the warned-insns list to suppress duplicate warnings.
* compiler: bug: constant folding load-time dregs.Kaz Kylheku2023-08-041-1/+2
| | | | | | | | | | | | | | | | The optimizer eliminates calls to pure library functions when all their arguments are D-registers. The call is made at compiled time and its value is inserted into the program as a constant (in a newly allocated D register). The bug is that we can't do this for a D register that is linked to a load-time value, because we don't know its value until run-time. * stdlib/optimize.tl (basic-blocks do-peephole-block): Add a constraint that none of the D registers can be a member of bb.lt-dregs, which holds the list of D registers that are used for load-time values.
* compiler: bug: disappearing basic block nojoin flag.Kaz Kylheku2023-07-311-0/+1
| | | | | | | | | | | Discovered while experimenting with new optimizations. * stdlib/optimize.tl (basic-blocks join-block): When we join the following block into the current block, we must propagate the nojoin property of the following block. The nojoin property has to do with the last instruction being xend. The joined block has that last instruction and so must be nojoin.
* compiler: bugfix: dangling rlinks after dead code eliminationKaz Kylheku2023-07-311-8/+10
| | | | | | | | | | | | | Discovered while experimenting with new optimizations. * stdlib/optimize.tl (basic-blocks :postinit): Pass t argument to new parameter of basic-blocks link-graph. (basic-blocks link-graph): New parameter indicating whether this is the first call; if false, we reset all the links. (basic-blocks elim-dead-code): This no longer has to reset the links before calling link-graph. But now calls link-graph one more time after the dead code removal so that no dead blocks appear in the graph.
* compiler: use partition-if for basic block division.Kaz Kylheku2023-07-281-5/+5
| | | | | | | | * stdlib/optimize.tl (basic-blocks :postinit): Calculate the basic block partitions more directly using partition-if, eliminating the calculation of two sequences of indices that have to be merged and then passed to the partition function.
* compiler: compress symbol tables also.Kaz Kylheku2023-07-261-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | When functions are optimized away due to constant folding, instead of replacing them with a nil, we now compact the table to close the gaps and renumber the references in the code. * stdlib/compiler.tl (compiler null-stab): Method removed. (compiler compact-dregs): Renamed to compact-dregs-and-syms. Now compacts the symbol table also. This is combined with D-reg compacting because it makes just two passes through the instruction: a pass to identify the used D registers and symbol indices, and then another pass to edit the instructions with the renamed D registers and renumbered symbol indices. (compiler optimize): Remove the call to the null-unused-data on the basic-blocks object; nulling out D regs and symbol table entries is no longer required. Fllow the rename of compact-dregs to compact-dregs-and-syms which is called the same way otherwise. * stdlib/optimize.tl (basic-blocks null-unused-data): No longer used method removed.
* compiler: compact D registers.Kaz Kylheku2023-07-251-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | We now have some constant folding in the optimizer too, not just in the front end compiler pass. This is leaving behind dead D registers that are not referenced in the code. Let's compact the D register table to close the gap. * stdlib/compiler.tl (compiler get-dreg): In this function we no longer check that we have allocated too many D registers. We let the counter blow past %lev-size%. Because this creates the fighting chance that the compaction of D regs will reduce their number to %lev-size% or less. By doing this, we allow code to be compilable that otherwise would not be: code that allocates too many D regs which are then optimized away. (compiler compact-dregs): New function. Does all the work. (compiler optimize): Compact the D regs at optimization level 5 or higher. (compile-toplevel): Check for an overflowing D reg count here, after optimization. * stdlib/optimize.tl (basic-blocks null-unused-data): Here, we no longer have to do anything with the D registers.
* compiler: code formatting.Kaz Kylheku2023-07-251-1/+1
| | | | | | | | * stdlib/compiler.tl (compiler get-dreg): Fix indentation proble. * stdlib/optimize.tl (basic-block fill-treg-compacting-map): Likewise.
* compiler: constant fold gapply like gcall.Kaz Kylheku2023-07-171-3/+6
| | | | | | | * stdlib/optimize.tl (basic-blocks do-peephole-block): Extend constant-folding case to recognize gapply as well as gcall. We just have to take care in how we apply apply arguments to the actual function to get the value.
* compiler: new apply-to-gapply optimizationKaz Kylheku2023-07-171-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's consider the DIM expression [a . b c]. Without this change we get: syms: 0: a 1: b 2: c code: 0: 98020000 getf t2 0 1: 98030001 getf t3 1 2: 98040002 getf t4 2 3: 1C020002 apply t2 t2 t3 t4 4: 00030002 5: 00000004 6: 10000002 end t2 With this change: syms: 0: a 1: b 2: c code: 0: 98030001 getf t3 1 1: 98040002 getf t4 2 2: 24020002 gapply t2 0 t3 t4 3: 00030000 4: 00000004 5: 10000002 end t2 There are 17 hits for this optimization in optimize.tl alone! * stdlib/optimize.tl (basic-blocks do-peephole-block): New pattern here. We recognize an instruction sequence which begins with a (getf treg idx) and ends in an (apply dest treg ...), without any instructions in between accessing or redefining treg. Additionally, the treg value must not be used after the apply, unless the apply redefines it. In that case, we rewrite this pattern to eliminate that initial getf instruction, and substitute (gapply dest idsx ..) for the apply.
* compiler: recognize T0 register (nil) as constant.Kaz Kylheku2023-07-151-2/+4
| | | | | | * stdlib/optimize.tl (basic-blocks do-peephole-block): The constant folding case should fire even if some of the arguments of the call aren't D registers but T0.
* compiler: move material into constfun.tlKaz Kylheku2023-07-151-0/+2
| | | | | | | | | | | | | * stdlib/compiler.tl (%effect-free-funs%, %effect-free%, %functional-funs%, %functional%): Move variables into stdlib/constfun.tl * stdlib/constfun.tl %effect-free-funs%, %effect-free%, %functional-funs%, %functional%): Moved here. * stdlib/optimize.tl: Use load-for to express dependency on constfun module; don't depend on the compiler having loaded it.
* compiler: constant folding in optimizer.Kaz Kylheku2023-07-151-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The compiler handles trivial constant folding over the source code, as a source to source transformation. However, there are more opportunities for constant folding after data flow optimizations of the VM code. Early constant folding will not fold, for instance, (let ((a 2) (b 3)) (* a b)) but we can reduce this to an end instruction that returns the value of a D register that holds 6. Data flow optimizations will propagate the D registers for 2 and 3 into the gcall instruction. We can then recognize that we have a gcall with nothing but D register operands, calling a constant-foldable function. We can allocate a new D register to hold the result of that calculation and just move that D register's value into the target register of the original gcall. * stdlib/compiler.tl (compiler get-dreg): When allocating a new D reg, we must invalidate the datavec slot which is calculated from the data hash. This didn't matter before, because until now, get-datavec was called after compilation, at which point no new D regs will exist. That is changing; the optimizer can allocate D regs. (compiler null-dregs, compiler null-stab): New methods. (compiler optimize): Pass self to constructor for basic-blocks. basic-blocks now references back to the compiler. At optimization level 5 or higher, constant folding can now happen, so we call the new method in the optimizer to null the unused data. This overwrites unused D registers and unused parts of the symbol vector with nil. * stdlib/optimize (basic-blocks): Boa constructor now takes a new leftmost param, the compiler. (basic-blocks do-peephole-block): New optimization case: gcall instruction invoking const-foldable function, with all arguments being dregs. (basic-blocks null-unused-data): New method.
* compiler: bugfix: wrong condition in late-peephole.Kaz Kylheku2023-05-041-1/+1
| | | | | | | | | | | * stdlib/optimize.tl (basic-blocks late-peephole): The test whether lab2 is used is bogus, and will never be true. The correct test is simply whether the block has two or more rlinks. This makes no difference in the standard library images. When the bug appears, the manifestation would be that a needed label is deleted, resulting in an exception from the assembler.
* compiler: liveness bug involving closures.Kaz Kylheku2023-05-041-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 2022-09-13 commit 6e354e1c2d5d64d18f527d52db75e344a9223d95, subject "compiler: bugfixes in dead code elimination", introduced a problem. By allowing the closure body blocks to be included in the links of the previous basic block that ends in the close instruction, it caused liveness info to flow out out of close blocks into the close instruction, which is wrong. Thus registers used inside a closure, which are entirely private, wrongly appear live outside of the closure, interfering with optimizations like eliminating dead registers. We can't simply roll back the commit because the bug it fixes will reappear. The fix is to pair the next field with a prev field, and maintain them; don't rely on the rlinks to point to the previous block. * stdlib/optimize.tl (basic-block): New slot, prev. (back-block join-block): As we delete the next block, we must update that block's next block's prev link. (basic-blocks link-graph): Build the prev links. Fix the bug in handling the close instruction: do not list the close body code among the links, only the branch target of the close. (basic-blocks do-peephole-block): In a few cases in which we set the bl.next to nil, we also set the bl.next.prev to nil, if bl.next exists. (basic-blocks elim-dead-clode): Reset the bl.prev of every block also. (basic-block check-bypass-empty): Here, we no longer depend on rlinks containing the previous block; the prev gives it to us. So we move that fixup out of the link, and also fix up the next blocks prev pointer.
* compiler: simplify live-info defined set semantics.Kaz Kylheku2023-04-181-6/+4
| | | | | | | | * stdlib/optimize.tl (basic-blocks local-liveness): Just store the mask of defined registers into each live-info. Do not propagate the defined mask from the next instruction backwards. The way the defined mask is used in calc-liveness, this makes no difference, and is simpler and faster.
* compiler: propagate t-reg rename into end insn.Kaz Kylheku2023-04-171-1/+1
| | | | | | | | * stdlib/optimize.tl (basic-blocks rename): When we stop the renaming due to an end instruction and the src being a v-reg, we can still do the rename in that end instruction itself. If the v-reg becomes invalid, that doesn't happen until after the instruction.
* compiler: rewrite t-regs through defining instruction.Kaz Kylheku2023-04-171-6/+24
| | | | | | | | | | | | | * stdlib/optimize.tl (subst-preserve): Rename list param to insn for clarity. (careful-subst-preserve): New function. This is like subst-preserve, but used only for instructions that have destination registers. It performs a rewrite such that those destination positions are avoided. (basic-blocks rename): When the instruction has src or dst as a target, don't just stop before that insn. Do the substitution in the source operands using careful-subst-preserve.
* compiler: move peephole pattern and remove condition.Kaz Kylheku2023-04-171-15/+11
| | | | | | | | | | | | | | | | | * stdlib/optimize.tl (basic-blocks do-peephole-block): Remove the local function only-locally-used-treg. This is unnecessary because the optimization is valid even if the treg is used in downstream basic blocks. It was necessary previously in the old version of this optimization in which we deleted the first instruction which sets the treg's value. We are now depending on it being identified as a dead register. Also, moving the rule to the end. The reason is that there are cases when the pattern matches, but it returns insns. That causes the rewrite macro to march down to the next instruction, skipping other patterns. This could be bad, unless the pattern is the last one tried before the @else fallback.
* compiler: allow v reg source in t-reg optimizationKaz Kylheku2023-04-161-13/+15
| | | | | | | | | | | | This change is now possible due to the previous bugfix. * stdlib/optimize.tl (basic-blocks rename): If the source register is a v-reg, do not allow the propagation past an end instruction. This is a precaution because the end instruction could be the end of the frame in which the v-register is valid; we don't want to propagate it outside of that frame.
* compiler: bugfix: wrong propagation into close insn.Kaz Kylheku2023-04-161-1/+3
| | | | | | | | | * stdlib/optimize.tl (basic-blocks rename): When we encounter a close instruction, we must leave it alone. The registers named in the argument area of the instruction do not belong to the current instruction stream or basic block; they belong to the function body.
* compiler: tighten cases in liveness calculationKaz Kylheku2023-04-151-8/+8
| | | | | | | | * stdlib/optimize.tl (basic-blocks local-liveness): Handle all instructions explicitly with no catch-all behavior. Make a copy of the live-info even for instructions that have no source or destination operands, so that they don't mistakenly marked as having defs or refs.
* compiler: keep track of multiple defs in live-info.Kaz Kylheku2023-04-101-17/+22
| | | | | | | | | | | * stdlib/optimize.tl (live-info): Slot def replaced by def0 and def1. (basic-blocks local-liveness): The local function def becomes defs: it can take two defs. These become def0 and def1. In the catch instruction case, we use both arguments, capture the resulting live-info and use it to call refs. (basic-blocks rename): Check whether either def0 or def1 is the source or destination.
* compiler: streamline live-info object creation.Kaz Kylheku2023-04-101-9/+8
| | | | | | | | | | | | | | * stdlib/optimize.tl (basic-blocks local-liveness): When processing a pure def, we don't copy the live-info unconditionally, which is waseteful since if the destination register is a t-reg, we will invoke (new live-info) to make yet another live info. Instead, let's destructively mutate the incoming live info from the instruction below, and return a copy that is made before that is done. In the def-ref case, the local copy is entirely superfluous, because in all cases we return a new object. We also eliminate redundant (set [bb.li-hash insn] li) evaluations.
* compiler: bug in liveness calculation over catch insnKaz Kylheku2023-04-101-1/+4
| | | | | | | | | * stdlib/optimize.tl (basic-blocks local-liveness): The exception symbol and argument registers in the catch instruction are clobbers, not references. We must treat them as defs. Unfortunately, the instruction has two clobbers but live-info has only one def slot, which should be fixed.
* compiler: improve t-reg copy elimination.Kaz Kylheku2023-04-101-7/+18
| | | | | | | | | | | | | * optimize.tl (rename): Instead of a mapping operation, we perform the substitution only until we hit an instruction that defines either the src or dst register. (basic-blocks do-peephole-block): Drop the conditions for doing the rename: that neither register can be defined somewhere in the rest of the block. This restriction is too limiting. We have to be careful now; we cannot delete the first instruction, and must only set the recalc flag and add to the rescan list if the substitution did something, to avoid looping.
* compiler: buggy t-reg move peephole case.Kaz Kylheku2023-04-101-3/+1
| | | | | | | | | | | * stdlib/optimize.tl (basic-blocks do-peephole-block): In the unnecessary copying t-reg case, let's just stay away from doing it if the source operand is a v-reg. It breaks under the recent "eval order of variables" commit, indicating that the conditions that it uses for replacing a v-reg with the t-reg are not correct. The most likely reason is that the v-reg can be assigned, but this doesn't show up in the liveness info which tracks only t-regs.
* compiler: small fix in optimizer.Kaz Kylheku2023-04-081-2/+2
| | | | | | * stdlib/optimizer.tl (basic-blocks do-peephole-block): Use pushnew instead of push in one peephole case, so the block isn't pushed onto the tryjoin and rescan lists twice.
* compiler: iterate on level 4-5 optimizations.Kaz Kylheku2023-04-071-0/+3
| | | | | | | | | | | | | * stdlib/optimize.tl (basic-blocks num-blocks): New method. * stdlib/compiler.tl (compiler optimize): At optimization level 6, instead of performing one extra pass of jump threading, dead-code elimintation and peephole optimizations, keep iterating on these until the number of basic blocks stays the same. * txr.1: Documented.
* compiler: optimization improvementsKaz Kylheku2023-04-071-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/optimize.tl (basic-blocks peephole-block): Drop the code argument, and operate on bl.insns, which is stored back. Perform the renames in the rename list after the peephole pass. (basic-blocks rename): New method. (basic-blocks do-peephole-block): Implementation of peephole-block, under a new name. The local function called rename is removed; calls to it go to the new rename method. (basic-blocks peephole): Simplify code around calls to peephole-block; we no longer have to pass bl.insns to it, capture the return value and store it back into bl.insns. * stdlib/compiler.tl (*opt-level*): Initial value changes from 6 to 7. (compiler optimize): At optimization level 6, we now do another jump threading pass, and peephole, like at levels 4 and 5. The peephole optimizations at level 5 make it possible to coalesce some basic blocks in some cases, and that opens up the possibility for more reductions. The previously level 6 optimizations are moved to level 7. * txr.1: Updated documentation of optimization levels, and default value of *opt-level*. * stdlib/doc-syms.tl: Updated.
* compiler: small local refactoring in optimizer.Kaz Kylheku2023-04-061-8/+4
| | | | | | | * stdlib/optimize.tl (basic-blocks peephole-block): Move local rename function into main labels block, so other optimizations will be able to use it. Remove an unused argument, and change the recursion to a mapcar, since that's what it's doing.
* lib: switch from use function to ignore functionKaz Kylheku2023-03-211-1/+1
| | | | | | | | | | | * stdlib/compiler.tl (compiler (comp-atom, comp-dwim), safe-const-reduce, igno-notfound): Use ignore rather than use for marking unused variable. * stdlib/copy-file.tl (copy-files, copy-path-rec, remove-path-rec, chmod-rec, chown-rec): Likewise. * stdlib/optimize.tl (basic-block print): Likewise.
* compiler: unused warnings in optimizer.Kaz Kylheku2023-03-211-62/+62
| | | | | | | | | | | | | | | | | | * stdlib/optimizer.tl (basic-block print): Suppress warning for pretty-p parameter using the use function. (basic-blocks (local-liveness, calc-liveness, thread-jumps-block, peephole-block, late-peephole, fill-treg-compacting-map), (basic-block apply-treg-compacting-map), dedup-labels): Fix unused variables in pattern, mostly by replacing them by @nil. (basic-blocks check-bypass-empty): Method moved, turned into (basic-block check-bypass-empty), losing the unused basic-blocks parameter. (basic-blocks elim-next-jump): Likewise moved into basic-block class. (basic-blocks elim-dead-code): Calls to check-bypass-empty and elim-next-jump adjusted.
* compiler: bug: unmatchable pattern in optimizer.Kaz Kylheku2023-03-211-1/+1
| | | | | | | | | | | This was uncovered by unused variable warnings. * stdlib/optimize.tl (basic-blocks peephole-block): The pattern trying to detect wasteful register moves is incorrect; the reg1 term is intended to be the @reg1 variable. It is matched literally and so will not match because the symbol reg1 does not literally occur in the VM code.
* Copyright year bump 2023.Kaz Kylheku2023-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2023.
* compiler: bug: bad basic-block merge across end insn.Kaz Kylheku2022-09-151-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bad situation reproduced as a miscompilation of some prof forms at *opt-level* 5 or above. The basic idea is that there is a situation like this prof t2 ... profiled code here producing value in t8 mov t2 t8 end t2 end t2 The code block produces a value in t8, which is copied into t2, and executes the end instruction. This instruction does not fall through to the next one but passes control back to the prof instruction. The prof instruction then stores the result value, which came from t2, back into the t2 register and resumes the program at the end t2. The first bad thing that happens is that the end instructions get merged together into one basic block. The optimizer then treats them without regard for the prof instruction, as if they were a linear sequence. It looks like the register move mov t2 t8 is wasteful and so it eliminates it, rewriting the end instruction to: end t8 end t8 Of course, the second instruction is now wrong because prof is still producing the result in t2. To fix this without changing the instruction set, I'm introducing another pseudo-op that represents end, called xend. This is similar to jend, except that jend is regarded as an unconditional branch whereas xend isn't. The special thing about xend is that a basic block in which it occcurs is marked as non-joinable. It will not be joined with the following basic block. * stdlib/asm.tl (xend): New alias opcode for end. * stdlib/compiler.tl (comp-prof): Use xend to end prof fragment, rather than plain end. * stdlib/optimize.tl (basic-block): New slot, nojoin. If true, block cannot be joined with next one. (basic-blocks jump-ops): Add xend to list of jump ops, so that a basic block will terminate on xend. (basic-blocks link-graph): Set the nojoin flag on a basic block which contains (and thus ends with) xend. (basic-blocks local-liveness): Add xend to the case in def-ref that handles end. (basic-blocks (peephole, join-blocks)): Refuse to join blocks marked nojoin. * tests/019/comp-bugs.tl: New file with miscompiled test case that was returning 42 instead of (42 0 0 0) as a result of the wrong register's value being returned.
* compiler: bugfixes in dead code eliminationKaz Kylheku2022-09-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | * stdlib/optimize (basic-blocks ling-graph): I'm reverting an old design decision here. The decision is this: the basic block of a close instruction points to the first basic block of the closure as its next block, but that next block does not point back: it doesn't list the close instruction's basic block among the rlinks. The idea was that the close instruction doesn't jump to that block, and so it shouldn't be linked to it. However, the next link was set purely so that the graph is connected. Unfortunately, the inconsistency in the graph structure which this causes is a problem in the elim-dead-code method. A situation arises when that first basic block after the close is removed. Because pit has an empty rlinks list, the block remains listed as the next block of the close block, even though it is removed from the master list of blocks. (basic-blocks check-bypass-empty): Fix one forgotten detail in this function: the block being deleted must be removed from the rlinks list of the next block.
* optimizer: fix live set being unexpectedly nil.Kaz Kylheku2022-06-091-2/+2
| | | | | | | | | | | | | | | | | | | | The following test case throws an exception: (compile-toplevel '(when-match @(or @b @c) nil nil)) the reason is that in the optimizer, the local-liveness method resets the bl.live set of live registers to nil. The expectation is that it will be recalculated. However, what happens is that * stdlib/optimize.tl (basic-block): Initialize live slot to zero. (basic-blocks local-liveness): Reset bl.live to 0, rather than nil. The problem is that sometimes the block in question is not reached in the graph traversal (down in the same function) which is supposed to assign it a live value. This happens in the above test case, and it's not due to any bug: the block is not reached by the forward traversal because the block has become unreachable.
* optimizer: remove root slot from basic-block.Kaz Kylheku2022-06-091-6/+3
| | | | | | | | | | | | | | The root slot is just supposed to be (car bl.list): the first basic block. Unfortunately, it's not maintained when bl.list is edited, which could cause bugs. This change makes no difference in the stdlib compiled files, other than of course the changes in optimize.tlo from this code change itself; so that is evidence suggesting this change is least not making anything worse. * stdlib/optimize.tl (basic-blocks): Remove root slot. (basic-blocks (link-graph, calc-liveness, elim-dead-code): Refer to (car bl.list) instead of bl.root.
* compiler: few more cases of ifq/ifql removal.Kaz Kylheku2022-01-181-1/+10
| | | | | | | | | | * optimize.tl (basic-blocks peephole-block): Check for the reversed arguments case of (ifq (d x) (t 0)), and also match ifq. Add a case for two different d registers being compared by ifq or ifql which are not both implicated as load-time regs; that also converts to an unconditional jmp to the else label. Add a case for a register being compared with itself with ifq or ifql, which disappears.
* compiler: two optimizations, motivated by optional params.Kaz Kylheku2022-01-141-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivating situation is this: (lambda (: (opt :)) opt) When the default value of an optional parameter is : then the net effect is that there is no optional substituion. The optional argument is signaled by the : symbol, and that same symbol then replaces the value. This is not optimized well: data: 0: : 1: t syms: code: 0: 8C000009 close t2 0 3 9 1 0 nil t2 1: 00000002 2: 00000001 3: 00000003 4: 00000002 5: 3C000008 ifq t2 d0 8 6: 00020400 7: 2C020400 movsr t2 d0 8: 10000002 end t2 9: 10000002 end t2 instruction count: 5 entry point: 4 The instruction sequence 5: 3C000008 ifq t2 d0 8 6: 00020400 7: 2C020400 movsr t2 d0 8: serves no purpose; it's like: (if (eq x y) (set x y)) With this commit it looks like: data: 0: : 1: t syms: code: 0: 8C000006 close t2 0 3 6 1 0 nil t2 1: 00000002 2: 00000001 3: 00000003 4: 00000002 5: 10000002 end t2 6: 10000002 end t2 instruction count: 3 entry point: 4 * stdlib/optimize.tl (basic-blocks peephole-block): Here, we add an optimization for the useless assignment pattern. If an "ifq tx dy label" instruction falls through to a "mov tx dy", then we remove that move instruction from the next block. But only if that next block has nothing else jumping to it! If there are other jumps there, they could be relying on that "mov tx dy", so it cannot be removed. (basic-blocks elim-next-jump): The above optimization may leave us with a useless ifq instruction, which jumps to the same destination whether the comparison is true or not. In elim-next-jmp, we took care only of jmp instructions which uselessly jump to the next block in instruction order. We fix this to also eliminate if and ifq instructions whose destination label is the next block; they are equivalent to an unconditional jump.
* Copyright year bump 2022.Kaz Kylheku2022-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2022.
* New functions: subq, subql, subqual and subst.Kaz Kylheku2021-12-221-5/+0
| | | | | | | | | | | | | | | * eval.c (eval_init): Register new intrinsics. * lib.c, lib.h (subq, subql, subqual, subst): New functions. * tests/012/seq.tl: New test cases. * stdlib/optimize.tl (subst): Function removed. The new subst drop-in replaces this one. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* compiler: small end/jend issue in late-peephole.Kaz Kylheku2021-12-111-1/+1
| | | | | | | | | * stdlib/optimize.tl (basic-blocks late-peephole): In one pattern, an instruction that is recognized as (jend ...) is inadvertently rewritten to (end ...). Since this is the last optimization stage, currently, and end and jend are synonyms for the same opcode, it doesn't matter. But it could turn into a bug; let's fix it.
* compiler: tweak in basic block debug print.Kaz Kylheku2021-12-111-1/+1
| | | | | | * stdlib/optimize (basic-block print): Print the label of the next block, rather than the block itself. This reduces the verbosity during debugging.
* compiler: register-compacting optimization.Kaz Kylheku2021-12-101-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This work addresses the following issues, which cause compiled functions to use more stack space than necessary. Firstly, the compiler doesn't allocate registers tightly. Every closure's registers can start at t2, but this isn't done. Secondly, data flow optimizations eliminate registers, leaving gaps in the register allocation. Code ends up strange: you may see register t63 used in a function where the next highest register below that is t39: evidence that a large number of temporary variables got mapped to registers and eliminated. In this change, an optimization is introduced, active at *opt-level* 6 which compacts the t registers for every closure: every closure's registers are renumbered starting from t2. Then, each closure' generating close instruction is also updated to indicate the accurate number of registers ensuring no space is wasted on the stack when the closure is prepared for execution. * stdlib/compiler.tl (compiler optimize): At optimization level 6, insert a call to basic-blocks compact-tregs just before the instruction are pulled out and put through the late peephole pass. * stdlib/optimize.tl (basic-block): New slot, closer. This is pronounced "clozer", as in one who closes. For any basic block that is the head of a closure (the entry point ito the closure code), this slot is set to point to the previous block: the one which ends in the close instruction which creates this closure: the closer. This is important because the close instruction can use t registers for arguments, and those registers belong to the closure. Those argument registers must be included in the renaming. (basic-blocks): New slots closures and cl-hash. The former lists the closure head basic blocks; all the basic blocks which are the entry blocks of a closure. The cl-hash associates each head block with a list of all the blocks (including the head block). (basic-blocks identify-closures): New method. This scans the list of blocks, identifying the closure heads, and associating them with their closers, to establish the bb.closure list. Then for each closure head in the list, the graph is searched to find all the blocks of a closure, and these lists are put into the bb.cl-hash. (basic-block fill-treg-compacting-map): This method scans a basic block, ferreting out all of it t registers, and adds renaming entries for them into a hash that is passed in. If the block is the head of a closure, then the close instruction of the block's closer is also scanned for t registers: but only the arguments, not the destination register. (basic-block apply-treg-compacting-map): This method renames the t register of a block using the renaming map. It follows the registers in the same way as fill-treg-compacting-map, and consquently also goes into the close instruction to rename the argument registers, as necessary. When tweaking the close instruction, it also updates the instruction's number-of-tregs field with the newly calculated number, which comes from the map size. (basic-blocks compact-tregs): This method ties it together, using the above methods to identify the basic blocks belonging to closures, build register renaming maps for them, and hen apply the renaming.
* compiler: new late-peephole case.Kaz Kylheku2021-11-291-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is related to the pattern in the previous commit. When we have a situation like this: lab1 mov tn nil lab2 ifq tn nil lab4 lab3 gcall tn ... We know that if lab1 is entered, then lab2 will necessarily fall through: the lab4 branch is not taken because tn is nil. But then, tn is clobbered immediately in lab3 by the gcall tn. In other words, the value stored into tn by lab1 is never used. Therefore, we can remove the "mov tn nil" instruction and move the l1 label. lab2 ifq tn nil lab4 lab1 lab3 gcall tn ... There are 74 hits for this pattern in stdlib. * stdlib/optimize.tl (basic-blocks late-peephole): Implement the above pattern.