| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl,
stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl,
stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl,
stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl,
stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl,
stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl,
stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl,
stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl,
stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl,
stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl,
stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl,
stdlib/vm-param.tl, stdlib/with-resources.tl,
stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright bumped to 2025.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When tests/012/compile.tl compiles tests/012/seq.tl, there
are now some compiler warnings due to constant expressions
that throw. We introduce a new compiler option to suppress
them, and then use it.
* stdlib/comp-opts.tl: New file. The definitions related
to compiler options are moved here out of compile.tl,
so that optimize.tl can use them.
* stdlib/compiler.tl (compile-opts, %warning-syms%,
when-opt, *compile-opts*, opt-controlled-diag): Moved to
comp-opts.tl. New constant-throws option added to
compile-opts and %warning-syms%.
(safe-constantp): Make the constant expression throws
diagnostic conditional on the new option.
* stdlib/optimize.tl: Load comp-opts file.
(basic-blocks do-peephole-block): Make diagnostic
about throwing situation subject to constant-throws
option.
* tests/012/seq.tl: Turn off constant-throws warning
option before the ref tests that work with ranges.
Fix: one of the expressions calls refs with the
wrong number of arguments, which was unintentional.
* txr.1: Document new diagnostic option.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c,
lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c,
match.h, parser.c, parser.h, parser.l, parser.y, psquare.h,
rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c,
socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl,
stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl,
stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl,
stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl,
stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl,
stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1,
txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h,
vmop.h, win/cleansvg.txr, y.tab.c.shipped:
Copyright year bumped to 2024.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have these issues, which are regressions:
1> (compile-toplevel '(/ 1 0))
** expr-1:1: warning: sys:b/: constant expression (sys:b/ 1 0) throws
** /: division by zero
** during evaluation at expr-1:1 of form (sys:b/ 1 0)
1> (compile-toplevel '(let ((a 1) (b 0)) (/ a b)))
** /: division by zero
** during evaluation at expr-1:1 of form (compile-toplevel [...])
While the compiler's early pass constant folding is careful
to detect constant expressions that throw, care was not taken
in the optimizer's later constant folding which takes place
after constant values are propagated around.
After the fix:
1> (compile-toplevel '(let ((a 1) (b 0) (c t)) (if c (/ a b))))
** expr-1:1: warning: let: function sys:b/ with arguments (1 0) throws
#<sys:vm-desc: 9aceb20>
2> (compile-toplevel '(let ((a 1) (b 0) (c nil)) (if c (/ a b))))
#<sys:vm-desc: 9aef9f0>
* stdlib/compiler.tl (compiler): New slot top-form.
(compile-toplevel): Initialize the top-form slot of the
compiler. The optimizer uses this to issue a warning now.
Since the warning is based on analyzing generated code, we
cannot trace it to the code more precisely than to the top-level
form.
* stdlib/optimize.tl (basic-blocks): New slot, warned-insns.
List of instructions that have been warned about.
(basic-blocks do-peephole-block): Rearrange the constant folding
case so that as part of the pattern match condition, we include
the fact that the function will not throw when called with those
constant arguments. Only in that case do we do the optimization.
We warn in the case when the function call does throw.
A function rejected due to throwing could be processed through
this rule multiple times, under multiple peephole passes, so
for that reason we use the warned-insns list to suppress duplicate
warnings.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimizer eliminates calls to pure library functions when
all their arguments are D-registers. The call is made at
compiled time and its value is inserted into the program
as a constant (in a newly allocated D register).
The bug is that we can't do this for a D register that
is linked to a load-time value, because we don't know its
value until run-time.
* stdlib/optimize.tl (basic-blocks do-peephole-block): Add
a constraint that none of the D registers can be a member
of bb.lt-dregs, which holds the list of D registers that
are used for load-time values.
|
|
|
|
|
|
|
|
|
|
|
| |
Discovered while experimenting with new optimizations.
* stdlib/optimize.tl (basic-blocks join-block): When we
join the following block into the current block, we must
propagate the nojoin property of the following block.
The nojoin property has to do with the last instruction
being xend. The joined block has that last instruction
and so must be nojoin.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discovered while experimenting with new optimizations.
* stdlib/optimize.tl (basic-blocks :postinit): Pass t argument
to new parameter of basic-blocks link-graph.
(basic-blocks link-graph): New parameter indicating whether
this is the first call; if false, we reset all the links.
(basic-blocks elim-dead-code): This no longer has to reset
the links before calling link-graph. But now calls link-graph
one more time after the dead code removal so that no dead
blocks appear in the graph.
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks :postinit): Calculate the
basic block partitions more directly using partition-if,
eliminating the calculation of two sequences of indices
that have to be merged and then passed to the partition
function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When functions are optimized away due to constant folding,
instead of replacing them with a nil, we now compact the
table to close the gaps and renumber the references in the
code.
* stdlib/compiler.tl (compiler null-stab): Method removed.
(compiler compact-dregs): Renamed to compact-dregs-and-syms.
Now compacts the symbol table also. This is combined with
D-reg compacting because it makes just two passes through
the instruction: a pass to identify the used D registers
and symbol indices, and then another pass to edit the
instructions with the renamed D registers and renumbered
symbol indices.
(compiler optimize): Remove the call to the null-unused-data
on the basic-blocks object; nulling out D regs and symbol
table entries is no longer required. Fllow the rename of
compact-dregs to compact-dregs-and-syms which is called
the same way otherwise.
* stdlib/optimize.tl (basic-blocks null-unused-data):
No longer used method removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now have some constant folding in the optimizer too, not
just in the front end compiler pass. This is leaving behind
dead D registers that are not referenced in the code.
Let's compact the D register table to close the gap.
* stdlib/compiler.tl (compiler get-dreg): In this function
we no longer check that we have allocated too many D
registers. We let the counter blow past %lev-size%.
Because this creates the fighting chance that the compaction
of D regs will reduce their number to %lev-size% or less.
By doing this, we allow code to be compilable that otherwise
would not be: code that allocates too many D regs which
are then optimized away.
(compiler compact-dregs): New function. Does all the work.
(compiler optimize): Compact the D regs at optimization
level 5 or higher.
(compile-toplevel): Check for an overflowing D reg count
here, after optimization.
* stdlib/optimize.tl (basic-blocks null-unused-data):
Here, we no longer have to do anything with the D registers.
|
|
|
|
|
|
|
|
| |
* stdlib/compiler.tl (compiler get-dreg): Fix
indentation proble.
* stdlib/optimize.tl (basic-block fill-treg-compacting-map):
Likewise.
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks do-peephole-block): Extend
constant-folding case to recognize gapply as well as gcall.
We just have to take care in how we apply apply arguments
to the actual function to get the value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's consider the DIM expression [a . b c]. Without
this change we get:
syms:
0: a
1: b
2: c
code:
0: 98020000 getf t2 0
1: 98030001 getf t3 1
2: 98040002 getf t4 2
3: 1C020002 apply t2 t2 t3 t4
4: 00030002
5: 00000004
6: 10000002 end t2
With this change:
syms:
0: a
1: b
2: c
code:
0: 98030001 getf t3 1
1: 98040002 getf t4 2
2: 24020002 gapply t2 0 t3 t4
3: 00030000
4: 00000004
5: 10000002 end t2
There are 17 hits for this optimization in optimize.tl
alone!
* stdlib/optimize.tl (basic-blocks do-peephole-block):
New pattern here. We recognize an instruction sequence
which begins with a (getf treg idx) and ends in
an (apply dest treg ...), without any instructions in
between accessing or redefining treg. Additionally,
the treg value must not be used after the apply,
unless the apply redefines it. In that case, we
rewrite this pattern to eliminate that initial getf
instruction, and substitute (gapply dest idsx ..)
for the apply.
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks do-peephole-block): The
constant folding case should fire even if some of the
arguments of the call aren't D registers but T0.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/compiler.tl (%effect-free-funs%, %effect-free%,
%functional-funs%, %functional%): Move variables
into stdlib/constfun.tl
* stdlib/constfun.tl %effect-free-funs%, %effect-free%,
%functional-funs%, %functional%): Moved here.
* stdlib/optimize.tl: Use load-for to express dependency
on constfun module; don't depend on the compiler having
loaded it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler handles trivial constant folding over the
source code, as a source to source transformation.
However, there are more opportunities for constant folding
after data flow optimizations of the VM code.
Early constant folding will not fold, for instance,
(let ((a 2) (b 3)) (* a b))
but we can reduce this to an end instruction that returns
the value of a D register that holds 6. Data flow optimizations
will propagate the D registers for 2 and 3 into the gcall
instruction. We can then recognize that we have a gcall with
nothing but D register operands, calling a constant-foldable
function. We can allocate a new D register to hold the result
of that calculation and just move that D register's value
into the target register of the original gcall.
* stdlib/compiler.tl (compiler get-dreg): When allocating
a new D reg, we must invalidate the datavec slot which is
calculated from the data hash. This didn't matter before,
because until now, get-datavec was called after compilation,
at which point no new D regs will exist. That is changing;
the optimizer can allocate D regs.
(compiler null-dregs, compiler null-stab): New methods.
(compiler optimize): Pass self to constructor for basic-blocks.
basic-blocks now references back to the compiler.
At optimization level 5 or higher, constant folding can
now happen, so we call the new method in the optimizer to
null the unused data. This overwrites unused D registers
and unused parts of the symbol vector with nil.
* stdlib/optimize (basic-blocks): Boa constructor now takes
a new leftmost param, the compiler.
(basic-blocks do-peephole-block): New optimization case:
gcall instruction invoking const-foldable function, with
all arguments being dregs.
(basic-blocks null-unused-data): New method.
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks late-peephole):
The test whether lab2 is used is bogus, and will
never be true. The correct test is simply whether
the block has two or more rlinks. This makes no
difference in the standard library images. When
the bug appears, the manifestation would be that
a needed label is deleted, resulting in an exception
from the assembler.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
2022-09-13 commit 6e354e1c2d5d64d18f527d52db75e344a9223d95,
subject "compiler: bugfixes in dead code elimination",
introduced a problem. By allowing the closure body blocks to
be included in the links of the previous basic block that ends
in the close instruction, it caused liveness info to flow out
out of close blocks into the close instruction, which is
wrong. Thus registers used inside a closure, which are
entirely private, wrongly appear live outside of the closure,
interfering with optimizations like eliminating dead
registers.
We can't simply roll back the commit because the bug it
fixes will reappear. The fix is to pair the next field
with a prev field, and maintain them; don't rely on
the rlinks to point to the previous block.
* stdlib/optimize.tl (basic-block): New slot, prev.
(back-block join-block): As we delete the next block,
we must update that block's next block's prev link.
(basic-blocks link-graph): Build the prev links.
Fix the bug in handling the close instruction:
do not list the close body code among the links,
only the branch target of the close.
(basic-blocks do-peephole-block): In a few cases in
which we set the bl.next to nil, we also set the
bl.next.prev to nil, if bl.next exists.
(basic-blocks elim-dead-clode): Reset the bl.prev
of every block also.
(basic-block check-bypass-empty): Here, we no longer
depend on rlinks containing the previous block;
the prev gives it to us. So we move that fixup out
of the link, and also fix up the next blocks prev
pointer.
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks local-liveness): Just
store the mask of defined registers into each live-info.
Do not propagate the defined mask from the next instruction
backwards. The way the defined mask is used in calc-liveness,
this makes no difference, and is simpler and faster.
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks rename): When we stop
the renaming due to an end instruction and the src
being a v-reg, we can still do the rename in that end
instruction itself. If the v-reg becomes invalid, that
doesn't happen until after the instruction.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (subst-preserve): Rename list param
to insn for clarity.
(careful-subst-preserve): New function. This is like
subst-preserve, but used only for instructions that
have destination registers. It performs a rewrite
such that those destination positions are avoided.
(basic-blocks rename): When the instruction has src
or dst as a target, don't just stop before that
insn. Do the substitution in the source operands using
careful-subst-preserve.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks do-peephole-block):
Remove the local function only-locally-used-treg.
This is unnecessary because the optimization is valid
even if the treg is used in downstream basic blocks.
It was necessary previously in the old version of
this optimization in which we deleted the first
instruction which sets the treg's value. We are now
depending on it being identified as a dead register.
Also, moving the rule to the end. The reason is
that there are cases when the pattern matches, but
it returns insns. That causes the rewrite macro to
march down to the next instruction, skipping other
patterns. This could be bad, unless the pattern is the
last one tried before the @else fallback.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change is now possible due to the previous bugfix.
* stdlib/optimize.tl (basic-blocks rename): If
the source register is a v-reg, do not allow
the propagation past an end instruction. This
is a precaution because the end instruction
could be the end of the frame in which the
v-register is valid; we don't want to propagate
it outside of that frame.
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks rename): When
we encounter a close instruction, we must leave
it alone. The registers named in the argument area
of the instruction do not belong to the current
instruction stream or basic block; they belong to
the function body.
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks local-liveness): Handle all
instructions explicitly with no catch-all behavior. Make a
copy of the live-info even for instructions that have no
source or destination operands, so that they don't mistakenly
marked as having defs or refs.
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (live-info): Slot def replaced by def0
and def1.
(basic-blocks local-liveness): The local function def becomes
defs: it can take two defs. These become def0 and def1. In the
catch instruction case, we use both arguments, capture the
resulting live-info and use it to call refs.
(basic-blocks rename): Check whether either def0 or def1 is
the source or destination.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks local-liveness): When
processing a pure def, we don't copy the live-info
unconditionally, which is waseteful since if the destination
register is a t-reg, we will invoke (new live-info) to
make yet another live info. Instead, let's destructively
mutate the incoming live info from the instruction below,
and return a copy that is made before that is done.
In the def-ref case, the local copy is entirely superfluous,
because in all cases we return a new object.
We also eliminate redundant (set [bb.li-hash insn] li)
evaluations.
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks local-liveness):
The exception symbol and argument registers in the
catch instruction are clobbers, not references.
We must treat them as defs. Unfortunately, the
instruction has two clobbers but live-info has
only one def slot, which should be fixed.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* optimize.tl (rename): Instead of a mapping operation,
we perform the substitution only until we hit an
instruction that defines either the src or dst register.
(basic-blocks do-peephole-block): Drop the conditions
for doing the rename: that neither register can be
defined somewhere in the rest of the block. This
restriction is too limiting. We have to be careful now;
we cannot delete the first instruction, and must only
set the recalc flag and add to the rescan list if the
substitution did something, to avoid looping.
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks do-peephole-block): In the
unnecessary copying t-reg case, let's just stay away from
doing it if the source operand is a v-reg. It breaks under the
recent "eval order of variables" commit, indicating that the
conditions that it uses for replacing a v-reg with the t-reg
are not correct. The most likely reason is that the v-reg
can be assigned, but this doesn't show up in the liveness
info which tracks only t-regs.
|
|
|
|
|
|
| |
* stdlib/optimizer.tl (basic-blocks do-peephole-block): Use
pushnew instead of push in one peephole case, so the block
isn't pushed onto the tryjoin and rescan lists twice.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks num-blocks): New
method.
* stdlib/compiler.tl (compiler optimize): At optimization
level 6, instead of performing one extra pass of
jump threading, dead-code elimintation and peephole
optimizations, keep iterating on these until the number
of basic blocks stays the same.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks peephole-block): Drop the
code argument, and operate on bl.insns, which is stored
back. Perform the renames in the rename list after the
peephole pass.
(basic-blocks rename): New method.
(basic-blocks do-peephole-block): Implementation of
peephole-block, under a new name. The local function called
rename is removed; calls to it go to the new rename method.
(basic-blocks peephole): Simplify code around calls to
peephole-block; we no longer have to pass bl.insns to it,
capture the return value and store it back into bl.insns.
* stdlib/compiler.tl (*opt-level*): Initial
value changes from 6 to 7.
(compiler optimize): At optimization level 6,
we now do another jump threading pass, and
peephole, like at levels 4 and 5. The peephole
optimizations at level 5 make it possible
to coalesce some basic blocks in some cases,
and that opens up the possibility for more
reductions. The previously level 6 optimizations
are moved to level 7.
* txr.1: Updated documentation of optimization levels,
and default value of *opt-level*.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks peephole-block): Move local
rename function into main labels block, so other optimizations
will be able to use it. Remove an unused argument, and change
the recursion to a mapcar, since that's what it's doing.
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/compiler.tl (compiler (comp-atom, comp-dwim),
safe-const-reduce, igno-notfound): Use ignore
rather than use for marking unused variable.
* stdlib/copy-file.tl (copy-files, copy-path-rec,
remove-path-rec, chmod-rec, chown-rec): Likewise.
* stdlib/optimize.tl (basic-block print): Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimizer.tl (basic-block print): Suppress
warning for pretty-p parameter using the use function.
(basic-blocks (local-liveness, calc-liveness,
thread-jumps-block, peephole-block, late-peephole,
fill-treg-compacting-map), (basic-block
apply-treg-compacting-map), dedup-labels): Fix
unused variables in pattern, mostly by replacing
them by @nil.
(basic-blocks check-bypass-empty): Method moved,
turned into (basic-block check-bypass-empty),
losing the unused basic-blocks parameter.
(basic-blocks elim-next-jump): Likewise moved
into basic-block class.
(basic-blocks elim-dead-code): Calls to check-bypass-empty
and elim-next-jump adjusted.
|
|
|
|
|
|
|
|
|
|
|
| |
This was uncovered by unused variable warnings.
* stdlib/optimize.tl (basic-blocks peephole-block):
The pattern trying to detect wasteful register moves
is incorrect; the reg1 term is intended to be the
@reg1 variable. It is matched literally and so will
not match because the symbol reg1 does not literally
occur in the VM code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl,
stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl,
stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl,
stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl,
stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl,
stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl,
stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl,
stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl,
stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl,
stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl,
stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl,
stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl,
stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl,
stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright year bumped to 2023.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bad situation reproduced as a miscompilation of some prof
forms at *opt-level* 5 or above.
The basic idea is that there is a situation like this
prof t2
... profiled code here producing value in t8
mov t2 t8
end t2
end t2
The code block produces a value in t8, which is copied into
t2, and executes the end instruction. This instruction does not
fall through to the next one but passes control back to the
prof instruction. The prof instruction then stores the result
value, which came from t2, back into the t2 register and
resumes the program at the end t2.
The first bad thing that happens is that the end instructions
get merged together into one basic block. The optimizer then
treats them without regard for the prof instruction, as if
they were a linear sequence. It looks like the register move
mov t2 t8
is wasteful and so it eliminates it, rewriting the end instruction
to:
end t8
end t8
Of course, the second instruction is now wrong because prof is
still producing the result in t2.
To fix this without changing the instruction set, I'm introducing
another pseudo-op that represents end, called xend. This is
similar to jend, except that jend is regarded as an unconditional
branch whereas xend isn't. The special thing about xend is
that a basic block in which it occcurs is marked as non-joinable.
It will not be joined with the following basic block.
* stdlib/asm.tl (xend): New alias opcode for end.
* stdlib/compiler.tl (comp-prof): Use xend to end prof fragment,
rather than plain end.
* stdlib/optimize.tl (basic-block): New slot, nojoin.
If true, block cannot be joined with next one.
(basic-blocks jump-ops): Add xend to list of jump ops,
so that a basic block will terminate on xend.
(basic-blocks link-graph): Set the nojoin flag on a
basic block which contains (and thus ends with) xend.
(basic-blocks local-liveness): Add xend to the case
in def-ref that handles end.
(basic-blocks (peephole, join-blocks)): Refuse to join
blocks marked nojoin.
* tests/019/comp-bugs.tl: New file with miscompiled
test case that was returning 42 instead of (42 0 0 0)
as a result of the wrong register's value being returned.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize (basic-blocks ling-graph): I'm reverting
an old design decision here. The decision is this:
the basic block of a close instruction points to the
first basic block of the closure as its next block,
but that next block does not point back: it doesn't list
the close instruction's basic block among the rlinks.
The idea was that the close instruction doesn't jump
to that block, and so it shouldn't be linked to it.
However, the next link was set purely so that the graph
is connected. Unfortunately, the inconsistency in the
graph structure which this causes is a problem in the
elim-dead-code method. A situation arises when that
first basic block after the close is removed. Because
pit has an empty rlinks list, the block remains listed
as the next block of the close block, even though it is
removed from the master list of blocks.
(basic-blocks check-bypass-empty): Fix one forgotten detail
in this function: the block being deleted must be removed
from the rlinks list of the next block.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following test case throws an exception:
(compile-toplevel '(when-match @(or
@b @c)
nil nil))
the reason is that in the optimizer, the local-liveness method
resets the bl.live set of live registers to nil. The expectation
is that it will be recalculated. However, what happens is that
* stdlib/optimize.tl (basic-block): Initialize live slot to zero.
(basic-blocks local-liveness): Reset bl.live to 0, rather than nil.
The problem is that sometimes the block in question is not reached
in the graph traversal (down in the same function) which is
supposed to assign it a live value. This happens in the above
test case, and it's not due to any bug: the block is not reached
by the forward traversal because the block has become unreachable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The root slot is just supposed to be (car bl.list): the
first basic block. Unfortunately, it's not maintained
when bl.list is edited, which could cause bugs.
This change makes no difference in the stdlib compiled files,
other than of course the changes in optimize.tlo from this
code change itself; so that is evidence suggesting
this change is least not making anything worse.
* stdlib/optimize.tl (basic-blocks): Remove root slot.
(basic-blocks (link-graph, calc-liveness, elim-dead-code):
Refer to (car bl.list) instead of bl.root.
|
|
|
|
|
|
|
|
|
|
| |
* optimize.tl (basic-blocks peephole-block): Check for the
reversed arguments case of (ifq (d x) (t 0)), and
also match ifq. Add a case for two different d registers
being compared by ifq or ifql which are not both implicated as
load-time regs; that also converts to an unconditional jmp
to the else label. Add a case for a register being compared
with itself with ifq or ifql, which disappears.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivating situation is this:
(lambda (: (opt :)) opt)
When the default value of an optional parameter is :
then the net effect is that there is no optional substituion.
The optional argument is signaled by the : symbol, and that
same symbol then replaces the value.
This is not optimized well:
data:
0: :
1: t
syms:
code:
0: 8C000009 close t2 0 3 9 1 0 nil t2
1: 00000002
2: 00000001
3: 00000003
4: 00000002
5: 3C000008 ifq t2 d0 8
6: 00020400
7: 2C020400 movsr t2 d0
8: 10000002 end t2
9: 10000002 end t2
instruction count:
5
entry point:
4
The instruction sequence
5: 3C000008 ifq t2 d0 8
6: 00020400
7: 2C020400 movsr t2 d0
8:
serves no purpose; it's like:
(if (eq x y)
(set x y))
With this commit it looks like:
data:
0: :
1: t
syms:
code:
0: 8C000006 close t2 0 3 6 1 0 nil t2
1: 00000002
2: 00000001
3: 00000003
4: 00000002
5: 10000002 end t2
6: 10000002 end t2
instruction count:
3
entry point:
4
* stdlib/optimize.tl (basic-blocks peephole-block): Here, we
add an optimization for the useless assignment pattern.
If an "ifq tx dy label" instruction falls through to a
"mov tx dy", then we remove that move instruction from
the next block. But only if that next block has nothing
else jumping to it! If there are other jumps there, they could
be relying on that "mov tx dy", so it cannot be removed.
(basic-blocks elim-next-jump): The above optimization may
leave us with a useless ifq instruction, which jumps to
the same destination whether the comparison is true or not.
In elim-next-jmp, we took care only of jmp instructions which
uselessly jump to the next block in instruction order.
We fix this to also eliminate if and ifq instructions whose
destination label is the next block; they are equivalent to
an unconditional jump.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c,
cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h,
combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h,
ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h,
glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h,
parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h,
rand.c, rand.h, regex.c, regex.h, signal.c, signal.h,
socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl,
stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl,
stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1,
txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h,
vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year
bumped to 2022.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (eval_init): Register new intrinsics.
* lib.c, lib.h (subq, subql, subqual, subst): New functions.
* tests/012/seq.tl: New test cases.
* stdlib/optimize.tl (subst): Function removed. The new subst
drop-in replaces this one.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks late-peephole): In one
pattern, an instruction that is recognized as (jend ...)
is inadvertently rewritten to (end ...). Since this is the
last optimization stage, currently, and end and jend are
synonyms for the same opcode, it doesn't matter.
But it could turn into a bug; let's fix it.
|
|
|
|
|
|
| |
* stdlib/optimize (basic-block print): Print the label of the
next block, rather than the block itself. This reduces the
verbosity during debugging.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This work addresses the following issues, which cause compiled
functions to use more stack space than necessary.
Firstly, the compiler doesn't allocate registers tightly.
Every closure's registers can start at t2, but this isn't
done. Secondly, data flow optimizations eliminate registers,
leaving gaps in the register allocation. Code ends up strange:
you may see register t63 used in a function where the next
highest register below that is t39: evidence that a large
number of temporary variables got mapped to registers and
eliminated.
In this change, an optimization is introduced, active at
*opt-level* 6 which compacts the t registers for every
closure: every closure's registers are renumbered starting
from t2. Then, each closure' generating close instruction
is also updated to indicate the accurate number of registers
ensuring no space is wasted on the stack when the closure
is prepared for execution.
* stdlib/compiler.tl (compiler optimize): At optimization
level 6, insert a call to basic-blocks compact-tregs
just before the instruction are pulled out and put through
the late peephole pass.
* stdlib/optimize.tl (basic-block): New slot, closer.
This is pronounced "clozer", as in one who closes.
For any basic block that is the head of a closure (the entry
point ito the closure code), this slot is set to point
to the previous block: the one which ends in the close
instruction which creates this closure: the closer. This is
important because the close instruction can use t registers
for arguments, and those registers belong to the closure.
Those argument registers must be included in the renaming.
(basic-blocks): New slots closures and cl-hash. The former
lists the closure head basic blocks; all the basic blocks
which are the entry blocks of a closure. The cl-hash
associates each head block with a list of all the blocks
(including the head block).
(basic-blocks identify-closures): New method. This scans the
list of blocks, identifying the closure heads, and associating
them with their closers, to establish the bb.closure list.
Then for each closure head in the list, the graph is searched
to find all the blocks of a closure, and these lists are put
into the bb.cl-hash.
(basic-block fill-treg-compacting-map): This method scans
a basic block, ferreting out all of it t registers, and
adds renaming entries for them into a hash that is passed in.
If the block is the head of a closure, then the close
instruction of the block's closer is also scanned for t
registers: but only the arguments, not the destination
register.
(basic-block apply-treg-compacting-map): This method
renames the t register of a block using the renaming map.
It follows the registers in the same way as
fill-treg-compacting-map, and consquently also goes into
the close instruction to rename the argument registers, as
necessary. When tweaking the close instruction, it also
updates the instruction's number-of-tregs field with the
newly calculated number, which comes from the map size.
(basic-blocks compact-tregs): This method ties it together,
using the above methods to identify the basic blocks belonging
to closures, build register renaming maps for them, and hen
apply the renaming.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is related to the pattern in the previous commit. When we have a
situation like this:
lab1
mov tn nil
lab2
ifq tn nil lab4
lab3
gcall tn ...
We know that if lab1 is entered, then lab2 will necessarily
fall through: the lab4 branch is not taken because tn is nil.
But then, tn is clobbered immediately in lab3 by the gcall tn.
In other words, the value stored into tn by lab1 is never used.
Therefore, we can remove the "mov tn nil" instruction and
move the l1 label.
lab2
ifq tn nil lab4
lab1
lab3
gcall tn ...
There are 74 hits for this pattern in stdlib.
* stdlib/optimize.tl (basic-blocks late-peephole): Implement the
above pattern.
|