diff options
author | Kaz Kylheku <kaz@kylheku.com> | 2021-03-02 22:18:13 -0800 |
---|---|---|
committer | Kaz Kylheku <kaz@kylheku.com> | 2021-03-02 22:18:13 -0800 |
commit | 25887472e9d60d256a0420de93f8ca97209c4367 (patch) | |
tree | 5c4060b29afa6fd0898ef896950c597cb8e1f58b | |
parent | 62e18de0a5c3572890bc73ea834b72b6851e3de1 (diff) | |
download | txr-25887472e9d60d256a0420de93f8ca97209c4367.tar.gz txr-25887472e9d60d256a0420de93f8ca97209c4367.tar.bz2 txr-25887472e9d60d256a0420de93f8ca97209c4367.zip |
compiler: optimize annoying instruction sequence.
There is a specific instruction sequence that often occurs
when pattern matching code is compiled. It is not easy
to attack by the existing optimization framework. Once the
code is divided into basic blocks, it spans multiple blocks.
It's not a good fit for either jump threading or
intra-basic-block peephhole reductions.
It is exemplified by the following disassembly snippet, where
the pattern is shown in the box:
5: 3C00000C ifq v00000 d1 12 --------+
6: 08000401 |
7: 20020010 gcall t16 0 v00001 d2 |
8: 08010000 |
9: 00000402 |
+---------------------------+ |
|10: 2C060403 movsr t6 d3 | |
|11: 3400000D jmp 13 | |
|12: 2C060000 movsr t6 nil | <--------+
|13: 3C000028 ifq t6 nil 40 |
+---------------------------+
14: 00060000
15: 3C000019 ifq v00001 d1 25
The d3 register holds t. Of course, it's always a different d
register in the various occurrences of this pattern.
The register t6 also varies, but it's the same register in
all three places in the pattern.
The purpose of this "kernel" of four instructions is:
- if it is entered from the top, set the the t6 register to t
and jump to 40.
- if it is entered at instruction 12, set the t6 register to
nil, and fall through to the next block.
This can be instead expressed by to three instructions:
+---------------------------+ |
|(mov t6 d3) | |
|(jmp 40) | |
|(mov t6 nil) | <--------+
+---------------------------+
This revised code then more conductive to further
optimizations in ways that the original is not. In some
cases, like the Ackermann test case, both moves into t6 will
disappear entirely.
* share/txr/stdlib/optimize.tl (basic-blocks :postinit): Pass
instructions through early-peephole.
(early-peephole): New function, where we can look for and
rewrite instruction patterns over the entire block of code,
before any division into basic blocks. Here, we look for the
four-instruction kernel and rewrite it.
-rw-r--r-- | share/txr/stdlib/optimize.tl | 19 |
1 files changed, 18 insertions, 1 deletions
diff --git a/share/txr/stdlib/optimize.tl b/share/txr/stdlib/optimize.tl index 73aa6dc6..99faad8a 100644 --- a/share/txr/stdlib/optimize.tl +++ b/share/txr/stdlib/optimize.tl @@ -51,7 +51,7 @@ uwprot catch block jend)) (:postinit (bb) - (let* ((insns (dedup-labels (cons bb.start bb.insns))) + (let* ((insns (early-peephole (dedup-labels (cons bb.start bb.insns)))) (cuts (merge [where symbolp insns] [where [andf consp (op memq (car @1) bb.jump-ops)] @@ -474,3 +474,20 @@ (cons label0 rest)) (@else tail)) insns) + +(defun early-peephole (code) + (rewrite-case insns code + (((mov (t @t1) (d @d1)) + (jmp @lab2) + @(symbolp @lab1) + (mov (t @t1) (t 0)) + @lab2 + (ifq (t @t1) (t 0) @lab3) + . @rest) + ^((mov (t ,t1) (d ,d1)) + (jmp ,lab3) + ,lab1 + (mov (t ,t1) (t 0)) + ,lab2 + ,*rest)) + (@else else))) |