diff options
author | Kaz Kylheku <kaz@kylheku.com> | 2018-11-05 07:14:26 -0800 |
---|---|---|
committer | Kaz Kylheku <kaz@kylheku.com> | 2018-11-05 07:14:26 -0800 |
commit | 8bfe276d821794a5f77ddabdc5187291676bdde4 (patch) | |
tree | 6f4a7381a59eb92c317329aff93bd30857d39f05 | |
parent | 3fd22aab6266602ef225b3a407f8838281f0914d (diff) | |
download | txr-8bfe276d821794a5f77ddabdc5187291676bdde4.tar.gz txr-8bfe276d821794a5f77ddabdc5187291676bdde4.tar.bz2 txr-8bfe276d821794a5f77ddabdc5187291676bdde4.zip |
compiler: bugfix: handle defpackage and such properly.
The problem is that the file compiler is emitting one big form
that contains all of the compiled top-level forms. For obvious
reasons, this doesn't work when that form contains symbols
that are in a package which is defined by one of those forms;
the compiled file will not load due to qualified symbols
referencing a nonexistent package.
The solution is to break up that big form when it contains
forms that manipulate the package system in ways that
possibly affect the read time of subsequent forms.
* lib.c (delete_package): Use a non-destructive deletion on
the *package-alist*, because we are going to be referring
to this variable in the compiler to detect whether the list
of packages has changed.
* share/txr/stdlib/compiler.tl (%package-manip%): New global
variable. This is a list of functions that manipulate the
package system in suspicious ways.
(user:compile-file): When compiling a form which is a call to
any of the suspicious functions, add a :fence symbol into
the compiled form list. Also do this if the evaluation of the
compiled form modifies the *package-alist* variable.
When emitting the list of forms into the output file, remove
the :fence symbols and break it up into multiple lists
along these fence boundaries.
* txr.1: Documented the degenerate situation that can arise.
-rw-r--r-- | lib.c | 2 | ||||
-rw-r--r-- | share/txr/stdlib/compiler.tl | 25 | ||||
-rw-r--r-- | txr.1 | 84 |
3 files changed, 104 insertions, 7 deletions
@@ -5226,7 +5226,7 @@ val delete_package(val package_in) val package = get_package(lit("delete-package"), package_in, nil); val iter; loc cpll = cur_package_alist_loc; - set(cpll, alist_nremove1(deref(cpll), package->pk.name)); + set(cpll, alist_remove1(deref(cpll), package->pk.name)); for (iter = deref(cpll); iter; iter = cdr(iter)) unuse_package(package, cdar(iter)); return nil; diff --git a/share/txr/stdlib/compiler.tl b/share/txr/stdlib/compiler.tl index 90eb2d41..51b0becd 100644 --- a/share/txr/stdlib/compiler.tl +++ b/share/txr/stdlib/compiler.tl @@ -1576,6 +1576,12 @@ (defvarl %tlo-ver% ^(4 0 ,%big-endian%)) +(defvarl %package-manip% '(make-package delete-package + use-package unuse-package + set-package-fallback-list + intern unintern rehome-sym + use-sym unuse-sym)) + (defun open-compile-streams (in-path out-path) (let* ((rsuff (r$ %file-suff-rx% in-path)) (suff (if rsuff [in-path rsuff])) @@ -1649,19 +1655,26 @@ (t (when (and (or *eval* *emit*) (not (constantp form))) (let* ((vm-desc (compile-toplevel form)) - (flat-vd (list-from-vm-desc vm-desc))) + (flat-vd (list-from-vm-desc vm-desc)) + (fence (member (car form) %package-manip%))) (when *eval* - (sys:vm-execute-toplevel vm-desc)) + (let ((pa *package-alist*)) + (sys:vm-execute-toplevel vm-desc) + (when (or (neq pa *package-alist*)) + (set fence t)))) (when *emit* - out.(add flat-vd)))))))))) + out.(add flat-vd) + (when fence + out.(add :fence))))))))))) (prinl %tlo-ver% out-stream) (unwind-protect (whilet ((obj (read in-stream *stderr* err-ret)) ((neq obj err-ret))) (compile-form obj)) - (let ((*print-circle* t) - (*package* (sys:make-anon-package))) - (prinl out.(get) out-stream) + (let* ((*print-circle* t) + (*package* (sys:make-anon-package)) + (out-forms (split* out.(get) (op where (op eq :fence))))) + [mapdo (op prinl @1 out-stream) out-forms] (delete-package *package*))) (let ((parser (sys:get-parser in-stream))) @@ -62889,6 +62889,8 @@ operators can be used to deliberately produce code which behaves differently when compiled and interpreted. In addition, unwanted differences in behavior can also occur. The situations are summarized below. +.coNP Differences due to @ load-time + Forms evaluated by .code load-time are treated differently by the compiler. When a top-level form is compiled, @@ -62900,6 +62902,8 @@ The interpreter doesn't perform this factoring; it evaluates a .code load-time form when it encounters it for the first time. +.coNP Treatment of unbound variables + Unbound variables are treated differently by the compiler. A reference to an unbound variable is treated as a global lexical access. This means that if a variable access is compiled first and then a @@ -62913,6 +62917,8 @@ The compiler treats a variable as dynamic if a .code defvar has been processed which marked that variable as special. +.coNP Unbound symbols in @ dwim + Arguments of a .code dwim form (or the equivalent bracket notation) which are unbound @@ -62950,6 +62956,84 @@ treated as a function. If it has both bindings, it is treated as a variable. The difference is that this is resolved at compile time for compiled code, and at evaluation time for interpreted code. +.coNP File-wide insertion of gensyms + +The following degenerate situation occurs, illustrated by example. Suppose the +following definitions are given: + +.cblk + (defvarl %gensym%) + + (defmacro define-secret-fun ((. args) . body) + (set %gensym% (gensym)) + ^(defun ,%gensym% (,*args) ,*body)) + + (defmacro call-secret-fun (. args) + ^(,%gensym% ,*args)) +.cble + +The idea is to be able to define a function whose name is an uninterned +symbol and then call it. An example module might use these definitions as +follows: + +.cblk + (define-secret-fun (a) (put-line `a is @a`)) + + (call-secret-fun 42) +.cble + +The effect is that the second top-level form calls the function, which +prints 42 to standard out. This works both interpreted and compiled with +.codn compile-file . +Each of these two macro calls generates a top-level form into which +the same gensym is inserted. This works under file compilation due to a +deliberate strategy in the layout of compiled files, which allows such +uses. Namely, the file compiler combines multiple top-level forms are combined +into a single object, which is read at once, and which uses the circle +notation to unify gensym references. + +However, suppose the following change is introduced: + +.cblk + (define-secret-fun (a) (put-line `a is @a`)) + + (defpackage foo) ;; newly inserted form + + (call-secret-fun 42) +.cble + +This still works interpreted, and appears to compiles. However, when the +compiled file is loaded, the compiled version of the +.code call-secret-fun +form fails with an error complaining that the +.code #:g0039 +(or other gensym name) function is not defined. + +This is because for this modified source file, the file compiler is not +able to combine the compiled forms into a single object. It would not +be correct to do so in the presence of the +.code defpackage +form, because the evaluation of that form affects the subsequent interpretation +of symbols. After the package definition is executed, it is possible for +a subsequent top-level form to refer to a symbol in the +.code foo +package such as +.code foo:bar +to occur, which would be erroneous if the package didn't exist. + +The file compiler therefore arranges for the compiled forms after the +.code defpackage +to be emitted into a separate object. But that division in the output file +consequently prevents the occurrences of the gensym to resolve to the same +symbol object. + +In other words, the strategy for allowing global gensym use is in conflict +with support for forms which have a necessary read-time effect such as +.codn defpackage . + +The solution is to rearrange the file to unravel the interference, or +to use interned symbols instead of gensyms. + .SS* Compilation Library .coNP Function @ compile-toplevel |