summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2018-11-05 07:14:26 -0800
committerKaz Kylheku <kaz@kylheku.com>2018-11-05 07:14:26 -0800
commit8bfe276d821794a5f77ddabdc5187291676bdde4 (patch)
tree6f4a7381a59eb92c317329aff93bd30857d39f05
parent3fd22aab6266602ef225b3a407f8838281f0914d (diff)
downloadtxr-8bfe276d821794a5f77ddabdc5187291676bdde4.tar.gz
txr-8bfe276d821794a5f77ddabdc5187291676bdde4.tar.bz2
txr-8bfe276d821794a5f77ddabdc5187291676bdde4.zip
compiler: bugfix: handle defpackage and such properly.
The problem is that the file compiler is emitting one big form that contains all of the compiled top-level forms. For obvious reasons, this doesn't work when that form contains symbols that are in a package which is defined by one of those forms; the compiled file will not load due to qualified symbols referencing a nonexistent package. The solution is to break up that big form when it contains forms that manipulate the package system in ways that possibly affect the read time of subsequent forms. * lib.c (delete_package): Use a non-destructive deletion on the *package-alist*, because we are going to be referring to this variable in the compiler to detect whether the list of packages has changed. * share/txr/stdlib/compiler.tl (%package-manip%): New global variable. This is a list of functions that manipulate the package system in suspicious ways. (user:compile-file): When compiling a form which is a call to any of the suspicious functions, add a :fence symbol into the compiled form list. Also do this if the evaluation of the compiled form modifies the *package-alist* variable. When emitting the list of forms into the output file, remove the :fence symbols and break it up into multiple lists along these fence boundaries. * txr.1: Documented the degenerate situation that can arise.
-rw-r--r--lib.c2
-rw-r--r--share/txr/stdlib/compiler.tl25
-rw-r--r--txr.184
3 files changed, 104 insertions, 7 deletions
diff --git a/lib.c b/lib.c
index f86576b9..761646ff 100644
--- a/lib.c
+++ b/lib.c
@@ -5226,7 +5226,7 @@ val delete_package(val package_in)
val package = get_package(lit("delete-package"), package_in, nil);
val iter;
loc cpll = cur_package_alist_loc;
- set(cpll, alist_nremove1(deref(cpll), package->pk.name));
+ set(cpll, alist_remove1(deref(cpll), package->pk.name));
for (iter = deref(cpll); iter; iter = cdr(iter))
unuse_package(package, cdar(iter));
return nil;
diff --git a/share/txr/stdlib/compiler.tl b/share/txr/stdlib/compiler.tl
index 90eb2d41..51b0becd 100644
--- a/share/txr/stdlib/compiler.tl
+++ b/share/txr/stdlib/compiler.tl
@@ -1576,6 +1576,12 @@
(defvarl %tlo-ver% ^(4 0 ,%big-endian%))
+(defvarl %package-manip% '(make-package delete-package
+ use-package unuse-package
+ set-package-fallback-list
+ intern unintern rehome-sym
+ use-sym unuse-sym))
+
(defun open-compile-streams (in-path out-path)
(let* ((rsuff (r$ %file-suff-rx% in-path))
(suff (if rsuff [in-path rsuff]))
@@ -1649,19 +1655,26 @@
(t (when (and (or *eval* *emit*)
(not (constantp form)))
(let* ((vm-desc (compile-toplevel form))
- (flat-vd (list-from-vm-desc vm-desc)))
+ (flat-vd (list-from-vm-desc vm-desc))
+ (fence (member (car form) %package-manip%)))
(when *eval*
- (sys:vm-execute-toplevel vm-desc))
+ (let ((pa *package-alist*))
+ (sys:vm-execute-toplevel vm-desc)
+ (when (or (neq pa *package-alist*))
+ (set fence t))))
(when *emit*
- out.(add flat-vd))))))))))
+ out.(add flat-vd)
+ (when fence
+ out.(add :fence)))))))))))
(prinl %tlo-ver% out-stream)
(unwind-protect
(whilet ((obj (read in-stream *stderr* err-ret))
((neq obj err-ret)))
(compile-form obj))
- (let ((*print-circle* t)
- (*package* (sys:make-anon-package)))
- (prinl out.(get) out-stream)
+ (let* ((*print-circle* t)
+ (*package* (sys:make-anon-package))
+ (out-forms (split* out.(get) (op where (op eq :fence)))))
+ [mapdo (op prinl @1 out-stream) out-forms]
(delete-package *package*)))
(let ((parser (sys:get-parser in-stream)))
diff --git a/txr.1 b/txr.1
index b441664b..524b6b7e 100644
--- a/txr.1
+++ b/txr.1
@@ -62889,6 +62889,8 @@ operators can be used to deliberately produce code which behaves differently
when compiled and interpreted. In addition, unwanted differences in behavior
can also occur. The situations are summarized below.
+.coNP Differences due to @ load-time
+
Forms evaluated by
.code load-time
are treated differently by the compiler. When a top-level form is compiled,
@@ -62900,6 +62902,8 @@ The interpreter doesn't perform this factoring; it evaluates a
.code load-time
form when it encounters it for the first time.
+.coNP Treatment of unbound variables
+
Unbound variables are treated differently by the compiler. A reference
to an unbound variable is treated as a global lexical access. This means that
if a variable access is compiled first and then a
@@ -62913,6 +62917,8 @@ The compiler treats a variable as dynamic if a
.code defvar
has been processed which marked that variable as special.
+.coNP Unbound symbols in @ dwim
+
Arguments of a
.code dwim
form (or the equivalent bracket notation) which are unbound
@@ -62950,6 +62956,84 @@ treated as a function. If it has both bindings, it is treated as a variable.
The difference is that this is resolved at compile time for compiled code,
and at evaluation time for interpreted code.
+.coNP File-wide insertion of gensyms
+
+The following degenerate situation occurs, illustrated by example. Suppose the
+following definitions are given:
+
+.cblk
+ (defvarl %gensym%)
+
+ (defmacro define-secret-fun ((. args) . body)
+ (set %gensym% (gensym))
+ ^(defun ,%gensym% (,*args) ,*body))
+
+ (defmacro call-secret-fun (. args)
+ ^(,%gensym% ,*args))
+.cble
+
+The idea is to be able to define a function whose name is an uninterned
+symbol and then call it. An example module might use these definitions as
+follows:
+
+.cblk
+ (define-secret-fun (a) (put-line `a is @a`))
+
+ (call-secret-fun 42)
+.cble
+
+The effect is that the second top-level form calls the function, which
+prints 42 to standard out. This works both interpreted and compiled with
+.codn compile-file .
+Each of these two macro calls generates a top-level form into which
+the same gensym is inserted. This works under file compilation due to a
+deliberate strategy in the layout of compiled files, which allows such
+uses. Namely, the file compiler combines multiple top-level forms are combined
+into a single object, which is read at once, and which uses the circle
+notation to unify gensym references.
+
+However, suppose the following change is introduced:
+
+.cblk
+ (define-secret-fun (a) (put-line `a is @a`))
+
+ (defpackage foo) ;; newly inserted form
+
+ (call-secret-fun 42)
+.cble
+
+This still works interpreted, and appears to compiles. However, when the
+compiled file is loaded, the compiled version of the
+.code call-secret-fun
+form fails with an error complaining that the
+.code #:g0039
+(or other gensym name) function is not defined.
+
+This is because for this modified source file, the file compiler is not
+able to combine the compiled forms into a single object. It would not
+be correct to do so in the presence of the
+.code defpackage
+form, because the evaluation of that form affects the subsequent interpretation
+of symbols. After the package definition is executed, it is possible for
+a subsequent top-level form to refer to a symbol in the
+.code foo
+package such as
+.code foo:bar
+to occur, which would be erroneous if the package didn't exist.
+
+The file compiler therefore arranges for the compiled forms after the
+.code defpackage
+to be emitted into a separate object. But that division in the output file
+consequently prevents the occurrences of the gensym to resolve to the same
+symbol object.
+
+In other words, the strategy for allowing global gensym use is in conflict
+with support for forms which have a necessary read-time effect such as
+.codn defpackage .
+
+The solution is to rearrange the file to unravel the interference, or
+to use interned symbols instead of gensyms.
+
.SS* Compilation Library
.coNP Function @ compile-toplevel