summaryrefslogtreecommitdiffstats
path: root/HACKING
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2015-07-30 08:58:39 -0700
committerKaz Kylheku <kaz@kylheku.com>2015-07-30 08:58:39 -0700
commita82c88f98b5a176d62e0055ec9f475179acc313f (patch)
tree3dc08c7f59b7d9cef690fb9b5a85b583f3a7fdab /HACKING
parentd9cda87d916b668c6f5de7fbf6cf983cb6c737e2 (diff)
downloadtxr-a82c88f98b5a176d62e0055ec9f475179acc313f.tar.gz
txr-a82c88f98b5a176d62e0055ec9f475179acc313f.tar.bz2
txr-a82c88f98b5a176d62e0055ec9f475179acc313f.zip
Correction to COBJ initialization pattern.
In fact, the previosuly documented process is not correct and still leaves a corruption problem under generational GC (which has been the default for some time). * HACKING: Document flaw in the initialization pattern previously thought to be correct, and show fix. * hash.c (copy_hash): Fix instance of incorrect pattern. * regex.c (regex_compile): Likewise.
Diffstat (limited to 'HACKING')
-rw-r--r--HACKING111
1 files changed, 69 insertions, 42 deletions
diff --git a/HACKING b/HACKING
index 078c6471..5fb49f14 100644
--- a/HACKING
+++ b/HACKING
@@ -5,44 +5,44 @@ CONTENTS:
SECTION LINE
-0. Overview 47
-
-1. Coding Practice 54
-1.2 Program File Structure 77
-1.3 Style 91
-1.3 Error Handling 153
-1.4 I/O 166
-1.5 Type Safety 176
-1.6 Regression 218
-
-2. Dynamic Types 227
-2.1 Two Kinds of Values 234
-2.1 Pointer Bitfield 245
-2.2 Heap Objects 268
-2.3 The COBJ type 288
-2.4 Strings 305
-2.4.1 Encapsulated C Strings 320
-2.4.2 Representation Hacks for 2-byte wchar_t 364
-2.4.3 Representation hacks for 4-byte wchar_t that is 2-byte aligned 422
-
-3. Garbage Collection 432
-3.1 Root Pointers 450
-3.2 GC-safe Code 473
-3.2.1 Rule One: Full Initialization 499
-3.2.2 Rule Two: Make it Reachable 528
-3.3 Weak Reference Support 663
-3.4 Finalization 706
-3.5 Generational GC 730
-3.5.2 Representation of Generations 739
-3.5.3 Basic Algorithm 775
-3.5.4 Handling Backpointers 810
-3.5.5 Generational GC and Finalization 888
-
-4. Debugging 917
-4.2. Debugging the Yacc-generated Parser 1048
-4.3. Debugging GC Issues 1061
-4.4 Object Breakpoint 1084
-4.5 Valgrind: Your Friend 1103
+0. Overview 48
+
+1. Coding Practice 55
+1.2 Program File Structure 78
+1.3 Style 92
+1.3 Error Handling 154
+1.4 I/O 167
+1.5 Type Safety 177
+1.6 Regression 219
+
+2. Dynamic Types 228
+2.1 Two Kinds of Values 235
+2.1 Pointer Bitfield 246
+2.2 Heap Objects 269
+2.3 The COBJ type 289
+2.4 Strings 306
+2.4.1 Encapsulated C Strings 321
+2.4.2 Representation Hacks for 2-byte wchar_t 365
+2.4.3 Representation hacks for 4-byte wchar_t that is 2-byte aligned 423
+
+3. Garbage Collection 433
+3.1 Root Pointers 451
+3.2 GC-safe Code 474
+3.2.1 Rule One: Full Initialization 500
+3.2.2 Rule Two: Make it Reachable 529
+3.3 Weak Reference Support 691
+3.4 Finalization 734
+3.5 Generational GC 758
+3.5.2 Representation of Generations 767
+3.5.3 Basic Algorithm 803
+3.5.4 Handling Backpointers 838
+3.5.5 Generational GC and Finalization 916
+
+4. Debugging 945
+4.2. Debugging the Yacc-generated Parser 1076
+4.3. Debugging GC Issues 1089
+4.4 Object Breakpoint 1112
+4.5 Valgrind: Your Friend 1131
0. Overview
@@ -587,7 +587,7 @@ destroys c. Essentially, the t->value structure member is the sink for the
data flow which carries the cons cell: The data flow emanates from the call
cons(foo, bar), and terminates in t->value.
-There are several right ways to fix this:
+Here is yet one more incorrect way to fix this:
{
val co;
@@ -603,9 +603,31 @@ COBJ. This makes the structure visible to the garbage collector (through the co
variable, which is live at the point where the cobj function is called, due to
having a next use in the return statement!) Now we can safely stash a newly
allocated cons cell into that structure, allowing that structure to hold the
-one and only reference to that object.
+one and only reference to that object. The issue which renders the above
+incorrect is with *how* we stash that cons into the object.
-Another approach, which avoids two-step initialization of the structure:
+The above breaks specifically because of generational
+garbage collection. The issue is that the t->value = cons(foo, bar)
+uses a plain C assignment. The problem is that the cons(foo, bar)
+call can trigger a garbage collection, which can promote the co object
+into the mature generation. Yet, the cons itself is a baby object. And
+consequently, the assignment now mutates a mature object to point to a baby
+object: the forbidden direction.
+
+If the above code structure is used, the assignment must use the
+set macro:
+
+ {
+ val co;
+ some_struct_type *t = (some_struct_type *) chk_malloc(sizeof *t);
+ t->value = nil;
+ co = cobj((mem_t *) t, some_type_symbol, &some_type_ops);
+ set(mkloc(t->value, co), cons(foo, bar));
+ return co;
+ }
+
+This is cumbersome. Another approach, which avoids two-step initialization of
+the structure, and the cumbersome set:
{
val c = cons(foo, bar);
@@ -621,11 +643,16 @@ cobj call because it has a next use: its value is used in the subsequent
assignment to t->value. We don't initialize the structure because even if
the cobj function triggers gc, the gc cannot yet see that structure and
so there is no danger. After cobj returns, the first thing we do is
-initialize the structure (obeying the first rule of gc-safe code).
+initialize the structure (obeying the first rule of gc-safe code).
Just after cobj returns, the structure is uninitialized and visible to the
garbage collector, but there is nothing that will trigger gc prior to
the initialization.
+The generational issue goes away because if the call to cobj triggers
+garbage collection, it will mean that the cons is a mature object.
+There is no problem with the assignment because it mutates a baby object
+to point to a mature object.
+
Note that this premature collection problem also affects functions which simply
take an existing object and put it into a structure, where it is not obvious
that an object may have been allocated which is not visible to gc,