CSE 341, Sp '06: GC FAQ

University of Washington Computer Science & Engineering

CSE 341, Sp '06: The Garbage Collection FAQ

CSE Home

About Us

Contact Info

I was wondering if you could explain the following things to me just once more. I had a hard time absorbing everything in lecture and the slides are fairly abstract:

What are "Forwarding Pointers"?
What's a "Cheney Queue"?
On slide 16, what do you mean by "find objects in memory without following pointers"?
Deutsch-Schorr-Waite algorithm---what's up with that?
What are "free-lists"?
What is "locality"?
How exactly do indirection tables and write barriers work?

Q. What are "Forwarding Pointers"?
A: Used in semi-space and related collectors. When you copy/move an object, everything that points to the old copy needs to be updated to point to the new copy. If old copy had back-pointers to everything that pointed to it, that would be easy, but they don't. Instead, install in the old object a "forwarding pointer" (as in forwarding address for U.S. Mail) to the new copy; as/after new copies are created, scan them for pointers into old-space, and replace any such by copies of the forwarding pointers found there. Forward pointers are no longer needed after the end of a stop & copy phase. (See also Cheney queue.)
Q. What's a "Cheney Queue"?
A: No, not the Vice President's skinny braided pony tail. In a semi- space collector, you copy every "live" (non-garbage) object from from- space to to-space, allocating (and implicitly compacting) memory in to-space as you go. You need to find all the live objects (and update their pointers as you go). How to do that? Abstractly, you've got a "root set" of pointers, and objects with pointers to other objects, i.e., a directed graph (not necessarily connected - the unreachable parts are the garbage). Those of you who have had 326 will recall that "breadth first search" is one way to search a graph. What do you need to do BFS? A queue of not-yet-explored nodes. But we're in the middle of garbage collecting, so we don't have extra space to throw around for this hypothetical queue. Cheney to the rescue! Keep 2 pointers into to-space: LAST, the most recently allocated new object, and FIRST, the first (least recently allocated new object) that is "unprocessed". Initially they're both at the front of to-space. First, copy everything pointed to from the root set, advancing LAST. Then, while FIRST < LAST, "process" the object FIRST points to, namely, examine all its pointers, copying any uncopied objects into to-space (advancing LAST). How do you know if it's uncopied? Look for absence of a forwarding pointer (above). So the key point is that part of the data you need to move to to-space anyway constitutes the BFS queue needed to locate all the live objects, which you also need to do regardless - 2 birds, 1 shotgun, er, stone. Thank you, Cheney.
Exercise: think how you go about updating all the old pointers to become new pointers while you do this, withOUT requiring a second pass over to-space (and making use of forwarding pointers).
Q. On slide 16, what do you mean by "find objects in memory without following pointers"?
How else would you find objects referenced by pointers?

A: Related to mark-and-sweep. Suppose your heap is memory locations 1 million to 10 million. You've just done the traversal from the root set marking everything reachable; this process ONLY follows pointers. Now how do you find the garbage? By definition, the garbage CANNOT be reached by following pointers from the root set. Well, look at the object at location one million. Is it marked? If so, not garbage; if not, garbage. In either case, figure out how long it is, say 42 bytes. Then look at the object in location 1 million and 42. Is it marked? How long is it? etc. Continue to location 10 million. Key point is that (unless absolutely every object is exactly 42 bytes long) you have figure out what kind/size of object you are looking at in order to advance to the next one, withOUT having a nice, typed pointer to it. This part of GC probably canNOT be coded in a strongly typed language; you really need to be able to look at the bits, add 42 to a pointer, etc. to walk through memory this way. Crawling through the globals and the stack to fint the root set pointers entails some similar difficulties.
Q. Deutsch-Schorr-Waite algorithm---what's up with that?
Do we need to know it? If so, how does it work?

A: Basically for mark-and-sweep. You should know what it's for, but I didn't tell you (and you don't need to know) how it works. The what is basically: Again we need to traverse a graph to find reachable nodes, but at GC time we don't have much memory to spare to do this (else we wouldn't be GCing). DSW is a clever way to tweak/untweak the pointers you are traversing so as to simulate the stack/queue you would use in more ordinary graph traversal without using (much) extra space.
Q. What are "free-lists"?
A: In semi-space, the free space is one contiguous chunk of memory --- easy to keep track of. But in mark-and-sweep, after GCing at least once, the free space may be many chunks separated by live objects. How to track it? One approach is to throw these chunks onto a linked list, with, say, the first 2 words of each dedicated to record length of this chunk and a pointer to the next free chunk. This is a free list. "New" must then search the free list for a chunk big enough to satisfy the current request, & update the free list accordingly. GC probably needs to identify & merge adjacent free chunks (else fragmentation is irreversible).
Furthermore, suppose your application made very frequent use of 8 byte objects and 42 byte objects and infrequent use of other sizes. A variant of the free list approach might have a general free list as above, plus 2 special free lists dedicated to 8 and 42 byte objects respectively. The advantage is that it's slightly faster to allocate/free an 8 or 42 byte object than a general-sized object---just remove/add it to the appropriate queue (no need to check/update chunk size). (But needs some thought as to whether/how/when adjacent free 8- or 42-byte chunks get merged into/carved from bigger free chunks...)
Q. What is "locality"?
A: The tendency of a program to repeatedly access nearby ("local") memory locations. E.g., walking sequentially through an array exhibits locality; successively accessing totally random array locations does not. Why does it matter? Because hardware memory caching and virtual memory paging provide substantial performance benefits only if there's some locality in memory reference. You'll learn more about this in 378, if you haven't seen it already. One advantage of the semi-space collector is that it may enhance locality, compared to, say, mark and sweep.
Exercise: Suppose you're writing a semi-space collector for Scheme. When you copy a cons cell, would it be better to copy its car or its cdr next? Why? How does this interact with a Cheney queue?
Q. How exactly do indirection tables and write barriers work?
A: In Generational garbage collection, (basically separate semi-space collectors for old and young objects) you want to be able to collect the young space more frequently than old space. But if old objects point to young objects, and the young objects move, how do we find them? One option would be to walk through old-space & update its pointers whenever young space is collected. But this would negate much of the performance advantage of generational GC---old data is less volatile, so we hope to save time by not mucking with it as often. Option 2 is to keep track of all old->young pointers so we can update them when young moves. "Write barriers" trap writes to pointer fields in old space so we can check for & record appearance/disappearance of old->young pointers; when we GC young, we update all these pointers. Option 3 is to use an indirection table: if an old object wants to point to a young object Y, it doesn't point to it directly; instead, it points to a entry for Y in the indirection table. When we GC young, we only need to update a single pointer to each young object in the indirection table; all old objects indirect through that pointer, so they all can find Y's new location. [I emphasize old->young pointers; similar issues exist for young->old pointers when we GC old, but the issue is less severe since young is typically smaller. E.g., GCing young at the same time so all Y->O pointers are updated is not too inefficient. Another issue to consider: when do entries in the indirection table ever get removed?]

Hope this helps...


	Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX [comments to cse341-webmaster at cs.washington.edu]