Memory Management
Literature references and annotations by Dick Grune, dick@dickgrune.com.
Last update: Fri Sep 06 15:59:25 2024.
These references and annotations were originally intended
for personal use and are presented here only in the hope
that they may be useful to others.
There is no claim to completeness or even correctness.
Each annotation represents my understanding of the text
at the moment I wrote the annotation.
No guarantees given; comments and content criticism welcome.
Yossi Levanoni,
Erez Petrank,
An on-the-fly reference counting garbage collector for Java,
in OOPSLA '01 Conference,
ACM SIGPLAN Notices,
vol. 36,
#11,
Nov. 2001,
pp. 367-380.
Memory management through reference counting is not obviously compatible
with a multiprocessor environment, due to the high bandwidth needed to
update the reference counts.
The authors propose a number of techniques to allow reference counting
on multiprocessors.
First, like Deutsch & Bobrow (CACM, Sept 1976) they defer updating
non-local reference counts.
This requires keeping the old reference counts as well.
Second, they introduce a collection phase (not present in normal
reference counting), during which the old and new reference counts of
all threads are brought into agreement with each other, and during which
unreferenced objects are detected and freed.
To avoid stopping the world, a sliding view of the world is used, in
which internally consistent views of different treads at slightly
different times are integrated.
To allow this integration, all messages during the collection phase have
to be recorded and made available.
All the algorithms involved are sketched in the paper, but for the full
algorithms the reader is referred to a technical report.
Hans-J. Boehm,
Simple garbage-collection safety,
in 1996 ACM SIGPLAN Conf. Programming Language Design and Implementation,
ACM SIGPLAN Notices,
vol. 31,
#5,
May 1996,
pp. 89-98.
Conservative garbage collection is threatened by some compiler
optimizations which "hide" pointers by doing calculations on them.
For example, suppose the array reference p[i-1000] is the last reference
to the heap array pointer p; then the reference could conceivably be
translated as p := p - 1000; p[i], hiding the pointer p from a
conservative garbage collector.
Although this is a non-problem in practice, the author shows ways (not
at all so "simple" as the title promises) to guarantee full safety.
Nandakumar Sankaran,
A bibliography on garbage collection and related topics,
ACM SIGPLAN Notices,
vol. 29,
#9,
Sept. 1994,
pp. 149-158.
[ 281 entries ]
Hans-J. Boehm,
Space-efficient conservative garbage collection,
in ACM SIGPLAN '93 Conf. PLDI,
ACM SIGPLAN Notices,
vol. 28,
#6,
June 1993,
pp. 197-206.
Several techniques to avoid false pointer identification are described.
The main technique is blacklisting data values found during a scan that
are identified to be prospective false pointers and to avoid allocating
data on these positions.
A prospective false pointer is for example a value that points into the
heap at a position where nothing has been allocated.
The conservative collector can determine this, but if it happens to
allocate data at that position, it will have created a false pointer.
A slow-down of less than 1 percent is reported.
Reed Hastings,
Bob Joyce,
Purify -- Fast detection of memory leaks and access errors,
in Winter '92 USENIX Conference,
1992,
pp. 125-136.
Description of the tool Purify.
"Access errors are errors of commission, memory leaks are errors of
omission".
Access errors are detected as follows.
Purify maintains a two-bit state for each byte in memory; the states
are: unallocated, uninitialized and initialized.
It also instruments each memory access with code to test and / or update
the state.
Arrays are surrounded by "red" zones of bytes marked uninitialized, to
catch out-of-bounds indexing.
Memory leaks are sniffed by calling a "garbage detector".
The garbage detector reports the origin (routine of allocation) of each
chunk it thinks is garbage.
It is up to the user to confirm this and take action.
There is also an excellent critique of conservative garbage collection
in Note 7, which points out, among other things, that the chance for
having a false pointer nail down an piece of garbage is larger for
larger pieces, but these are exactly the ones that we would like to
remove.
Amer Diwan,
Eliot Moss,
Richard Hudson,
Compiler support for garbage collection in a statically typed language,
ACM SIGPLAN Notices,
in ACM SIGPLAN '92 Conf. on Prog. Lang. Design and Implementation,
vol. 27,
July 1992,
pp. 273-282.
An untidy pointer is a pointer that is instrumental in accessing data,
but does not point properly to that data due to optimizations.
The obvious example is using the base address A[0] for an array
A[10:20].
Such pointers will be misinterpreted by the garbage collector.
The paper points out optimizations which can give rise to
untidy pointers, and suggests a number of remedies.
The basic solution is to extend the templates for the garbage collector
with the possibility of untidy pointers; simple expressions involving
addition and/or subtraction of constants can be specified.
This scheme has several problems, solutions for which are presented.
David L. Detlefs,
Concurrent garbage collection for C++,
in Topics in Advanced Language Implementation,
ed. by Peter Lee,
Cambridge, Mass.,
MIT Press,
1991,
pp. 101-134.
The two main questions a garbage collector has to answer are:
"Given an object what pointers does it contain?" and
"Given a pointer, where is the start and end of the object it points
to?".
Both are a problem in C++.
The first can be "solved" by conservative garbage collection.
The second is almost unsolvable due to the existence of interior pointers.
On top of this, the author wanted a compacting concurrent garbage
collector, which raise a host of new problems.
Compacting means updating pointers, which cannot be done in a
"conservative" way.
The paper first explains several solutions from the literature,
including Bartlett's "mostly copying" algorithm, which hitherto had only
been published in technical reports (but see
Bartlett, "Compacting garbage collection with ambiguous roots",
Lisp Pointers, Vol. 1, No. 6, 1988, pp. 3-12), and shows all of them
wanting.
His own garbage collector includes a new front end for the C++ code,
which constructs type descriptions for all allocatable types, calls
a different version of malloc for the "new" construct and makes various
other adaptations.
This provides complete knowledge of where the pointers are.
To find the start of an allocated segment from an interior pointer, a
bit map of memory is kept, one bit per allocatable position.
The bit is on for the start positions of all allocated segments; by
searching backwards, the start position can be found.
Compacting is done by a variant of Bartlett's mostly copying collector.
Concurrency is introduced by using Baker's algorithm: copy the roots
only, and then start a second process to do the rest of the copying.
At the same time, watch all pointers the program is going to see, and if
the point to from-space, move the corresponding object right now.
Watching all pointers is implemented using the page lock primitives of
the Mach operating system.
Overlap percentages of 25 to 50 percent are achieved.
Andrew W. Appel,
Garbage collection,
in Topics in Advanced Language Implementation,
ed. by Peter Lee,
Cambridge, Mass.,
MIT Press,
1991,
pp. 89-100.
A somewhat breathless summary of garbage collection techniques (the
Schorr and Waite is algorithm explained in six lines!).
Ravi Sharma,
Mary Lou Soffa,
Parallel generational garbage collection,
ACM SIGPLAN Notices,
vol. 26,
#11,
Nov. 1991,
pp. 16-32.
Many references; Do excerpt.
Jürgen Heymann,
A comprehensive analytical model for garbage collection algorithms,
ACM SIGPLAN Notices,
vol. 26,
#8,
Aug. 1991,
pp. 50-59.
Derives closed formulas for various properties of five garbage
collection algorithms, using the entire Greek alphabet and then some.
The five algorithms are: marks&scan (M&S), two-space copying (COPY),
parallel M&S, parallel COPY, and reference count + parallel M&S.
The justification of the various properties is argued in depth.
In very broad terms, reference count + parallel M&S is "best".
Benjamin Goldberg,
Tag-free garbage collection for strongly typed programming languages,
in ACM SIGPLAN '91 Conf. Programming Language Design and Implementation,
ACM SIGPLAN Notices,
vol. 26,
#6,
June 1991,
pp. 165-176.
For every type (including the implicit types of activation records)
garbage collection routines are generated.
For monomorphically and polymorphically typed languages.
Ulrich Neumerkel,
Speicherbereinigung für Prologsysteme,
Thesis,
TU Inst.f. Computersprachen E185/1 Argentinierstr.8/4,
A-1040 Wien Austria,
~1990,
pp. .
(ulrich@vip.UUCP) +431 58801/4477 fax +431 5057838
E.P. Wentworth,
Pitfalls of conservative garbage collection,
Software Practice & Experience,
vol. 20,
#7,
July 1990,
pp. 719-727.
Points out that one falsely identified pointer can keep an extensive
data structure from being freed, and shows that this can happen in
practice, mainly in systems in which garbage is not created piecemeal
but in large blobs.
This happens regularly in functional languages, from which the author
draws his examples.
David R. Hanson,
Fast allocation and deallocation of memory based on object lifetimes,
Softw. Pract. Exper.,
vol. 20,
#1,
Jan. 1990,
pp. 5-12.
All dynamic data is grouped into a small number of lifetimes, e.g. phase
1 to n of a compiler.
For each lifetime, data is allocated from large chunks of memory which
are freed simultaneously only after the lifetime is over.
The algorithm is twice as fast as quick-fit and twice as slow as stack
allocation.
Includes code in ANSI C.
R.P. Brent,
Efficient implementation of the first-fit strategy for dynamic storage allocation,
ACM Trans. Prog. Lang. Syst.,
vol. 11,
#3,
July 1989,
pp. 388-403.
Naive implementation of first-fit for malloc() uses linear search to
find the position of the first fit.
There exist several techniques to improve upon this by imposing a tree
structure over the free list, thus obtaining logaritmic cost; this is
another one, with a lower constant factor.
(Keeping separate free lists for exponentially increasing block sizes
gives (almost) constant time, though.)
Memory is filled by free and occupied chuncks; a chunk consists of a
one-word header (size + free/occupied bit) followed by the user block.
On top of this a conceptual structure of a constant number equally sized
"segments" is imposed.
Segments have no representation in memory and the only important
information about a segement is the position of the first chunk in it
and the size of the largest free chunk that starts in it;
since segements do not exist in memory, chunks can extend right
through them.
Segement addresses are kept in a priority heap with the one with the
smallest largest free chunk on top.
Now finding a chunk that is large enough is easy:
search the priority heap for the first segment to contain a large enough
chunk, and find that chunk by linear search from the first chunk in the
segment.
The paper is not too reader-friendly:
1. The reason for having the segment structure is not explained.
I suppose it serves to keep the memory requirements for the
administration constant; keeping the free chunks themselves in a
priority heap would make that heap proportional in size to the number
of free chunks.
2. Both blocks and blocks + administration word are called blocks, which
is mildly confusing.
3. A number of algorithms and technical terms are introduced by
referring the reader to literature references, often exercises in
Knuth's book.
Michael Caplinger,
A memory allocator with garbage collection for C,
in Winter 1988 USENIX Conf.,
Feb. 1988,
pp. 325-330.
A conservative garbage collector for C.
Uses a 32-bit magic number to (heuristically) mark the start of a chunk
(called a segment here), to catch pointers to non-segments.
Coexists with free().
All malloc calls done by the system before the program starts set the
alien bit in the segment header.
Such segments are not garbage-collected, due to possible uncouth
operations during program start-up.
The performance of the package is "not extremely good, but acceptable".
Charles B. Weinstock,
William A. Wulf,
Quick Fit: an efficient algorithm for heap storage allocation,
ACM SIGPLAN Notices,
vol. 23,
#10,
Oct. 1988,
pp. 141-148.
A set of quick-lists is kept for blocks of popular sizes.
Hans-Juergen Boehm,
Mark Weiser,
Garbage collection in an uncooperative environment,
Softw. Pract. Exper.,
vol. 18,
#9,
Sept. 1988,
pp. 807-820.
Introduces "conservative garbage collection": conservative garbage
collection does not guarantee that it will collect all garbage, only
that it will not collect all non-garbage.
The authors point out that in principle all garbage collectors are
conservative garbage collectors.
Conservative garbage collection needs virtually no cooperation from the
code.
This has several advantages:
1. it can work with virtually any code;
2. it costs nothing when not used;
3. it requires no extra memory;
4. the cooperation from the code cannot be wrong.
The only requirements are:
1. the code does not conceal pointers;
2. it must be easy to determine if a purported pointer points to an
allocated chunk.
The present version cannot handle pointers to inside a chunk.
Basically, conservative garbage collection is like normal non-compacting
mark-and-scan garbage collection, with the following additions:
1. the allocator keeps a list of allocated chunks;
2. anything that looks like a pointer (is in range and points to an
allocated chunk) is considered a pointer.
A.W. Appel,
J.R. Ellis,
K. Li,
Real-time concurrent collection on stock multiprocessors,
ACM SIGPLAN Notices,
vol. 23,
#7,
July 1988,
pp. 11-20.
Using virtual memory traps to avoid tests in Baker's version of
two-space copying garbage collection.
Bernard Lang,
Francis Dupont,
Incremental incrementally compacting garbage collection,
in ACM SIGPLAN '87 Symposium on Interpreters and Interpretive Techniques,
ACM SIGPLAN Notices,
vol. 22,
#7,
July 1987,
pp. 253-263.
Starts by constructing an abstract garbage collection algorithm and then
shows that mark & scan + compaction and two-space copying are actually
different actualizations of the same algorithm.
The authors then construct a "mixed" algorithm, with a from-space, a
to-space and an ms-space (for mark & scan).
Each space may consist of several non-adjacent segments.
During each garbage collection cycle one set of from-space, to-space and
ms-space segments is treated.
In addition the boundaries between the segments are movable.
All this should result in very adaptive garbage collection.
The algorithms are only described loosely (but probably adequately).
D. Yun Yeh,
Toshinori Munakata,
Dynamic initial allocation and local reallocation procedures for multiple stacks,
Commun. ACM,
vol. 29,
#2,
Feb. 1986,
pp. 134-141.
The authors propose two improvements to the bidirectional multiple-stack
allocation algorithm of Korsh & Laison, Commun. ACM, 1983.
The first improvement concerns the initial stack allocation.
Traditionally, a static placement is chosen, before any element is
stacked.
This paper proposes to delay allocation for a stack until the first
element is stacked.
When the time arrives for a stack to be allocated, it is allocated to
the back of an unpaired stack if there is one, or in the largest gap.
An advantage is that the number of stacks need not be known in advance,
in addition to a more flexible allocation.
The second improvement concerns reallocation after a collision.
Traditionally, all stacks are reallocated.
This paper proposes to try to restrict the number of stacks pushed
around.
Note that in the bidirectional structure four stacks are involved in a
collision: the colliding stacks with their back-to-back partners.
First an attempt is made to disengage the colliding stacks by shifting
the pair that has the largest gap next to it.
If this does not yield enough elbow room (according to some criterion),
more stack pairs and gaps are involved in the shift, etc., until in the
worst case all stacks have to be moved.
Simulation results are given, suggesting that both measures together 1.
achieve a 15 to 20% reduction of the number of elements moved, 2. get
more improvement as the stack behavior is more erratic, and 3. are more
stable; all with respect to the Korsh-Laison algorithm.
D. Julian M. Davies,
Technical Correspondence on "A multiple-stack manipulation procedure",
Commun. ACM,
vol. 27,
#11,
Nov. 1984,
pp. 1158.
Points the reader to Bobrow & Wegbreit, CACM, 1973.
Mordechai Ben-Ari,
Algorithms for on-the-fly garbage collection,
ACM Trans. Prog. Lang. Syst.,
vol. 6,
#3,
July 1984,
pp. 333-344.
After having critisized Dijkstra's algorithm for being too difficult to
prove correct, the author presents a simpler but less efficient
algorithm using two colours, and proves its correctness (informally).
From this, a more efficient algorithm using three colours is derived.
Full algorithms are given in Ada; efficient forms of bit manipulation in
hardware are also included.
Thomas W. Christopher,
Reference count garbage collection,
SP&E,
vol. 14,
#6,
June 1984,
pp. 503-507.
The title of the paper is misleading: the reference count algorithm is
not doctored to do full garbage collection, it is just complemented by
a full mark and scan garbage collector in the traditional way.
The novelty of the paper lies in the way the two are integrated such
that the presence of the reference counts are used to obviate the need
to know the stack.
"How to do full garbage collection without knowing the stack" would be a
more appropriate title.
Basically, the garbage collector works in ??? scans (aside from doing
the normal frees on chunks that get a reference count of 0).
First it goes through all pointers in all chunks not in the free list
and decrements the reference count of the chunks they point to by one.
Now all non-zero reference counts stem from references from outside the
heap.
Next a scan finds all chunks with a non-zero reference count, and for
each such chunk C performs a depth-first visit on all chunks reachable
from it and sets their reference counts to $epsilon$ (called RCNIL in the
paper), a number that is not zero, but acts as zero when something is
added.
Now all non-garbage chunks have a reference count larger than zero.
The last scan reverses the effect of the first: it goes through all
pointers in all chunks with a non-zero reference count and increments
the reference count of the chunks they point to by one.
Now all non-referenced circular structures have got reference counts of
zero.
Some details of the scans and the implementation of the counters and the
free list are given in narrative form.
David M. Ungar,
Generation scavenging: A non-disruptive high performance storage reclamation algorithm,
in ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments,
ACM SIGPLAN Notices,
vol. 19,
#5,
May 1984,
pp. 157-167.
Mature, give them everything but the kitchen sink and then the kitchen
sink on top approach to memory management, as implemented for Berkeley
Smalltalk.
After rejecting four existing garbage collection techniques
(reference counting for being too expensive and for not releasing
cycles,
deferred reference counting for not releasing cycles,
mark & sweep for causing frequent pauses, and
two-space copying),
the authors combine scavenging for new object and off-line reclamation
of old objects.
There are three areas for new objects, NewSpace where the new are
allocated, PastSurvivorSpace where surviving new objects remain and a
scratch area FutureSurvivorSpace.
The scavenger moves live objects from NewSpace + PastSurvivorSpace to
FutureSurvivorSpace and then swap the meaning of PastSurvivorSpace and
FutureSurvivorSpace.
Objects that survive several scavenges get tenure and are moved to the
old objects area, where they are subject to off-line collecting (using
good old mark & scan?).
All this no doubt requires a lot of pointer updating.
Measuements are given and comparisons with the rejected algorithms are
made.
A four-bit pointer extension in the hardware is suggested to assist in
garbage collecting.
James F. Korsh,
Gary Laison,
A multiple-stack manipulation procedure,
Commun. ACM,
vol. 26,
#11,
Nov. 1983,
pp. 921-923.
The authors propose a modification to Garwick's algorithm (BIT 4, 1964)
for keeping a number of stacks in a contiguous chunk of memory.
In Garwick's algorithm all stacks grow in the same direction, but this
algorithm implements half the stacks as growing from low to high and the
other half as growing from high to low.
The end stacks are with their backs to the ends of memory, the left one
growing right and the right one growing left.
The other stacks are in between, in back-to-back pairs.
This arrangement leaves larger gaps between the stack, and fewer hard
boundaries, which is a good thing.
So note that "bidirectional" in this case does not mean that the stacks
can grow from both ends, as one might assume.
It is probably up to the interface to hide this set-up, although this is
not mentioned in the paper.
Simulation results are presented which show that the bidirectional
technique roughly reduces the number of shifted bytes by half.
Jaques Cohen,
Alexandru Nicolau,
Comparison of compaction algorithms for garbage collection,
ACM TOPLAS,
vol. 5,
#4,
Oct. 1983,
pp. 532-553.
The time efficiency of four garbage compacting algorithms is determined
by deriving closed formulas for it.
The algorithms were first coded in a special language and a processor
was used to determine the a closed form for the time requirement; the
program Macsyma was then used to simplify the formula.
Memory requirements were derived directly from the algorithm.
The four garbage compacting algorithms are: Knuth's standard Lisp
three-pass garbage collector, a "rolling table" compacter, Morris'
compacter and Jonkers' compacter.
Each is explained briefly but sufficiently.
The formulas are then applied to specific hardware, the PDP-10.
The verdict is that Knuth's standard algorithm is fastest but requires
space for one pointer, unlike the other three, and among those that need
almost no additional memory (usually 1 or 2 bits) Jonkers' is fastest.
Morris's is 3 to 8 times slower due to pointer manipulation, and table
rolling can become non-linear.
Johannes J. Martin,
An efficient garbage compaction algorithm,
Commun. ACM,
vol. 25,
#8,
Aug. 1982,
pp. 571-581.
Considerable pointer shuffling to speed up Morris' algorithm.
David R. Barach,
David H. Taenzer,
Robert E. Wells,
A technique for finding storage allocation errors in C-language programs,
ACM SIGPLAN Notices,
vol. 17,
#5,
May 1982,
pp. 16-23.
malloc() and free() are instrumented so as to write a small (6-byte)
record of their actions to a file.
A postprocessing program analyses this file to produce a report of leaked
blocks and duplicate frees.
The report includes call stack listings for the offending actions, which
necessitates compiler-dependent code in the instrumentation, to scan the
stack.
Since the authors use fopen() to open the record file, they have to
switch off their recording mechanism during that open.
David Spector,
Minimal overhead garbage collection of complex list structure,
ACM SIGPLAN Notices,
vol. 17,
#3,
March 1982,
pp. 80-82.
The pointer obtained when an object is created is marked the "defining
reference" by setting one bit in it; whenever the pointer is copied, the
defining-reference bit in the copy is cleared.
Whenever a pointer is discarded, its defining-reference bit is checked
and if it is set, the object pointed at is discarded after recursively
discarding all the pointers in it.
This scheme only works if the defining reference is the last to be
discarded; it is the programmer's job to see to this, so it's not really
garbage collection.
Jacques Cohen,
Garbage collection of linked data structures,
Computing Surveys,
vol. 13,
#3,
1981,
pp. 341-367.
Survey of many algorithms known at that time, with short descriptions.
D.G. Bobrow,
Managing reentrant structures using reference counts,
ACM TOPLAS,
vol. 2,
#3,
July 1980,
pp. 269-273.
Although in principle reentrant structures cannot be handled with
reference counts, large subclasses of them can, with some assistence of
the programmer.
The basic idea is to define special-purpose groups of nodees and to keep
one reference count for the entire group, which counts the number of
references pointing into the group from outside of it.
The programmer defines what constitutes a group, and the system
maintains the counter.
When the counter reaches zero, the entire group can be deallocated.
Four examples of such groups are discussed, covering doubly-liked lists
and trees, circular lists, etc.
David S. Wise,
Morris' garbage compaction algorithm restores reference counts,
ACM TOPLAS,
vol. 1,
#1,
July 1979,
pp. 115-120.
[[ So concise as to be incomprehensible. ]]
H.B.M. Jonkers,
A fast garbage compaction algorithm,
Inform. Process. Lett.,
vol. 9,
#1,
July 1979,
pp. 26-30.
The algorithm consists of two scans, both from left to right.
The first scan does not move blocks.
When arriving at a block B, the invariant holds that pointers to B
from blocks left of B have been threaded into a thread starting at
B; these can now be updated to point to the final location of B.
Next, pointers from B are threaded to their destinations.
Note that all pointer locations in B still contain genuine pointers,
not thread pointers.
The second scan moves the blocks.
When arriving at a block B, the invariant holds that all blocks to the
left of B have been moved to their final positions and contain updated
pointers only, and pointers to B from B and blocks to the right of it
have been threaded into a thread starting at B; these can now be
updated to point to the final location of B.
The very clear paper contains an informal description of the algorithm,
informal code, formal code in Algol 68 and two page-size pictures, one
showing the threading symbolically (abstracting from the exact
mechanism) and one showing the actual threading.
F. Lockwood Morris,
A time- and space-efficient garbage compaction algorithm,
Commun. ACM,
vol. 21,
#8,
Aug. 1978,
pp. 662-665.
Vivid description of a compaction algorithm that requires only one bit
per pointer.
To simplify updating the pointer after (or during) the move phase, for
each chunck X the pointers to it are chained in a chain starting at
X and terminated by the contents of the first word in X which was
pushed out by the new pointer to the start of the chain; one bit is
needed to signal the end of the chain.
Then the chunks can be moved and for each chunk its address, now known,
can be patched back into the pointer locations accessible through the
chain.
Chaining the pointers in the first place is the hard part, and the
author shows that it can be done in two sweeps, one in the direction of
the compaction and a second against it.
The latter can then be combined with the move phase and the update
phase, to result in an efficient algorithm.
Full code is supplied, with an informal proof of correctness.
Henry G. Baker Jr.,
List processing in real time on a serial computer,
Commun. ACM,
vol. 21,
#4,
April 1978,
pp. 280-294.
[ concurent garbage collection for a LISP system ]
D.S. Wise,
D.P. Friedman,
The one-bit reference count,
BIT,
vol. 17,
#3,
Sept. 1977,
pp. 351-359.
The one-bit reference count of a chunk has two values:
0 = unknown / overflow,
1 = there is exactly one reference to this chunk.
Chunks start with a reference count of 1.
Adding a reference sets the reference count to overflow, deleting the
reference frees the chunk.
Unreachable chunks with a reference count = overflow have to wait until
a real garbage collector comes along.
(Seems extendible to reference counts of any size - DG.)
The rest of the paper (the larger part) describes a caching mechanism
for nodes whose reference count has only temporarily overflowed.
Jeffrey M. Barth,
Shifting garbage collection overhead to compile time,
Commun. ACM,
vol. 20,
#7,
July 1977,
pp. 513-518.
Extensive data flow analysis to catch situations in which allocation is
more or less immediately followed by becoming inaccessible ("cancelling
transactions").
Also detects situations in which bunch allocations can be done.
David S. Wise,
Dan C. Watson,
Tuning Garwick's algorithm for repacking sequential storage,
BIT,
vol. 16,
1976,
pp. 442-450.
Garwick's algorithm (BIT 4, 1964) concerns the maintenance of a sequence
of n stacks or extensible arrays in a consecutive memory segment: when
one stack bumps into the next as a result of a PUSH or EXTEND operation,
some (or in Garwick's algorithm all) stacks have to be reshifted to
"well-spaced" positions.
To compute the new positions, Garwick's algorithm remembers the old
sizes of the stacks, so it can compute their respective growths.
Now a fraction a of the available memory is distributed as growth
space equally over the n stacks, and a fraction $1-a$ is distributed
in proportion to the stacks' latest growths.
Garwick recommends a to be set to about 0.1.
The authors propose to adjust a dynamically, as follows.
When reallocation takes place, they compute the a that would have
avoided this reallocation, had that been the value of a the last time,
and set a to that value.
They test this heuristic by simulation under three scripts:
1. Stacks are pushed on at random.
Result: the value of a soon reaches 0.8 and then fluctuates around
that value; prediction has no value here.
2. One stack is chosen and extended until reallocation takes place, and
then a new stack is chosen at random, etc.
Result: the value of a slowly climbs to 1.0; again prediction has no
value.
3. Random stacks are pushed on in proportion to their sequence number,
i.e. stack number i grows about i times as fast as stack 1.
Result: the value of a stabilizes at about 0.15, even if it is started
at 0.9; prediction is effective here.
So it works.
L.Peter Deutsch,
Daniel G. Bobrow,
An efficient, incremental, automatic gargbage collector,
Commun. ACM,
vol. 19,
#9,
Sept. 1976,
pp. 522-526.
The authors consider a two-level memory: a cheaply accessible region
called the "variables" and a more expensive region called the heap.
Whether these regions are indeed on different hardware (main memory and
backing store) or just the working set versus pages out on disk is
immaterial.
Reference counts are kept for all garbage-collectible data, but with
four considerable differences: 1. they count references from the heap
only; 2. they are kept up to date only for those in the variables; 3.
they are not kept in the data itself but rather in hash tables; 4.
reference counts of exactly one (which are the most frequent) are not
recorded.
This greatly increases efficiency (no more updating of counters in
secondary memory), but no longer allows freeing data as soon as its
reference count drops to zero (it may still be referenced from the
variables).
To remedy this, two (hash) tables are kept: the ZCT (zero count) table,
which holds the addresses of data not pointed to by data on the heap,
and an MRT (multi-reference table) which holds the addresses and
reference counts of data pointed to by more than one reference from the
heap; since most (90 tot 98%) of the accessible data carry a reference
count of 1, these tables can be small.
When creating a data item, its address is entered into the ZCT table.
Storing pointers to it in variables carries no cost.
When storing a pointer into heap data, 1. if the pointer is in the ZCT
table, delete it from that table (reference count goes from 0 to 1); 2.
if it is in the MRT table, increment its reference count there; 3. if it
is in neither, enter it into the MRT with a count of 2.
Similar obvious rules apply to the deletion of a pointer situated in the
heap.
Again, pointers situated in the variables are not tracked.
When a memory allocation request fails or the ZCT table overflows,
reclaiming takes place, as follows.
The variables are scanned, and any ZCT entry holding a pointer
accessible from the variables is marked "used by variables but not by
the heap"; now all data pointed to by unmarked entries in the ZCT table
can be freed, and the corresponding entries in the ZCT table can be
cleared; finally the marks are removed.
Freeing the heap data can again affect the ZCT and the MRT, so, if
needed or desired, reclaiming may be repeated.
Reference-counting garbage collectors require a full garbage collector
to reclaim cyclic structures.
The paper describes a two-space copying garbage collector based on the
ZCT and MRT tables, and a generational one.
Furthermore it suggests the use of a separate processor to concurrently
handle the ZCT and MRT table handling and memory reclamation.
Most descriptions in the paper are rather sketchy, and require
considerable interpretation.
D.G. Bobrow,
A note on hash linking,
Commun. ACM,
vol. 18,
#7,
July 1975,
pp. 413-415.
Normally a pointer leads to a memory location and gives access to the
contents of that location and of possibly surrounding memory locations.
The technique described here allows additional data to be associated
with arbitrary pointers and memory locations, at a price.
The basic idea is very simple: put the additional information in a hash
table with the pointer as the key, and look it up.
The problem is of course to know that additional information is
associated with a pointer and should be looked up.
The answer is: mark the location under the pointer as "hash-linked".
If that cannot be done, it gets more expensive, but the scheme can still
be useful (see below); but for the moment we assume it can.
This scheme has funny applications.
First of all, it can be applied to itself, as follows.
When a new pointer is entered into the hash table and there is a
collision, the hash table entry found is marked hash-linked, and the new
entry is entered as the additional information into a second hash table
with the pointer as the key.
This means that there will never be a collision during look-up in the
first hash table, which in turn means that this table does not have to
store the keys; this is a considerable memory saver.
The paper continues by giving a number of applications.
Some examples:
1. Large arrays of mostly small numbers can be allocated with 1 or 2
bytes to a number; the occasional larger value is stored in the hash
table, with its location marked by maxint.
2. Occasional text insertions or deletions in very long strings can be
represented by storing the modifications in the hash table and replacing
their positions by marker bytes.
3. A recursive visit to all nodes of a possibly cyclic graph (called a
"reentrant structure" here) in which there is no room in the nodes to
store marking bits can be performed by storing the pointer to each node
in a hash table as soon as it is met, and check the hash table before
following a pointer.
(This seems to me to be a different algorithm: here we do not
know in advance that the address is there -- its presence in or absence
from the hash table is the marking bit.
DG.)
All these techniques are useful when memory is more important than CPU
cycles.
C.J. Cheney,
A non-recursive list compacting algorithm,
Commun. ACM,
vol. 13,
#11,
Nov. 1970,
pp. 677-678.
[ this is about two-space copying garbage collection ]
Jan V. Garwick,
Data storage in compilers,
BIT,
vol. 4,
1964,
pp. 137-140.
Describes a procedure in Algol 60 to emulate various dynamically sized
arrays in one fixed array.
The arrays are named by numbers, so the store procedure is defined as
store(val): at (i): in array (a); int val, i, a;.
George E. Collins,
A method for overlapping and erasure of lists,
Commun. ACM,
vol. 3,
#12,
Dec. 1960,
pp. 655-657.
Newell, Shaw and Simon [1957] have shown how to make lists share common
data (creating dags in our terminology), and the author points out that
this causes difficulties when a list is freed ("erased").
The problem is solved by introducing reference counts, unknown in 1960,
but with a number of built-in optimizations.
An implementation is described for the IBM 704, 709 and 7090.
Thes machines have 36 bits words and 15 bits addresses.
Each word accessible by a pointer (a "location word") contains a 21 bits
descriptor and a 15 bits address of the next link.
The descriptor contains 2 type bits and a 15 bits segment (plus 4 unused
bits).
For shared data, these 15 bits contain a pointer to reference-counted
object; for non-shared data that will not fit in 15 bits, the 15 bits
contain a pointer to the data, and for non-shared data that fits in 15
bits, the 15 bits contain the data.
When non-shared data gerts shared, it is endowed with a reference count;
when the reference count drops to 1, the data format reverts to
non-shared.
So the reference count is always greater than 1.
|