Due 4/20/99
For this problem we will examine the problem of placing structure fields into cache lines. We call this the field placement problem. A structure is composed of a number of fields of different sizes in bytes. For example, a structure foo might consist of 8 fields of sizes 2, 2, 2, 3, 4, 4, 7, 7. A cache line is a segment of memory that is brought into the cache at one time during a cache miss. For example, a cache line might be 16 bytes. Each field must fit within one cache line. When a structure instance is allocated from memory it is advantageous to allocate it the fewest number of cache lines. In our example, the fields can be allocated as 2, 7, 7 on one cache line and 2, 2, 3, 4, 4 on a second cache line. A good algorithm for the field placement problem would be general where it would work whenever the field sizes are any integers f1, f2, ... , fn and the cache line size is any integer b.
1. State the problem of finding the
minimum number of cache lines for structure as a decision problem rather
than an optimization problem.
2. Show that this decision problem
is NP-complete. You may use any of the known NP-complete problems
in your argument.
In reality each field in a structure has an alignment associated it. For example, let a cache line be indexed 0 to c-1. Four byte integers must begin only at positions 0, 4, 8, ..., that is, they have 4 byte alignment. So now our problem changes where each field is given by a pair (f, a) where f is its length and a is its alignment. A field (f,a) is of length f and must begin in at a position in a cache line which is a multiple of a. Extending the above example, assume we have the following fields (2,1), (2,1), (2,2), (3,2), (4,4), (4,4), (7,4), (7,4). The first two fields can begin in any position, the next two in positions that are a multiples of 2, and the last four in positions that are multiples of 4. For this example, the best structure placement requires 3 cache lines. The two (7,4)'s must be on two separate cache lines because the only way they can fit in one cache line is leaving two gaps of 1 (see below). The remaining fields have lengths that add up to 17 requiring two more cache lines.
| x | x | x | x | x | x | x | | x | x | x | x | x | x | x | |
You will have to convince yourself that once the (7,4)'s are on separate cache lines then a third cache line will be needed. Again, assume we have the general problem of fields (f1, a1), (f2, a2),..., (fn, an) and a cache line size b.
3. Show that the problem of determining
if the fields (with alignments) can be placed in a single cache line is
NP-complete. (Hint: 3-partition might be very helpful)
4. (Extra Credit) Assuming that
alignments are always powers of two, field lengths are always positive
multiples of their alignments, and the cache line size is also a power
of two show that there is a polynomial time algorithm for placing the fields
in a single cache line. A kind of greedy algorithm might do the trick.