Loading AI tools
Design decisions affecting processor cache speeds and sizes From Wikipedia, the free encyclopedia
Cache placement policies are policies that determine where a particular memory block can be placed when it goes into a CPU cache. A block of memory cannot necessarily be placed at an arbitrary location in the cache; it may be restricted to a particular cache line or a set of cache lines[1] by the cache's placement policy.[2][3]
There are three different policies available for placement of a memory block in the cache: direct-mapped, fully associative, and set-associative. Originally this space of cache organizations was described using the term "congruence mapping".[4]
In a direct-mapped cache structure, the cache is organized into multiple sets[1] with a single cache line per set. Based on the address of the memory block, it can only occupy a single cache line. The cache can be framed as a n × 1 column matrix.[5]
Consider a main memory of 16 kilobytes, which is organized as 4-byte blocks, and a direct-mapped cache of 256 bytes with a block size of 4 bytes. Because the main memory is 16kB, we need a minimum of 14 bits to uniquely represent a memory address.
Since each cache block is of size 4 bytes, the total number of sets in the cache is 256/4, which equals 64 sets.
The incoming address to the cache is divided into bits for Offset, Index and Tag.
Below are memory addresses and an explanation of which cache line they map to:
0x0000
(tag - 0b00_0000
, index – 0b00_0000
, offset – 0b00
) corresponds to block 0 of the memory and maps to the set 0 of the cache.0x0004
(tag - 0b00_0000
, index – 0b00_0001
, offset – 0b00
) corresponds to block 1 of the memory and maps to the set 1 of the cache.0x00FF
(tag – 0b00_0000
, index – 0b11_1111
, offset – 0b11
) corresponds to block 63 of the memory and maps to the set 63 of the cache.0x0100
(tag – 0b00_0001
, index – 0b00_0000
, offset – 0b00
) corresponds to block 64 of the memory and maps to the set 0 of the cache.In a fully associative cache, the cache is organized into a single cache set with multiple cache lines. A memory block can occupy any of the cache lines. The cache organization can be framed as 1 × m row matrix.[5]
Consider a main memory of 16 kilobytes, which is organized as 4-byte blocks, and a fully associative cache of 256 bytes and a block size of 4 bytes. Because the main memory is 16kB, we need a minimum of 14 bits to uniquely represent a memory address.
The total number of sets in the cache is 1, and the set contains 256/4=64 cache lines, as the cache block is of size 4 bytes.
The incoming address to the cache is divided into bits for offset and tag.
Since any block of memory can be mapped to any cache line, the memory block can occupy one of the cache lines based on the replacement policy.
Set-associative cache is a trade-off between direct-mapped cache and fully associative cache.
A set-associative cache can be imagined as a n × m matrix. The cache is divided into ‘n’ sets and each set contains ‘m’ cache lines. A memory block is first mapped onto a set and then placed into any cache line of the set.
The range of caches from direct-mapped to fully associative is a continuum of levels of set associativity. (A direct-mapped cache is one-way set-associative and a fully associative cache with m cache lines is m-way set-associative.)
Many processor caches in today's designs are either direct-mapped, two-way set-associative, or four-way set-associative.[5]
Consider a main memory of 16 kilobytes, which is organized as 4-byte blocks, and a 2-way set-associative cache of 256 bytes with a block size of 4 bytes. Because the main memory is 16kB, we need a minimum of 14 bits to uniquely represent a memory address.
Since each cache block is of size 4 bytes and is 2-way set-associative, the total number of sets in the cache is 256/(4 * 2), which equals 32 sets.
The incoming address to the cache is divided into bits for Offset, Index and Tag.
Below are memory addresses and an explanation of which cache line on which set they map to:
0x0000
(tag - 0b000_0000
, index – 0b0_0000
, offset – 0b00
) corresponds to block 0 of the memory and maps to the set 0 of the cache. The block occupies a cache line in set 0, determined by the replacement policy for the cache.0x0004
(tag - 0b000_0000
, index – 0b0_0001
, offset – 0b00
) corresponds to block 1 of the memory and maps to the set 1 of the cache. The block occupies a cache line in set 1, determined by the replacement policy for the cache.0x00FF
(tag – 0b000_0001
, index – 0b1_1111
, offset – 0b11
) corresponds to block 63 of the memory and maps to the set 31 of the cache. The block occupies a cache line in set 31, determined by the replacement policy for the cache.0x0100
(tag – 0b000_0010
, index – 0b0_0000
, offset – 0b00
) corresponds to block 64 of the memory and maps to the set 0 of the cache. The block occupies a cache line in set 0, determined by the replacement policy for the cache.Other schemes have been suggested, such as the skewed cache,[8] where the index for way 0 is direct, as above, but the index for way 1 is formed with a hash function. A good hash function has the property that addresses which conflict with the direct mapping tend not to conflict when mapped with the hash function, and so it is less likely that a program will suffer from an unexpectedly large number of conflict misses due to a pathological access pattern. The downside is extra latency from computing the hash function.[9] Additionally, when it comes time to load a new line and evict an old line, it may be difficult to determine which existing line was least recently used, because the new line conflicts with data at different indexes in each way; LRU tracking for non-skewed caches is usually done on a per-set basis. Nevertheless, skewed-associative caches have major advantages over conventional set-associative ones.[10]
A true set-associative cache tests all the possible ways simultaneously, using something like a content-addressable memory. A pseudo-associative cache tests each possible way one at a time. A hash-rehash cache and a column-associative cache are examples of a pseudo-associative cache.
In the common case of finding a hit in the first way tested, a pseudo-associative cache is as fast as a direct-mapped cache, but it has a much lower conflict miss rate than a direct-mapped cache, closer to the miss rate of a fully associative cache.[9]
Seamless Wikipedia browsing. On steroids.
Every time you click a link to Wikipedia, Wiktionary or Wikiquote in your browser's search results, it will show the modern Wikiwand interface.
Wikiwand extension is a five stars, simple, with minimum permission required to keep your browsing private, safe and transparent.