LFU Cache Python Implementation with collections.Counter()

Om Goswami
3 min readJan 20, 2021

Hey everyone, this is article #1, and in this piece, I’m going to be talking about the least frequently used cache algorithm (LFU). This algorithm can be used to manage memory. The cache has a finite amount of slots available for storing data; when all the slots are occupied, the algorithm will clear the block of data that was least frequently accessed and replace it with a new entry. If there are multiple blocks of data that have been accessed the same number of times, the algorithm evicts the block that was accessed less recently. Let’s pause for an example (from binarysearch.com):

arguments = [[3], [1, 10], [1], [2, 20], [3, 30], [4, 40], [3], [2]]
The first argument sets the maximum capacity of the LFUCache
Each pair of numbers is a key-value pair: we enter this pair into the cache
Each individual number prompts a `get` method: the cache returns the value associated with the given key
Output:
[None, None, 10, None, None, None, 30, -1]
Explanation:
We create an LFUCache of capacity 3 (first element in arguments)
Set key of 1 to value 10. Size of cache is now 1
We get 1 which has value of 10
Set key of 2 to value 20. Size of cache is now 2
Set key of 3 to value 30. Size of cache is now 3
Set key of 4 to value 40. Size exceeds capacity, so now we evict the least frequently used key. Since both 2 and 3 are tied for being used the least, we evict the least recently used key which is 2.
We get 3 which has value of 30
We get 2 which has been evicted so we return -1

There are a number of different ways to achieve this, but for me, using Python’s Counter subclass was the most intuitive. Counter stores certain elements as keys and the frequency with which these elements occur as values, which is a perfect fit for this case because we need to track how often each element is accessed so that we can evict the least-accessed element when the cache is full. Furthermore, the Counter subclass allows us to use methods like most_common — these methods will allow us to easily figure out which element needs to be removed.

“A Counter is a dict subclass for counting hashable objects. It is a collection where elements are stored as dictionary keys and their counts are stored as dictionary values.”- Python documentation

So, in the implementation, each key will be mapped to a list of three values: the number of times the key has been accessed, how recently the key has been accessed, and the actual value itself. Since we are using most_common, we don’t want to increase the value each time it is accessed; rather, we will decrease the values each time (Counter allows us to store non-positive values) so that most_common returns the least frequently accessed pair and not the most frequently accessed one. Below is a sample of my code for this implementation: it is simple, with just an __init__, get, and set method in the LFUCache class.

Thanks for reading! Please feel free to send feedback, advice, or corrections my way, I would very much appreciate your input.

P.S. the data structure that most_common is based on is called a heap, and it’s extremely handy for use cases like this one.

--

--

Om Goswami

Hey there! I’m a senior at Evergreen Valley High School, hoping to study CS in college. I also love to play football and basketball (go Warriors!)