DEV Community

Cover image for How I make cache-keys from python objects
Waylon Walker
Waylon Walker

Posted on • Originally published at waylonwalker.com

How I make cache-keys from python objects

When I need a consistent key for a pythohn object I often reach for
hashlib.md5 It works for me and the use cases I have.

diskcache

Yesterday we talked about setting up a persistant cache with python diskcache. In order to make this really work we need a good way to make consistent cache keys from some sort of python object.



article cover for <br>
 How I setup a sqlite cache in python<br>


How I setup a sqlite cache in python



hash

does not work

My first thought was to just hash the files, this will give me a unique key for each. This will work, and give you a consistant key for one and only one given python process. If you start a new interpreter you will get different keys.

waylonwalker.com on ๎‚  main [$โœ˜!?] via ๎ŽŽ v5.1.5 ๎ˆต v3.8.0 (waylonwalker.com)
โฏ ipython

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hash("waylonwalker")
-3862245013515310359

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hash("waylonwalker")
-3862245013515310359

waylonwalker โ†ชmain v3.8.0 ipython
โฏ exit

waylonwalker.com on ๎‚  main [$โœ˜!?] via ๎ŽŽ v5.1.5 ๎ˆต v3.8.0 (waylonwalker.com)
โฏ ipython


waylonwalker โ†ชmain v3.8.0 ipython
โฏ hash("waylonwalker")
-83673051278873734

Enter fullscreen mode Exit fullscreen mode

here is a snapshot of my terminal proving that you can get the same hash in one session, but it changes when you restart ipython.

hashlib.md5

Here is a quick couple ipython sessions showing that md5 cache is consistent accross multiple sessions.

waylonwalker.com on ๎‚  main [$โœ˜!?] via ๎ŽŽ v5.1.5 ๎ˆต v3.8.0 (waylonwalker.com) on ๎Œฝ (us-east-1)
โฏ ipython

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hashlib.md5("waylonwalker")
[PYFLYBY] import hashlib
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ <ipython-input-1-1537c4473c74>:1 in <module>                                                     โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
TypeError: Unicode-objects must be encoded before hashing

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hashlib.md5("waylonwalker".encode("utf-8"))
<md5 HASH object @ 0x7fe4ba6832d0>

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hashlib.md5("waylonwalker".encode("utf-8")).hexdigest()
'1c7c1073ca096ffdb324471770911fe2'

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hashlib.md5("waylonwalker".encode("utf-8")).hexdigest()
'1c7c1073ca096ffdb324471770911fe2'

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hashlib.md5("waylonwalker".encode("utf-8")).hexdigest()
'1c7c1073ca096ffdb324471770911fe2'

waylonwalker โ†ชmain v3.8.0 ipython
โฏ exit


waylonwalker.com on ๎‚  main [$โœ˜!?] via ๎ŽŽ v5.1.5 ๎ˆต v3.8.0 (waylonwalker.com) on ๎Œฝ (us-east-1) took 47s
โฏ ipython

waylonwalker โ†ชmain v3.8.0 ipython
โฏ hashlib.md5("waylonwalker".encode("utf-8")).hexdigest()
[PYFLYBY] import hashlib
'1c7c1073ca096ffdb324471770911fe2'


Enter fullscreen mode Exit fullscreen mode

key for diskcache

Since it is consistent we can use it as a cache key for diskcache operations. I setup a little funciton that allows me to pass a bunch of differnt things in to cache. As long as the str method exists and is gives the data that you want to cache key on, this will work.

def make_hash(self, *keys: str) -> str:
    str_keys = [str(key) for key in keys]
    return hashlib.md5("".join(str_keys).encode("utf-8")).hexdigest()
Enter fullscreen mode Exit fullscreen mode


article cover for <br>
 understanding python \*args and \*\*kwargs<br>


understanding python *args and **kwargs



If the args is confusing, I have a full article on *args and `*kwargs`.

See it in action

Here you can see it in action. Anything passed into the function gets to be part of the key.

waylonwalker โ†ชmain v3.8.0 ipython
โฏ def make_hash(self, *keys: str) -> str:
...:     str_keys = [str(key) for key in keys]
...:     return hashlib.md5("".join(str_keys).encode("utf-8")).hexdigest()
...:

waylonwalker โ†ชmain v3.8.0 ipython
โฏ make_hash(1, "one", "1", 1.0)
'73901d019df012a1cdab826ce301217d'

waylonwalker โ†ชmain v3.8.0 ipython
โฏ exit


waylonwalker.com on ๎‚  main [$โœ˜!?] via ๎ŽŽ v5.1.5 ๎ˆต v3.8.0 (waylonwalker.com) on ๎Œฝ (us-east-1) took 19m19s
โฏ

waylonwalker.com on ๎‚  main [$โœ˜!?] via ๎ŽŽ v5.1.5 ๎ˆต v3.8.0 (waylonwalker.com) on ๎Œฝ (us-east-1)
โฏ ipython

waylonwalker โ†ชmain v3.8.0 ipython
โฏ def make_hash(self, *keys: str) -> str:
...:     str_keys = [str(key) for key in keys]
...:     return hashlib.md5("".join(str_keys).encode("utf-8")).hexdigest()
[PYFLYBY] import hashlib

waylonwalker โ†ชmain v3.8.0 ipython
โฏ make_hash(1, "one", "1", 1.0)
'73901d019df012a1cdab826ce301217d'
Enter fullscreen mode Exit fullscreen mode

Discussion (0)