cache_utils#

class catlas.cache_utils.SqliteSingleThreadDict(filename=None, tablename='unnamed', journal_mode='PERSIST', encode=<function dumps>, decode=<built-in function loads>)#

A dictionary connected to a sqlite database. Acts like a dictionary, but is actually a file that opens file connections when items are accessed or set.

Raises

KeyError – an accessed key didn’t exist in the dictionary.

close()#

Close read-only connection

catlas.cache_utils.better_build_func_identifier(func)#

Build a roughly unique identifier for the cached function.

Parameters

func (Callable) – a function to cache

Returns

a list of components identifying a function based on its code and location in the import hierarchy

Return type

tuple[str]

catlas.cache_utils.check_cache(cached_func)#
checks if cached function is safe to call without overriding cache (adapted from

https://github.com/joblib/joblib/blob/7742f5882273889f7aaf1d483a8a1c72a97d57e3/joblib/memory.py#L672)

Inputs:

cached_func (Callable): Function to check cache for

Returns

True if cached function is safe to call, else False

Return type

bool

catlas.cache_utils.get_cached_func_location(func)#

Find the location inside of your <cache>/joblib/ folder where a cached function is stored. Necessary because each function will have multiple subcaches for its codebase.

Parameters

func (Callable) – a function that has been cached

Returns

the path where the input function is stored

Return type

str

catlas.cache_utils.hash_func(func)#

Hash the function id, its file location, and the function code.

Parameters

func (Callable) – a function to cache

Returns

a hash uniquely identifying the function

Return type

str

catlas.cache_utils.naive_func_identifier(func)#

Build simple identifier based on function name.

Parameters

func (Callable) – a function to cache

Returns

a string identifying the input function based on its location in the import hierarchy

Return type

str

catlas.cache_utils.sqlitedict_memoize(folder, func, ignore=(), coerce_mmap=False, shard_digits=2)#

Cache functions in a way that splits cached functions between many folders.

Parameters
  • folder (str) – file location where cache shoud be created

  • func (Callable) – function to cache

  • ignore (tuple[str], optional) – List of arguments that will be ignored when determining whether to start a new cache. Defaults to ().

  • coerce_mmap (bool, optional) – if True, don’t distinguish between numpy ndarrays and numpy memmaps. Defaults to False.

  • shard_digits (int, optional) – Generate 16^(shard digits) different folders to store functions in. Defaults to 2.

Returns

cached function

Return type

Callable

catlas.cache_utils.token(config) str#

Generates a unique config identifier. Taken from stackoverflow 45674572.

Parameters

config (dict) – a catlas input config

Returns

A hex token identifying the config.

Return type

str