pudl.workspace.resource_cache module

Implementations of datastore resource caches.

class pudl.workspace.resource_cache.AbstractCache(read_only: bool = False)[source]

Bases: abc.ABC

Defines interaface for the generic resource caching layer.

abstract add(resource: pudl.workspace.resource_cache.PudlResourceKey, content: bytes)None[source]

Adds resource to the cache and sets the content.

abstract contains(resource: pudl.workspace.resource_cache.PudlResourceKey)bool[source]

Returns True if the resource is present in the cache.

abstract delete(resource: pudl.workspace.resource_cache.PudlResourceKey)None[source]

Removes the resource from cache.

abstract get(resource: pudl.workspace.resource_cache.PudlResourceKey)bytes[source]

Retrieves content of given resource or throws KeyError.

is_read_only()bool[source]

Returns true if the cache is read-only and should not be modified.

class pudl.workspace.resource_cache.GoogleCloudStorageCache(gcs_path: str, **kwargs: Any)[source]

Bases: pudl.workspace.resource_cache.AbstractCache

Implements file cache backed by Google Cloud Storage bucket.

add(resource: pudl.workspace.resource_cache.PudlResourceKey, value: bytes)[source]

Adds (or updates) resource to the cache with given value.

contains(resource: pudl.workspace.resource_cache.PudlResourceKey)bool[source]

Returns True if resource is present in the cache.

delete(resource: pudl.workspace.resource_cache.PudlResourceKey)[source]

Deletes resource from the cache.

get(resource: pudl.workspace.resource_cache.PudlResourceKey)bytes[source]

Retrieves value associated with given resource.

class pudl.workspace.resource_cache.LayeredCache(*caches: List[pudl.workspace.resource_cache.AbstractCache], **kwargs: Any)[source]

Bases: pudl.workspace.resource_cache.AbstractCache

Implements multi-layered system of caches.

This allows building multi-layered system of caches. The idea is that you can have faster local caches with fall-back to the more remote or expensive caches that can be acessed in case of missing content.

Only the closest layer is being written to (set, delete), while all remaining layers are read-only (get).

add(resource: pudl.workspace.resource_cache.PudlResourceKey, value)[source]

Adds (or replaces) resource into the cache with given value.

add_cache_layer(cache: pudl.workspace.resource_cache.AbstractCache)[source]

Adds caching layer. The priority is below all other.

contains(resource: pudl.workspace.resource_cache.PudlResourceKey)bool[source]

Returns True if resource is present in the cache.

delete(resource: pudl.workspace.resource_cache.PudlResourceKey)[source]

Removes resource from the cache if the cache is not in the read_only mode.

get(resource: pudl.workspace.resource_cache.PudlResourceKey)bytes[source]

Returns content of a given resource.

is_optimally_cached(resource: pudl.workspace.resource_cache.PudlResourceKey)bool[source]

Returns true if the resource is contained in the closest write-enabled layer.

num_layers()[source]

Returns number of caching layers that are in this LayeredCache.

class pudl.workspace.resource_cache.LocalFileCache(cache_root_dir: pathlib.Path, **kwargs: Any)[source]

Bases: pudl.workspace.resource_cache.AbstractCache

Simple key-value store mapping PudlResourceKeys to ByteIO contents.

add(resource: pudl.workspace.resource_cache.PudlResourceKey, content: bytes)[source]

Adds (or updates) resource to the cache with given value.

contains(resource: pudl.workspace.resource_cache.PudlResourceKey)bool[source]

Returns True if resource is present in the cache.

delete(resource: pudl.workspace.resource_cache.PudlResourceKey)[source]

Deletes resource from the cache.

get(resource: pudl.workspace.resource_cache.PudlResourceKey)bytes[source]

Retrieves value associated with a given resource.

class pudl.workspace.resource_cache.PudlResourceKey(dataset: str, doi: str, name: str)[source]

Bases: tuple

Uniquely identifies a specific resource.

dataset: str

Alias for field number 0

doi: str

Alias for field number 1

get_local_path()pathlib.Path[source]

Returns (relative) path that should be used when caching this resource.

name: str

Alias for field number 2