pudl.workspace.resource_cache¶
Implementations of datastore resource caches.
Attributes¶
Classes¶
Uniquely identifies a specific resource. |
|
Defines interaface for the generic resource caching layer. |
|
Simple key-value store mapping PudlResourceKeys to ByteIO contents. |
|
Implements file cache backed by Google Cloud Storage bucket. |
|
Implements multi-layered system of caches. |
Functions¶
|
Extend a GCS predicate function with additional exception_types. |
Module Contents¶
- pudl.workspace.resource_cache.extend_gcp_retry_predicate(predicate, *exception_types)[source]¶
Extend a GCS predicate function with additional exception_types.
- class pudl.workspace.resource_cache.PudlResourceKey[source]¶
Bases:
NamedTuple
Uniquely identifies a specific resource.
- get_local_path() pathlib.Path [source]¶
Returns (relative) path that should be used when caching this resource.
- class pudl.workspace.resource_cache.AbstractCache(read_only: bool = False)[source]¶
Bases:
abc.ABC
Defines interaface for the generic resource caching layer.
- abstract get(resource: PudlResourceKey) bytes [source]¶
Retrieves content of given resource or throws KeyError.
- abstract add(resource: PudlResourceKey, content: bytes) None [source]¶
Adds resource to the cache and sets the content.
- abstract delete(resource: PudlResourceKey) None [source]¶
Removes the resource from cache.
- abstract contains(resource: PudlResourceKey) bool [source]¶
Returns True if the resource is present in the cache.
- class pudl.workspace.resource_cache.LocalFileCache(cache_root_dir: pathlib.Path, **kwargs: Any)[source]¶
Bases:
AbstractCache
Simple key-value store mapping PudlResourceKeys to ByteIO contents.
- _resource_path(resource: PudlResourceKey) pathlib.Path [source]¶
- get(resource: PudlResourceKey) bytes [source]¶
Retrieves value associated with a given resource.
- add(resource: PudlResourceKey, content: bytes)[source]¶
Adds (or updates) resource to the cache with given value.
- delete(resource: PudlResourceKey)[source]¶
Deletes resource from the cache.
- contains(resource: PudlResourceKey) bool [source]¶
Returns True if resource is present in the cache.
- class pudl.workspace.resource_cache.GoogleCloudStorageCache(gcs_path: str, **kwargs: Any)[source]¶
Bases:
AbstractCache
Implements file cache backed by Google Cloud Storage bucket.
- _blob(resource: PudlResourceKey) google.cloud.storage.blob.Blob [source]¶
Retrieve Blob object associated with given resource.
- get(resource: PudlResourceKey) bytes [source]¶
Retrieves value associated with given resource.
- add(resource: PudlResourceKey, value: bytes)[source]¶
Adds (or updates) resource to the cache with given value.
- delete(resource: PudlResourceKey)[source]¶
Deletes resource from the cache.
- contains(resource: PudlResourceKey) bool [source]¶
Returns True if resource is present in the cache.
- class pudl.workspace.resource_cache.LayeredCache(*caches: list[AbstractCache], **kwargs: Any)[source]¶
Bases:
AbstractCache
Implements multi-layered system of caches.
This allows building multi-layered system of caches. The idea is that you can have faster local caches with fall-back to the more remote or expensive caches that can be acessed in case of missing content.
Only the closest layer is being written to (set, delete), while all remaining layers are read-only (get).
- _caches: list[AbstractCache][source]¶
- add_cache_layer(cache: AbstractCache)[source]¶
Adds caching layer.
The priority is below all other.
- get(resource: PudlResourceKey) bytes [source]¶
Returns content of a given resource.
- add(resource: PudlResourceKey, value)[source]¶
Adds (or replaces) resource into the cache with given value.
- delete(resource: PudlResourceKey)[source]¶
Removes resource from the cache if the cache is not in the read_only mode.
- contains(resource: PudlResourceKey) bool [source]¶
Returns True if resource is present in the cache.
- is_optimally_cached(resource: PudlResourceKey) bool [source]¶
Return True if resource is contained in the closest write-enabled layer.