pudl.workspace.setup#

Tools for setting up and managing PUDL workspaces.

Module Contents#

Classes#

MissingPath

Validates potential path that doesn't exist.

PudlPaths

These settings provide access to various PUDL directories.

Functions#

init([clobber])

Set up a new PUDL working environment based on the user settings.

deploy(→ None)

Deploy all files from a package_data directory into a workspace.

Attributes#

pudl.workspace.setup.logger[source]#
class pudl.workspace.setup.MissingPath[source]#

Bases: pathlib.Path

Validates potential path that doesn’t exist.

classmethod __get_validators__() Any[source]#

Validates that path doesn’t exist and is path-like.

classmethod validate(value: pathlib.Path) pathlib.Path[source]#

Validates that path doesn’t exist.

pudl.workspace.setup.PotentialDirectoryPath[source]#
class pudl.workspace.setup.PudlPaths[source]#

Bases: pydantic.BaseSettings

These settings provide access to various PUDL directories.

It is primarily configured via PUDL_INPUT and PUDL_OUTPUT environment variables. Other paths of relevance are derived from these.

class Config[source]#

Pydantic config, reads from .env file.

env_file = '.env'[source]#
property input_dir: pathlib.Path[source]#

Path to PUDL input directory.

property output_dir: pathlib.Path[source]#

Path to PUDL output directory.

property settings_dir: pathlib.Path[source]#

Path to directory containing settings files.

property data_dir: pathlib.Path[source]#

Path to PUDL data directory.

property pudl_db: pathlib.Path[source]#

Returns url of locally stored pudl sqlite database.

pudl_input: PotentialDirectoryPath[source]#
pudl_output: PotentialDirectoryPath[source]#
sqlite_db(name: str) str[source]#

Returns url of locally stored pudl slqlite database with given name.

The name is expected to be the name of the database without the .sqlite suffix. E.g. pudl, ferc1 and so on.

output_file(filename: str) pathlib.Path[source]#

Path to file in PUDL output directory.

static set_path_overrides(input_dir: str | None = None, output_dir: str | None = None) None[source]#

Set PUDL_INPUT and/or PUDL_OUTPUT env variables.

Parameters:
  • input_dir – if set, overrides PUDL_INPUT env variable.

  • output_dir – if set, overrides PUDL_OUTPUT env variable.

pudl.workspace.setup.init(clobber=False)[source]#

Set up a new PUDL working environment based on the user settings.

Parameters:

clobber (bool) – if True, replace existing files. If False (the default) do not replace existing files.

Returns:

None

pudl.workspace.setup.deploy(pkg_path: str, deploy_dir: pathlib.Path, ignore_files: list[str], clobber: bool = False) None[source]#

Deploy all files from a package_data directory into a workspace.

Parameters:
  • pkg_path – Dotted module path to the subpackage inside of package_data containing the resources to be deployed.

  • deploy_dir – Directory on the filesystem to which the files within pkg_path should be deployed.

  • ignore_files – List of filenames (strings) that may be present in the pkg_path subpackage, but that should be ignored.

  • clobber – if True, replace existing copies of the files that are being deployed from pkg_path to deploy_dir. If False, do not replace existing files.

Returns:

None