pudl.workspace.setup
Tools for setting up and managing PUDL workspaces.
Module Contents
Functions
|
Set default user input and output locations in |
Read paths to default PUDL input/output dirs from user's $HOME/.pudl.yml. |
|
|
Derive PUDL paths based on given input and output paths. |
|
Set up a new PUDL working environment based on the user settings. |
|
Deploy all files from a package_data directory into a workspace. |
Attributes
- pudl.workspace.setup.set_defaults(pudl_in, pudl_out, clobber=False)[source]
Set default user input and output locations in
$HOME/.pudl.yml
.Create a user settings file for future reference, that defines the default PUDL input and output directories. If this file already exists, behavior depends on the clobber parameter, which is False by default. If it’s True, the existing file is replaced. If False, the existing file is not changed.
- Parameters
pudl_in (os.PathLike) – Path to be used as the default input directory for PUDL – this is where
pudl.workspace.datastore
will look to find thedata
directory, full of data from public agencies.pudl_out (os.PathLike) – Path to the default output directory for PUDL, where results of data processing will be organized.
clobber (bool) – If True and a user settings file exists, overwrite it. If False, do not alter the existing file. Defaults to False.
- Returns
None
- pudl.workspace.setup.get_defaults()[source]
Read paths to default PUDL input/output dirs from user’s $HOME/.pudl.yml.
- Parameters
None –
- Returns
The contents of the user’s PUDL settings file, with keys
pudl_in
andpudl_out
defining their default PUDL workspace. If the$HOME/.pudl.yml
file does not exist, set these paths to None.- Return type
- pudl.workspace.setup.derive_paths(pudl_in, pudl_out)[source]
Derive PUDL paths based on given input and output paths.
If no configuration file path is provided, attempt to read in the user configuration from a file called .pudl.yml in the user’s HOME directory. Presently the only values we expect are pudl_in and pudl_out, directories that store files that PUDL either depends on that rely on PUDL.
- Parameters
pudl_in (os.PathLike) – Path to the directory containing the PUDL input files, most notably the
data
directory which houses the raw data downloaded from public agencies by thepudl.workspace.datastore
tools.pudl_in
may be the same directory aspudl_out
.pudl_out (os.PathLike) – Path to the directory where PUDL should write the outputs it generates. These will be organized into directories according to the output format (sqlite, parquet, etc.).
- Returns
- A dictionary containing common PUDL settings, derived from those
read out of the YAML file. Mostly paths for inputs & outputs.
- Return type
- pudl.workspace.setup.init(pudl_in, pudl_out, clobber=False)[source]
Set up a new PUDL working environment based on the user settings.
- Parameters
pudl_in (os.PathLike) – Path to the directory containing the PUDL input files, most notably the
data
directory which houses the raw data downloaded from public agencies by thepudl.workspace.datastore
tools.pudl_in
may be the same directory aspudl_out
.pudl_out (os.PathLike) – Path to the directory where PUDL should write the outputs it generates. These will be organized into directories according to the output format (sqlite, parquet, etc.).
clobber (bool) – if True, replace existing files. If False (the default) do not replace existing files.
- Returns
None
- pudl.workspace.setup.deploy(pkg_path, deploy_dir, ignore_files, clobber=False)[source]
Deploy all files from a package_data directory into a workspace.
- Parameters
pkg_path (str) – Dotted module path to the subpackage inside of package_data containing the resources to be deployed.
deploy_dir (os.PathLike) – Directory on the filesystem to which the files within pkg_path should be deployed.
ignore_files (iterable) – List of filenames (strings) that may be present in the pkg_path subpackage, but that should be ignored.
clobber (bool) – if True, replace existing copies of the files that are being deployed from pkg_path to deploy_dir. If False, do not replace existing files.
- Returns
None