pudl.workspace.setup_cli#

Set up a well-organized PUDL data management workspace.

This script creates a well-defined directory structure for use by the PUDL package, and copies several example settings files and Jupyter notebooks into it to get you started. If the command is run without any arguments, it will create this workspace in your current directory.

It’s also possible to specify different input and output directories, which is useful if you want to use a single PUDL data store (which may contain many GB of data) to support several different workspaces. See the –pudl_in and –pudl_out options.

By default the script will not overwrite existing files. If you want it to replace existing files use the –clobber option.

The directory structure set up for PUDL looks like this:

PUDL_DIR

└── settings

PUDL_INPUT

├── censusdp1tract ├── eia860 ├── eia860m ├── eia861 ├── eia923 … ├── epacems ├── ferc1 ├── ferc714 └── tmp

PUDL_OUTPUT

├── ferc1_dbf.sqlite ├── ferc1_xbrl.sqlite … ├── pudl.sqlite └── hourly_emissions_cems.parquet

Initially, the directories in the data store will be empty. The pudl_datastore or pudl_etl commands will download data from public sources and organize it for you there by source.

Module Contents#

Functions#

initialize_parser()

Parse command line arguments for the pudl_setup script.

main()

Set up a new default PUDL workspace.

Attributes#

pudl.workspace.setup_cli.logger[source]#
pudl.workspace.setup_cli.initialize_parser()[source]#

Parse command line arguments for the pudl_setup script.

pudl.workspace.setup_cli.main()[source]#

Set up a new default PUDL workspace.