pudl.etl_pkg module

Module coordinating the PUDL ETL pipeline, generating data packages.

pudl.etl_pkg.etl_pkg(pkg_settings, pudl_settings)[source]

Extracts, transforms and loads CSVs.

Parameters
  • pkg_settings (dict) – a dictionary of inputs for a datapackage.

  • pudl_settings (dict) – a dictionary filled with settings that mostly describe paths to various resources and outputs.

Returns

dictionary with datapackpackages (keys) and lists of tables (values)

Return type

dict

pudl.etl_pkg.validate_input(settings_init)[source]

Read and validate the inputs from a settings file.

Parameters

settings_init (iterable) – a list of data package parameters, with each element of the list being a dictionary specifying the data to be packaged.

Returns

validated list of inputs

Return type

iterable