pudl.transform.eia module

Routines specific to cleaning up EIA data.

This module helps with the normalization of EIA datasets and complinging additonal connections between EIA entities. Right now, these two tasks include what we call harvesting and generating a more complete set of boiler generator associations. The harvesting process normalizes the EIA tables - it consolidates the duplicated fields/records into entity and annual entity tables. The boiler generator associations (bga) takes the given 860 bga and expands on this through several methods within the _boiler_generator_assn function.

pudl.transform.eia.transform(eia_transformed_dfs, eia923_years=(2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017), eia860_years=(2011, 2012, 2013, 2014, 2015, 2016, 2017), debug=False)[source]

Creates DataFrames for EIA Entity tables and modifies EIA tables.

This function coordinates two main actions: generating the entity tables via _harvesting() and generating the boiler generator associations via _boiler_generator_assn().

There is also some removal of tables that are no longer needed after the entity harvesting is finished.

Parameters
  • eia_transformed_dfs (dict) – a dictionary of table names (kays) and transformed dataframes (values).

  • eia923_years (list) – a list of years for EIA 923

  • eia860_years (list) – a list of years for EIA 860

  • debug (bool) – if true, informational columns will be added into boiler_generator_assn

Returns

a dictionary of table names (keys) and dataframes (values) for the entity tables.

eia_transformed_dfs (dict): a dictionary of table names (keys) and dataframes (values) for the rest of the EIA tables.

Return type

entities_dfs (dict)