pudl.transform.eia module

Code for transforming EIA data that pertains to more than one EIA Form.

This module helps normalize EIA datasets and infers additonal connections between EIA entities (i.e. utilities, plants, units, generators…). This includes:

  • compiling a master list of plant, utility, boiler, and generator IDs that appear in any of the EIA 860 or 923 tables.

  • inferring more complete boiler-generator associations.

  • differentiating between static and time varying attributes associated with the EIA entities, storing the static fields with the entity table, and the variable fields in an annual table.

The boiler generator association inferrence (bga) takes the associations provided by the EIA 860, and expands on it using several methods which can be found in pudl.transform.eia._boiler_generator_assn().

pudl.transform.eia.transform(eia_transformed_dfs, eia923_years=(2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018), eia860_years=(2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018), debug=False)[source]

Creates DataFrames for EIA Entity tables and modifies EIA tables.

This function coordinates two main actions: generating the entity tables via _harvesting() and generating the boiler generator associations via _boiler_generator_assn().

There is also some removal of tables that are no longer needed after the entity harvesting is finished.

Parameters
  • eia_transformed_dfs (dict) – a dictionary of table names (kays) and transformed dataframes (values).

  • eia923_years (list) – a list of years for EIA 923, must be continuous, and include only working years.

  • eia860_years (list) – a list of years for EIA 860, must be continuous, and only include working years.

  • debug (bool) – if true, informational columns will be added into boiler_generator_assn

Returns

two dictionaries having table names as keys and dataframes as values for the entity tables transformed EIA dataframes

Return type

tuple