pudl.transform.eia923 module

Routines specific to cleaning up EIA Form 923 data.

pudl.transform.eia923.boiler_fuel(eia923_dfs, eia923_transformed_dfs)[source]

Transforms the boiler_fuel_eia923 table.

Parameters
  • eia923_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA923 form, as reported in the Excel spreadsheets they distribute.

  • eia923_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Returns

eia923_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values).

Return type

dict

pudl.transform.eia923.coalmine(eia923_dfs, eia923_transformed_dfs)[source]

Transforms the coalmine_eia923 table.

Parameters
  • eia923_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA923 form, as reported in the Excel spreadsheets they distribute.

  • eia923_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Returns

eia923_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values).

Return type

dict

pudl.transform.eia923.fuel_receipts_costs(eia923_dfs, eia923_transformed_dfs)[source]

Transforms the fuel_receipts_costs_eia923 dataframe.

Fuel cost is reported in cents per mmbtu. Converts cents to dollars.

Parameters
  • eia923_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA923 form, as reported in the Excel spreadsheets they distribute.

  • eia923_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Returns

eia923_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Return type

dict

pudl.transform.eia923.generation(eia923_dfs, eia923_transformed_dfs)[source]

Transforms the generation_eia923 table.

Parameters
  • eia923_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA923 form, as reported in the Excel spreadsheets they distribute.

  • eia923_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Returns

eia923_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values).

Return type

dict

pudl.transform.eia923.generation_fuel(eia923_dfs, eia923_transformed_dfs)[source]

Transforms the generation_fuel_eia923 table.

Parameters
  • eia923_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA923 form, as reported in the Excel spreadsheets they distribute.

  • eia923_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Returns

eia923_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values).

Return type

dict

pudl.transform.eia923.plants(eia923_dfs, eia923_transformed_dfs)[source]

Transforms the plants_eia923 table.

Much of the static plant information is reported repeatedly, and scattered across several different pages of EIA 923. The data frame that this function uses is assembled from those many different pages, and passed in via the same dictionary of dataframes that all the other ingest functions use for uniformity.

Parameters
  • eia923_dfs (dictionary of pandas.DataFrame) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA 923 form, as reported in the Excel spreadsheets they distribute.

  • eia923_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Returns

eia923_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA923 form (keys) correspond to normalized DataFrames of values from that page (values)

Return type

dict

pudl.transform.eia923.transform(eia923_raw_dfs, eia923_tables=('generation_fuel_eia923', 'boiler_fuel_eia923', 'generation_eia923', 'coalmine_eia923', 'fuel_receipts_costs_eia923'))[source]

Transforms all the EIA 923 tables.

Parameters
  • eia923_raw_dfs (dict) – a dictionary of tab names (keys) and DataFrames (values). Generated from pudl.extract.eia923.extract().

  • eia923_tables (tuple) – A tuple containing the EIA923 tables that can be pulled into PUDL.

Returns

A dictionary of DataFrame with table names as keys and pandas.DataFrame objects as values, where the contents of the DataFrames correspond to cleaned and normalized PUDL database tables, ready for loading.

Return type

dict