pudl.transform.eia860
Module to perform data cleaning functions on EIA860 data tables.
Module Contents
Functions
|
Pull and transform the ownership table. |
|
Pull and transform the generators table. |
|
Pull and transform the plants table. |
|
Pull and transform the boilder generator association table. |
|
Pull and transform the utilities table. |
|
Transform EIA 860 DataFrames. |
Attributes
- pudl.transform.eia860.ownership(eia860_dfs, eia860_transformed_dfs)[source]
Pull and transform the ownership table.
Transformations include:
Replace . values with NA.
Convert pre-2012 ownership percentages to proportions to match post-2012 reporting.
- Parameters
eia860_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA860 form, as reported in the Excel spreadsheets they distribute.
eia860_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Returns
eia860_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Return type
- pudl.transform.eia860.generators(eia860_dfs, eia860_transformed_dfs)[source]
Pull and transform the generators table.
There are three tabs that the generator records come from (proposed, existing, retired). Pre 2009, the existing and retired data are lumped together under a single generator file with one tab. We pull each tab into one dataframe and include an
operational_status
to indicate which tab the record came from. We useoperational_status
to parse the pre 2009 files as well.Transformations include:
Replace . values with NA.
Update
operational_status_code
to reflect plant status as either proposed, existing or retired.Drop values with NA for plant and generator id.
Replace 0 values with NA where appropriate.
Convert Y/N/X values to boolean True/False.
Convert U/Unknown values to NA.
Map full spelling onto code values.
Create a fuel_type_code_pudl field that organizes fuel types into clean, distinguishable categories.
- Parameters
eia860_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA860 form, as reported in the Excel spreadsheets they distribute.
eia860_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to a normalized DataFrame of values from that page (values).
- Returns
eia860_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Return type
- pudl.transform.eia860.plants(eia860_dfs, eia860_transformed_dfs)[source]
Pull and transform the plants table.
Much of the static plant information is reported repeatedly, and scattered across several different pages of EIA 923. The data frame which this function uses is assembled from those many different pages, and passed in via the same dictionary of dataframes that all the other ingest functions use for uniformity.
Transformations include:
Replace . values with NA.
Homogenize spelling of county names.
Convert Y/N/X values to boolean True/False.
- Parameters
eia860_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA860 form, as reported in the Excel spreadsheets they distribute.
eia860_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Returns
eia860_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Return type
- pudl.transform.eia860.boiler_generator_assn(eia860_dfs, eia860_transformed_dfs)[source]
Pull and transform the boilder generator association table.
Transformations include:
Drop non-data rows with EIA notes.
Drop duplicate rows.
- Parameters
eia860_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA860 form, as reported in the Excel spreadsheets they distribute.
eia860_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Returns
eia860_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Return type
- pudl.transform.eia860.utilities(eia860_dfs, eia860_transformed_dfs)[source]
Pull and transform the utilities table.
Transformations include:
Replace . values with NA.
Fix typos in state abbreviations, convert to uppercase.
Drop address_3 field (all NA).
Combine phone number columns into one field and set values that don’t mimic real US phone numbers to NA.
Convert Y/N/X values to boolean True/False.
Map full spelling onto code values.
- Parameters
eia860_dfs (dict) – Each entry in this dictionary of DataFrame objects corresponds to a page from the EIA860 form, as reported in the Excel spreadsheets they distribute.
eia860_transformed_dfs (dict) – A dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Returns
eia860_transformed_dfs, a dictionary of DataFrame objects in which pages from EIA860 form (keys) correspond to normalized DataFrames of values from that page (values).
- Return type
- pudl.transform.eia860.transform(eia860_raw_dfs, eia860_tables=PUDL_TABLES['eia860'])[source]
Transform EIA 860 DataFrames.
- Parameters
- Returns
A dictionary of DataFrame objects in which pages from EIA860 form (keys) corresponds to a normalized DataFrame of values from that page (values).
- Return type