pudl.transform.ferc714 module

Transformation of the FERC Form 714 data.

pudl.transform.ferc714.BAD_RESPONDENTS = [319, 99991, 99992, 99993, 99994, 99995]

Fake respondent IDs for database test entities.

pudl.transform.ferc714.EIA_CODE_FIXES = {125: 2775, 134: 5416, 203: 12341, 257: 59504, 292: 20382, 295: 40229, 301: 14725, 302: 14725, 303: 14725, 304: 14725, 305: 14725, 306: 14725, 307: 14379, 309: 12427, 315: 56090, 323: 58790, 324: 58791, 329: 39347}

Overrides of FERC 714 respondent IDs with wrong or missing EIA Codes

pudl.transform.ferc714.OFFSET_CODES = {'AKDT': Timedelta('-1 days +15:00:00'), 'AKST': Timedelta('-1 days +15:00:00'), 'CDT': Timedelta('-1 days +18:00:00'), 'CST': Timedelta('-1 days +18:00:00'), 'EDT': Timedelta('-1 days +19:00:00'), 'EST': Timedelta('-1 days +19:00:00'), 'HST': Timedelta('-1 days +14:00:00'), 'MDT': Timedelta('-1 days +17:00:00'), 'MST': Timedelta('-1 days +17:00:00'), 'PDT': Timedelta('-1 days +16:00:00'), 'PST': Timedelta('-1 days +16:00:00')}

A mapping of timezone offset codes to Timedelta offsets from UTC.

from one year to the next, and these result in duplicate records, which are Note that the FERC 714 instructions state that all hourly demand is to be reported in STANDARD time for whatever timezone is being used. Even though many respondents use daylight savings / standard time abbreviations, a large majority do appear to conform to using a single UTC offset throughout the year. There are 6 instances in which the timezone associated with reporting changed dropped.

pudl.transform.ferc714.TZ_CODES = {'AKDT': 'America/Anchorage', 'AKST': 'America/Anchorage', 'CDT': 'America/Chicago', 'CST': 'America/Chicago', 'EDT': 'America/New_York', 'EST': 'America/New_York', 'HST': 'Pacific/Honolulu', 'MDT': 'America/Denver', 'MST': 'America/Denver', 'PDT': 'America/Los_Angeles', 'PST': 'America/Los_Angeles'}

Mapping between standardized time offset codes and canonical timezones.

pudl.transform.ferc714.adjacency_ba(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.demand_forecast_pa(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.demand_hourly_pa(tfr_dfs)[source]

Transform the hourly demand time series by Planning Area.

Transformations include:

  • Clean UTC offset codes.

  • Replace UTC offset codes with UTC offset and timezone.

  • Drop 25th hour rows.

  • Set records with 0 UTC code to 0 demand.

  • Drop duplicate rows.

  • Flip negative signs for reported demand.

Parameters

tfr_dfs (dict) – A dictionary of (partially) transformed dataframes, to be cleaned up.

Returns

The input dictionary of dataframes, but with a finished pa_demand_hourly_ferc714 dataframe.

Return type

dict

pudl.transform.ferc714.demand_monthly_ba(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.description_pa(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.gen_plants_ba(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.id_certification(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.interchange_ba(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.lambda_description(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.lambda_hourly_ba(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.net_energy_load_ba(tfr_dfs)[source]

A stub transform function.

pudl.transform.ferc714.respondent_id(tfr_dfs)[source]

Transform the FERC 714 respondent IDs, names, and EIA utility IDs.

This consists primarily of dropping test respondents and manually assigning EIA utility IDs to a few FERC Form 714 respondents that report planning area demand, but which don’t have their corresponding EIA utility IDs provided by FERC for some reason (including PacifiCorp).

Parameters

tfr_dfs (dict) – A dictionary of (partially) transformed dataframes, to be cleaned up.

Returns

The input dictionary of dataframes, but with a finished respondent_id_ferc714 dataframe.

Return type

dict

pudl.transform.ferc714.transform(raw_dfs, tables=('respondent_id_ferc714', 'id_certification_ferc714', 'gen_plants_ba_ferc714', 'demand_monthly_ba_ferc714', 'net_energy_load_ba_ferc714', 'adjacency_ba_ferc714', 'interchange_ba_ferc714', 'lambda_hourly_ba_ferc714', 'lambda_description_ferc714', 'description_pa_ferc714', 'demand_forecast_pa_ferc714', 'demand_hourly_pa_ferc714'))[source]

Transform the raw FERC 714 dataframes into datapackage ready ouputs.

Parameters
  • raw_dfs (dict) – A dictionary of raw pandas.DataFrame objects, as read out of the original FERC 714 CSV files. Generated by the pudl.extract.ferc714.extract() function.

  • tables (iterable) – The set of PUDL tables within FERC 714 that we should process. Typically set to all of them, unless

Returns

A dictionary of pandas.DataFrame objects that are ready to be output in a data package / database table.

Return type

dict