pudl.extract.epaipm module¶
Retrieve data from EPA’s Integrated Planning Model (IPM) v6.
Unlike most of the PUDL data sources, IPM is not an annual timeseries. This file assumes that only v6 will be used as an input, so there are a limited number of files.
This module was written by @gschivley
-
class
pudl.extract.epaipm.
EpaIpmDatastore
(datastore: pudl.workspace.datastore.Datastore)[source]¶ Bases:
object
Helper for extracting EpaIpm dataframes from Datastore.
-
SETTINGS
= (TableSettings(table_name='transmission_single_epaipm', file='table_3-21_annual_transmission_capabilities_of_u.s._model_regions_in_epa_platform_v6_-_2021.xlsx', excel_settings={'skiprows': 3, 'usecols': 'B:F', 'index_col': [0, 1]}), TableSettings(table_name='transmission_joint_epaipm', file='table_3-5_transmission_joint_ipm.csv', excel_settings={}), TableSettings(table_name='load_curves_epaipm', file='table_2-2_load_duration_curves_used_in_epa_platform_v6.xlsx', excel_settings={'skiprows': 3, 'usecols': 'B:AB'}), TableSettings(table_name='plant_region_map_epaipm_active', file='needs_v6_november_2018_reference_case_0.xlsx', excel_settings={'sheet_name': 'NEEDS v6_Active', 'usecols': 'C,I'}), TableSettings(table_name='plant_region_map_epaipm_retired', file='needs_v6_november_2018_reference_case_0.xlsx', excel_settings={'sheet_name': 'NEEDS v6_Retired_Through2021', 'usecols': 'C,I'}))¶
-
get_dataframe
(table_name: str) → pandas.core.frame.DataFrame[source]¶ Retrieve the specified file from the epaipm archive.
- Parameters
table_name – table name, from self.table_filename
pandas_args – pandas arguments for parsing the file
- Returns
Pandas dataframe of EPA IPM data.
-
get_table_settings
(table_name: str) → pudl.extract.epaipm.TableSettings[source]¶ Returns TableSettings for a given table_name.
-
-
class
pudl.extract.epaipm.
TableSettings
(table_name: str, file: str, excel_settings: Dict[str, Any] = {})[source]¶ Bases:
tuple
Contains information for how to access and load EpaIpm dataframes.
-
pudl.extract.epaipm.
extract
(epaipm_tables: List[str], ds: pudl.workspace.datastore.Datastore) → Dict[str, pandas.core.frame.DataFrame][source]¶ Extracts data from IPM files.
- Parameters
epaipm_tables (iterable) – A tuple or list of table names to extract
ds (
EpaIpmDatastore
) – Initialized datastore
- Returns
dictionary of DataFrames with extracted (but not yet transformed) data from each file.
- Return type