pudl.output.pudltabl
#
This module provides a class enabling tabular compilations from the PUDL DB.
Many of our potential users are comfortable using spreadsheets, not databases, so we are creating a collection of tabular outputs that contain the most useful core information from the PUDL data packages, including additional keys and human readable names for the objects (utilities, plants, generators) being described in the table.
These tabular outputs can be joined with each other using those keys, and used as a data source within Microsoft Excel, Access, R Studio, or other data analysis packages that folks may be familiar with. They aren’t meant to completely replicate all the data and relationships contained within the full PUDL database, but should serve as a generally usable set of PUDL data products.
The PudlTabl class can also provide access to complex derived values, like the generator and plant level marginal cost of electricity (MCOE), which are defined in the analysis module.
In the long run, this is a probably a kind of prototype for pre-packaged API outputs or data products that we might want to be able to provide to users a la carte.
Module Contents#
Classes#
A class for compiling common useful tabular outputs from the PUDL DB. |
Functions#
|
Grab the pudl SQLite database table metadata. |
Attributes#
- class pudl.output.pudltabl.PudlTabl(pudl_engine: sqlalchemy.engine.Engine, freq: Literal[AS, MS, None] = None, start_date: str | date | datetime | pd.Timestamp = None, end_date: str | date | datetime | pd.Timestamp = None, fill_fuel_cost: bool = False, roll_fuel_cost: bool = False, fill_net_gen: bool = False, fill_tech_desc: bool = True, unit_ids: bool = False)[source]#
A class for compiling common useful tabular outputs from the PUDL DB.
- pu_eia860(update=False)[source]#
Pull a dataframe of EIA plant-utility associations.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- pu_ferc1(update=False)[source]#
Pull a dataframe of FERC plant-utility associations.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- advanced_metering_infrastructure_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- balancing_authority_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- balancing_authority_assn_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- demand_response_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- demand_response_water_heater_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- demand_side_management_sales_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- demand_side_management_ee_dr_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- demand_side_management_misc_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- distributed_generation_tech_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- distributed_generation_fuel_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- distributed_generation_misc_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- distribution_systems_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- dynamic_pricing_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- energy_efficiency_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- green_pricing_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- mergers_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- net_metering_customer_fuel_class_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- net_metering_misc_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- non_net_metering_customer_fuel_class_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- non_net_metering_misc_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- operational_data_revenue_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- operational_data_misc_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- reliability_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- sales_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- service_territory_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- utility_assn_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- utility_data_nerc_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- utility_data_rto_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- utility_data_misc_eia861() pandas.DataFrame [source]#
An interim EIA 861 output function.
- respondent_id_ferc714() pandas.DataFrame [source]#
An interim FERC 714 output function.
- demand_hourly_pa_ferc714() pandas.DataFrame [source]#
An interim FERC 714 output function.
- utils_eia860(update=False)[source]#
Pull a dataframe describing utilities reported in EIA 860.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- bga_eia860(update=False)[source]#
Pull a dataframe of boiler-generator associations from EIA 860.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- plants_eia860(update=False)[source]#
Pull a dataframe of plant level info reported in EIA 860.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- gens_eia860(update=False)[source]#
Pull a dataframe describing generators, as reported in EIA 860.
If you want to fill the technology_description field, recreate the pudl_out object with the parameter fill_tech_desc = True.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- boil_eia860(update=False)[source]#
Pull a dataframe of boiler level info reported in EIA 860.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- own_eia860(update=False)[source]#
Pull a dataframe of generator level ownership data from EIA 860.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- gf_eia923(update: bool = False) pandas.DataFrame [source]#
Pull combined nuclear and non-nuclear generation fuel data.
- Parameters:
update – If True, re-calculate the output dataframe, even if a cached version exists.
- Returns:
A denormalized table for interactive use.
- gf_nonuclear_eia923(update: bool = False) pandas.DataFrame [source]#
Pull non-nuclear EIA 923 generation and fuel consumption data.
- Parameters:
update – If True, re-calculate the output dataframe, even if a cached version exists.
- Returns:
A denormalized table for interactive use.
- gf_nuclear_eia923(update: bool = False) pandas.DataFrame [source]#
Pull EIA 923 generation and fuel consumption data for nuclear units.
- Parameters:
update – If True, re-calculate the output dataframe, even if a cached version exists.
- Returns:
A denormalized table for interactive use.
- frc_eia923(update=False)[source]#
Pull EIA 923 fuel receipts and costs data.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- bf_eia923(update=False)[source]#
Pull EIA 923 boiler fuel consumption data.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- gen_eia923(update=False)[source]#
Pull EIA 923 net generation data by generator.
Net generation is reported in two seperate tables in EIA 923: in the generation_eia923 and generation_fuel_eia923 tables. While the generation_fuel_eia923 table is more complete (the generation_eia923 table includes only ~55% of the reported MWhs), the generation_eia923 table is more granular (it is reported at the generator level).
This method either grabs the generation_eia923 table that is reported by generator, or allocates net generation from the generation_fuel_eia923 table to the generator level.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- gen_original_eia923(update=False)[source]#
Pull the original EIA 923 net generation data by generator.
- gen_fuel_by_generator_energy_source_eia923(update=False)[source]#
Net generation and fuel data allocated to generator/energy_source_code.
Net generation and fuel data originally reported in the gen fuel table
- gen_fuel_by_generator_eia923(update=False)[source]#
Net generation from gen fuel table allocated to generators.
- gen_fuel_by_generator_energy_source_owner_eia923(update=False)[source]#
Generation and fuel consumption by generator/energy_source_code/owner.
- plants_steam_ferc1(update=False)[source]#
Pull the FERC Form 1 steam plants data.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- fuel_ferc1(update=False)[source]#
Pull the FERC Form 1 steam plants fuel consumption data.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- fbp_ferc1(update=False)[source]#
Summarize FERC Form 1 fuel usage by plant.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- plants_small_ferc1(update=False)[source]#
Pull the FERC Form 1 Small Plants Table.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- plants_hydro_ferc1(update=False)[source]#
Pull the FERC Form 1 Hydro Plants Table.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- plants_pumped_storage_ferc1(update=False)[source]#
Pull the FERC Form 1 Pumped Storage Table.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- purchased_power_ferc1(update=False)[source]#
Pull the FERC Form 1 Purchased Power Table.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- plant_in_service_ferc1(update=False)[source]#
Pull the FERC Form 1 Plant in Service Table.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- plants_all_ferc1(update=False)[source]#
Pull the FERC Form 1 all plants table.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- hr_by_gen(update=False)[source]#
Calculate and return generator level heat rates (mmBTU/MWh).
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- hr_by_unit(update=False)[source]#
Calculate and return generation unit level heat rates.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- fuel_cost(update=False)[source]#
Calculate and return generator level fuel costs per MWh.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- capacity_factor(update=False, min_cap_fact=None, max_cap_fact=None)[source]#
Calculate and return generator level capacity factors.
- Parameters:
update (bool) – If true, re-calculate the output dataframe, even if a cached version exists.
- Returns:
a denormalized table for interactive use.
- Return type:
- mcoe(update: bool = False, min_heat_rate: float = 5.5, min_fuel_cost_per_mwh: float = 0.0, min_cap_fact: float = 0.0, max_cap_fact: float = 1.5, all_gens: bool = True, gens_cols: Any = None)[source]#
Calculate and return generator level MCOE based on EIA data.
Eventually this calculation will include non-fuel operating expenses as reported in FERC Form 1, but for now only the fuel costs reported to EIA are included. They are attibuted based on the unit-level heat rates and fuel costs.
- Parameters:
update – If true, re-calculate the output dataframe, even if a cached version exists.
min_heat_rate – lowest plausible heat rate, in mmBTU/MWh. Any MCOE records with lower heat rates are presumed to be invalid, and are discarded before returning.
min_cap_fact – minimum generator capacity factor. Generator records with a lower capacity factor will be filtered out before returning. This allows the user to exclude generators that aren’t being used enough to have valid.
min_fuel_cost_per_mwh – minimum fuel cost on a per MWh basis that is required for a generator record to be considered valid. For some reason there are now a large number of $0 fuel cost records, which previously would have been NaN.
max_cap_fact – maximum generator capacity factor. Generator records with a lower capacity factor will be filtered out before returning. This allows the user to exclude generators that aren’t being used enough to have valid.
all_gens – Controls whether the output contains records for all generators in the generators_eia860 table, or only those generators with associated MCOE data. True by default.
gens_cols – equal to the string “all”, None, or a list of names of column attributes to include from the generators_eia860 table in addition to the list of defined DEFAULT_GENS_COLS in the MCOE analysis module. If “all”, all columns from the generators table will be included. By default, the DEFAULT_GENS_COLS defined in the MCOE analysis module will be merged into the final MCOE output.
- Returns:
a compilation of generator attributes, including fuel costs per MWh.
- Return type:
- gens_mega_eia(update: bool = False, gens_cols: Any = None) pandas.DataFrame [source]#
Generate and return a generators table with ownership integrated.
- Parameters:
update – If True, re-calculate the output dataframe, even if a cached version exists.
gens_cols – equal to the string “all”, None, or a list of additional column attributes to include from the EIA 860 generators table in the output mega gens table. By default all columns necessary to create the plant parts EIA table are included.
- Returns:
A table of all of the generators with identifying columns and data columns, sliced by ownership which makes “total” and “owned” records for each generator owner. The “owned” records have the generator’s data scaled to the ownership percentage (e.g. if a 100 MW generator has a 75% stake owner and a 25% stake owner, this will result in two “owned” records with 75 MW and 25 MW). The “total” records correspond to the full plant for every owner (e.g. using the same 2-owner 100 MW generator as above, each owner will have a records with 100 MW).
- Raises:
AssertionError – If the frequency of the pudl_out object is not ‘AS’
- plant_parts_eia(update: bool = False, update_gens_mega: bool = False, gens_cols: Any = None) pandas.DataFrame [source]#
Generate and return master plant-parts EIA.
- Parameters:
update – If true, re-calculate the output dataframe, even if a cached version exists.
update_gens_mega – If True, update the gigantic Gens Mega table.
gens_cols – equal to the string “all”, None, or a list of additional column attributes to include from the EIA 860 generators table in the output mega gens table. By default all columns necessary to create the EIA plant part list are included.
- ferc1_eia(update: bool = False, update_plant_parts_eia: bool = False, update_plants_all_ferc1: bool = False, update_fbp_ferc1: bool = False) pandas.DataFrame [source]#
Generate the connection between FERC1 and EIA.
- epacamd_eia() pandas.DataFrame [source]#
Read the EPACAMD-EIA Crosswalk from the PUDL DB.
- __getstate__() dict [source]#
Get current object state for serializing (pickling).
This method is run as part of pickling the object. It needs to return the object’s current state with any un-serializable objects converted to a form that can be serialized. See
object.__getstate__()
for further details on the expected behavior of this method.
- __setstate__(state: dict) None [source]#
Restore the object’s state from a dictionary.
This method is run when the object is restored from a pickle. Anything that was changed in
pudl.output.pudltabl.PudlTabl.__getstate__()
must be undone here. Another important detail is that__init__
is not run when an object is de-serialized, so any setup there that alters external state might need to happen here as well.- Parameters:
state – the object state to restore. This is effectively the output of
pudl.output.pudltabl.PudlTabl.__getstate__()
.