pudl.validate#

PUDL data validation functions and test case specifications.

What defines a data validation?
  • What data are we checking? * What table or output does it come from? * What selection criteria do we apply to that table or output?

  • What are we checking it against? * Itself (helps validate that the tests themselves are working) * A processed version of itself (aggregation or derived values) * A hard-coded external standard (e.g. heat rates, fuel heat content)

Module Contents#

Functions#

intersect_indexes(→ pandas.Index)

Calculate the intersection of a collection of pandas Indexes.

check_date_freq(→ None)

Verify an expected relationship between time frequencies of two dataframes.

no_null_rows(df[, cols, df_name, thresh])

Check for rows filled with NA values indicating bad merges.

no_null_cols(→ pandas.DataFrame)

Check that a dataframe has no all-NaN columns.

group_mean_continuity_check(→ dagster.AssetCheckResult)

Check that certain variables don't vary by too much.

check_max_rows(→ pandas.DataFrame)

Validate that a dataframe has less than a maximum number of rows.

check_min_rows(→ pandas.DataFrame)

Validate that a dataframe has a certain minimum number of rows.

check_unique_rows(→ pandas.DataFrame)

Test whether dataframe has unique records within a subset of columns.

weighted_quantile(→ float)

Calculate the weighted quantile of a Series or DataFrame column.

historical_distribution(→ list[float])

Calculate a historical distribution of weighted values of a column.

vs_bounds(df, data_col, weight_col[, query, title, ...])

Test a distribution against an upper bound, lower bound, or both.

vs_self(df, data_col, weight_col[, query, title, ...])

Test a distribution against its own historical range.

vs_historical(orig_df, test_df, data_col, weight_col)

Validate aggregated distributions against original data.

bounds_histogram(df, data_col, weight_col, query, ...)

Plot a weighted histogram showing acceptable bounds/actual values.

historical_histogram(orig_df, test_df, data_col, ...)

Weighted histogram comparing distribution with historical subsamples.

plot_vs_bounds(df, validation_cases)

Run through a data validation based on absolute bounds.

plot_vs_self(df, validation_cases)

Validate a bunch of distributions against themselves.

plot_vs_agg(orig_df, agg_df, validation_cases)

Validate a bunch of distributions against aggregated versions.

Attributes#

logger

core_ferc1__yearly_steam_plants_sched402_capacity

core_ferc1__yearly_steam_plants_sched402_expenses

core_ferc1__yearly_steam_plants_sched402_capacity_ratios

core_ferc1__yearly_steam_plants_sched402_connected_hours

core_ferc1__yearly_steam_plants_sched402_self

fuel_ferc1_self

fuel_ferc1_coal_mmbtu_per_unit_bounds

fuel_ferc1_oil_mmbtu_per_unit_bounds

fuel_ferc1_gas_mmbtu_per_unit_bounds

fuel_ferc1_coal_cost_per_mmbtu_bounds

fuel_ferc1_oil_cost_per_mmbtu_bounds

fuel_ferc1_gas_cost_per_mmbtu_bounds

fuel_ferc1_coal_cost_per_unit_bounds

fuel_ferc1_oil_cost_per_unit_bounds

fuel_ferc1_gas_cost_per_unit_bounds

fbp_ferc1_self

fbp_ferc1_gas_cost_per_mmbtu_bounds

fbp_ferc1_oil_cost_per_mmbtu_bounds

fbp_ferc1_coal_cost_per_mmbtu_bounds

gf_eia923_coal_heat_content

Valid coal heat content values (all coal types).

gf_eia923_gas_heat_content

Valid natural gas heat content values.

gf_eia923_oil_heat_content

Valid petroleum based fuel heat content values.

gf_eia923_agg

EIA923 Boiler Fuel data validation against aggregated data.

bf_eia923_coal_heat_content

Valid coal (bituminous, sub-bituminous, and lignite) heat content values.

bf_eia923_oil_heat_content

Valid petroleum based fuel heat content values.

bf_eia923_gas_heat_content

Valid natural gas heat content values.

bf_eia923_coal_ash_content

Valid coal ash content (%).

bf_eia923_coal_sulfur_content

Valid coal sulfur content values.

bf_eia923_self

EIA923 Boiler Fuel data validation against itself.

bf_eia923_agg

EIA923 Boiler Fuel data validation against aggregated data.

frc_eia923_coal_ant_heat_content

Check for reasonable anthracite coal heat content.

frc_eia923_coal_bit_heat_content

Check for reasonable bituminous coal heat content.

frc_eia923_coal_sub_heat_content

Check for reasonable Sub-bituminous coal heat content.

frc_eia923_coal_lig_heat_content

Check for reasonable lignite coal heat content.

frc_eia923_coal_cc_heat_content

Check for reasonable refined coal heat content.

frc_eia923_coal_wc_heat_content

Check for reasonable waste coal heat content.

frc_eia923_oil_dfo_heat_content

Check for reasonable diesel fuel oil heat contents.

frc_eia923_gas_sgc_heat_content

Check for reasonable coal syngas heat contents.

frc_eia923_oil_jf_heat_content

Check for reasonable jet fuel heat contents.

frc_eia923_oil_ker_heat_content

Check for reasonable kerosene heat contents.

frc_eia923_petcoke_heat_content

Check for reasonable petroleum coke heat contents.

frc_eia923_rfo_heat_content

Check for reasonable residual fuel oil heat contents.

frc_eia923_propane_heat_content

Check for reasonable propane heat contents.

frc_eia923_petcoke_syngas_heat_content

Check for reasonable petcoke syngas heat contents.

frc_eia923_waste_oil_heat_content

Check for reasonable waste oil heat contents.

frc_eia923_blast_furnace_gas_heat_content

Check for reasonable blast furnace gas heat contents.

frc_eia923_natural_gas_heat_content

Check for reasonable natural gas heat contents.

frc_eia923_other_gas_heat_content

Check for reasonable other gas heat contents.

frc_eia923_ag_byproduct_heat_content

Check for reasonable agricultural byproduct heat contents.

frc_eia923_muni_solids_heat_content

Check for reasonable municipal solid waste heat contents.

frc_eia923_biomass_solids_heat_content

Check for reasonable other biomass solids heat contents.

frc_eia923_wood_solids_heat_content

Check for reasonable wood solids heat contents.

frc_eia923_biomass_liquids_heat_content

Check for reasonable other biomass liquids heat contents.

frc_eia923_sludge_heat_content

Check for reasonable sludget waste heat contents.

frc_eia923_black_liquor_heat_content

Check for reasonable black liquor heat contents.

frc_eia923_wood_liquids_heat_content

Check for reasonable wood waste liquids heat contents.

frc_eia923_landfill_gas_heat_content

Check for reasonable landfill gas heat contents.

frc_eia923_biomass_gas_heat_content

Check for reasonable other biomass gas heat contents.

frc_eia923_coal_ash_content

Valid coal ash content (%).

frc_eia923_coal_sulfur_content

Valid coal sulfur content values.

frc_eia923_coal_mercury_content

Valid coal mercury content limits.

frc_eia923_coal_moisture_content

Valid coal moisture content, based on historical EIA 923 reporting.

frc_eia923_self

EIA923 fuel receipts & costs data validation against itself.

frc_eia923_agg

EIA923 fuel receipts & costs data validation against aggregated data.

mcoe_gas_capacity_factor

Static constraints on natural gas generator capacity factors.

mcoe_coal_capacity_factor

Static constraints on coal fired generator capacity factors.

mcoe_gas_heat_rate

Static constraints on gas fired generator heat rates.

mcoe_coal_heat_rate

Static constraints on coal fired generator heat rates.

mcoe_fuel_cost_per_mwh

Static constraints on fuel costs per MWh net generation.

mcoe_fuel_cost_per_mmbtu

Static constraints on fuel costs per mmbtu of fuel consumed.

mcoe_self_fuel_cost_per_mmbtu

mcoe_self_fuel_cost_per_mwh

mcoe_self

gens_eia860_vs_bound

gens_eia860_self

pudl.validate.logger[source]#
pudl.validate.intersect_indexes(indexes: list[pandas.Index]) pandas.Index[source]#

Calculate the intersection of a collection of pandas Indexes.

Parameters:

indexes – a list of pandas.Index objects

Returns:

The intersection of all values found in the input indexes.

pudl.validate.check_date_freq(df1: pandas.DataFrame, df2: pandas.DataFrame, mult: int) None[source]#

Verify an expected relationship between time frequencies of two dataframes.

Identify all distinct values of report_date in each of the input dataframes and check that the number of distinct report_date values in df2 is mult times the number of report_date values in df1 across only those years which appear in both dataframes. This is primarily aimed at comparing annual and monthly dataframes, but should also work with e.g. annual (df1) and quarterly (df2) frequency data using mult=4.

Note the function assumes that a dataframe with sub-annual frequency will cover the entire year it’s part of. If you have a partial year of monthly data in one dataframe that overlaps with annual data in another dataframe you’ll probably get unexpected behavior.

We use this method rather than attempting to infer a frequency from the observed values because often we have only a single year of data, and you need at least 3 values in a DatetimeIndex to infer the frequency.

Parameters:
  • df1 – A dataframe with a column named report_date which contains dates.

  • df2 – A dataframe with a column named report_date which contains dates.

  • mult – A multiplicative factor indicating the expected ratio between the number of distinct date values found in df1 and df2. E.g. if df1 is annual and df2 is monthly, mult should be 12.

Returns:

None

Raises:
  • AssertionError – if the number of distinct report_date values in df2 is not mult times the number of distinct report_date values in df1.

  • ValueError – if either df1 or df2 does not have a column named report_date

pudl.validate.no_null_rows(df, cols='all', df_name='', thresh=0.9)[source]#

Check for rows filled with NA values indicating bad merges.

Sum up the number of NA values in each row and the columns specified by cols. If the NA values make up more than thresh of the columns overall, the row is considered Null and the check fails.

Parameters:
  • df (pandas.DataFrame) – DataFrame to check for null rows.

  • cols (iterable or "all") – The labels of columns to check for all-null values. If “all” check all columns.

Returns:

The input DataFrame, for use with DataFrame.pipe().

Return type:

pandas.DataFrame

Raises:
  • ValueError – If the fraction of NA values in any row is greater than

  • thresh`

pudl.validate.no_null_cols(df: pandas.DataFrame, cols: str = 'all', df_name: str = '') pandas.DataFrame[source]#

Check that a dataframe has no all-NaN columns.

Occasionally in the concatenation / merging of dataframes we get a label wrong, and it results in a fully NaN column… which should probably never actually happen. This is a quick verification.

Parameters:
  • df (pandas.DataFrame) – DataFrame to check for null columns.

  • cols (iterable or "all") – The labels of columns to check for all-null values. If “all” check all columns.

  • df_name (str) – Name of the dataframe, to aid in debugging/logging.

Returns:

The same DataFrame as was passed in, for use in

DataFrame.pipe().

Return type:

pandas.DataFrame

Raises:

ValueError – If any completely NaN / Null valued columns are found.

pudl.validate.group_mean_continuity_check(df: pandas.DataFrame, thresholds: dict[str, float], groupby_col: str, n_outliers_allowed: int = 0) dagster.AssetCheckResult[source]#

Check that certain variables don’t vary by too much.

Groups and sorts the data by groupby_col, then takes the mean across each group. Useful for saying something like “the average water usage of cooling systems didn’t jump by 10x from 2012-2013.”

Parameters:
  • df – the df with the actual data

  • thresholds – a mapping from column names to the ratio by which those columns are allowed to fluctuate from one group to the next.

  • groupby_col – the column by which we will group the data.

  • n_outliers_allowed – how many data points are allowed to be above the

  • threshold.

pudl.validate.check_max_rows(df: pandas.DataFrame, expected_rows: int | float = np.inf, margin: float = 0.05, df_name: str = '') pandas.DataFrame[source]#

Validate that a dataframe has less than a maximum number of rows.

pudl.validate.check_min_rows(df: pandas.DataFrame, expected_rows: int | float = 0, margin: float = 0.05, df_name: str = '') pandas.DataFrame[source]#

Validate that a dataframe has a certain minimum number of rows.

pudl.validate.check_unique_rows(df: pandas.DataFrame, subset: list[str] | None = None, df_name: str = '') pandas.DataFrame[source]#

Test whether dataframe has unique records within a subset of columns.

Parameters:
  • df – DataFrame to check for duplicate records.

  • subset – Columns to consider in checking for dupes.

  • df_name – Name of the dataframe, to aid in debugging/logging.

Returns:

The same DataFrame as was passed in, for use in DataFrame.pipe().

Raises:

ValueError – If there are duplicate records in the subset of selected columns.

pudl.validate.weighted_quantile(data: pandas.Series, weights: pandas.Series, quantile: float) float[source]#

Calculate the weighted quantile of a Series or DataFrame column.

This function allows us to take two columns from a pandas.DataFrame one of which contains an observed value (data) like heat content per unit of fuel, and the other of which (weights) contains a quantity like quantity of fuel delivered which should be used to scale the importance of the observed value in an overall distribution, and calculate the values that the scaled distribution will have at various quantiles.

Parameters:
  • data – A series containing numeric data.

  • weights – Weights to use in scaling the data. Must have the same length as data.

  • quantile – A number between 0 and 1, representing the quantile at which we want to find the value of the weighted data.

Returns:

The value in the weighted data corresponding to the given quantile. If there are no values in the data, return numpy.nan.

pudl.validate.historical_distribution(df: pandas.DataFrame, data_col: str, weight_col: str, quantile: float) list[float][source]#

Calculate a historical distribution of weighted values of a column.

In order to know what a “reasonable” value of a particular column is in the pudl data, we can use this function to see what the value in that column has been in each of the years of data we have on hand, and a given quantile. This population of values can then be used to set boundaries on acceptable data distributions in the aggregated and processed data.

Parameters:
  • df (pandas.DataFrame) – a dataframe containing historical data, with a column named either report_date or report_year.

  • data_col (str) – Label of the column containing the data of interest.

  • weight_col (str) – Label of the column containing the weights to be used in scaling the data.

Returns:

The weighted quantiles of data, for each of the years found in the historical data of df.

Return type:

list

pudl.validate.vs_bounds(df, data_col, weight_col, query='', title='', low_q=False, low_bound=False, hi_q=False, hi_bound=False)[source]#

Test a distribution against an upper bound, lower bound, or both.

pudl.validate.vs_self(df, data_col, weight_col, query='', title='', low_q=0.05, mid_q=0.5, hi_q=0.95)[source]#

Test a distribution against its own historical range.

This is a special case of the pudl.validate.vs_historical() function, in which both the orig_df and test_df are the same. Mostly it helps ensure that the test itself is valid for the given distribution.

pudl.validate.vs_historical(orig_df, test_df, data_col, weight_col, query='', low_q=0.05, mid_q=0.5, hi_q=0.95, title='')[source]#

Validate aggregated distributions against original data.

pudl.validate.bounds_histogram(df, data_col, weight_col, query, low_q, hi_q, low_bound, hi_bound, title='')[source]#

Plot a weighted histogram showing acceptable bounds/actual values.

pudl.validate.historical_histogram(orig_df, test_df, data_col, weight_col, query='', low_q=0.05, mid_q=0.5, hi_q=0.95, low_bound=None, hi_bound=None, title='')[source]#

Weighted histogram comparing distribution with historical subsamples.

pudl.validate.plot_vs_bounds(df, validation_cases)[source]#

Run through a data validation based on absolute bounds.

pudl.validate.plot_vs_self(df, validation_cases)[source]#

Validate a bunch of distributions against themselves.

pudl.validate.plot_vs_agg(orig_df, agg_df, validation_cases)[source]#

Validate a bunch of distributions against aggregated versions.

pudl.validate.core_ferc1__yearly_steam_plants_sched402_capacity[source]#
pudl.validate.core_ferc1__yearly_steam_plants_sched402_expenses[source]#
pudl.validate.core_ferc1__yearly_steam_plants_sched402_capacity_ratios[source]#
pudl.validate.core_ferc1__yearly_steam_plants_sched402_connected_hours[source]#
pudl.validate.core_ferc1__yearly_steam_plants_sched402_self[source]#
pudl.validate.fuel_ferc1_self[source]#
pudl.validate.fuel_ferc1_coal_mmbtu_per_unit_bounds[source]#
pudl.validate.fuel_ferc1_oil_mmbtu_per_unit_bounds[source]#
pudl.validate.fuel_ferc1_gas_mmbtu_per_unit_bounds[source]#
pudl.validate.fuel_ferc1_coal_cost_per_mmbtu_bounds[source]#
pudl.validate.fuel_ferc1_oil_cost_per_mmbtu_bounds[source]#
pudl.validate.fuel_ferc1_gas_cost_per_mmbtu_bounds[source]#
pudl.validate.fuel_ferc1_coal_cost_per_unit_bounds[source]#
pudl.validate.fuel_ferc1_oil_cost_per_unit_bounds[source]#
pudl.validate.fuel_ferc1_gas_cost_per_unit_bounds[source]#
pudl.validate.fbp_ferc1_self[source]#
pudl.validate.fbp_ferc1_gas_cost_per_mmbtu_bounds[source]#
pudl.validate.fbp_ferc1_oil_cost_per_mmbtu_bounds[source]#
pudl.validate.fbp_ferc1_coal_cost_per_mmbtu_bounds[source]#
pudl.validate.gf_eia923_coal_heat_content[source]#

Valid coal heat content values (all coal types).

The Generation Fuel table does not break different coal types out separately, so we can only test the validity of the entire suite of coal records.

pudl.validate.gf_eia923_gas_heat_content[source]#

Valid natural gas heat content values.

Focuses on natural gas proper. Lower bound excludes other types of gaseous fuels intentionally.

pudl.validate.gf_eia923_oil_heat_content[source]#

Valid petroleum based fuel heat content values.

Based on historically reported values in EIA 923 Fuel Receipts and Costs.

pudl.validate.gf_eia923_agg[source]#

EIA923 Boiler Fuel data validation against aggregated data.

pudl.validate.bf_eia923_coal_heat_content[source]#

Valid coal (bituminous, sub-bituminous, and lignite) heat content values.

pudl.validate.bf_eia923_oil_heat_content[source]#

Valid petroleum based fuel heat content values.

Based on historically reported values in EIA 923 Fuel Receipts and Costs.

pudl.validate.bf_eia923_gas_heat_content[source]#

Valid natural gas heat content values.

Based on historically reported values in EIA 923 Fuel Receipts and Costs. May fail because of a population of bad data around 0.1 mmbtu/unit. This appears to be an off- by-10x error, possibly due to reporting error in units used.

pudl.validate.bf_eia923_coal_ash_content[source]#

Valid coal ash content (%).

Based on historical reporting in EIA 923.

pudl.validate.bf_eia923_coal_sulfur_content[source]#

Valid coal sulfur content values.

Based on historically reported values in EIA 923 Fuel Receipts and Costs.

pudl.validate.bf_eia923_self[source]#

EIA923 Boiler Fuel data validation against itself.

pudl.validate.bf_eia923_agg[source]#

EIA923 Boiler Fuel data validation against aggregated data.

pudl.validate.frc_eia923_coal_ant_heat_content[source]#

Check for reasonable anthracite coal heat content.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_coal_bit_heat_content[source]#

Check for reasonable bituminous coal heat content.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_coal_sub_heat_content[source]#

Check for reasonable Sub-bituminous coal heat content.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_coal_lig_heat_content[source]#

Check for reasonable lignite coal heat content.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_coal_cc_heat_content[source]#

Check for reasonable refined coal heat content.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_coal_wc_heat_content[source]#

Check for reasonable waste coal heat content.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_oil_dfo_heat_content[source]#

Check for reasonable diesel fuel oil heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_gas_sgc_heat_content[source]#

Check for reasonable coal syngas heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_oil_jf_heat_content[source]#

Check for reasonable jet fuel heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_oil_ker_heat_content[source]#

Check for reasonable kerosene heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_petcoke_heat_content[source]#

Check for reasonable petroleum coke heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_rfo_heat_content[source]#

Check for reasonable residual fuel oil heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_propane_heat_content[source]#

Check for reasonable propane heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_petcoke_syngas_heat_content[source]#

Check for reasonable petcoke syngas heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_waste_oil_heat_content[source]#

Check for reasonable waste oil heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_blast_furnace_gas_heat_content[source]#

Check for reasonable blast furnace gas heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_natural_gas_heat_content[source]#

Check for reasonable natural gas heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_other_gas_heat_content[source]#

Check for reasonable other gas heat contents.

Based on values given in the EIA 923 instructions, but with the lower bound set by the expected lower bound of heat content on blast furnace gas (since there were “other” gasses with bounds lower than the expected 0.32 in the data) https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_ag_byproduct_heat_content[source]#

Check for reasonable agricultural byproduct heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_muni_solids_heat_content[source]#

Check for reasonable municipal solid waste heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_biomass_solids_heat_content[source]#

Check for reasonable other biomass solids heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_wood_solids_heat_content[source]#

Check for reasonable wood solids heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_biomass_liquids_heat_content[source]#

Check for reasonable other biomass liquids heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_sludge_heat_content[source]#

Check for reasonable sludget waste heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_black_liquor_heat_content[source]#

Check for reasonable black liquor heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_wood_liquids_heat_content[source]#

Check for reasonable wood waste liquids heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_landfill_gas_heat_content[source]#

Check for reasonable landfill gas heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_biomass_gas_heat_content[source]#

Check for reasonable other biomass gas heat contents.

Based on values given in the EIA 923 instructions: https://www.eia.gov/survey/form/eia_923/instructions.pdf

pudl.validate.frc_eia923_coal_ash_content[source]#

Valid coal ash content (%).

Based on historical reporting in EIA 923.

pudl.validate.frc_eia923_coal_sulfur_content[source]#

Valid coal sulfur content values.

Based on historically reported values in EIA 923 Fuel Receipts and Costs.

pudl.validate.frc_eia923_coal_mercury_content[source]#

Valid coal mercury content limits.

Based on USGS FS095-01 https://pubs.usgs.gov/fs/fs095-01/fs095-01.html

Upper tail may fail because of a population of extremely high mercury content coal (9.0ppm) which is likely a reporting error.

pudl.validate.frc_eia923_coal_moisture_content[source]#

Valid coal moisture content, based on historical EIA 923 reporting.

pudl.validate.frc_eia923_self[source]#

EIA923 fuel receipts & costs data validation against itself.

pudl.validate.frc_eia923_agg[source]#

EIA923 fuel receipts & costs data validation against aggregated data.

pudl.validate.mcoe_gas_capacity_factor[source]#

Static constraints on natural gas generator capacity factors.

pudl.validate.mcoe_coal_capacity_factor[source]#

Static constraints on coal fired generator capacity factors.

pudl.validate.mcoe_gas_heat_rate[source]#

Static constraints on gas fired generator heat rates.

pudl.validate.mcoe_coal_heat_rate[source]#

Static constraints on coal fired generator heat rates.

pudl.validate.mcoe_fuel_cost_per_mwh[source]#

Static constraints on fuel costs per MWh net generation.

pudl.validate.mcoe_fuel_cost_per_mmbtu[source]#

Static constraints on fuel costs per mmbtu of fuel consumed.

pudl.validate.mcoe_self_fuel_cost_per_mmbtu[source]#
pudl.validate.mcoe_self_fuel_cost_per_mwh[source]#
pudl.validate.mcoe_self[source]#
pudl.validate.gens_eia860_vs_bound[source]#
pudl.validate.gens_eia860_self[source]#