pudl.extract.eia923

Retrieves data from EIA Form 923 spreadsheets for analysis.

This modules pulls data from EIA’s published Excel spreadsheets.

This code is for use analyzing EIA Form 923 data. Currenly only years 2009-2016 work, as they share nearly identical file formatting.

Module Contents

Classes

Extractor

Extractor for EIA form 923.

Attributes

logger

pudl.extract.eia923.logger[source]
class pudl.extract.eia923.Extractor(*args, **kwargs)[source]

Bases: pudl.extract.excel.GenericExtractor

Extractor for EIA form 923.

process_raw(self, df, page, **partition)[source]

Drops reserved columns.

extract(self, settings: pudl.settings.Eia923Settings = Eia923Settings())[source]

Extracts dataframes.

Returns dict where keys are page names and values are DataFrames containing data across given years.

Parameters

settings – Object containing validated settings relevant to EIA 923. Contains the tables and years to be loaded into PUDL.

static process_renamed(df, page, **partition)[source]

Cleans up unnamed_0 column in stocks page, drops invalid plan_id_eia rows.

static process_final_page(df, page)[source]

Removes reserved columns from the final dataframe.

static get_dtypes(page, **partition)[source]

Returns dtypes for plant id columns and county FIPS column.