pudl.extract.epacems module¶
Retrieve data from EPA CEMS hourly zipped CSVs.
This modules pulls data from EPA’s published CSV files.
-
pudl.extract.epacems.
extract
(epacems_years, states, data_dir)[source]¶ Coordinate the extraction of EPA CEMS hourly DataFrames.
- Parameters
- Yields
dict – a dictionary with a single EPA CEMS tabular data resource name as the key, having the form “hourly_emissions_epacems_YEAR_STATE” where YEAR is a 4 digit number and STATE is a lower case 2-letter code for a US state. The value is a
pandas.DataFrame
containing all the raw EPA CEMS hourly emissions data for the indicated state and year.
-
pudl.extract.epacems.
read_cems_csv
(filename)[source]¶ Read a CEMS CSV file, compressed or not, into a
pandas.DataFrame
.Note that some columns are not read. See
pudl.constants.epacems_columns_to_ignore
. Data types for the columns are specified inpudl.constants.epacems_csv_dtypes
and names of the output columns are set bypudl.constants.epacems_rename_dict
.- Parameters
filename (str) – The name of the file to be read
- Returns
A DataFrame containing the contents of the CSV file.
- Return type