pudl.extract.vcerare

Extract VCE Resource Adequacy Renewable Energy (RARE) Power Dataset.

This dataset has 1,000s of columns, so we don’t want to manually specify a rename on import because we’ll pivot these to a column in the transform step. We adapt the standard extraction infrastructure to simply read in the data.

Each annual zip folder contains a folder with three files: Wind_Power_140m_Offshore_county.csv Wind_Power_100m_Onshore_county.csv Fixed_SolarPV_Lat_UPV_county.csv

The drive also contains one more CSV file: vce_county_lat_long_fips_table.csv. This gets read in when the fips partition is set to True.

Attributes

Classes

VCERareMetadata

Special metadata class for VCE RARE Power Dataset.

Extractor

Extractor for VCE RARE Power Dataset.

Functions

raw_vcerare_asset_factory(→ dagster.AssetsDefinition)

An asset factory for VCE RARE Power Dataset.

raw_vcerare__lat_lon_fips(→ pandas.DataFrame)

Extract lat/lon to FIPS and county mapping CSV.

Module Contents

pudl.extract.vcerare.logger[source]
pudl.extract.vcerare.VCERARE_PAGES = ['offshore_wind_power_140m', 'onshore_wind_power_100m', 'fixed_solar_pv_lat_upv'][source]
class pudl.extract.vcerare.VCERareMetadata(*args, **kwargs)[source]

Bases: pudl.extract.extractor.GenericMetadata

Special metadata class for VCE RARE Power Dataset.

_file_name[source]
_load_column_maps(column_map_pkg) dict[source]

There are no column maps to load, so return an empty dictionary.

get_all_pages() list[str][source]

Hard code the page names, which usually are pulled from column rename spreadsheets.

get_file_name(page, **partition)[source]

Returns file name of given partition and page.

class pudl.extract.vcerare.Extractor(*args, **kwargs)[source]

Bases: pudl.extract.csv.CsvExtractor

Extractor for VCE RARE Power Dataset.

METADATA[source]

Instance of metadata object to use with this extractor.

get_column_map(page, **partition)[source]

Return empty dictionary, we don’t rename these files.

source_filename(page: str, **partition: pudl.extract.extractor.PartitionSelection) str[source]

Produce the CSV file name as it will appear in the archive.

The files are nested in an additional folder with the year name inside of the zipfile, so we add a prefix folder based on the yearly partition to the source filename.

Parameters:
  • page – pudl name for the dataset contents, eg “boiler_generator_assn” or “coal_stocks”

  • partition – partition to load. Examples: {‘year’: 2009} {‘year_month’: ‘2020-08’}

Returns:

string name of the CSV file

load_source(page: str, **partition: pudl.extract.extractor.PartitionSelection) pandas.DataFrame[source]

Produce the dataframe object for the given partition.

Parameters:
  • page – pudl name for the dataset contents, eg “boiler_generator_assn” or “data”

  • partition – partition to load. Examples: {‘year’: 2009} {‘year_month’: ‘2020-08’}

Returns:

pd.DataFrame instance containing CSV data

process_raw(df: pandas.DataFrame, page: str, **partition: pudl.extract.extractor.PartitionSelection) pandas.DataFrame[source]

Append report year to df to distinguish data from other years.

validate(df: pandas.DataFrame, page: str, **partition: pudl.extract.extractor.PartitionSelection) pandas.DataFrame[source]

Skip this step, as we aren’t renaming any columns.

combine(dfs: list[pandas.DataFrame], page: str) pandas.DataFrame[source]

Concatenate dataframes into one, take any special steps for processing final page.

pudl.extract.vcerare.raw_vcerare__all_dfs[source]
pudl.extract.vcerare.raw_vcerare_asset_factory(part: str) dagster.AssetsDefinition[source]

An asset factory for VCE RARE Power Dataset.

pudl.extract.vcerare.raw_vcerare_assets[source]
pudl.extract.vcerare.raw_vcerare__lat_lon_fips(context) pandas.DataFrame[source]

Extract lat/lon to FIPS and county mapping CSV.

This dataframe is static, so it has a distinct partition from the other datasets and its extraction is controlled by a boolean in the ETL run.