pudl.extract.vcerare¶
Extract VCE Resource Adequacy Renewable Energy (RARE) Power Dataset.
This dataset has 1,000s of columns, so we don’t want to manually specify a rename on import because we’ll pivot these to a column in the transform step. We adapt the standard extraction infrastructure to simply read in the data.
Each annual zip folder contains a folder with three files: Wind_Power_140m_Offshore_county.csv Wind_Power_100m_Onshore_county.csv Fixed_SolarPV_Lat_UPV_county.csv
The drive also contains one more CSV file: vce_county_lat_long_fips_table.csv. This gets read in when the fips partition is set to True.
Attributes¶
Classes¶
Special metadata class for VCE RARE Power Dataset. |
|
Extractor for VCE RARE Power Dataset. |
Functions¶
|
An asset factory for VCE RARE Power Dataset. |
|
Extract lat/lon to FIPS and county mapping CSV. |
Module Contents¶
- pudl.extract.vcerare.VCERARE_PAGES = ['offshore_wind_power_140m', 'onshore_wind_power_100m', 'fixed_solar_pv_lat_upv'][source]¶
- class pudl.extract.vcerare.VCERareMetadata(*args, **kwargs)[source]¶
Bases:
pudl.extract.extractor.GenericMetadata
Special metadata class for VCE RARE Power Dataset.
- _load_column_maps(column_map_pkg) dict [source]¶
There are no column maps to load, so return an empty dictionary.
- class pudl.extract.vcerare.Extractor(*args, **kwargs)[source]¶
Bases:
pudl.extract.csv.CsvExtractor
Extractor for VCE RARE Power Dataset.
- source_filename(page: str, **partition: pudl.extract.extractor.PartitionSelection) str [source]¶
Produce the CSV file name as it will appear in the archive.
The files are nested in an additional folder with the year name inside of the zipfile, so we add a prefix folder based on the yearly partition to the source filename.
- Parameters:
page – pudl name for the dataset contents, eg “boiler_generator_assn” or “coal_stocks”
partition – partition to load. Examples: {‘year’: 2009} {‘year_month’: ‘2020-08’}
- Returns:
string name of the CSV file
- load_source(page: str, **partition: pudl.extract.extractor.PartitionSelection) pandas.DataFrame [source]¶
Produce the dataframe object for the given partition.
- Parameters:
page – pudl name for the dataset contents, eg “boiler_generator_assn” or “data”
partition – partition to load. Examples: {‘year’: 2009} {‘year_month’: ‘2020-08’}
- Returns:
pd.DataFrame instance containing CSV data
- process_raw(df: pandas.DataFrame, page: str, **partition: pudl.extract.extractor.PartitionSelection) pandas.DataFrame [source]¶
Append report year to df to distinguish data from other years.
- validate(df: pandas.DataFrame, page: str, **partition: pudl.extract.extractor.PartitionSelection) pandas.DataFrame [source]¶
Skip this step, as we aren’t renaming any columns.
- combine(dfs: list[pandas.DataFrame], page: str) pandas.DataFrame [source]¶
Concatenate dataframes into one, take any special steps for processing final page.
- pudl.extract.vcerare.raw_vcerare_asset_factory(part: str) dagster.AssetsDefinition [source]¶
An asset factory for VCE RARE Power Dataset.
- pudl.extract.vcerare.raw_vcerare__lat_lon_fips(context) pandas.DataFrame [source]¶
Extract lat/lon to FIPS and county mapping CSV.
This dataframe is static, so it has a distinct partition from the other datasets and its extraction is controlled by a boolean in the ETL run.