pudl.settings
#
Module for validating pudl etl settings.
Module Contents#
Classes#
Contains full list of supported FERC XBRL forms. |
|
BaseModel with global configuration. |
|
An abstract pydantic model for generic datasets. |
|
An immutable pydantic model to validate Ferc1Settings. |
|
An immutable pydantic model to validate Ferc714Settings. |
|
An immutable pydantic model to validate EPA CEMS settings. |
|
An immutable pydantic model to validate PHMSA settings. |
|
An immutable pydantic model to validate EIA 923 settings. |
|
An immutable pydantic model to validate EIA 861 settings. |
|
An immutable pydantic model to validate EIA 860 settings. |
|
An immutable pydantic model to validate EIA 860m settings. |
|
An immutable pydantic model to validate Glue settings. |
|
An immutable pydantic model to validate EIA datasets settings. |
|
An immutable pydantic model to validate PUDL Dataset settings. |
|
An immutable Pydantic model to validate FERC 1 to SQLite settings. |
|
An immutable pydantic model to validate Ferc1 to SQLite settings. |
|
An immutable pydantic model to validate Ferc1 to SQLite settings. |
|
An immutable pydantic model to validate FERC from 2 XBRL to SQLite settings. |
|
An immutable Pydantic model to validate FERC 2 to SQLite settings. |
|
An immutable Pydantic model to validate FERC 6 to SQLite settings. |
|
An immutable pydantic model to validate FERC from 6 XBRL to SQLite settings. |
|
An immutable Pydantic model to validate FERC 60 to SQLite settings. |
|
An immutable pydantic model to validate FERC from 60 XBRL to SQLite settings. |
|
An immutable pydantic model to validate FERC from 714 XBRL to SQLite settings. |
|
An immutable pydantic model to validate FERC XBRL to SQLite settings. |
|
Main settings validation class. |
Functions#
Recursively convert a dictionary of dataset settings to dagster config in place. |
|
|
Create a dictionary of dagster config out of a |
|
Create a DOI URL out o a Zenodo DOI. |
Attributes#
- class pudl.settings.XbrlFormNumber(*args, **kwds)[source]#
Bases:
enum.Enum
Contains full list of supported FERC XBRL forms.
- class pudl.settings.FrozenBaseModel(/, **data: Any)[source]#
Bases:
pydantic.BaseModel
BaseModel with global configuration.
- class pudl.settings.GenericDatasetSettings(/, **data: Any)[source]#
Bases:
FrozenBaseModel
An abstract pydantic model for generic datasets.
Each dataset must specify working partitions. A dataset can have an arbitrary number of partitions.
- Parameters:
disabled – if true, skip processing this dataset.
- property partitions: list[None | dict[str, str]][source]#
Return list of dictionaries representing individual partitions.
Convert a list of partitions into a list of dictionaries of partitions. This is intended to be used to store partitions in a format that is easy to use with
pd.json_normalize
.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc1Settings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate Ferc1Settings.
- Parameters:
data_source – DataSource metadata object
years – list of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc714Settings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate Ferc714Settings.
- Parameters:
data_source – DataSource metadata object
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.EpaCemsSettings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate EPA CEMS settings.
- Parameters:
data_source – DataSource metadata object
year_quarters – list of year_quarters to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.PhmsaGasSettings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate PHMSA settings.
- Parameters:
data_source – DataSource metadata object
years – list of zipped data start years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Eia923Settings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate EIA 923 settings.
- Parameters:
data_source – DataSource metadata object
years – list of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Eia861Settings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate EIA 861 settings.
- Parameters:
data_source – DataSource metadata object
years – list of years to validate.
transform_functions – list of transform functions to be applied to eia861
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Eia860Settings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate EIA 860 settings.
This model also check 860m settings.
- Parameters:
data_source – DataSource metadata object
years – list of years to validate.
eia860m – whether or not to incorporate an EIA-860m month.
ClassVar[str] (eia860m_year_month) – The 860m year-month to incorporate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- classmethod check_eia860m_year_month(eia860m: bool) bool [source]#
Check 860m date-year is exactly one year after most recent working 860 year.
- Parameters:
eia860m – True if 860m is requested.
- Returns:
True if 860m is requested.
- Return type:
eia860m
- Raises:
ValueError – the 860m date is within 860 working years.
- class pudl.settings.Eia860mSettings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable pydantic model to validate EIA 860m settings.
- Parameters:
data_source – DataSource metadata object
ClassVar[str] (year_months) – The 860m year to date.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.GlueSettings(/, **data: Any)[source]#
Bases:
FrozenBaseModel
An immutable pydantic model to validate Glue settings.
- Parameters:
eia – Include eia in glue settings.
ferc1 – Include ferc1 in glue settings.
- class pudl.settings.EiaSettings(/, **data: Any)[source]#
Bases:
FrozenBaseModel
An immutable pydantic model to validate EIA datasets settings.
- Parameters:
eia860 – Immutable pydantic model to validate eia860 settings.
eia861 – Immutable pydantic model to validate eia861 settings.
eia923 – Immutable pydantic model to validate eia923 settings.
- eia860: Eia860Settings | None[source]#
- eia860m: Eia860mSettings | None[source]#
- eia861: Eia861Settings | None[source]#
- eia923: Eia923Settings | None[source]#
- classmethod default_load_all(data: dict[str, Any]) dict[str, Any] [source]#
If no datasets are specified default to all.
- class pudl.settings.DatasetsSettings(/, **data: Any)[source]#
Bases:
FrozenBaseModel
An immutable pydantic model to validate PUDL Dataset settings.
- Parameters:
ferc1 – Immutable pydantic model to validate ferc1 settings.
eia – Immutable pydantic model to validate eia(860, 923) settings.
glue – Immutable pydantic model to validate glue settings.
epacems – Immutable pydantic model to validate epacems settings.
- eia: EiaSettings | None[source]#
- epacems: EpaCemsSettings | None[source]#
- ferc1: Ferc1Settings | None[source]#
- ferc714: Ferc714Settings | None[source]#
- glue: GlueSettings | None[source]#
- phmsagas: PhmsaGasSettings | None[source]#
- classmethod default_load_all(data: dict[str, Any]) dict[str, Any] [source]#
If no datasets are specified default to all.
- Parameters:
data – dataset settings inputs.
- Returns:
Validated dataset settings inputs.
- classmethod add_glue_settings(data: dict[str, Any]) dict[str, Any] [source]#
Add glue settings if ferc1 and eia data are both requested.
- make_datasources_table(ds: pudl.workspace.datastore.Datastore) pandas.DataFrame [source]#
Compile a table of dataset information.
There are three places we can look for information about a dataset: * the datastore (for DOIs, working partitions, etc) * the ETL settings (for partitions that are used in the ETL) * the DataSource info (which is stored within the ETL settings)
The ETL settings and the datastore have different levels of nesting - and therefore names for datasets. The nesting happens particularly with the EI data. There are three EIA datasets right now eia923, eia860 and eia860m. eia860m is a monthly update of a few tables in the larger eia860 dataset.
- Parameters:
ds – An initalized PUDL Datastore from which the DOI’s for each raw input dataset can be obtained.
- Returns:
a dataframe describing the partitions and DOI’s of each of the datasets in this settings object.
- class pudl.settings.Ferc1DbfToSqliteSettings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable Pydantic model to validate FERC 1 to SQLite settings.
- Parameters:
years – List of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.FercGenericXbrlToSqliteSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
pydantic_settings.BaseSettings
An immutable pydantic model to validate Ferc1 to SQLite settings.
- Parameters:
taxonomy – URL of XBRL taxonomy used to create structure of SQLite DB.
years – list of years to validate.
disabled – if True, skip processing this dataset.
- class pudl.settings.Ferc1XbrlToSqliteSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
FercGenericXbrlToSqliteSettings
An immutable pydantic model to validate Ferc1 to SQLite settings.
- Parameters:
taxonomy – URL of taxonomy used to .
years – list of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc2XbrlToSqliteSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
FercGenericXbrlToSqliteSettings
An immutable pydantic model to validate FERC from 2 XBRL to SQLite settings.
- Parameters:
years – List of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc2DbfToSqliteSettings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable Pydantic model to validate FERC 2 to SQLite settings.
- Parameters:
years – List of years to validate.
disabled – if True, skip processing this dataset.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc6DbfToSqliteSettings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable Pydantic model to validate FERC 6 to SQLite settings.
- Parameters:
years – List of years to validate.
disabled – if True, skip processing this dataset.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc6XbrlToSqliteSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
FercGenericXbrlToSqliteSettings
An immutable pydantic model to validate FERC from 6 XBRL to SQLite settings.
- Parameters:
years – List of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc60DbfToSqliteSettings(/, **data: Any)[source]#
Bases:
GenericDatasetSettings
An immutable Pydantic model to validate FERC 60 to SQLite settings.
- Parameters:
years – List of years to validate.
disabled – if True, skip processing this dataset.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc60XbrlToSqliteSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
FercGenericXbrlToSqliteSettings
An immutable pydantic model to validate FERC from 60 XBRL to SQLite settings.
- Parameters:
years – List of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.Ferc714XbrlToSqliteSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
FercGenericXbrlToSqliteSettings
An immutable pydantic model to validate FERC from 714 XBRL to SQLite settings.
- Parameters:
years – List of years to validate.
- data_source: ClassVar[pudl.metadata.classes.DataSource][source]#
- class pudl.settings.FercToSqliteSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
pydantic_settings.BaseSettings
An immutable pydantic model to validate FERC XBRL to SQLite settings.
- Parameters:
ferc1_dbf_to_sqlite_settings – Settings for converting FERC 1 DBF data to SQLite.
ferc1_xbrl_to_sqlite_settings – Settings for converting FERC 1 XBRL data to SQLite.
other_xbrl_forms – List of non-FERC1 forms to convert from XBRL to SQLite.
- ferc1_dbf_to_sqlite_settings: Ferc1DbfToSqliteSettings | None[source]#
- ferc1_xbrl_to_sqlite_settings: Ferc1XbrlToSqliteSettings | None[source]#
- ferc2_dbf_to_sqlite_settings: Ferc2DbfToSqliteSettings | None[source]#
- ferc2_xbrl_to_sqlite_settings: Ferc2XbrlToSqliteSettings | None[source]#
- ferc6_dbf_to_sqlite_settings: Ferc6DbfToSqliteSettings | None[source]#
- ferc6_xbrl_to_sqlite_settings: Ferc6XbrlToSqliteSettings | None[source]#
- ferc60_dbf_to_sqlite_settings: Ferc60DbfToSqliteSettings | None[source]#
- ferc60_xbrl_to_sqlite_settings: Ferc60XbrlToSqliteSettings | None[source]#
- ferc714_xbrl_to_sqlite_settings: Ferc714XbrlToSqliteSettings | None[source]#
- classmethod default_load_all(data: dict[str, Any]) dict[str, Any] [source]#
If no datasets are specified default to all.
- get_xbrl_dataset_settings(form_number: XbrlFormNumber) FercGenericXbrlToSqliteSettings [source]#
Return a list with all requested FERC XBRL to SQLite datasets.
- Parameters:
form_number – Get settings by FERC form number.
- class pudl.settings.EtlSettings(_case_sensitive: bool | None = None, _env_prefix: str | None = None, _env_file: pydantic_settings.sources.DotenvType | None = ENV_FILE_SENTINEL, _env_file_encoding: str | None = None, _env_ignore_empty: bool | None = None, _env_nested_delimiter: str | None = None, _env_parse_none_str: str | None = None, _secrets_dir: str | pathlib.Path | None = None, **values: Any)[source]#
Bases:
pydantic_settings.BaseSettings
Main settings validation class.
- ferc_to_sqlite_settings: FercToSqliteSettings | None[source]#
- datasets: DatasetsSettings | None[source]#
- classmethod from_yaml(path: str) EtlSettings [source]#
Create an EtlSettings instance from a yaml_file path.
- Parameters:
path – path to a yaml file; this could be remote.
- Returns:
An ETL settings object.
- pudl.settings._convert_settings_to_dagster_config(settings_dict: dict[str, Any]) None [source]#
Recursively convert a dictionary of dataset settings to dagster config in place.
For each partition parameter in a
GenericDatasetSettings
subclass, create a correspondingDagsterField
. By default theGenericDatasetSettings
subclasses will default to include all working paritions if the partition value is None. Get the value type so dagster can do some basic type checking in the UI.- Parameters:
settings_dict – dictionary of datasources and their parameters.
- pudl.settings.create_dagster_config(settings: GenericDatasetSettings) dict[str, dagster.Field] [source]#
Create a dictionary of dagster config out of a
GenericDatasetsSettings
.- Parameters:
settings – A dataset settings object, subclassed from
GenericDatasetSettings
.- Returns:
A dictionary of
DagsterField
objects.