pudl.metadata.fields¶
Field metadata.
Attributes¶
Field attributes by PUDL identifier (field.name). |
|
Field attributes by resource group (resource.group) and PUDL identifier. |
|
Functions¶
|
Compile a dictionary of field dtypes, applying group overrides. |
|
Apply dtypes to those columns in a dataframe that have PUDL types defined. |
Module Contents¶
- pudl.metadata.fields.FIELD_METADATA: dict[str, dict[str, Any]][source]¶
Field attributes by PUDL identifier (field.name).
Keys are in alphabetical order.
- pudl.metadata.fields.FIELD_METADATA_BY_GROUP: dict[str, dict[str, Any]][source]¶
Field attributes by resource group (resource.group) and PUDL identifier.
If a field exists in more than one data group (e.g. both
eia
andferc1
) and has distinct metadata in those groups, this is the place to specify the override. Only those elements which should be overridden need to be specified.
- pudl.metadata.fields.get_pudl_dtypes(group: str | None = None, field_meta: dict[str, Any] | None = FIELD_METADATA, field_meta_by_group: dict[str, Any] | None = FIELD_METADATA_BY_GROUP, dtype_map: dict[str, Any] | None = FIELD_DTYPES_PANDAS) dict[str, Any] [source]¶
Compile a dictionary of field dtypes, applying group overrides.
- Parameters:
group – The data group (e.g. ferc1, eia) to use for overriding the default field types. If None, no overrides are applied and the default types are used.
field_meta – Field metadata dictionary which at least describes a “type”.
field_meta_by_group – Field metadata type overrides to apply based on the data group that the field is part of, if any.
dtype_map – Mapping from canonical PUDL data types to some other set of data types. Uses pandas data types by default.
- Returns:
A mapping of PUDL field names to their associated data types.
- pudl.metadata.fields.apply_pudl_dtypes(df: pandas.DataFrame, group: str | None = None, field_meta: dict[str, Any] | None = FIELD_METADATA, field_meta_by_group: dict[str, Any] | None = FIELD_METADATA_BY_GROUP, strict: bool = False) pandas.DataFrame [source]¶
Apply dtypes to those columns in a dataframe that have PUDL types defined.
Note that ad-hoc column dtypes can be defined and merged with default PUDL field metadata before it’s passed in as
field_meta
if you have module specific column types you need to apply alongside the standard PUDL field types.- Parameters:
df – The dataframe to apply types to. Not all columns need to have types defined in the PUDL metadata unless you pass
strict=True
.group – The data group to use for overrides, if any. E.g. “eia”, “ferc1”.
field_meta – A dictionary of field metadata, where each key is a field name and the values are dictionaries which must have a “type” element. By default this is pudl.metadata.fields.FIELD_METADATA.
field_meta_by_group – A dictionary of field metadata to use as overrides, based on the value of group, if any. By default it uses the overrides defined in pudl.metadata.fields.FIELD_METADATA_BY_GROUP.
strict – whether or not all columns need a corresponding field.
- Returns:
The input dataframe, but with standard PUDL types applied.