
Spatial operations for demand allocation.


check_gdf(→ None)

Check that GeoDataFrame contains (Multi)Polygon geometries with non-zero area.


Convert geometry to (Multi)Polygon.

explode(→ geopandas.GeoDataFrame)

Explode MultiPolygon to multiple Polygon geometries.

self_union(→ geopandas.GeoDataFrame)

Calculate the geometric union of a feature layer with itself.

dissolve(→ geopandas.GeoDataFrame)

Dissolve layer by aggregating features based on common attributes.

overlay(→ geopandas.GeoDataFrame)

Overlay multiple layers incrementally.

get_data_columns(→ list)

Return list of columns, ignoring geometry.

Module Contents

pudl.analysis.spatial.check_gdf(gdf: geopandas.GeoDataFrame) None[source]

Check that GeoDataFrame contains (Multi)Polygon geometries with non-zero area.


gdf – GeoDataFrame.

  • TypeError – Object is not a GeoDataFrame.

  • AttributeError – GeoDataFrame has no geometry.

  • TypeError – Geometry is not a GeoSeries.

  • ValueError – Geometry contains null geometries.

  • ValueError – Geometry contains non-(Multi)Polygon geometries.

  • ValueError – Geometry contains (Multi)Polygon geometries with zero area.

  • ValueError – MultiPolygon contains Polygon geometries with zero area.

pudl.analysis.spatial.polygonize(geom: shapely.geometry.base.BaseGeometry) shapely.geometry.Polygon | shapely.geometry.MultiPolygon[source]

Convert geometry to (Multi)Polygon.


geom – Geometry to convert to (Multi)Polygon.


Geometry converted to (Multi)Polygon, with all zero-area components removed.


ValueError – Geometry has zero area.

pudl.analysis.spatial.explode(gdf: geopandas.GeoDataFrame, ratios: collections.abc.Iterable[str] = None) geopandas.GeoDataFrame[source]

Explode MultiPolygon to multiple Polygon geometries.

  • gdf – GeoDataFrame with non-zero-area (Multi)Polygon geometries.

  • ratios – Names of columns to rescale by the area fraction of the Polygon relative to the MultiPolygon. If provided, MultiPolygon cannot self-intersect. By default, the original value is used unchanged.


ValueError – Geometry contains self-intersecting MultiPolygon.


GeoDataFrame with each Polygon as a separate row in the GeoDataFrame. The index is the number of the source row in the input GeoDataFrame.

pudl.analysis.spatial.self_union(gdf: geopandas.GeoDataFrame, ratios: collections.abc.Iterable[str] = None) geopandas.GeoDataFrame[source]

Calculate the geometric union of a feature layer with itself.

Areas of overlap are split into two or more geometrically-identical features: one for each of the original overlapping features. Each split feature contains the attributes of the original feature.

  • gdf – GeoDataFrame with non-zero-area MultiPolygon geometries.

  • ratios – Names of columns to rescale by the area fraction of the split feature relative to the original. By default, the original value is used unchanged.


GeoDataFrame representing the union of the input features with themselves. Its index contains tuples of the index of the original overlapping features.


NotImplementedError – MultiPolygon geometries are not yet supported.

pudl.analysis.spatial.dissolve(gdf: geopandas.GeoDataFrame, by: collections.abc.Iterable[str], func: collections.abc.Callable | str | list | dict, how: Literal['union', 'first'] | collections.abc.Callable[[geopandas.GeoSeries], shapely.geometry.base.BaseGeometry] = 'union') geopandas.GeoDataFrame[source]

Dissolve layer by aggregating features based on common attributes.

  • gdf – GeoDataFrame with non-empty (Multi)Polygon geometries.

  • by – Names of columns to group features by.

  • func – Aggregation function for data columns (see pd.DataFrame.groupby()).

  • how – Aggregation function for geometry column. Either ‘union’ (gpd.GeoSeries.unary_union()), ‘first’ (first geometry in group), or a function aggregating multiple geometries into one.


GeoDataFrame with dissolved geometry and data columns, and grouping columns set as the index.

pudl.analysis.spatial.overlay(*gdfs: geopandas.GeoDataFrame, how: Literal['intersection', 'union', 'identity', 'symmetric_difference', 'difference'] = 'intersection', ratios: collections.abc.Iterable[str] = None) geopandas.GeoDataFrame[source]

Overlay multiple layers incrementally.

When a feature from one layer overlaps the feature of another layer, the area of overlap is split into two geometrically-identical features: one for each of the original overlapping features. Each split feature contains the attributes of the original feature.

TODO: To identify the source of output features, the user can ensure that each layer contains a column to index by. Alternatively, tuples of indices of the overlapping feature from each layer (null if none) could be returned as the index.

  • gdfs – GeoDataFrames with non-empty (Multi)Polygon geometries assumed to contain no self-overlaps (see self_union()). Names of (non-geometry) columns cannot be used more than once. Any index colums are ignored.

  • how – Spatial overlay method (see gpd.overlay()).

  • ratios – Names of columns to rescale by the area fraction of the split feature relative to the original. By default, the original value is used unchanged.


ValueError – Duplicate column names in layers.


GeoDataFrame with the geometries and attributes resulting from the overlay.

pudl.analysis.spatial.get_data_columns(df: pandas.DataFrame) list[source]

Return list of columns, ignoring geometry.