pudl.output.export module¶
Routines for exporting data from PUDL for use elsewhere.
Function names should be indicative of the format of the thing that’s being exported (e.g. CSV, Excel spreadsheets, parquet files, HDF5).
-
pudl.output.export.
annotated_xlsx
(df, notes_dict, tags_dict, first_cols, sheet_name, xlsx_writer)[source]¶ Outputs an annotated spreadsheet workbook based on compiled dataframes.
Creates annotation tab and header rows for EIA 860, EIA 923, and FERC 1 fields in a dataframe. This is done using an Excel Writer object, which must be created and saved outside the function, thereby allowing multiple sheets and associated annotations to be compiled in the same Excel file.
- Parameters
df (pandas.DataFrame) – The dataframe for which annotations are being created
notes_dict (dict) – dictionary with column names as keys and long annotations as values
tags_dict (dict) – dictionary of dictionaries with tag categories as keys for outer dictionary and values are dictionaries with column names as keys and values are tag within the tag category
first_cols (list) – ordered list of columns that should come first in outfile
sheet_name (string) – name of data sheet in output spreadsheet
xlsx_writer (pandas.ExcelWriter) – this is an ExcelWriter object used to accumulate multiple tabs, which must be created outside of function, before calling the first time e.g. “xlsx_writer = pd.ExcelWriter(‘outfile.xlsx’)”
- Returns
which must be called outside the function, after final use of function, for writing out to excel: “xlsx_writer.save()”
- Return type
xlsx_writer (pandas.ExcelWriter)