dapper.met.exporter

Meteorological data export pipelines.

Classes

Exporter(adapter, src_path, *, domain[, ...])

Source-agnostic meteorological exporter.

class dapper.met.exporter.Exporter(adapter, src_path, *, domain, out_dir=None, calendar='noleap', dtime_resolution_hrs=1, dtime_units='days', dformat='BYPASS', append_attrs=None, chunks=None, include_vars=None, exclude_vars=None)[source]

Bases: object

Source-agnostic meteorological exporter.

This class orchestrates a two-pass pipeline that ingests time-sharded CSVs for many sites/cells, preprocesses them via a pluggable adapter, and writes ELM-ready NetCDF outputs in two layouts:

  1. "cellset" – one NetCDF per variable with dims ('DTIME','lat','lon') (global packing; sparse lat/lon axes are OK).

  2. "sites" – one directory per site; each directory contains one NetCDF per variable with dims ('n','DTIME') where n=1 (per-site packing).

Exporter is source-agnostic: all dataset-specific logic (file discovery, unit conversions, renaming to ELM short names, etc.) lives in an adapter that implements the BaseAdapter interface (e.g., an ERA5Adapter). The exporter handles staging (CSV → per-site parquet), global DTIME axis creation, packing scans, chunking, and NetCDF I/O.

Parameters:
  • adapter (BaseAdapter) – Implements: discover_files, normalize_locations, preprocess_shard, required_vars, and pack_params.

  • csv_directory (str or pathlib.Path) – Directory containing time-sharded CSV files for all sites/cells.

  • out_dir (str or pathlib.Path) – Destination directory for NetCDF outputs and temporary parquet shards.

  • df_loc (pandas.DataFrame) – Locations table with at least columns ["gid","lat","lon"]; optional "zone". The adapter’s normalize_locations: - validates columns, - adds "lon_0-360", - fills/validates "zone", - sorts for stable site order.

  • id_col (str, optional) – Kept for backward compatibility (unused when "gid" is assumed).

  • calendar ({"noleap","standard"}, default "noleap") – Calendar for numeric DTIME coordinate; Feb 29 filtered for “noleap”.

  • dtime_resolution_hrs (int, default 1) – Target time resolution in hours for the DTIME axis.

  • dtime_units ({"days","hours"}, default "days") – Units of the numeric DTIME coordinate (e.g., "days since YYYY-MM-DD HH:MM:SS").

  • domain (Domain)

  • dformat (str)

  • append_attrs (dict | None)

dformat{“BYPASS”,”DATM_MODE”}, default “BYPASS”

Target ELM format selector passed through to the adapter.

append_attrsdict, optional

Extra global NetCDF attributes to include in every file. The exporter also adds: export_mode ("cellset" or "sites") and pack_scope ("global" or "per-site").

chunkstuple[int,…], optional

Explicit NetCDF chunk sizes.

include_vars / exclude_varsIterable[str], optional

Allow-/block-lists of ELM short names applied after preprocess. Meta columns {"gid","time","LATIXY","LONGXY","zone"} are always kept.

Side Effects

  • Creates a temporary directory of per-site parquet shards under out_dir.

  • Writes NetCDF files to out_dir in the chosen layout.

  • Writes a zone_mappings.txt file either at the root (cellset) or inside each site directory (sites).

Notes

  • Packing: global packing for cellset; per-site packing for sites.

  • Required columns: CSV shards and df_loc both use "gid"; CSVs include the adapter’s date/time column (renamed to "time" during preprocess).

  • Combined (lat/lon) layout: does not enforce regular grids; axes are the unique sorted lat/lon from df_loc (sparse OK).

run(*, pack_scope=None, filename=None, overwrite=False)[source]

Run the MET export for this exporter’s Domain.

The output layout is derived from Domain.mode:
  • sites: writes <run_dir>/<gid>/MET/{prefix_}{var}.nc and a per-site zone_mappings.txt (always zone=01, id=1).

  • cellset: writes <run_dir>/MET/{prefix_}{var}.nc and a single zone_mappings.txt covering all locations (zones taken from df_loc, default 1).

Parameters:
  • pack_scope – Optional packing strategy override. Defaults to per-site for sites and global for cellset outputs.

  • filename (str | None) – Optional filename prefix for output NetCDF files. If provided, each variable is written to {filename}_{var}.nc.

  • overwrite (bool) – If True, clears existing MET outputs before writing.

Return type:

None