dapper.ERA5Adapter¶
- class dapper.ERA5Adapter[source]¶
Bases:
BaseAdapterERA5-Land → ELM adapter.
This adapter implements the
BaseAdapterinterface for ERA5-Land hourly data. It handles source-specific details—file discovery, unit conversions, humidity diagnostics, renaming to ELM short names, and nonnegativity enforcement, so the upstreamExportercan remain source-agnostic.Responsibilities¶
discover_files: Find CSV shards in a directory and infer the overall (start_year, end_year) using their date coverage.
normalize_locations: Validate and normalize the locations table (adds
lon_0-360, ensures/createszone, stable sorting).id_column_for_csv: Declare the identifier column name in the input CSVs. For ERA5 we require
gid.preprocess_shard: Convert one merged shard (CSV rows joined to locations) into canonical ELM columns. Steps include:
time filtering and optional “noleap” removal of Feb 29
ERA5→ELM unit conversions (e.g., J/hr/m² → W/m², m/hr → mm/s)
optional humidity computation (RH/Q) if temperature, dewpoint, and surface pressure are available
renaming raw ERA5 fields to ELM short names via a mapping
clipping canonical nonnegative variables
returning only required columns in a deterministic order
required_vars: Report the canonical ELM variable names required for the requested output format.
pack_params: Provide robust
(add_offset, scale_factor)for a canonical ELM variable, given optional data to tune ranges.
Notes
Humidity computation is performed only when
temperature_2m,dewpoint_temperature_2m, andsurface_pressureare present.Precipitation conversion uses
m/hr → mm/svia division by3.6.
- __init__()¶
Methods
__init__()discover_files(csv_directory, calendar)Discover ERA5 CSV shards in a directory and infer the inclusive year range.
id_column_for_csv(df_csv, id_col)Return the required identifier column name expected in ERA5 CSV shards ("gid").
normalize_locations(df_loc[, id_col])Standardize df_loc to include ['gid','lat','lon','lon_0-360','zone'], sorted by (lat, lon).
pack_params(elm_var[, data])Return (add_offset, scale_factor) used to pack a variable for NetCDF output.
preprocess_shard(df_merged, start_year, ...)required_vars(dformat)Return the canonical ELM variables required for the requested output format.
Attributes
- DRIVER_TAG = 'ERA5'¶
- SOURCE_NAME = 'ERA5-Land hourly reanalysis'¶
- discover_files(csv_directory, calendar)[source]¶
Discover ERA5 CSV shards in a directory and infer the inclusive year range.
- id_column_for_csv(df_csv, id_col)[source]¶
Return the required identifier column name expected in ERA5 CSV shards (“gid”).
- pack_params(elm_var, data=None)[source]¶
Return (add_offset, scale_factor) used to pack a variable for NetCDF output.
- preprocess_shard(df_merged, start_year, end_year, calendar, dformat)[source]¶
Filter time & handle no-leap
Apply ERA5 → ELM unit conversions
Compute humidities (if columns available)
Rename columns to canonical ELM names using RAW_TO_ELM
Clip canonical nonnegative variables
Return only the canonical vars required by elm_required_vars(dformat), plus LONGXY/LATIXY/time/gid/zone (coords/meta).