dapper.ERA5Adapter¶

class dapper.ERA5Adapter[source]¶

Bases: BaseAdapter

ERA5-Land → ELM adapter.

This adapter implements the BaseAdapter interface for ERA5-Land hourly data. It handles source-specific details—file discovery, unit conversions, humidity diagnostics, renaming to ELM short names, and nonnegativity enforcement, so the upstream Exporter can remain source-agnostic.

Responsibilities¶

discover_files: Find CSV shards in a directory and infer the overall (start_year, end_year) using their date coverage.
normalize_locations: Validate and normalize the locations table (adds lon_0-360, ensures/creates zone, stable sorting).
id_column_for_csv: Declare the identifier column name in the input CSVs. For ERA5 we require gid.
preprocess_shard: Convert one merged shard (CSV rows joined to locations) into canonical ELM columns. Steps include:
1. time filtering and optional “noleap” removal of Feb 29
2. ERA5→ELM unit conversions (e.g., J/hr/m² → W/m², m/hr → mm/s)
3. optional humidity computation (RH/Q) if temperature, dewpoint, and surface pressure are available
4. renaming raw ERA5 fields to ELM short names via a mapping
5. clipping canonical nonnegative variables
6. returning only required columns in a deterministic order
required_vars: Report the canonical ELM variable names required for the requested output format.
pack_params: Provide robust (add_offset, scale_factor) for a canonical ELM variable, given optional data to tune ranges.

Notes

Humidity computation is performed only when temperature_2m, dewpoint_temperature_2m, and surface_pressure are present.
Precipitation conversion uses m/hr → mm/s via division by 3.6.

__init__()¶

Methods

`__init__`()
`discover_files`(csv_directory, calendar)	Discover ERA5 CSV shards in a directory and infer the inclusive year range.
`id_column_for_csv`(df_csv, id_col)	Return the required identifier column name expected in ERA5 CSV shards ("gid").
`normalize_locations`(df_loc[, id_col])	Standardize df_loc to include ['gid','lat','lon','lon_0-360','zone'], sorted by (lat, lon).
`pack_params`(elm_var[, data])	Return (add_offset, scale_factor) used to pack a variable for NetCDF output.
`preprocess_shard`(df_merged, start_year, ...)
`required_vars`(dformat)	Return the canonical ELM variables required for the requested output format.

Attributes

`DRIVER_TAG`
`SOURCE_NAME`

DRIVER_TAG = 'ERA5'¶

SOURCE_NAME = 'ERA5-Land hourly reanalysis'¶

discover_files(csv_directory, calendar)[source]¶: Discover ERA5 CSV shards in a directory and infer the inclusive year range.

id_column_for_csv(df_csv, id_col)[source]¶: Return the required identifier column name expected in ERA5 CSV shards (“gid”).

pack_params(elm_var, data=None)[source]¶: Return (add_offset, scale_factor) used to pack a variable for NetCDF output.

preprocess_shard(df_merged, start_year, end_year, calendar, dformat)[source]¶

Filter time & handle no-leap
Apply ERA5 → ELM unit conversions
Compute humidities (if columns available)
Rename columns to canonical ELM names using RAW_TO_ELM
Clip canonical nonnegative variables
Return only the canonical vars required by elm_required_vars(dformat), plus LONGXY/LATIXY/time/gid/zone (coords/meta).

required_vars(dformat)[source]¶: Return the canonical ELM variables required for the requested output format.