dapper.surf.validate

dapper module: surf.validate.

Classes

CheckResult(check, severity, passed, detail)

One validation result row.

SurfaceValidator(*[, expected_sizes, ...])

Validator for ELM/CLM point surface NetCDF files (1×1 spatial cell).

class dapper.surf.validate.CheckResult(check, severity, passed, detail, var=None)[source]

Bases: object

One validation result row.

Parameters:
  • check (str)

  • severity (str)

  • passed (bool)

  • detail (str)

  • var (str | None)

check: str
detail: str
passed: bool
severity: str
var: Optional[str] = None
class dapper.surf.validate.SurfaceValidator(*, expected_sizes=None, lat_candidates=('lsmlat', 'lat', 'latitude', 'y'), lon_candidates=('lsmlon', 'lon', 'longitude', 'x'), enforce_known_vars_only=False, require_point_dims=True, skip_soft_checks=False)[source]

Bases: object

Validator for ELM/CLM point surface NetCDF files (1×1 spatial cell).

Primary (format/layout) checks

V-001 dims: lat/lon like dims exist and both have length == 1 (ERROR) V-002 dims.sizes: expected sizes for common dims (WARN) e.g. time=12, natpft=17, lsmpft=17, nlevsoi=10, nlevslp=11, numurbl=3, numrad=2, nlevurb=5 V-003 schema.required: required variables per SCHEMA are present (ERROR) V-004 schema.choose_one_of: at least one var present in each group (ERROR) V-005 schema.conditional: if driver present → dependent vars present (WARN) V-006 registry.known_vars: vars not in REGISTRY flagged (INFO, or ERROR if enforce_known_vars_only) V-007 dims.order: for spatial vars, spatial dims are the last two (…, lat, lon) (WARN) V-008 dims.match_registry: non-spatial dim order matches REGISTRY (WARN) V-009 dtype.match_registry: integer vs float matches REGISTRY (WARN) V-010 units.present: ‘units’ attribute exists (WARN) V-011 units.match_registry: exact match when REGISTRY.units not ‘’/’varies’ (WARN) V-012 fillvalue.sane: floats have _FillValue (NaN ok), ints do not rely on NaN (WARN) V-013 coordinates.present: LATIXY/LONGXY present (WARN); optional INFO: lat/lon coord ≈ LATIXY/LONGXY

Soft (non-blocking) checks

V-101 ranges.percent: PCT_* (and PCT_NATVEG) ∈ [0,100] (ERROR) V-102 ranges.unit: LANDFRAC_PFT, SKY_VIEW ∈ [0,1] (ERROR) V-103 ranges.nonneg: SLOPE, (ST)DEV_ELEV, AREA, TOPO ≥ 0 (ERROR) V-104 time.length: any var with ‘time’ dim → len(time)==12 (ERROR) V-105 consistency.pftsum: sum(PCT_NAT_PFT) ≈ PCT_NATVEG (WARN) V-106 conditional.urban: if max(PCT_URBAN)>0 → URBAN_REGION_ID present (WARN) V-107 conditional.glacier: if max(PCT_GLACIER)>0 → GLC_MEC & PCT_GLC_MEC present (WARN)

Usage

>>> v = SurfaceValidator()
>>> report = v.validate(r"X:\path\surfdata_1x1pt.nc")
>>> report.query("severity=='ERROR' and passed==False")
validate(nc_path)[source]

Run validation on a surface NetCDF and return a structured report.

Return type:

DataFrame

Parameters:

nc_path (str)

Parameters:
  • expected_sizes (Dict[str, int] | None)

  • lat_candidates (Tuple[str, ...])

  • lon_candidates (Tuple[str, ...])

  • enforce_known_vars_only (bool)

  • require_point_dims (bool)

  • skip_soft_checks (bool)