Skip to content

PROVENANCE — Sub-daily rainfall depth ratios (by climatic region)

Methodology of record. This document is the engineering provenance for the sub-daily depth-ratio table that feeds Cascade Hydro's Tier-3 (daily-only) sub-daily estimation path, replacing the previously unsourced fallback ratio table.

The quantitative per-region tables (sample sizes, record spans, medians, CIs) are produced by running scripts/derive_subdaily_ratios.py; that run regenerates this file with the numbers filled in. The sections below define the method, sources, assumptions, and limitations that hold regardless of the run.

Engineer-of-record note. This supports the engineer's sign-off; it does not replace it. Values flagged planning-level are explicitly not for final design without local data.


1. What this is and what it replaces

When only daily (24-h) annual-maximum rainfall is available, sub-daily design depths are estimated by multiplying the 24-h depth by a depth ratio for each shorter duration:

\[ \text{depth}(D) = \text{depth}(1440) \times \text{ratio}(D), \qquad \text{ratio}(D) = \frac{\text{depth}(D)}{\text{depth}(1440)} \]

so \(\text{ratio}(1440) = 1.00\) by construction and all shorter durations are \(< 1.00\). Target durations (minutes): 5, 10, 15, 30, 60, 120, 360, 720, 1440.

The prior _FALLBACK_RATIOS had no documented source. This methodology derives the ratios from a citable ECCC product with a full per-station audit trail.


2. Aggregation unit — climatic region, not province

Sub-hourly:daily ratios are set by which storm type produces the short-duration annual maxima. Convective regimes (Prairies, continental interior) raise the 5–30 min ratios; synoptic/orographic regimes (Pacific Maritime, Atlantic Maritime) flatten them. That gradient follows ecological/climatic regions, not political boundaries, so the aggregation unit is the terrestrial ecozone (15 units; National Ecological Framework). Ecoprovinces (~53) are available as a finer option but thin the per-unit sample, especially in the North.

Caveat (stated). Ecozones are a climatic–vegetation stratification, not a purpose-built rainfall-regime regionalization. They are official, citable, and broadly track the convective/synoptic gradient, but are not optimised for rainfall scaling. A data-driven cross-check (cluster stations by their ratio vectors; compare to the assigned region) is recommended before treating any single region value as definitive.

A province-keyed convenience table is also produced for back-compatibility. It blends regimes within multi-zone provinces (e.g. BC spans Pacific Maritime, Montane/Boreal Cordillera, Taiga Plains) and is therefore planning-level by construction. Prefer the region table via a lat/lon lookup (see §8).


3. Primary data source

ECCC Engineering Climate Datasets — Short-Duration Rainfall IDF Files, default v3.40 (released December 2025; the version is a script parameter, --idf-version). Host: https://collaboration.cmc.ec.gc.ca/cmc/climate/Engineer_Climate/IDF/, distributed as a single nested .zip per version (<version>.zip → per-province .zip archives → per-station .txt files). Each file's "Table 1: Annual Maximum (mm)" lists the annual-maximum depth for exactly the nine target durations, already QC'd by ECCC from the high-resolution (tipping-bucket, 5-min) gauge records.

Why this source rather than re-deriving from raw sub-daily series:

  1. ECCC's public bulk archive (climate.weather.gc.ca) only resolves to 1 hour, so 5/10/15/30-min annual maxima are not obtainable from it. Table 1 is the official sub-hourly annual-maximum series.
  2. Using ECCC's published, QC'd AMS makes the result traceable to a citable ECCC product and avoids re-implementing ECCC's quality control.

This folds two candidate approaches — empirical sub-daily and published-IDF fallback — into one defensible path that covers all nine durations.

Optional cross-check: the ≥1 h ratios can additionally be checked against annual maxima rolled up from the bulk hourly archive; this cannot touch the sub-hourly durations and is not part of the primary derivation.


4. Region boundaries

National Ecological Framework for Canada, v2.2 (Agriculture and Agri-Food Canada) — Terrestrial Ecozones of Canada (15 units) or Terrestrial Ecoprovinces of Canada (53 units), distributed as GeoJSON via open.canada.ca / agriculture.canada.ca. Stations are assigned to a region by point-in-polygon (ray casting, holes handled). Assignment method is recorded per station:

  • in_polygon — inside an official region polygon.
  • nearest_centroid — outside all polygons (typically coastal/offshore); assigned to the nearest region centroid and flagged for audit.
  • unassigned — lat/lon could not be parsed; excluded from region aggregation, listed in the CSV.

If no boundary file is supplied, the script does not guess ecozones; it falls back to province-level aggregation with a printed warning (never an invented region).


5. Method (per station)

For each duration D, fit a Gumbel (EV-I) distribution to the station's annual maxima by method of moments (ECCC's own IDF practice; matches Cascade Hydro frequency.py):

\[ \beta = \frac{s\sqrt{6}}{\pi}, \qquad \mu = \bar{x} - \gamma\beta, \qquad \gamma = 0.5772156649 \]
\[ \text{depth}(D, T) = \mu + \beta\left[-\ln\left(-\ln\left(1 - \tfrac{1}{T}\right)\right)\right] \]

Primary ratio = design-depth ratio at return period \(T\) (default \(T = 10\) yr):

\[ \text{ratio}_T(D) = \frac{\text{depth}(D, T)}{\text{depth}(1440, T)} \]

Also computed and stored: - RP sensitivity at T = 2 and 100 yr (depth ratios are mildly RP-dependent; the spread is reported so the engineer can judge it against the inter-station spread within a region); - a distribution-free cross-check: \(\text{mean}(AM(D)) / \text{mean}(AM(1440))\); - the median of per-year paired ratios (diagnostic; noisier).


6. Inclusion / QC

  • ≥ 10 valid annual maxima per duration (ECCC's own minimum record length for IDF development). The 24-h anchor must clear the bar (it is the denominator).
  • -99.9 sentinel → missing; non-physical values (≤ 0) dropped.
  • A computed ratio > 1.0 for D < 1440 is flagged (data problem) and excluded from that station's contribution.

7. Aggregation & uncertainty (per region, per duration)

  • Region ratio = median of the per-station ratios (robust to outlier stations). The record-length-weighted mean is also reported.
  • Uncertainty: n_stations, inter-station stdev, IQR, and a 90% normal-approx CI about the mean (\(\text{mean} \pm 1.645 \cdot \text{stdev}/\sqrt{n}\)).
  • National average = median of the pooled per-station ratios across all regions.

Confidence flags

Flag Meaning
documented region-duration backed by ≥ 5 stations
planning-level 1–4 stations; screening/planning use, not final design without local data
(blank / None) no usable station data — value left empty, not invented; borrow from the nearest comparable region only as an explicit, documented engineering decision

8. Integration into Cascade Hydro

scripts/derive_subdaily_ratios.py writes ratios.py exposing:

  • RATIOS_BY_REGION: dict[str, dict[int, float]]primary product.
  • NATIONAL_AVERAGE: dict[int, float].
  • PROVINCE_CONVENIENCE: dict[str, dict[int, float]] — back-compat, planning-level.
  • PROVINCE_TO_REGION: dict[str, dict[str, int]] — crosswalk (station counts).
  • CONFIDENCE: dict[str, str], SOURCE: str, POPULATED: bool.

Recommended wiring. Move the Tier-3 lookup to lat/lon → ecozone → RATIOS_BY_REGION (ECCC records already carry coordinates). When a daily-only record has no coordinates, fall back to PROVINCE_CONVENIENCE[province] (marked planning-level), then NATIONAL_AVERAGE. Keep the existing Provenance machinery: while POPULATED is False, retain the current "source required" state; once populated, set the citation from SOURCE and surface the per-region CONFIDENCE flag in the IDF report so planning-level regions are visible.


9. Limitations

  • Ecozone is a proxy for rainfall regime (§2 caveat); validate against a data-driven clustering before relying on a single region value.
  • Northern/Arctic ecozones have few or no IDF stations; expect planning-level or empty there. Any neighbour-borrowing is an explicit, documented engineering choice — not produced automatically.
  • Header lat/lon parsing in the ECCC .txt is best-effort; stations that fail it are reported and excluded from region aggregation (they still appear in the CSV).
  • nearest_centroid assignments (coastal) should be spot-checked.
  • The IDF AMS (default v3.40) reflect ECCC's record through the dataset vintage; they do not include any climate-change uplift (handled separately in the tool).

10. Citations

  • Environment and Climate Change Canada. Engineering Climate Datasets — Short-Duration Rainfall Intensity-Duration-Frequency Data, v3.40 (2025); version configurable. climate.weather.gc.ca/prods_servs/engineering_e.html · collaboration.cmc.ec.gc.ca/cmc/climate/Engineer_Climate/IDF/ (open)
  • Agriculture and Agri-Food Canada. National Ecological Framework for Canada, v2.2 — Terrestrial Ecozones / Ecoprovinces (GeoJSON). open.canada.ca (open)
  • Ecological Stratification Working Group (1995/1996). A National Ecological Framework for Canada. AAFC (Research Branch) & Environment Canada (SOE Directorate), Ottawa/Hull. Cat. No. A42-65/1996E. (report; verify edition)
  • World Meteorological Organization. Guide to Hydrological Practices, WMO-No. 168 — depth-duration relationship context. (verify edition/section)
  • Hershfield, D.M. (1961). Rainfall Frequency Atlas of the United States (Tech. Paper 40), U.S. Weather Bureau — classic short-duration ratio context (US; cross-check only, not a Canadian source).
  • (verify) NOAA Atlas 14 (Bonnin et al.) — US sub-hourly ratio cross-check reference; not used as a Canadian source.

Verification status: the two ECCC/AAFC open datasets and their access URLs were confirmed by web search. The WMO-No. 168, Hershfield, and NOAA Atlas 14 entries are cross-check context cited from standard knowledge and carry a "verify" marker; obtain the primary documents before relying on them for anything beyond context.