Validation: published IDF_CC (UWO) worked examples vs. Cascade Hydro¶
This page is the readable baseline for a second external-validation suite
(tests/test_validation_idfcc_london.py). It complements the
ECCC real-data validation: where that page checks the
Gumbel/Method-of-Moments core against Environment and Climate Change Canada's
published numbers, this page cross-checks both distributions the
industry-standard IDF_CC tool (University of Western Ontario) publishes
worked numbers for, on a single shared annual-maximum series:
- Gumbel (EV1) by Method of Moments, and
- GEV by L-moments (Hosking & Wallis 1997).
The GEV / L-moments column is the reason this page exists. It is the only
published, reproducible ground truth available for Cascade Hydro's closed-form
GEV estimator (idf_analyzer.analysis.frequency._gev_lmoments_params), the same
estimator the IDF_CC tool uses.
Every number in the result tables below is produced by the shipped analysis code
on the committed fixture. To regenerate them, run
pytest tests/test_validation_idfcc_london.py -v.
Scope: this page validates the statistical core against a second authority
This page checks parameter estimation and return levels for Gumbel and GEV against a published worked example. It is the analogue of VALIDATION.md Layer 1. The full daily-gauge pipeline (screening, curve fitting, sub-daily ratios) is validated end to end on the ECCC station and is not repeated here.
| Field | Value |
|---|---|
| Reference station | London, Ontario (ECCC), IDF_CC manual Appendix B |
| Source | Schardong, Gaur, Simonovic & Sandink (2025), Computerized Tool for the Development of IDF Curves Under a Changing Climate — Technical Manual v.8.0, Water Resources Research Report No. 115, UWO. ISBN (online) 978-0-7714-3160-9 |
| Methods (stated) | Gumbel / Method of Moments; GEV / L-moments (manual §3.1.3) |
| Test file | tests/test_validation_idfcc_london.py |
| Ground-truth fixture | tests/data/idfcc_london_ontario_appendixB.txt (manual transcribed verbatim) |
This fixture lives under tests/ and is not packaged into the installed
application. It exists only to gate releases.
1. Why this is a clean benchmark¶
The IDF_CC manual prints, for one London annual-maximum series, three things that make a direct comparison possible:
- The annual-maximum series (AMS) it fitted, per duration (Appendix B).
- The resulting Gumbel return-period depths (manual p. 26).
- The resulting GEV return-period depths (manual p. 28).
The manual states its estimators as Gumbel / Method of Moments and GEV /
L-moments, which are exactly the two closed-form estimators Cascade Hydro uses.
The input data and estimators are therefore identical to the manual's, so any
disagreement comes only from table rounding (2 dp) and the shared Hosking-Wallis
shape polynomial. The tables are read directly from the committed fixture by
parse_annual_maxima(), parse_published_gumbel() and parse_published_gev();
they are never hand-typed into the test.
2. Methodology¶
2.1 Gumbel (EV1) by Method of Moments¶
Identical to the ECCC page: scale \(\beta = s\sqrt{6}/\pi\), location
\(\mu = \bar{x} - \gamma\beta\), return level \(x_T = \mu + \beta\,y_T\) with
\(y_T = -\ln[-\ln(1-1/T)]\). See VALIDATION.md §3.1
for the worked 24 h derivation. This is
fit_distribution(series, dist_name="gumbel_r", method="MoM").
2.2 GEV by L-moments (Hosking & Wallis 1997)¶
The annual maxima give unbiased sample probability-weighted moments \(b_0, b_1, b_2\), then L-moments \(\ell_1, \ell_2, \ell_3\) and the L-skewness \(t_3 = \ell_3/\ell_2\). The GEV parameters follow in closed form (manual Eqs. 15-17):
with shape \(k\), scale \(\alpha\), location \(\mu\). The \(T\)-year depth is the GEV quantile
Sign convention. Cascade Hydro evaluates the GEV through scipy's
genextreme, whose shape parameter \(c_{\text{scipy}}\) equals Hosking's \(k\)
directly (both are the negative of the Coles \(\xi\)):
\(F(x) = \exp\{-[1 - c_{\text{scipy}}(x-\mu)/\alpha]^{1/c_{\text{scipy}}}\}\). So
\(k\) is passed through unchanged as \((c, \text{loc}, \text{scale}) = (k, \mu, \alpha)\);
\(k \to 0\) is the Gumbel limit. This is
fit_distribution(series, dist_name="genextreme", method="L-moments"), and
_gev_lmoments_params implements the four equations above.
Worked example, 5 min column (manual Example 3.2). Fitting the printed 5 min
AMS reproduces the manual's L-moments to rounding (\(\ell_1 = 9.677\) vs the
manual's \(9.681\), \(\ell_2 = 1.740\) vs \(1.739\)) and yields GEV parameters
\((k, \mu, \alpha) = (-0.112,\ 8.108,\ 2.238)\) against the manual's
\((-0.106,\ 8.120,\ 2.253)\). The resulting 5 min / 100 yr depth is 21.6 mm
against the manual's 21.47 mm (0.5%). test_gev_lmoments_matches_worked_example_params
pins these parameters.
2.3 A note on the manual's 24 h moments¶
The manual's Example 3.1 prints the 24 h column as mean 53.67 mm, std 17.46 mm. The 24 h AMS as printed in the manual's own Appendix B (transcribed verbatim in the fixture) gives mean 53.66 mm — consistent — but std 17.21 mm, a 1.4% gap. This is a localized inconsistency within the manual, not a transcription error here:
- The mean is consistent, so only the spread differs; a single shifted AMS value cannot explain it while holding the mean, so it is most likely one or two tail values in the printed 24 h column (or the printed std) differing from the series actually fitted.
- For the 5 min column the printed AMS instead reproduces the manual's own worked L-moments to rounding (above), so the discrepancy does not affect the other eight durations.
The published 24 h depths track the manual's \(\approx 17.46\), so reproducing them
from the printed AMS lands at ~0.7% rather than at rounding. The test asserts the
moments the printed AMS actually yields (53.67 / 17.21) and flags this in
test_parsed_moments_match_manual_printed_stats.
2.4 Error metric¶
Relative error against the manual's published value, \(\varepsilon = \lvert x_{\text{tool}} - x_{\text{manual}}\rvert / x_{\text{manual}}\), over all 54 cells (9 durations × 6 return periods) per distribution.
3. Results¶
The manual's own AMS goes through fit_distribution (Gumbel/MoM, then
GEV/L-moments), and the result is compared cell by cell to the manual's published
depth tables. Pass bar: 1.5% (_TOLERANCE = 0.015).
3.1 Gumbel (EV1) — relative error per cell (%)¶
| Duration | 2-yr | 5-yr | 10-yr | 25-yr | 50-yr | 100-yr |
|---|---|---|---|---|---|---|
| 5 min | 0.04 | 0.03 | 0.12 | 0.17 | 0.17 | 0.21 |
| 10 min | 0.77 | 0.60 | 0.53 | 0.48 | 0.44 | 0.42 |
| 15 min | 0.74 | 0.65 | 0.57 | 0.53 | 0.51 | 0.48 |
| 30 min | 0.07 | 0.23 | 0.31 | 0.38 | 0.40 | 0.44 |
| 1 h | 0.43 | 0.41 | 0.39 | 0.38 | 0.37 | 0.37 |
| 2 h | 0.32 | 0.17 | 0.12 | 0.06 | 0.02 | 0.00 |
| 6 h | 0.15 | 0.09 | 0.19 | 0.29 | 0.33 | 0.38 |
| 12 h | 0.41 | 0.43 | 0.45 | 0.47 | 0.48 | 0.48 |
| 24 h | 0.07 | 0.28 | 0.42 | 0.57 | 0.65 | 0.72 |
Worst-case 0.77% (10 min, 2-yr), mean 0.36%.
3.2 GEV by L-moments — relative error per cell (%)¶
| Duration | 2-yr | 5-yr | 10-yr | 25-yr | 50-yr | 100-yr |
|---|---|---|---|---|---|---|
| 5 min | 0.16 | 0.14 | 0.02 | 0.17 | 0.33 | 0.51 |
| 10 min | 1.01 | 1.09 | 0.91 | 0.50 | 0.10 | 0.36 |
| 15 min | 0.95 | 0.97 | 0.86 | 0.57 | 0.30 | 0.05 |
| 30 min | 0.14 | 0.00 | 0.18 | 0.49 | 0.78 | 1.08 |
| 1 h | 0.42 | 0.66 | 0.73 | 0.77 | 0.74 | 0.68 |
| 2 h | 0.37 | 0.36 | 0.30 | 0.21 | 0.11 | 0.00 |
| 6 h | 0.26 | 0.08 | 0.09 | 0.32 | 0.50 | 0.71 |
| 12 h | 0.31 | 0.32 | 0.42 | 0.59 | 0.72 | 0.89 |
| 24 h | 0.20 | 0.52 | 0.49 | 0.33 | 0.15 | 0.07 |
Worst-case 1.09% (10 min, 5-yr), mean 0.44%. As with Gumbel and with the ECCC station, the largest errors sit at the short durations where depths are smallest (10-15 min, 12-21 mm) and the manual's 0.01 mm rounding is largest in relative terms. Both distributions reproduce the manual to better than the 1.5% bar across all 108 cells.
4. How to use this for drift detection¶
- Run the suite:
pytest tests/test_validation_idfcc_london.py -v(offline, deterministic, part of the release gate). - If it fails, compare the reported numbers to §3 here:
- A Gumbel regression means the Method-of-Moments or return-level core changed (it should also fail the ECCC suite).
- A GEV regression means the L-moments estimator
(
_gev_lmoments_params) changed. Treat it as serious; it should match the IDF_CC worked example to ~1%. - If the change is intentional, update the tolerance in the test and the baseline tables here in the same commit, with a note on why.
5. Provenance & reproducibility¶
- The ground-truth file transcribes the IDF_CC manual verbatim
(
tests/data/idfcc_london_ontario_appendixB.txt): Appendix B (AMS), the Gumbel table (p. 26) and the GEV table (p. 28). The test parses all three from it; the numbers are never hand-typed into code. - Both fitting paths call the shipped analysis code
(
idf_analyzer.analysis.frequency.fit_distribution), the same code the GUI uses, so this validates the product rather than a parallel reimplementation. - The Hosking & Wallis L-moment GEV estimator lives in
idf_analyzer.analysis.frequency._gev_lmoments_paramsand is documented in the in-app Methods & References tab under L-moments (Hosking), with its shape polynomial rendered live from the code constantGEV_LMOM_SHAPE_POLY.