Validation: published IDF_CC (UWO) worked examples vs. Cascade Hydro¶

This page is the readable baseline for a second external-validation suite (tests/test_validation_idfcc_london.py). It complements the ECCC real-data validation: where that page checks the Gumbel/Method-of-Moments core against Environment and Climate Change Canada's published numbers, this page cross-checks both distributions the industry-standard IDF_CC tool (University of Western Ontario) publishes worked numbers for, on a single shared annual-maximum series:

Gumbel (EV1) by Method of Moments, and
GEV by L-moments (Hosking & Wallis 1997).

The GEV / L-moments column is the reason this page exists. It is the only published, reproducible ground truth available for Cascade Hydro's closed-form GEV estimator (idf_analyzer.analysis.frequency._gev_lmoments_params), the same estimator the IDF_CC tool uses.

Every number in the result tables below is produced by the shipped analysis code on the committed fixture. To regenerate them, run pytest tests/test_validation_idfcc_london.py -v.

Scope: this page validates the statistical core against a second authority

This page checks parameter estimation and return levels for Gumbel and GEV against a published worked example. It is the analogue of VALIDATION.md Layer 1. The full daily-gauge pipeline (screening, curve fitting, sub-daily ratios) is validated end to end on the ECCC station and is not repeated here.

Field	Value
Reference station	London, Ontario (ECCC), IDF_CC manual Appendix B
Source	Schardong, Gaur, Simonovic & Sandink (2025), Computerized Tool for the Development of IDF Curves Under a Changing Climate — Technical Manual v.8.0, Water Resources Research Report No. 115, UWO. ISBN (online) 978-0-7714-3160-9
Methods (stated)	Gumbel / Method of Moments; GEV / L-moments (manual §3.1.3)
Test file	`tests/test_validation_idfcc_london.py`
Ground-truth fixture	`tests/data/idfcc_london_ontario_appendixB.txt` (manual transcribed verbatim)

This fixture lives under tests/ and is not packaged into the installed application. It exists only to gate releases.

1. Why this is a clean benchmark¶

The IDF_CC manual prints, for one London annual-maximum series, three things that make a direct comparison possible:

The annual-maximum series (AMS) it fitted, per duration (Appendix B).
The resulting Gumbel return-period depths (manual p. 26).
The resulting GEV return-period depths (manual p. 28).

The manual states its estimators as Gumbel / Method of Moments and GEV / L-moments, which are exactly the two closed-form estimators Cascade Hydro uses. The input data and estimators are therefore identical to the manual's, so any disagreement comes only from table rounding (2 dp) and the shared Hosking-Wallis shape polynomial. The tables are read directly from the committed fixture by parse_annual_maxima(), parse_published_gumbel() and parse_published_gev(); they are never hand-typed into the test.

2. Methodology¶

2.1 Gumbel (EV1) by Method of Moments¶

Identical to the ECCC page: scale \(\beta = s\sqrt{6}/\pi\), location \(\mu = \bar{x} - \gamma\beta\), return level \(x_T = \mu + \beta\,y_T\) with \(y_T = -\ln[-\ln(1-1/T)]\). See VALIDATION.md §3.1 for the worked 24 h derivation. This is fit_distribution(series, dist_name="gumbel_r", method="MoM").

2.2 GEV by L-moments (Hosking & Wallis 1997)¶

The annual maxima give unbiased sample probability-weighted moments \(b_0, b_1, b_2\), then L-moments \(\ell_1, \ell_2, \ell_3\) and the L-skewness \(t_3 = \ell_3/\ell_2\). The GEV parameters follow in closed form (manual Eqs. 15-17):

\[ c = \frac{2}{3 + t_3} - \frac{\ln 2}{\ln 3}, \qquad k = 7.8590\,c + 2.9554\,c^2, \]

\[ \alpha = \frac{\ell_2\,k}{(1 - 2^{-k})\,\Gamma(1+k)}, \qquad \mu = \ell_1 - \alpha\,\frac{1 - \Gamma(1+k)}{k}, \]

with shape \(k\), scale \(\alpha\), location \(\mu\). The \(T\)-year depth is the GEV quantile

\[ x_T = \mu + \frac{\alpha}{k}\!\left[\,1 - \left(-\ln(1-\tfrac{1}{T})\right)^{k}\right]. \]

Sign convention. Cascade Hydro evaluates the GEV through scipy's genextreme, whose shape parameter \(c_{\text{scipy}}\) equals Hosking's \(k\) directly (both are the negative of the Coles \(\xi\)): \(F(x) = \exp\{-[1 - c_{\text{scipy}}(x-\mu)/\alpha]^{1/c_{\text{scipy}}}\}\). So \(k\) is passed through unchanged as \((c, \text{loc}, \text{scale}) = (k, \mu, \alpha)\); \(k \to 0\) is the Gumbel limit. This is fit_distribution(series, dist_name="genextreme", method="L-moments"), and _gev_lmoments_params implements the four equations above.

Worked example, 5 min column (manual Example 3.2). Fitting the printed 5 min AMS reproduces the manual's L-moments to rounding (\(\ell_1 = 9.677\) vs the manual's \(9.681\), \(\ell_2 = 1.740\) vs \(1.739\)) and yields GEV parameters \((k, \mu, \alpha) = (-0.112,\ 8.108,\ 2.238)\) against the manual's \((-0.106,\ 8.120,\ 2.253)\). The resulting 5 min / 100 yr depth is 21.6 mm against the manual's 21.47 mm (0.5%). test_gev_lmoments_matches_worked_example_params pins these parameters.

2.3 A note on the manual's 24 h moments¶

The manual's Example 3.1 prints the 24 h column as mean 53.67 mm, std 17.46 mm. The 24 h AMS as printed in the manual's own Appendix B (transcribed verbatim in the fixture) gives mean 53.66 mm — consistent — but std 17.21 mm, a 1.4% gap. This is a localized inconsistency within the manual, not a transcription error here:

The mean is consistent, so only the spread differs; a single shifted AMS value cannot explain it while holding the mean, so it is most likely one or two tail values in the printed 24 h column (or the printed std) differing from the series actually fitted.
For the 5 min column the printed AMS instead reproduces the manual's own worked L-moments to rounding (above), so the discrepancy does not affect the other eight durations.

The published 24 h depths track the manual's \(\approx 17.46\), so reproducing them from the printed AMS lands at ~0.7% rather than at rounding. The test asserts the moments the printed AMS actually yields (53.67 / 17.21) and flags this in test_parsed_moments_match_manual_printed_stats.

2.4 Error metric¶

Relative error against the manual's published value, \(\varepsilon = \lvert x_{\text{tool}} - x_{\text{manual}}\rvert / x_{\text{manual}}\), over all 54 cells (9 durations × 6 return periods) per distribution.

3. Results¶

The manual's own AMS goes through fit_distribution (Gumbel/MoM, then GEV/L-moments), and the result is compared cell by cell to the manual's published depth tables. Pass bar: 1.5% (_TOLERANCE = 0.015).

3.1 Gumbel (EV1) — relative error per cell (%)¶

Duration	2-yr	5-yr	10-yr	25-yr	50-yr	100-yr
5 min	0.04	0.03	0.12	0.17	0.17	0.21
10 min	0.77	0.60	0.53	0.48	0.44	0.42
15 min	0.74	0.65	0.57	0.53	0.51	0.48
30 min	0.07	0.23	0.31	0.38	0.40	0.44
1 h	0.43	0.41	0.39	0.38	0.37	0.37
2 h	0.32	0.17	0.12	0.06	0.02	0.00
6 h	0.15	0.09	0.19	0.29	0.33	0.38
12 h	0.41	0.43	0.45	0.47	0.48	0.48
24 h	0.07	0.28	0.42	0.57	0.65	0.72

Worst-case 0.77% (10 min, 2-yr), mean 0.36%.

3.2 GEV by L-moments — relative error per cell (%)¶

Duration	2-yr	5-yr	10-yr	25-yr	50-yr	100-yr
5 min	0.16	0.14	0.02	0.17	0.33	0.51
10 min	1.01	1.09	0.91	0.50	0.10	0.36
15 min	0.95	0.97	0.86	0.57	0.30	0.05
30 min	0.14	0.00	0.18	0.49	0.78	1.08
1 h	0.42	0.66	0.73	0.77	0.74	0.68
2 h	0.37	0.36	0.30	0.21	0.11	0.00
6 h	0.26	0.08	0.09	0.32	0.50	0.71
12 h	0.31	0.32	0.42	0.59	0.72	0.89
24 h	0.20	0.52	0.49	0.33	0.15	0.07

Worst-case 1.09% (10 min, 5-yr), mean 0.44%. As with Gumbel and with the ECCC station, the largest errors sit at the short durations where depths are smallest (10-15 min, 12-21 mm) and the manual's 0.01 mm rounding is largest in relative terms. Both distributions reproduce the manual to better than the 1.5% bar across all 108 cells.

4. How to use this for drift detection¶

Run the suite: pytest tests/test_validation_idfcc_london.py -v (offline, deterministic, part of the release gate).
If it fails, compare the reported numbers to §3 here:
A Gumbel regression means the Method-of-Moments or return-level core changed (it should also fail the ECCC suite).
A GEV regression means the L-moments estimator (_gev_lmoments_params) changed. Treat it as serious; it should match the IDF_CC worked example to ~1%.
If the change is intentional, update the tolerance in the test and the baseline tables here in the same commit, with a note on why.

5. Provenance & reproducibility¶

The ground-truth file transcribes the IDF_CC manual verbatim (tests/data/idfcc_london_ontario_appendixB.txt): Appendix B (AMS), the Gumbel table (p. 26) and the GEV table (p. 28). The test parses all three from it; the numbers are never hand-typed into code.
Both fitting paths call the shipped analysis code (idf_analyzer.analysis.frequency.fit_distribution), the same code the GUI uses, so this validates the product rather than a parallel reimplementation.
The Hosking & Wallis L-moment GEV estimator lives in idf_analyzer.analysis.frequency._gev_lmoments_params and is documented in the in-app Methods & References tab under L-moments (Hosking), with its shape polynomial rendered live from the code constant GEV_LMOM_SHAPE_POLY.