Flood Frequency Analysis
Estimate design flood magnitudes directly from observed streamflow records using Flood Frequency Analysis (FFA). This guide covers annual maximum and partial-duration series, the seven supported probability distributions, parameter estimation (method of moments, L-moments, MLE, PWM), goodness-of-fit testing, confidence intervals via bootstrap, outlier detection using Grubbs-Beck, and weighted regional skew following USGS Bulletin 17C.
Introduction
Section titled “Introduction”Flood frequency analysis (FFA) is a cornerstone of hydrological engineering. It provides a systematic framework for estimating the magnitude of flood events associated with specific return periods (recurrence intervals). These estimates are essential for the design of hydraulic structures such as bridges, culverts, spillways, stormwater systems, dam spillways, and floodplain management infrastructure.
The fundamental approach involves fitting theoretical probability distributions to a series of observed extreme streamflow values. Once a distribution is fitted, its inverse cumulative distribution function (quantile function) can be used to estimate flood magnitudes beyond the observed record, enabling engineers to derive design flows for return periods of 50, 100, or even 10 000 years.
AMS vs PDS
Section titled “AMS vs PDS”Two sampling strategies are commonly used to extract extremes from a continuous streamflow record:
- Annual maximum series (AMS): the single largest peak flow from each water year is retained, giving one value per year. AMS is the standard input to flood frequency procedures worldwide (Bulletin 17C, UK Flood Estimation Handbook, Australian Rainfall and Runoff). Its strength is statistical simplicity — the values are approximately independent and identically distributed.
- Partial-duration series (PDS) / peaks-over-threshold (POT): all independent peaks above a chosen threshold are retained, regardless of year. PDS uses more of the information in the record, which can improve estimates at short return periods, but requires careful threshold selection and an independence criterion between peaks.
For return periods beyond about 10 years, AMS and PDS estimates converge. For shorter return periods (1 – 5 years), PDS tends to give slightly higher quantiles because more events per year contribute to the sample.
AEP and ARI terminology
Section titled “AEP and ARI terminology”Return period is equivalent to the reciprocal of annual exceedance probability (AEP) for large floods:
A 1:100 year flood has an AEP of 1% (0.01) and an average recurrence interval (ARI) of about 100 years. Terminology varies by country — AEP is preferred in Australian and modern American practice because it avoids the common misinterpretation that a 1:100 year flood occurs exactly once every 100 years, when in fact it has a 1% chance of being exceeded in any year (and roughly a 63% chance of being exceeded at least once in 100 years).
Probability distributions
Section titled “Probability distributions”The choice of distribution is critical to the accuracy of flood frequency estimates. Different distributions make different assumptions about the shape of the flood frequency curve, particularly in the upper tail where design floods are estimated. The tool supports seven commonly used distributions.
Gumbel (Extreme Value Type I)
Section titled “Gumbel (Extreme Value Type I)”The Gumbel distribution is the simplest extreme value distribution, with two parameters (location and scale ). It assumes a fixed coefficient of skewness of approximately 1.1396, which makes it suitable for regions where flood data exhibit moderate positive skewness. Gumbel is widely used in European and many international standards.
Log-Normal (2-parameter)
Section titled “Log-Normal (2-parameter)”The Log-Normal distribution assumes that the natural logarithms of the data follow a normal distribution. It is characterised by two parameters (mean and standard deviation of the log-transformed data) and produces a positively skewed distribution in the original space. It is a good default choice for many hydrological variables when the coefficient of skewness in log-space is near zero.
Log-Pearson Type III
Section titled “Log-Pearson Type III”The Log-Pearson III (LP3) distribution is the standard distribution recommended by the United States Water Resources Council and documented in USGS Bulletin 17C. It extends the Log-Normal by adding a skewness parameter, providing greater flexibility in fitting the upper tail of flood frequency curves. LP3 fits a Pearson III distribution to the base-10 logarithms of the data.
Where is the mean of the log-flows, is the standard deviation of the log-flows, and is the frequency factor for return period and skew coefficient .
Generalised Extreme Value (GEV)
Section titled “Generalised Extreme Value (GEV)”The GEV distribution unifies all three extreme value types (Gumbel, Frechet, and Weibull) through a shape parameter . When , it reduces to the Gumbel. Positive values of produce heavier upper tails (Frechet-type), while negative values produce bounded upper tails (Weibull-type). GEV is widely recommended by the World Meteorological Organization and is the default in many national guidelines, including UK FEH Volume 3.
Pearson Type III
Section titled “Pearson Type III”The Pearson III distribution is a three-parameter gamma distribution that can represent a wide range of skewness values. It is fitted to the untransformed data (unlike LP3, which works in log-space). P3 is commonly used in Australian flood frequency analysis (Australian Rainfall and Runoff, 2019).
Generalised Logistic
Section titled “Generalised Logistic”The Generalised Logistic distribution is recommended for flood frequency analysis in the United Kingdom (Flood Estimation Handbook, Volume 3). It has three parameters and provides flexibility in fitting both the body and tails of the distribution. Its L-moment ratios differ from the GEV, making it a useful alternative for comparison on the same dataset.
Generalised Pareto
Section titled “Generalised Pareto”The Generalised Pareto distribution is commonly used for peaks-over-threshold (POT) analysis, but can also be applied to annual maxima. It is a two-parameter distribution (scale and shape ) with a location threshold . It is particularly useful for modelling the tail behaviour of extreme events when a natural threshold can be identified.
Parameter estimation
Section titled “Parameter estimation”Once a distribution is selected, its parameters must be estimated from the observed data. Four estimation methods are supported, each with distinct advantages.
Method of moments (MOM)
Section titled “Method of moments (MOM)”Equates theoretical distribution moments (mean, variance, skewness) to sample moments. Simple and intuitive, but can be sensitive to outliers and may produce biased estimates for small samples. MOM is required for Bulletin 17C compliance with LP3.
L-moments (LMOM)
Section titled “L-moments (LMOM)”Based on linear combinations of order statistics. L-moments are more robust to outliers than conventional moments and provide nearly unbiased estimates even for small samples. They are strongly recommended by Hosking & Wallis (1997) and are the foundation of regional flood frequency analysis.
Maximum likelihood estimation (MLE)
Section titled “Maximum likelihood estimation (MLE)”Finds the parameters that maximise the probability of observing the given data. MLE is asymptotically efficient (optimal for large samples) but may not converge for small samples or certain distribution-data combinations. MLE-fitted parameters are required for the AIC and BIC information criteria used in model selection.
Probability weighted moments (PWM)
Section titled “Probability weighted moments (PWM)”Closely related to L-moments, PWM uses expectations of order statistics weighted by probability. PWM estimators are available in closed form for most distributions and share the robustness properties of L-moments.
| Method | Strength | Weakness |
|---|---|---|
| Method of moments | Simple; Bulletin 17C compliant | Outlier-sensitive; biased for small |
| L-moments | Robust; unbiased for small | Less efficient than MLE for very long records |
| MLE | Asymptotically efficient; basis for AIC/BIC | May fail to converge; outlier-sensitive |
| PWM | Closed-form; robust | Limited distribution support historically |
Goodness-of-fit testing
Section titled “Goodness-of-fit testing”Goodness-of-fit (GoF) tests measure how well a fitted distribution matches the observed data. The tool applies multiple complementary tests and ranking criteria to help identify the best-fit distribution.
Kolmogorov-Smirnov (KS) test
Section titled “Kolmogorov-Smirnov (KS) test”Measures the maximum absolute difference between the empirical cumulative distribution function (ECDF) and the fitted theoretical CDF. It gives equal weight to all parts of the distribution. Less powerful than Anderson-Darling for detecting tail discrepancies, but widely used and easy to interpret.
Anderson-Darling (AD) test
Section titled “Anderson-Darling (AD) test”Measures the discrepancy between the ECDF and the theoretical CDF, with greater weight given to the tails. This makes it particularly suitable for flood frequency analysis, where accurate tail estimation is critical. Lower AD statistics indicate a better fit.
Chi-square test
Section titled “Chi-square test”Groups data into bins and compares observed vs expected frequencies. Results depend on the choice of binning, which can limit its reliability for small samples. Included for completeness and regulatory reporting.
AIC / BIC information criteria
Section titled “AIC / BIC information criteria”The Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) balance goodness-of-fit against model complexity by penalising additional parameters. Lower values indicate a better trade-off. These require MLE-fitted parameters.
Where is the number of free parameters, is the sample size, and is the maximised likelihood.
Plotting positions
Section titled “Plotting positions”Empirical exceedance probabilities are assigned to ranked observations using a plotting position formula, which determines where each observation appears on a frequency plot. The tool supports the Weibull, Cunnane, and Gringorten formulas. For an observation of rank in a record of length :
The Weibull formula is unbiased for the mean of the exceedance probability; Cunnane is closer to unbiased for the quantile itself and is recommended for visual comparison with fitted frequency curves.
Confidence intervals
Section titled “Confidence intervals”Quantile estimates are subject to sampling uncertainty. The tool computes confidence intervals using the parametric bootstrap: the observed data are resampled with replacement many times (default: 500 iterations), the distribution is re-fitted to each resampled dataset, and quantiles are estimated. The resulting distribution of quantile estimates provides percentile-based confidence bounds.
For example, at a 95% confidence level, the lower bound is the 2.5th percentile and the upper bound is the 97.5th percentile of the bootstrapped quantile estimates. Wider confidence intervals indicate greater uncertainty, which typically increases for longer return periods and shorter data records.
Outlier detection
Section titled “Outlier detection”The tool implements the Grubbs-Beck test for detecting low and high outliers, following the methodology described in USGS Bulletin 17C. Outliers are data points that depart significantly from the trend of the remaining data when plotted on log-probability paper.
High outliers may represent rare catastrophic events that genuinely belong in the record. Retaining them can inflate quantile estimates, but removing them may underestimate risk.
Low outliers — termed Potentially Influential Low Floods (PILFs) in Bulletin 17C — can distort the fitted distribution, particularly the skewness estimate. The Bulletin 17C approach identifies PILFs using the Multiple Grubbs-Beck test and applies a conditional probability adjustment in which the low flows are treated as censored data below a threshold.
Data quality and homogeneity
Section titled “Data quality and homogeneity”Before fitting any distribution, the annual maximum series should satisfy the statistical assumptions of independence, identical distribution, and stationarity. The following tests are recommended:
- Mann-Kendall trend test — a non-parametric test for a monotonic trend in the series. A significant trend indicates non-stationarity, often linked to land-use change, reservoir regulation, or climate change.
- Pettitt change-point test — a non-parametric test for an abrupt shift in the mean. Useful for detecting dam construction, land-use step changes, or changes in measurement method.
- Wald-Wolfowitz runs test — tests whether the sequence of values above and below the median is random (an independence test).
- Spearman rank correlation — tests for a monotonic trend and is complementary to Mann-Kendall.
If a significant trend or change-point is detected, the series is not stationary and standard FFA is not strictly valid. Options include shortening the record to a stationary sub-period, detrending the data, or applying a non-stationary framework in which the distribution parameters are functions of time or a climate covariate.
Regional skew
Section titled “Regional skew”Sample skewness estimates from short records are highly uncertain. Bulletin 17C recommends using a weighted average of the station (at-site) skew and a regional skew estimate to reduce this uncertainty:
Where is the station skew, is the regional skew, and , are their respective mean square errors. The tool allows you to enable regional skew weighting and specify the regional skew coefficient and its MSE. In the United States, regional skew maps are published in USGS Scientific Investigations Reports; for South Africa, a regional skew surface can be derived from the national DWS network via L-moment regionalisation.
Worked example
Section titled “Worked example”Consider the following annual maximum flows (m³/s) recorded at a gauging station over 20 years:
45.2, 67.8, 123.4, 89.1, 56.3, 201.5, 78.9, 95.6, 110.3, 54.7,142.8, 63.4, 87.2, 175.3, 92.1, 58.9, 134.6, 71.5, 105.8, 82.4Step 1 — Data entry. Paste the values into the Data Input panel. The parser accepts comma-separated, space-separated, or one-value-per-line formats.
Step 2 — Quality control. Run the Mann-Kendall and Pettitt tests to confirm stationarity. Apply the Grubbs-Beck test to flag any potential low or high outliers and review them against historical information.
Step 3 — Configure analysis. Select the distributions to fit (e.g. all seven), choose L-Moments as the estimation method, and set the return periods of interest (typically 1:2, 1:5, 1:10, 1:20, 1:50, 1:100, 1:200).
Step 4 — Run analysis. The tool computes sample statistics (mean = 96.8 m³/s, = 20), fits all selected distributions, and ranks them by composite goodness-of-fit.
Step 5 — Review results. Examine the frequency curves to verify that the fitted distributions align with the plotting positions (Cunnane recommended). Check the GoF table for the best-fit ranking. Review bootstrap confidence intervals to assess uncertainty — a 95% CI of [230, 380] m³/s around a point estimate of 285 m³/s at 1:100 years indicates substantial extrapolation uncertainty.
Step 6 — Extract design values. Read the quantile table for your design return period. If different distributions produce substantially different 1:100 year estimates, report the range and use engineering judgement.
Limitations
Section titled “Limitations”FFA is a powerful technique but rests on strong statistical assumptions that are rarely fully satisfied in practice:
- Stationarity — the process generating floods is assumed not to change over time. Climate change, land-use change, and reservoir operations all violate this.
- Independence — successive annual maxima are assumed statistically independent. Persistent multi-year wet or dry phases (ENSO, PDO) can induce autocorrelation.
- Sample size — reliable estimates at 1:100 years require at least 25 – 30 years of record; 1:1000 year extrapolations from 50-year records are speculative.
- Rating-curve uncertainty — observed peaks are derived from stage via a rating curve that is often extrapolated beyond the measured range for the largest floods.
Finding gauge data
Section titled “Finding gauge data”Use the Stream Gauge Finder to locate DWS (or USGS) stations near your project site, download the annual peak flow series directly, and feed it into this tool. Complement with Daily Rainfall Data when catchment-averaged precipitation is needed as a covariate for non-stationary analysis.
References
Section titled “References”- England, J.F., Cohn, T.A., Faber, B.A., et al. (2019). Guidelines for Determining Flood Flow Frequency — Bulletin 17C. USGS Techniques and Methods, Book 4, Chapter B5.
- Hosking, J.R.M. & Wallis, J.R. (1997). Regional Frequency Analysis: An Approach Based on L-Moments. Cambridge University Press.
- Stedinger, J.R., Vogel, R.M. & Foufoula-Georgiou, E. (1993). Frequency analysis of extreme events. Chapter 18 in Handbook of Hydrology (D.R. Maidment, ed.), McGraw-Hill.
- Institute of Hydrology. (1999). Flood Estimation Handbook, Volume 3: Statistical procedures for flood frequency estimation. Centre for Ecology & Hydrology, Wallingford, UK.
- Ball, J., Babister, M., Nathan, R., et al. (2019). Australian Rainfall and Runoff: A Guide to Flood Estimation. Commonwealth of Australia (Geoscience Australia).
- Grubbs, F.E. & Beck, G. (1972). Extension of sample sizes and percentage points for significance tests of outlying observations. Technometrics, 14(4), 847 – 854.
Open Flood Frequency Analysis