Peak-Over Threshold

Estimating the Tail - PoT

Fat tails are difficult to estimate almost by definition. Because tail events are so infrequent and can be so dramatic, it is effectively impossible to know what distribution is in the tails. And even if we did know the distribution, the limited data points render the usual statistical methods useless for estimating parameters. Hence why Taleb discourages projecting or forecasting of fat tail processes, instead suggesting a focus on heuristics and approximate approaches to assess risk/reward.

Still, without some kind of model, it’s difficult to even make approximate assessments, so we will forge ahead.

Coles (2001, Ch. 4) outlines what appears to be the most accepted method for parameter estimation of fat tails: Threshold Modeling. The approach involves a fair amount of visual interpretation and subjective analysis relative to more common statistical methods. The Threshold Modeling approach agrees in particular with the generalized Pareto.

The process is outlined as follows:

  • the goal is to find some threshold of X, above which a fat tail will apply. this threshold is equivalent to the location parameter of the generalized Pareto.

  • for reasons detailed in Coles, the conditional expectation of X exceeding some threshold is linear in the threshold:

\[\begin{split}E(X - \theta | X > \theta) = \frac{\sigma_{\theta_o} + \epsilon\theta}{1 - \epsilon} \\\text{where: } \theta = \text{ the threshold value} \\\epsilon, \text{ the shape parameter} = 1 / \alpha \\\sigma, \text{ the scale parameter}\end{split}\]
  • because of this linear relationship, we can inspect a Mean Residual Life (MRL) plot for linearity of the mean excesses over a range of thresholds. this linearity would evidence a transition into a generalized Pareto distribution.

  • we then fit the shape and scale parameters to the threshold excesses over a range of thresholds. the favored threshold occurs at the greatest threshold that continues to evidence low variability and tighter confidence bands.

  • with the above, we can select a single threshold value and inspect visually the fit of the CDF, the Return Value / Period, the QQ Plot, and the PP Plot. the closer the data fits with the regression line and confidence bands in the extreme tails, the better the threshold estimate.

See the excellent package thresholdmodeling for more details.