Thursday, 28 May 2009

Follow me on Twitter!

I've quite busy lately and will be even busier in the foreseeable future (and I wouldn't like it any other way)! As a result I will only post larger discussions here. However, you can follow me on Twitter (click here). I update it more or less daily. As always everyone willing to discuss statistics is welcome to contact me!

Wednesday, 25 March 2009

A brief discussion of the Extreme Value Theory

I’ve been reading about the Extreme Value Theory (EVT) lately. At first sight it seems like a decent solution for problems related to the distributions of financial variables. Here is a short video introducing it:


The theory attempts to deal with the problem of fat tails / excess kurtosis (see my previous post), which is present in virtually all financial data regardless of the sampling period. Some scholars suggest using EVT-adjusted distributions for Value-at-Risk (VaR) models. I do agree that EVT is a more intelligent approach to known problems with VaR models, however, it also requires estimating additional parameters and as we will see that is a terribly complex task. By the way, I’m in no way suggesting that VaR shouldn’t be adjusted due to these complexities. What I’m probably saying is that EVT is not practical and VaR should be thrown out completely.

The basic idea behind the theory is to separate a distribution into two tiers: the tails and the rest (i.e. the bulk of the distribution). Each tier is described with a different distribution. The tails are often described with some power-law distribution (for example, Pareto) while the rest could approach something similar to a normal distribution. Describing the tails distribution is a more challenging and, in fact, a far more important task. The first problem arises when we need to decide where to draw the line between the tails and the bulk of the distribution. It could be 2.5% at each tail and 95% in between, however, these numbers are arbitrary. Ideally, we shouldn’t attribute too much of the distribution to the tails because that could lead to incorrect inferences (after all, the tails should represent rare events). On the other hand, less weight in the tails will make it harder to accurately describe them. It’s an obvious trade-off. Also, deciding whether to use the same weight for the tails regardless of the variable or to use different weights for each variable (and if so, then based on what) is a tough question. I’m not sure that there is an empirical answer to this, perhaps only a philosophical one.

Now let’s talk a bit about the distributions that could be used to describe the tails. Generally, a prudent approach would be to assume a power-law distribution. I do understand that is terribly bad news. As far as I know there is no research, which has disproved that tails are subject to a power-law distribution. I’m not trying to say that the tails are necessarily distributed that way, all I'm saying is that it's safe to assume they are. By the way, the reason why this is bad news is because it invalidates a great deal of financial tools. Also, accurately estimating the tail exponent simply isn't possible. Often even small errors in estimation can lead to very significant differences. If you are prudent enough to assume power-law distribution in the tails, the next prudent decision you could take is to stay away from characterization of the tails. There's a very good example related to modelling tails I once heard. Imagine an ice cube melting. You can predict the shape and size of the puddle once it melts. However, if you see the puddle you can't really determine what the ice cube looked like. The same is valid for modelling tails based on limited information, i.e. by observing a sample and not the true distribution... we can only guess what the true distribution is like. It's true when it comes to the entire distribution and its even more true when it comes to the tails due to the small sample sizes.

I think there are three lessons to be learned here. First, if your position in the financial markets is dependent on accurate characterization of the tails, you are in trouble (probably quite relevant to derivatives traders...). I seriously doubt that is possible to achieve in practice. It's probably only a matter of time until the exposure to tails will bankrupt you. The second lesson relates to building a portfolio, more specifically, to what happens if you have a portfolio of many assets with finite variance and you add one asset that has infinite variance (i.e. with tails that are subject to power-law distribution). I don't think I can express the argument better that this gentleman:

http://www.fooledbyrandomness.com/alphalecture.mp3

The last lesson probably could be put this way - a better model with more variables to estimate is worse than a simpler model with fewer variables. EVT might be great but there's no way it can get around characterizing the tails without even smallest errors. Also, I feel necessary to reiterate my conclusion from my previous post - no model is better than a bad model. Perhaps, we shouldn't try to operate in terms of holistic models but rather use time-tested tricks and techniques.

Tuesday, 10 March 2009

Tutorials

I found some good statistics tutorials on Youtube.com. In case you feel like revizing the properties of a lognormal distribution for an exam or getting a basic idea of some financial concepts (hedging, Monte Carlo simulations, VaR, Extreme Value Theory, etc.) you should definitely see these videos (currently there are about 189 of them). Each video is 7-9 minutes long and painless enough for the less quantitative among you.

Monday, 9 March 2009

A Few Words About Excess Kurtosis

As someone once famously said, a ship is built to withstand a storm, not the good weather. In the context of current crisis this seems like a very prudent, however rarely implemented, approach. Preparing for a "storm" implies knowing something about the tails of the distribution. In a normal distribution tails are irrelevant but reality offers little proof to accept this as a reasonable assumption. Tail events occur far more often than standard models predict (a classic illustation of this is described in "When Genius Failed" by R. Lowenstein). In fact, most models can't make any predictions about what happens in the tails. The Value-at-Risk models essentially are a peace-time metric. I'm sure you read numerous articles in the financial press blaming (at least partially) the current crisis on heavy reliance on VaR. At the same time proponents (such as N. N. Taleb) of the study of extreme events have received a lot of attention. An exhaustive analysis of financial data seems to confirm the need to refocus on the rare and extreme. Much of the evidence for this is related to a statistical metric known as kurtosis. Here are some of the findings:
  1. Excess kurtosis is present in any financial data
  2. Usually a small set of events accounts for the largest share of kurtosis in the data
  3. Kurtosis is highly volatile in time series (sometimes referred to as infinite kurtosis or unstable 4th moment).
I'd love to go into statistical details but I will spare you. What this evidence means is that markets are dominated by a handfull hard-to-predict extremely influential events (as of recently referred to as Black Swans). Any models based on variance, standard deviation, correlation, ARCH and VaR could considered useless (it's not outrageous to assume infinite 2nd moment). It's a statistical nightmare. It's also quite disturbing that such important findings are largely ignored. This is probably due to the fact that no viable alternative methods exist. However, to quote N. N. Taleb, no model is better than a bad model. Current crisis is a prime example of this.

It is questionable how much we can study the tails. Rare events add up to a small sample and even an undergraduate student knows that small samples are problematic. Parametrizing the tail distribution also is risky because estimation errors have tremendous effects. So how is this useful at all? It seems that we have no way to assess what lurks in the tails (various statistical techniques exist but they have their limitations). Even more importantly, it is only a matter of time until a system exposed to tail events will collapse. So probably the smartest thing to do is not to be exposed to tail events, or at least negative tail events that can ruin you.

Sunday, 8 March 2009

Hurst Exponent & Fractal Dimension

Recently I've been reading about alternative statistics, particularly Hurst Exponent and Fractal Dimension. Both of them essentially measure the same thing (Fractal Dimension = 2 - Hurst Exponent) and have extremely interesing and varied applications.

I guess it would be easier to start with the explanation of Fractal Dimension. We can measure how heavy something is. We can also measure how loud something is or even how well it fits a line. However, until fairly recently there was no way to measure how "rough" something is. A measure of roughness (i.e. Fractal Dimension) has been pioneered and developed by the mathematician B. B. Mandelbrot. A perfect line has a Fractal Dimension of 1. The British coastline has a Fractal Dimension of 1.25 while the Australian one (or was it the South African...) is 1.13 because it is smoother. Either way, you get the idea. A Brownian Motion process (based on normal distribution) has a Fractal Dimension of 1.5. A very rough/volatile process would approach 2.

Those of you working in the financial sector have probably already thought of Fractal Dimension of a time series of asset prices. Fractal Dimension could be considered a measure of volatility. Indeed, there have been numerous attempts to measure the Fractal Dimension of various stocks and indeces such as SP500. This is where it becomes useful to mention the Hurst Exponent.

Most if not all mainstream financial theories are based on a simple assumption: Hurst = 0.5 (equivalent to Fractal Dimension = 1.5, often referred to as Brownian Motion or Brownian Noise). What this implies is that asset prices are temporaly independent. However, theoretically (and empirically as we will later see) Hurst Exponents can vary between 0 and 1. Hurst larger than 0.5 indicates presence of trends in the data such as growth and decline cycles. Hurst smaller than 0.5 means that the data exhibits anti-trend behaviour, i.e. the longer the trend continues, the higher the probability that it will reverse.  By now there definitely is a great deal of empirical data showing that Hurst can vary greatly in the time series of asset prices. Hurst = 0.5 (i.e. normal distribution) at the very best is a special case but not necessarily the rule. The implications of this are very profound. Very often data exhibits what is called long memory (autocorrelation that doesn't decay as quickly as expected). Furthermore, any models based on variance or other central tendency are incorrect. Mean is quite meaningless (pun intended) because it's extremely dependent on outlier events. An entirely new set of statistical tools is necessary to deal with this. This might sound rather intimidating to some. To be fair I have to mention that countless ARCH (autoregressive conditional heteroskedasticity) models were developed to deal with some of the problems I mentioned. However, to my best knowledge none of them are scale-invariant, which can be a significant problem.

Having spent a few days exploring this topic, Hurst seemed like a neat measure for momentum/long memory in the data. In fact, I immediately thought of testing it out on real stock prices and perhaps formulating an investment strategy. Unfortunately, nothing is that easy. Estimating Hurst is a complex task (not necessarily mathematically but rather conceptually). There are numerous methods and none of them seems to be superiour to others. Some people have attempted generating artificial distributions with a pre-determined Hurst in order to check the accuracy of their methods. Needless to say, there's plenty of room to run into problems there, too. Estimation errors can have a very significant effect on investment decisions so it's difficult to rely on Hurst Exponent. It's an entire ocean of practical problems.

Another major problem is that there's very little literature on this topic. I haven't found a comprehensive book or a website, which would go into great detail. Instead, most of the knowledge on this topic is scattered in small pieces (mostly incomprehensible PhD papers...). Glueing them together sometimes isn't very easy. I guess I could suggest these two links as an introduction to a basic method of estimating Hurst Exponent:

Researching Hurst Exponent and Fractal Dimension has led me to a number of related topics, which I'd like to discuss at some point (L-stable distributions, processes generating fat tails, etc.). I'd also like to be fully aware of their less obvious implications and hopefully one day I will. I can't say that I could trust mainstream statistics with my investment decisions. Understanding where mainstream statistics fail could save my career one day at the very least.