Mathestate Logo

 

Generalized Central Limit Theorem

The generalized central limit distribution states that a sum of independent random variables from the same distribution, when properly centered and scaled belongs to the domain of attraction of a stable distribution.  Further the only distributions that arise as limits from suitably scaled and centered sums of random variables are stable distributions.  The goal of this section is not to give a proof, but to demonstrate this to prepare for the next section where we will propose that over short intervals of time financial market returns are sufficiently independent and identical that they should also converge to a stable domain of attraction.

To start with the simplest case we will take sums of random variables from a Pareto distribution with a minimum value of 1.  We will start with a single tailed distribution then progress to a two tailed example.  The formulae for the Pareto distribution density and distribution functions are below.

GCLT_1.gif

GCLT_2.gif

Here we create a sample of random variables that result as sum of 1000 ParetoDistribution[1, 1.5] random variables.  The scaling factor was selected with some knowledge of the mathematics of stable distributions, and the shift is simply the mean of the Pareto distribution multiplied by n.  In the notebook, the cell below can be run with different size n to see how the convergence to a stable distribution progresses.  With the adjustments as n grows larger, this should progress to a stable distribution with parameters {α, β, γ, δ} = {1.5, 1., 1, 0.}

Shift by 3000.

Then scale by 0.00541926

Stable fit of shifted and scaled sample
{1.48616, 1., 0.895938, -0.220168}

GCLT_3.gif

A two tailed model is set up as the difference of two ParetoDistribution[1, α] random variables.  The density histogram fit is shown as well as the log - log distribution fit.  As the log-log plot clearly shows the characteristic stable tail behavior is clearly present with sums of the differences of a 1000 pairs of Pareto distribution random variables.  We use the Mathematica formula below to generate n differences.

GCLT_4.gif

This set up is for generating stable RVs in the range 1 < α < 2.  Sums of two tailed Pareto RVs converge very slowly to stable input parameters as α → 2.  For α < 1 a different centering strategy would likely be better, i.e. try the median.

Shift by -1500.

Then scale by 0.00541926

Stable fit of shifted and scaled sample
{1.39597, -0.55467, 0.751621, 0.0282783}

GCLT_5.gif

Graphics:Log Log Left Tail Blue, Right Red, (Normal Green)

The above shows that it is possible to generate random variables that demonstrate stable behavior with as few as a thousand sums of random variables from a heavy tailed distribution.  It is not hard now to imagine a market system where limit order book log price differences might have heavy tailed distributions.  As sums of price differences accumulate over a short period of time, they should converge to a stable distribution.  Over a very short time frame the statistical description of the limit order books is not likely to change much.  But the look at real financial data suggests that the statistical parameters of the limit order books may change over several minutes.  To study this in more detail we develop an simple algorithmic continuous double auction market in the next section.

GCLT_7.gif



© Copyright 2008 mathestate    Fri 19 Dec 2008