Calibrating the Ornstein-Uhlenbeck model

by M.A. (Thijs) van den Berg


We describe two methods for calibrating the model parameters of an Ornstein-Uhlenbeck process to a given dataset.

  • The least squares regression method
  • maximum likelihood method


Contents

[edit] Introduction

The stochastic differential equation (SDE) for the Ornstein-Uhlenbeck process is given by

dS_t=\lambda(\mu-S_t)dt+\sigma dW_t \,

with

\lambda \, Mean reversion rate
\mu \, The mean
\sigma \, Volatility

[edit] An example simulation

The table and figure below show a simulated scenario for the Ornstein-Uhlenbeck process with time step \delta=0.25\,, mean reversion rate \lambda=3.0\,, long term mean \mu=1.0\, and a noise term of \sigma=0.50\,. We will use this data to explain the model calibration steps.

A scenarios of a the Ornstein-Uhlenbeck process. The scenarios start at S(0)=3 and reverting to a long term mean of 1.
A scenarios of a the Ornstein-Uhlenbeck process. The scenarios start at S(0)=3 and reverting to a long term mean of 1.
Table 1
i t S_i\, N_{0,1}\,
0 0.00 3.0000
1 0.25 1.7600 -1.0268
2 0.50 1.2693 -0.4985
3 0.75 1.1960 0.3825
4 1.00 0.9468 -0.8102
5 1.25 0.9532 -0.1206
6 1.50 0.6252 -1.9604
7 1.75 0.8604 0.2079
8 2.00 1.0984 0.9134
9 2.25 1.4310 2.1375
10 2.50 1.3019 0.5461
11 2.75 1.4005 1.4335
12 3.00 1.2686 0.4414
13 3.25 0.7147 -2.2912
14 3.50 0.9237 0.3249
15 3.75 0.7297 -1.3019
16 4.00 0.7105 -0.8995
17 4.25 0.8683 0.0281
18 4.50 0.7406 -1.0959
19 4.75 0.7314 -0.8118
20 5.00 0.6232 -1.3890

We used the following simulation equation for generating paths that are sampled with fixed time steps of \delta=0.25\,. The equation is an exact solution of the SDE.

S_{i+1} = S_i  e^{-\lambda \delta}+\mu(1-e^{-\lambda \delta}) + \sigma\sqrt{\frac{1-e^{-2\lambda \delta}}{2\lambda}} N_{0,1} \,

The random numbers used in this example are shown in the last column of the table 1.




[edit] Calibration using least squares regression

The relationship between consecutive observations S_i,S_{i+1}\, is linear with a iid normal random term \epsilon\,

S_{i+1} = a S_i + b + \epsilon \,
Least square fitting of a line to the data.
Least square fitting of a line to the data.


The relationship between the linear fit and the model parameters is given by

\begin{align} a &= e^{-\lambda\delta}\\ b &= \mu\left(1-e^{-\lambda\delta}\right)\\ sd(\epsilon) &= \sigma \sqrt{\frac{1-e^{-2\lambda\delta}}{2\lambda}} \end{align}

rewriting these equations gives

\begin{align} \lambda &= -\frac{\ln a}{\delta} \\ \mu &= \frac{b}{1-a}\\ \sigma &= sd(\epsilon) \sqrt{\frac{-2\ln a}{\delta(1 - a^2)}} \end{align}

[edit] Calculating the least squares regression

Most software tools (Excel, Matlab, R, Octave, Maple, ...) have built in functionality for least square regression. If its not available, a least square regression can easily be done by calculating the the quantities below

\begin{align} S_x &= \sum_{i=1}^{n} S_{i-1}\;\;\;\;\;\;  S_y = \sum_{i=1}^{n} S_i \\ S_{xx} &= \sum_{i=1}^{n} S_{i-1}^2\;\;\;\;\;\; S_{xy} = \sum_{i=1}^{n} S_{i-1} S_i\;\;\;\;\;\; S_{yy} = \sum_{i=1}^{n} S_i^2   \end{align}

from which we get the following parameters of the least square fit

\begin{align} a &= \frac{nS_{xy}-S_xS_y}{nS_{xx}-S_x^2}\\ b &= \frac{S_y - a S_x}{n}\\ sd(\epsilon) &= \sqrt{\frac{n S_{yy}-S_y^2 - a\left(n S_{xy}-S_x S_y\right)}{n(n-2)}} \end{align}

[edit] Example

Applying the regression to the data in table 1 we get

Param Value
Sx 22.5301
Sy 20.1534
Sxx 30.8338
Sxy 25.1973
Syy 22.2222
a 0.4574
b 0.4924
sd(\epsilon)\, 0.2073

These results allow us to recover the model parameters:

Param Value
\mu\, 0.9075
\lambda\, 3.1288
\sigma\, 0.5831




[edit] Calibration using Maximum Likelihood estimates

[edit] Conditional probability density function

The conditional probability density function is easily derived by combining the simulation equation above with the probability density function of the normal distribution function:

P(N_{0,1}=x) = \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}x^2}
Conditional probability density function -red- of S at t=1.
Conditional probability density function -red- of S at t=1.

The equation of the conditional probability density of an observation S_{i+1}\, given a previous observation S_i\, (with a \delta\, time step between them) is given by

f(S_{i+1} | S_i;\mu,\lambda,\hat{\sigma}) = \frac{1}{\sqrt{2\pi \hat{\sigma}^2}} \exp \left[ -\frac{ \left(S_i - S_{i-1} e^{-\lambda \delta}-\mu(1-e^{-\lambda \delta}) \right)^2 }{2\hat{\sigma}^2} \right]

To simplify the notation of this equation we have introduced

\hat{\sigma}^2 = \sigma^2\frac{1-e^{-2\lambda\delta}}{2\lambda}

[edit] Log-likelihood function

The log-likelihood function of a set of observation (S_0, S_1, \ldots, S_n)\, can be derived from the conditional density function

\begin{align} \mathcal{L}(\mu,\lambda,\hat{\sigma}) &= \sum_{i=1}^n \ln f(S_i | S_{i-1};\mu,\lambda,\sigma)\\ &= -\frac{n}{2}  \ln(2\pi) -n\ln(\hat{\sigma}) -\frac{1}{2\hat{\sigma}^2} \sum_{i=1}^n \left[ S_i - S_{i-1} e^{-\lambda \delta}-\mu(1-e^{-\lambda \delta}) \right]^2 \end{align}

[edit] Maximum likelihood conditions

The maximum of this log-likelihood surface can be found at the location where all the partial derivatives are zero. This leads to the following set of constraints.

Log likelihood function as function of mu.
Log likelihood function as function of mu.
Log likelihood function as function of lambda.
Log likelihood function as function of lambda.
\begin{align} \frac{\partial \mathcal{L}(\mu,\lambda,\hat{\sigma})}{\partial\mu} &= 0 = \frac{1}{\hat{\sigma}^2} \sum_{i=1}^n \left[ S_i - S_{i-1} e^{-\lambda \delta}-\mu(1-e^{-\lambda \delta}) \right]\\ &\mu = \frac{\sum_{i=1}^{n}\left[S_i - S_{i-1} e^{-\lambda \delta}\right]}{n\left(1-e^{-\lambda \delta}\right)}  \end{align}

,

\begin{align} \frac{\partial \mathcal{L}(\mu,\lambda,\hat{\sigma})}{\partial \lambda} &= 0 = -\frac{\delta e^{-\lambda \delta}}{\hat{\sigma}^2} \sum_{i=1}^n \left[ (S_i - \mu)(S_{i-1} - \mu) - e^{-\lambda \delta}\left( S_{i-1} - \mu\right)^2 \right]\\ &\lambda = -\frac{1}{\delta} \ln \frac{ \sum_{i=1}^n (S_i-\mu) (S_{i-1}-\mu) }{ \sum_{i=1}^n (S_{i-1}-\mu)^2 } \end{align}

,

\begin{align} \frac{\partial \mathcal{L}(\mu,\lambda,\hat{\sigma})}{\partial \hat{\sigma}} &= 0 =\frac{n}{\hat{\sigma}} - \frac{1}{\hat{\sigma}^3}  \sum_{i=1}^n \left[  (S_i - \mu - e^{-\lambda \delta}\left( S_{i-1} - \mu\right)  \right]^2\\ &\hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n \left[  (S_i - \mu - e^{-\lambda \delta}\left( S_{i-1} - \mu\right)  \right]^2 \end{align}

[edit] Solution of the conditions

The problem with these conditions is that they solutions depend on each other. However, both \lambda\, and \mu\, are independent of \sigma\,, and knowing either \lambda\, or \mu\, will directly give the value the other. The solution of \sigma\, can be found once both \lambda\, and \mu\, are determined. To solve these equations it is thus sufficient to find either \lambda\, or \mu\,.


Finding \mu\, can be done by substituting the \lambda\, condition into the \mu\,.

First we change notation of the \mu\, and \lambda\, condition using the same notation as before, i.e

\begin{align} S_x &= \sum_{i=1}^{n} S_{i-1}\;\;\;\;\;\;  S_y = \sum_{i=1}^{n} S_i \\ S_{xx} &= \sum_{i=1}^{n} S_{i-1}^2\;\;\;\;\;\; S_{xy} = \sum_{i=1}^{n} S_{i-1} S_i\;\;\;\;\;\; S_{yy} = \sum_{i=1}^{n} S_i^2   \end{align}

which gives us:

\begin{align} \mu &= \frac{S_y-e^{-\lambda\delta}S_x}{n\left(1-e^{-\lambda\delta}\right)}\\ \lambda &= -\frac{1}{\delta}\ln \frac{S_{xy}-\mu S_x-\mu S_y + n\mu^2}{S_{xx} -2\mu S_x + n\mu^2} \end{align}


substituting \lambda\, into \mu\, gives

n\mu = \frac{S_y-\left(\frac{S_{xy}-\mu S_x-\mu S_y + n\mu^2}{S_{xx} -2\mu S_x + n\mu^2}\right)S_x}{1-\left(\frac{S_{xy}-\mu S_x-\mu S_y + n\mu^2}{S_{xx} -2\mu S_x + n\mu^2}\right)}

removing denominators

n\mu = \frac{S_y(S_{xx} -2\mu S_x + n\mu^2)-(S_{xy}-\mu S_x-\mu S_y + n\mu^2)S_x}{(S_{xx} -2\mu S_x + n\mu^2)-(S_{xy}-\mu S_x-\mu S_y + n\mu^2)}

collecting terms

n\mu = \frac{(S_y S_{xx} - S_x S_{xy}) + \mu(S_x^2 - S_x S_y) + \mu^2 n(S_y-S_x)}{(S_{xx} - S_{xy}) + \mu(S_y-S_x)}

moving all \mu\, to the left

n\mu(S_{xx}-S_{xy}) - \mu(S_x^2 - S_x S_y) = S_y S_{xx} - S_x S_{xy}

[edit] Final results: The maximum likelihood equations

mean

\begin{align} \mu =& \frac{S_y S_{xx} - S_x S_{xy}}{n(S_{xx}-S_{xy}) - (S_x^2 - S_x S_y)}\\ \end{align}

mean reversion rate

\begin{align} \lambda =& -\frac{1}{\delta}\ln \frac{S_{xy}-\mu S_x-\mu S_y + n\mu^2}{S_{xx} -2\mu S_x + n\mu^2}\\ \end{align}

variance

\begin{align} \hat{\sigma}^2 =& \frac{1}{n} \left[ S_{yy} - 2\alpha S_{xy} + \alpha^2 S_{xx} -2 \mu (1-\alpha)  (S_y - \alpha S_x) +n \mu^2 (1-\alpha)^2 \right]\\ \sigma^2 =& \hat{\sigma}^2 \frac{2\lambda}{1-\alpha^2}\\ \end{align}

with \alpha = e^{-\lambda\delta}\,

[edit] Example

Calculating the sums based on table 1 we get

Param Value
Sx 22.5301
Sy 20.1534
Sxx 30.8338
Sxy 25.1973
Syy 22.2222

These results allow us to recover the model parameters:

Param Value
\mu\, 0.9075
\lambda\, 3.1288
\sigma\, 0.5532

[edit] Matlab Code

[edit] Least Squares Calibration

function [mu,sigma,lambda] = OU_Calibrate_LS(S,delta)
  n = length(S)-1;

  Sx  = sum( S(1:end-1) );
  Sy  = sum( S(2:end) );
  Sxx = sum( S(1:end-1).^2 );
  Sxy = sum( S(1:end-1).*S(2:end) );
  Syy = sum( S(2:end).^2 );

  a  = ( n*Sxy - Sx*Sy ) / ( n*Sxx -Sx^2 );
  b  = ( Sy - a*Sx ) / n;
  sd = sqrt( (n*Syy - Sy^2 - a*(n*Sxy - Sx*Sy) )/n/(n-2) );

  lambda = -log(a)/delta;
  mu     = b/(1-a);
  sigma  =  sd * sqrt( -2*log(a)/delta/(1-a^2) );
end

[edit] Maximum Likelyhood Calibration

function [mu,sigma,lambda] = OU_Calibrate_ML(S,delta)
  n = length(S)-1;

  Sx  = sum( S(1:end-1) );
  Sy  = sum( S(2:end) );
  Sxx = sum( S(1:end-1).^2 );
  Sxy = sum( S(1:end-1).*S(2:end) );
  Syy = sum( S(2:end).^2 );

  mu  = (Sy*Sxx - Sx*Sxy) / ( n*(Sxx - Sxy) - (Sx^2 - Sx*Sy) );
  lambda = -log( (Sxy - mu*Sx - mu*Sy + n*mu^2) / (Sxx -2*mu*Sx + n*mu^2) ) / delta;
  a = exp(-lambda*delta);
  sigmah2 = (Syy - 2*a*Sxy + a^2*Sxx - 2*mu*(1-a)*(Sy - a*Sx) + n*mu^2*(1-a)^2)/n;
  sigma = sqrt(sigmah2*2*lambda/(1-a^2));
end

[edit] Example Usage

S = [ 3.0000 1.7600 1.2693 1.1960 0.9468 0.9532 0.6252 ...
      0.8604 1.0984 1.4310 1.3019 1.4005 1.2686 0.7147 ...
      0.9237 0.7297 0.7105 0.8683 0.7406 0.7314 0.6232 ];

delta = 0.25;

[mu1, sigma1, lambda1] = OU_Calibrate_LS(S,delta)
[mu2, sigma2, lambda2] = OU_Calibrate_ML(S,delta)

gives

mu1     = 0.90748788828331
sigma1  = 0.58307607458526
lambda1 = 3.12873217812387

mu2     = 0.90748788828331
sigma2  = 0.55315453345189
lambda2 = 3.12873217812386

[edit] Add a comment

Name (required):

Website:

Comment:


[edit] sitmo said ...

The two methods give the exact same result s except for the variance.

..so which ones do you prefer & why?

--Admin 15:10, 10 July 2007 (CEST)

[edit] Anthony said ...

Am I missing something here. When I put the values into the equation for the sd(e) in the first part I get a value of ~ 1.44. Now forgive me because I may well be doing something dumb but I'd just like to confirm that :)

--Anthony 15:14, 24 July 2007 (CEST)

[edit] sitmo said ...

I get this

parameters: S0=3, mu=1, lambda=3, dt=0.25, sigma=0.5, W=-1.0267

Excel formula: =3*EXP(-3*0.25) + 1*(1-EXP(-3*0.25)) + 0.5*SQRT((1-EXP(-2*3*0.25))/(2*3))*-1.0267

= 1.760014

--sitmo 13:19, 26 July 2007 (CEST)

[edit] Anthony said ...

Sorry, my mistake I wasn't making myself clear. I was talking about the calculation of the standard deviation of epsilon {sd(e)}, not of the discretisation of the sde.

The formula being sqrt((n*Syy - Sy*Sy - a)/n-2)

Which with values of n = 20, Syy = 22.222, Sy = 20.1534, a = 0.4574 gives sqrt(2.101504) = ~1.44

--Anthony 14:51, 2 August 2007 (CEST)

[edit] sitmo said ...

Hi Anthony, Ah! That's indeed a bug. I also got the wrong value 1.44. The page has been updated now. The problem was the sde(e) equation. The new outcome is 0.2073, and that will lead to a sigma of 0.5831.

Some more info about the updated sde(e) equation:

sde(e)=sqrt(the sum square residuals/(n-2))

Thanks for the feedback!

--Admin 13:17, 3 August 2007 (CEST)

[edit] Anthony said ...

Hi,

One other observation, the initial value for this process is very high. The long term standard deviation of the process is ~0.2 (sigma/sqrt(2*lambda)). However the initial value is 3, i.e. 2 about the mean reversion level. This is the equivalent of about 10 standard deviations away from the mean reversion level which is quite an unlikely value for an ornstein- uhlenbeck process.

The overall result of this is that the estimation method above gets a much better estimate of the lambda than it would normally, as can be seen in the regression graph. It may be more realistic to start the process closer to the long term mean. However as I'm coming to realise this gives a much weaker estimate of the mean reversion level!

I'd appreciate you thoughts on how robust you think this estimation procedure is to varying parameter values.

--Anthony 10:25, 6 August 2007 (CEST)

[edit] sitmo said ...

Hi Anthony, That a nice observation... I did the high initial value to illustrate the mean reversion effect. I'll have to give your questions some more thoughts.

My initial idea is that it should be quite easy to get confidence interval equations for the methods. That would be a nice thing to add to this article! For the least square fit & can image that those estimators are well known. I'm quite busy at the moment, but it might be very easy & quickly doable to sort this out...

I'll get back to you on this. Thanks for the idea, it's a good & valuable thing to add.

--Admin 22:46, 7 August 2007 (CEST)

[edit] Mike said ...

Hi Thijs,

I am interested in deriving the solution, at the bottom of page 2 of the document titled - "Calibrating the Ornstein-Uhlenbeck Model"(pdf document), from first principles.

Do you know of any literature on this.

Mike


--Mike 21:04, 21 August 2007 (CEST)

[edit] sitmo said ...

Hi Mike, You need a lot a background for that. The field is called stochastic calculus, and you can look for keywords: stochastic integration, ito formula.

A simple approach with a Riemann sum might be doable. You replace dt with a small timestep e.g. t/n and write out the n-step sum. Next you let n go to infinity. You replace sigma*Wdt with normal distributed numbers with mean zero and variance t/n (or standard deviation the square root of the t/n). When you write out the sum, you can use the fact that the sum of (weighted, uncorrelated) normal distributed variables is als normal distributed (with a variance the sum of the variances). Finding the solution of the sum for n->infinity should give your result.

--Admin 22:51, 21 August 2007 (CEST)

[edit] Paul said ...

If one wanted to simulate a risk neutral path of a Ornstein-Uhlenbeck model - how would the solution to the SDE be defined.

--Paul 16:08, 24 August 2007 (CEST)

[edit] sitmo said ...

Paul, You want to add a premium for risk, right?

A good choice for a risk premium model would be to make is dependent on the variance at future times (the sigma*squareroot(..) term before N(0,1) in eq 2). Because of the mean reversion, the variance will converge to upper bound. This implies that the risk will stop increasing as a function of maturity. At some point the amount of risk in a 10 year or 20 year price is the same.

If you approach it this way, you just add a term of -10% of the variance (your risk premium)

--sitmo 12:53, 25 August 2007 (CEST)

[edit] Nishil said ...

for sigma square hat MLE expression for the mean reverting process, I think it should be divided by 'n'.

--Nishil 18:14, 7 September 2007 (CEST)

[edit] Nishil said ...

for sigma square hat MLE expression for the mean reverting process, I think it should be divided by 'n'.

--Nishil 18:14, 7 September 2007 (CEST)

[edit] sitmo said ...

Nishil, There is a 1/n in front of the sum. As far as I can see, the equation is right. Can you explain a bit more?

--sitmo 10:05, 12 September 2007 (CEST)

[edit] wolf said ...

I agree with nishil...in the section "final results..." the formula for sigma hat squared should be divided by n. compare this to its corresponding formula in the section "max likelihood conditions"

--wolf 13:28, 1 October 2007 (CEST)

[edit] Sitmo said ...

Thanks, both!

--Admin 22:33, 2 November 2007 (CET)

[edit] lorenzo said ...

hye. i have a doubt. what happens if in the regression (first method of calibration) a<0? this is totally posiible and in that case you can't obtain "a" because the exponential is never<0.

--lorenzo 22:58, 30 January 2008 (CET)

[edit] walter said ...

What happens to the model when trying to calibrate it to real world data e.g. Commodity prices? I tried it and the calibrated expected value (mu) is way off the long term average for the series. How come? And do you later use the calbrated absolut values for vola in the smulation?

--walter 21:41, 6 February 2008 (CET)

[edit] walter said ...

For the later simulation it would make sense if you have used Logarithms of real absolute values. But it is not mention in the paper I think.

--walter 21:53, 6 February 2008 (CET)

[edit] walter said ...

Is the sigma derived from the calibration an annualized factor or ist it based on delta (here being 0.25) so having to multiply it by Sqrt(4) to get an annualized factor?

--walter 22:56, 6 February 2008 (CET)

[edit] peter said ...

Hi Walter. Did you come to any conclusion about the calibrated expected value (mu) been way off the long term average for the series when looking at commodity prices?

--peter 11:02, 12 February 2008 (CET)

[edit] walter said ...

hi peter I "solved" the problem by taking mu as given and deducing the other two parameters of the equation PS: I used Logarithms of values in the time series and not the values themselves

--walter 18:29, 12 February 2008 (CET)

[edit] Momo said ...

Hi Walter hi Admin

I've also tried to use this calibration for commodity data and got some counterintuitive results. However, I agree that for commodities one should use log-prices instead of the prices itself.

My observation was the following: If the estimated parameter "a" is greater than 1 --> lambda is negative and if the starting point S(0) is above the mean reverting level "mu", then the mean of S(t) (or log of it) tends to increase (!) over time. This means that in the long run I expect to get simulated values which diverge even more (--> distance increases) from the long term average "mu". That's not what one would expect, right? Any comments on that? Or maybe I'm missing some point....


--Momo 10:10, 21 February 2008 (CET)

[edit] David Brown said ...

Honestly, I'd suggest being really careful calibrating this to something that is potentially non-stationary. The majority of commodity prices and log commodity prices are non stationary, having an explosive unit root at least.

--David Brown 05:08, 9 March 2008 (CET)

[edit] Admin said ...

Momo,

If you simulate OU paths, then the probability distribution at some fixed future point in time will be normal distributed with a known mean and variance. You can find those equation in the other OU article (in the section "Simulating the Ornstein Uhlenbeck Proces" ... ...Alternatively, is value at S(t) is normal distributed with:)

Now when you you have a normal distributed variable x=N(mean,var) then exp(x) will have a mean of exp(mean+0.5*var). Both the mean and variance change in time. The mean of your scenarios start at your initial value and move towards mu. The variance starts at 0 and grows to some long term fixed value.

So what you can do is solve the folowing relation by finding a value OU_mean (given your target mean)

targetmean = exp(OU_mean + 0.5*OU_var)


--Admin 15:43, 12 March 2008 (CET)

[edit] Phileas said ...

I am totally new in the field. Just 2 few questions

Some parctitioners use the differential equation non the solution to simulate the path. Is that normal?

Can follow if S used to estimate the parmeters are real observations or the outcome of the simulations: Calibration should mean choosing a path that fits real data. Am I wrong?


--Phileas 09:54, 25 April 2008 (CEST)

[edit] David said ...

Can anybody suggest something else if the estimate of the slope is negative? any alternate approach to mean reversion process?

--David 17:07, 7 May 2008 (CEST)

[edit] Giggs said ...

Hi all, if I want to calibrate this model to returns and not to prices, it is very likely that one uses negative values. In my case, I even get a negative value in the LN function for lambda (FOC). Since LN is only defined for positive values, I cannot calculate lambda. Does anyone have an idea how to solve this problem?

--Giggs 14:50, 14 May 2008 (CEST)