Estimation of stochastic differential equation

A.C. Guidoum and K. Boukhetala

2016-10-08

The fitsde() function

The Sim.DiffProc package implements pseudo-maximum likelihood via the fitsde() function. The interface and the output of the fitsde function are made as similar as possible to those of the standard mle function in the stats4 package of the basic R system. The main arguments to fitsde consist:

The functions of type S3 method (as similar of the standard mle function in the stats4 package of the basic R system for the class fitsde are the following:

The following we explain how to use this function to estimate a SDE with different approximation methods.

Euler method

Consider a process solution of the general stochastic differential equation: \begin{equation}\label{eq02} dX_{t}= f(X_{t},\underline{\theta}) dt + g(X_{t},\underline{\theta}) dW_{t}, \quad \quad t \geq 0 \, , X_{0} = x_{0}, \end{equation} The Euler scheme produces the discretization (\(\Delta t \rightarrow 0\)): \begin{equation*} X_{t+\Delta t} - X_{t} = f(X_{t},\theta) \Delta t+ g(X_{t},\theta) (W_{t+\Delta t} -W_{t}), \end{equation*} The increments \(X_{t+\Delta t} - X_{t}\) are then independent Gaussian random variables with mean: \(\text{E}_{x} = f(X_{t},\theta)\Delta t\), and variance: \(\text{V}_{x} = g^{2}(X_{t},\theta) \Delta t\). Therefore the transition density of the process can be written as: \begin{equation*} p_{\theta}(t,y|x)=\frac{1}{\sqrt{2\pi t g^{2}(x,\theta)}} \exp\left(-\frac{\left(y-x-f(x,\theta)t\right)^2}{2tg^{2}(x,\theta)}\right) \end{equation*} Then the log-likelihood is: \begin{equation}\label{eq08} h_{n}(\theta|X_{1},X_{2},\dots,X_{n})=-\frac{1}{2}\left(\sum_{i=1}^{n} \frac{(X_{i}-X_{i-1}-f(X_{i-1},\theta)\Delta)^2}{\sigma^2 \Delta t} + n \log(2\pi \sigma^2 \Delta t)\right) \end{equation} As an example, we consider the Chan-Karolyi-Longstaff-Sanders (CKLS) model: \begin{equation}\label{eq09} dX_{t} = (\theta_{1}+\theta_{2} X_{t}) dt + \theta_{3} X_{t}^{\theta_{4}} dW_{t},\qquad X_{0}=2 \end{equation}

with \(\theta_{1}=1\), \(\theta_{2}=2\), \(\theta_{3}=0.5\) and \(\theta_{4}=0.3\). Before calling fitsde, we generate sampled data \(X_{t_{i}}\), with \(\Delta t =10^{-4}\), as following:

f <- expression( (1+2*x) ) ; g <- expression( 0.5*x^0.3 )
sim    <- snssde1d(drift=f,diffusion=g,x0=2,N=10^4,Dt=10^-4)
mydata <- sim$X

we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=\theta_{3}=\theta_{4}=1\), and we specify the coefficients drift and diffusion as expressions. you can use the upper and lower limits of the search region used by the optimizer (here using the default method "L-BFGS-B"), and specifying the method to use with pmle="euler".

fx <- expression( theta[1]+theta[2]*x ) ## drift coefficient of model
gx <- expression( theta[3]*x^theta[4] ) ## diffusion coefficient of model 
fitmod <- fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1=1, theta2=1,
                theta3=1,theta4=1),pmle="euler")
fitmod
## 
## Call:
## fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1, 
##     theta2 = 1, theta3 = 1, theta4 = 1), pmle = "euler")
## 
## Coefficients:
##    theta1    theta2    theta3    theta4 
## 1.1287275 2.1913699 0.4899050 0.3095782

The estimated coefficients are extracted from the output object fitmod as follows:

coef(fitmod)
##    theta1    theta2    theta3    theta4 
## 1.1287275 2.1913699 0.4899050 0.3095782

We can use the summary function to produce result summaries of output object:

summary(fitmod)
## Pseudo maximum likelihood estimation
## 
## Method:  Euler
## Call:
## fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1, 
##     theta2 = 1, theta3 = 1, theta4 = 1), pmle = "euler")
## 
## Coefficients:
##         Estimate Std. Error
## theta1 1.1287275 1.44133407
## theta2 2.1913699 0.20984636
## theta3 0.4899050 0.01019174
## theta4 0.3095782 0.01098450
## 
## -2 log L: -66967.97

vcov for variance-covariance matrice, and extract log-likelihood by logLik:

vcov(fitmod)
##               theta1        theta2        theta3        theta4
## theta1  2.077444e+00 -2.490563e-01  1.394691e-04 -3.899667e-05
## theta2 -2.490563e-01  4.403550e-02  5.538228e-06 -2.093221e-05
## theta3  1.394691e-04  5.538228e-06  1.038715e-04 -1.052829e-04
## theta4 -3.899667e-05 -2.093221e-05 -1.052829e-04  1.206593e-04
logLik(fitmod)
## [1] 33483.99
AIC(fitmod)
## [1] -66959.97
BIC(fitmod)
## [1] -66949.55

Computes confidence intervals for one or more parameters in a fitted SDE:

confint(fitmod, level=0.95)
##             2.5 %    97.5 %
## theta1 -1.6962354 3.9536904
## theta2  1.7800786 2.6026612
## theta3  0.4699295 0.5098804
## theta4  0.2880490 0.3311074

Ozaki method

The second approach we present is the Ozaki method, and it works for homogeneous stochastic differential equations. Consider the stochastic differential equation: \begin{equation}\label{eq10} dX_{t}= f(X_{t},\underline{\theta}) dt + \sigma dW_{t}, \quad \quad t \geq 0 \, , X_{0} = x_{0}, \end{equation} where \(\sigma\) is supposed to be constant. We just recall that the transition density for the Ozaki method is Gaussian, we have that: \(X_{t+\Delta t}|X_{t} = x \sim \mathcal{N}(\text{E}_{x},\text{V}_{x})\), where: \begin{align}\label{eq11} \text{E}_{x} &= x + \frac{f(x)}{\partial_{x} f(x)} \left( e^{\partial_{x} f(x) \Delta t} - 1 \right), \quad\text{and}\quad \text{V}_{x} &= \sigma^{2} \frac{e^{2K_{x} \Delta t} -1}{2K_{x}}, \end{align} with: \begin{equation*} K_{x} = \frac{1}{\Delta t} \log \left(1+\frac{f(x)}{x\partial_{x}f(x)}\left(e^{\partial_{x}f(x) \Delta t}-1\right) \right) \end{equation*}

It is always possible to transform process \(X_t\) with a constant diffusion coefficient using the Lamperti transform.

Now we consider the Vasicek model, with \(f(x,\theta) = \theta_{1} (\theta_{2}- x)\) and constant volatility \(g(x,\theta) = \theta_{3}\), \begin{equation}\label{eq12} dX_{t} = \theta_{1} (\theta_{2}- X_{t}) dt + \theta_{3} dW_{t},\qquad X_{0}=5 \end{equation}

with \(\theta_{1}=3\), \(\theta_{2}=2\) and \(\theta_{3}=0.5\), we generate sampled data \(X_{t_{i}}\), with \(\Delta t =10^{-2}\), as following:

f <- expression( 3*(2-x) ) ; g <- expression( 0.5 )
sim <- snssde1d(drift=f,diffusion=g,x0=5,Dt=0.01)
HWV <- sim$X

we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=\theta_{3}=1\), and we specify the coefficients drift and diffusion as expressions. Specifying the method to use with pmle="ozaki", which can easily be implemented in R as follows:

fx <- expression( theta[1]*(theta[2]- x) ) ## drift coefficient of model 
gx <- expression( theta[3] )           ## diffusion coefficient of model 
fitmod <- fitsde(data=HWV,drift=fx,diffusion=gx,start = list(theta1=1,theta2=1,
                  theta3=1),pmle="ozaki")
summary(fitmod)
## Pseudo maximum likelihood estimation
## 
## Method:  Ozaki
## Call:
## fitsde(data = HWV, drift = fx, diffusion = gx, start = list(theta1 = 1, 
##     theta2 = 1, theta3 = 1), pmle = "ozaki")
## 
## Coefficients:
##         Estimate Std. Error
## theta1 3.6410502 0.42461009
## theta2 2.0412504 0.04643124
## theta3 0.5127161 0.01146631
## 
## -2 log L: -3104.02

If you want to have confidence intervals \(\theta_{1}\) and \(\theta_{2}\) parameters only, using the command confint as following:

confint(fitmod,parm=c("theta1","theta2"),level=0.95)
##           2.5 %   97.5 %
## theta1 2.808830 4.473271
## theta2 1.950247 2.132254

Shoji-Ozaki method

An extension of the method to Ozaki the more general case where the drift is allowed to depend on the time variable \(t\), and also the diffusion coefficient can be varied is the method Shoji and Ozaki. Consider the stochastic differential equation: \begin{equation}\label{eq13} dX_{t}= f(t,X_{t},\underline{\theta}) dt + g(X_{t},\underline{\theta}) dW_{t}, \quad \quad t \geq 0 \, , X_{0} = x_{0}, \end{equation} the transition density for the Shoji-Ozaki method is Gaussian, we have that: \(X_{t+\Delta t}|X_{t} = x \sim \mathcal{N}\left(\mathrm{A}_{(t,x)}x,\mathrm{B}^{2}_{(t,x)}\right)\), where: \begin{align}\label{eq14} \mathrm{A}_{(t,x)} &= 1+ \frac{f(t,x)}{x\mathrm{L}_{t}} \left(e^{\mathrm{L}_{t}\Delta t }-1\right)+\frac{\mathrm{M}_{t}}{x\mathrm{L}^{2}_{t}} \left(e^{\mathrm{L}_{t} \Delta t}-1-\mathrm{L}_{t}\Delta t\right), \\ \mathrm{B}_{(t,x)} &= g(x) \sqrt{\frac{e^{2\mathrm{L}_{t} \Delta t}-1}{2\mathrm{L}_{t}}}, \end{align} with: \begin{equation*} \mathrm{L}_{t} = \partial_{x} f(t,x) \quad\text{and}\quad \mathrm{M}_{t} = \frac{g^{2}(x)}{2} \partial_{xx} f(t,x)+ \partial_{t} f(t,x). \end{equation*} As an example, we consider the following model: \begin{equation}\label{eq15} dX_{t} = a(t)X_{t} dt + \theta_{2}X_{t} dW_{t},\qquad X_{0}=10 \end{equation}

with: \(a(t) = \theta_{1}t\), and we generate sampled data \(X_{t_{i}}\), with \(\theta_{1}=-2\), \(\theta_{2}=0.2\) and time step \(\Delta t =10^{-3}\), as following:

f <- expression(-2*x*t) ; g <- expression(0.2*x)
sim <- snssde1d(drift=f,diffusion=g,N=1000,Dt=0.001,x0=10)
mydata <- sim$X

we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=1\), and we specify the method to use with pmle="shoji":

fx <- expression( theta[1]*x*t ) ## drift coefficient of model 
gx <- expression( theta[2]*x )   ## diffusion coefficient of model 
fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
                 theta2=1),pmle="shoji",lower=c(-3,0),upper=c(-1,1))
summary(fitmod)
## Pseudo maximum likelihood estimation
## 
## Method:  Shoji
## Call:
## fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1, 
##     theta2 = 1), pmle = "shoji", lower = c(-3, 0), upper = c(-1, 
##     1))
## 
## Coefficients:
##          Estimate  Std. Error
## theta1 -1.4085224 0.343630005
## theta2  0.1982903 0.004435787
## 
## -2 log L: -3417.592

Kessler method

Kessler (1997) proposed to use a higher-order Ito-Taylor expansion to approximate the mean and variance in a conditional Gaussian density. Consider the stochastic differential equation \(dX_{t}= f(X_{t},\underline{\theta}) dt + g(X_{t},\underline{\theta}) dW_{t}\) , the transition density by Kessler method is: \(X_{t+\Delta t}|X_{t} = x \sim \mathcal{N}\left(\text{E}_{x},\text{V}_{x}\right)\), where: \begin{align}\label{eq16} \text{E}_{x} &= x + f(t,x) \Delta t+\left(f(t,x)\partial_{x}f(t,x) + \frac{1}{2} g^{2}(t,x) \partial_{xx}g(t,x)\right)\frac{(\Delta t)^2}{2}, \\ \text{V}_{x} &= x^2 +(2f(t,x)x+g^{2}(t,x)) \Delta t +\bigg(2f(t,x)\left(\partial_{x}f(t,x)x+f(t,x)+g(t,x)\partial_{x}g(t,x)\right) \nonumber\\ &\quad+g^{2}(t,x)\left(\partial_{xx}f(t,x)x+2\partial_{x}f(t,x)+\partial_{x}g^{2}(t,x)+g(t,x)\partial_{xx}g(t,x)\right)\bigg)\frac{(\Delta t)^2}{2}-\text{E}^{2}_{x}. \end{align} We consider the following Hull-White (extended Vasicek) model: \begin{equation}\label{eq17} dX_{t} = a(t)(b(t)-X_{t}) dt + \sigma(t) dW_{t},\qquad X_{0}=2 \end{equation}

with: \(a(t) = \theta_{1}t\) and \(b(t)=\theta_{2}\sqrt{t}\), the volatility depends on time: \(\sigma(t)=\theta_{3}t\). We generate sampled data of \(X_t\), with \(\theta_{1}=3\), \(\theta_{2}=1\) and \(\theta_{3}=0.3\), time step \(\Delta t =10^{-3}\), as following:

f <- expression(3*t*(sqrt(t)-x)) ; g <- expression(0.3*t)
sim <- snssde1d(drift=f,diffusion=g,M=1,N=1000,x0=2,Dt=0.001)
mydata <- sim$X

we set the initial values for the optimizer as \(\theta_{1}=\theta_{2}=\theta_{3}=1\), and we specify the method to use with pmle="kessler":

fx <- expression( theta[1]*t* ( theta[2]*sqrt(t) - x ) ) ## drift coefficient of model 
gx <- expression( theta[3]*t ) ## diffusion coefficient of model 
fitmod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
                  theta2=1,theta3=1),pmle="kessler")
summary(fitmod)
## Pseudo maximum likelihood estimation
## 
## Method:  Kessler
## Call:
## fitsde(data = mydata, drift = fx, diffusion = gx, start = list(theta1 = 1, 
##     theta2 = 1, theta3 = 1), pmle = "kessler")
## 
## Coefficients:
##         Estimate  Std. Error
## theta1 2.7988138 0.268371073
## theta2 0.7726340 0.192091516
## theta3 0.3002192 0.006716556
## 
## -2 log L: -8462.47

The fitsde in practice

Estimation of attractive model

We propose the following dispersion models family (Boukhetala,1996): \begin{equation}\label{eq18} dR_t = \left( \frac{0.5 \theta^{2}_{3} R_t^{ \theta_{2} - 1} - \theta_{1}}{R_t^{\theta_{2}}} \right) dt + \theta_{3} dW_{t}, \end{equation}

where: \(2 \theta_{1}> \theta^{2}_{3}\) condition to ensure attractiveness; we generate sampled data of this model, with \(\theta_{1}=5\), \(\theta_{2}=1\) and \(\theta_{3}=0.2\), \(\Delta t =10^{-3}\), as following:

theta1 = 5; theta2 = 1; theta3 = 0.2
f <- expression( ((0.5*theta3^2 *x^(theta2-1) - theta1)/ x^theta2) )
g <- expression( theta3 )
sim <- snssde1d(drift=f,diffusion=g,M=1,N=1000,x0=3,Dt=0.001)
mydata <- sim$X

we use fitsde function to estimate the parameters of this model as follows:

fx <- expression( ((0.5*theta[3]^2 *x^(theta[2]-1) - theta[1])/ x^theta[2])  )
gx <- expression(theta[3])
fitmod <- fitsde(mydata,drift=fx,diffusion=gx, start = list(theta1=1,theta2=1,
                 theta3=1),lower=c(0,0,0),pmle="euler")
coef(fitmod)
##    theta1    theta2    theta3 
## 5.1002378 0.9952380 0.2013187

for to calculate the bias and confidence intervals of estimators it is easy, we can proceed as follows:

true <- c(theta1,theta2,theta3)   ## True parameters
bias <- true-coef(fitmod)
bias
##       theta1       theta2       theta3 
## -0.100237756  0.004761964 -0.001318711
confint(fitmod)
##            2.5 %    97.5 %
## theta1 4.7466588 5.4538167
## theta2 0.9806679 1.0098082
## theta3 0.1923328 0.2103046

Model selection via AIC

Let the following models: \begin{align*} % \nonumber to remove numbering (before each equation) dX_{t} &= \theta_{1} X_{t} dt + \theta_{2} X_{t}^{\theta_{3}} dW_{t}, &\text{(true model)}\\ dX_{t} &= (\theta_{1}+\theta_{2} X_{t}) dt + \theta_{3} X_{t}^{\theta_{4}} dW_{t},&\text{(competing model 1)}\\ dX_{t} &= (\theta_{1}+\theta_{2} X_{t}) dt + \theta_{3} \sqrt{X_{t}} dW_{t}, &\text{(competing model 2)}\\ dX_{t} &= \theta_{1} dt + \theta_{2} X_{t}^{\theta_{3}} dW_{t}, &\text{(competing model 3)} \end{align*}

We generate data from true model with parameters \(\underline{\theta}=(2,0.3,0.5)\), initial value \(X_{0}=2\) and \(\Delta t =10^{-4}\), as following:

f <- expression( 2*x )
g <- expression( 0.3*x^0.5 )
sim <- snssde1d(drift=f,diffusion=g,M=1,N=10000,x0=2,Dt=0.0001)
mydata <- sim$X

We test the performance of the AIC statistics for the four competing models, we can proceed as follows:

## True model
fx <- expression( theta[1]*x )
gx <- expression( theta[2]*x^theta[3] )
truemod <- fitsde(data=mydata,drift=fx,diffusion=gx,start = list(theta1=1,
                   theta2=1,theta3=1),pmle="euler")
## competing model 1
fx1 <- expression( theta[1]+theta[2]*x )
gx1 <- expression( theta[3]*x^theta[4] )
mod1 <- fitsde(data=mydata,drift=fx1,diffusion=gx1,start = list(theta1=1,
          theta2=1,theta3=1,theta4=1),pmle="euler")
## competing model 2
fx2 <- expression( theta[1]+theta[2]*x )
gx2 <- expression( theta[3]*sqrt(x) )
mod2 <- fitsde(data=mydata,drift=fx2,diffusion=gx2,start = list(theta1=1,
           theta2=1,theta3=1),pmle="euler")
## competing model 3
fx3 <- expression( theta[1] )
gx3 <- expression( theta[2]*x^theta[3] )
mod3 <- fitsde(data=mydata,drift=fx3,diffusion=gx3,start = list(theta1=1,
           theta2=1,theta3=1),pmle="euler")
## Computes AIC
AIC <- c(AIC(truemod),AIC(mod1),AIC(mod2),AIC(mod3))
Test <- data.frame(AIC,row.names = c("True mod","Comp mod1","Comp mod2","Comp mod3"))
Bestmod <- rownames(Test)[which.min(Test[,1])]
Bestmod
## [1] "True mod"

the estimates under the different models:

Theta1 <- c(coef(truemod)[[1]],coef(mod1)[[1]],coef(mod2)[[1]],coef(mod3)[[1]])
Theta2 <- c(coef(truemod)[[2]],coef(mod1)[[2]],coef(mod2)[[2]],coef(mod3)[[2]])
Theta3 <- c(coef(truemod)[[3]],coef(mod1)[[3]],coef(mod2)[[3]],coef(mod3)[[3]])
Theta4 <- c("",coef(mod1)[[4]],"","")
Parms  <- data.frame(Theta1,Theta2,Theta3,Theta4,row.names = c("True mod",
                      "Comp mod1","Comp mod2","Comp mod3"))
Parms
##               Theta1    Theta2    Theta3            Theta4
## True mod   1.8107966 0.3018689 0.4952373                  
## Comp mod1 -0.8867762 1.9792870 0.3018856 0.495182982648353
## Comp mod2  1.0947975 1.5990583 0.2997053                  
## Comp mod3  7.1921632 0.3013008 0.4981262

Application to real data

We make use of real data of the U.S. Interest Rates monthly form \(06/1964\) to \(12/1989\) (see Figure 1) available in package Ecdat, and we estimate the parameters \(\underline{\theta}=(\theta_{1},\theta_{2},\theta_{3},\theta_{4})\) of CKLS model. These data have been analyzed by Stefano et all (2014) with yuima package, here we confirm the results of the estimates by several approximation methods.

data(Irates)
rates <- Irates[, "r1"]
X <- window(rates, start = 1964.471, end = 1989.333)
plot(X)

we can now use all previous methods by fitsde function to estimate the parameters of CKLS model as follows:

fx <- expression( theta[1]+theta[2]*x ) ## drift coefficient of CKLS model
gx <- expression( theta[3]*x^theta[4] ) ## diffusion coefficient of CKLS model
pmle <- eval(formals(fitsde.default)$pmle)
fitres <- lapply(1:4, function(i) fitsde(X,drift=fx,diffusion=gx,pmle=pmle[i],
                  start = list(theta1=1,theta2=1,theta3=1,theta4=1)))
Coef <- data.frame(do.call("cbind",lapply(1:4,function(i) coef(fitres[[i]]))))
Info <- data.frame(do.call("rbind",lapply(1:4,function(i) logLik(fitres[[i]]))),
                   do.call("rbind",lapply(1:4,function(i) AIC(fitres[[i]]))),
                   do.call("rbind",lapply(1:4,function(i) BIC(fitres[[i]]))),
                   row.names=pmle)
names(Coef) <- c(pmle)
names(Info) <- c("logLik","AIC","BIC")
Coef
##             euler    kessler      ozaki      shoji
## theta1  2.0769516  2.1433505  2.1153154  2.1015009
## theta2 -0.2631871 -0.2743368 -0.2690547 -0.2664674
## theta3  0.1302158  0.1259800  0.1265225  0.1316708
## theta4  1.4513173  1.4691660  1.4649140  1.4513080
Info
##            logLik      AIC      BIC
## euler   -237.8786 483.7572 487.1514
## kessler -237.7845 483.5690 486.9632
## ozaki   -237.8356 483.6712 487.0654
## shoji   -237.8786 483.7572 487.1514

In Figure 2 we reports the sample mean of the solution of the CKLS model with the estimated parameters and real data, their empirical \(95\%\) confidence bands (from the \(2.5th\) to the \(97.5th\) percentile), we can proceed as follows:

f <- expression( (2.076-0.263*x) )
g <- expression( 0.130*x^1.451 )
mod <- snssde1d(drift=f,diffusion=g,x0=X[1],M=200, N=length(X),t0=1964.471,
                  T=1989.333)
mod
## Ito Sde 1D:
##  | dX(t) = (2.076 - 0.263 * X(t)) * dt + 0.13 * X(t)^1.451 * dW(t)
## Method:
##  | Euler scheme of order 0.5
## Summary:
##  | Size of process   | N  = 298.
##  | Number of simulation  | M  = 200.
##  | Initial value     | x0 = 3.317.
##  | Time of process   | t in [1964.471,1989.333].
##  | Discretization    | Dt = 0.08342953.
plot(mod,plot.type="single",type="n",ylim=c(0,30))
lines(X,col=4,lwd=2)
lines(time(mod),mean(mod),col=2,lwd=2)
lines(time(mod),bconfint(mod,level=0.95)[,1],col=5,lwd=2)
lines(time(mod),bconfint(mod,level=0.95)[,2],col=5,lwd=2)
legend("topleft",c("real data","mean path",paste("bound of", 95,"% confidence")),inset = .01,col=c(4,2,5),lwd=2,cex=0.8)