# Bayesian inference for state-space models¶

## Defining a prior distribution¶

We have already seen that module `particles.distributions`

defines various `ProbDist`

objects; i.e. objects that represent probability distribution. Such objects have methods to simulate random variates, compute the log-density, and so on.

This module defines in particular a class called `StructDist`

, whose methods take as inputs and outputs structured arrays. This is what we are going to use to define prior distributions. Here is a simple example:

```
[1]:
```

```
import warnings; warnings.simplefilter('ignore') # hide warnings
from matplotlib import pyplot as plt
import numpy as np
from particles import distributions as dists
prior_dict = {'mu': dists.Normal(scale=2.),
'rho': dists.Uniform(a=-1., b=1.),
'sigma':dists.Gamma()}
my_prior = dists.StructDist(prior_dict)
```

Object `my_prior`

represents a distribution for \(\theta=(\mu, \rho, \sigma)\) where \(\mu\sim N(0,2^2)\), \(\rho \sim \mathcal{U}([-1,1])\), \(\sigma \sim \mathrm{Gamma}(1, 1)\), independently. We may now sample from this distribution, or compute its pdf, and so on. For each of the operations, the inputs and outputs must be structured arrays, with named variables `'rho'`

and `'sigma'`

.

```
[2]:
```

```
theta = my_prior.rvs(size=500) # sample 500 theta-parameters
plt.style.use('ggplot')
plt.hist(theta['sigma'], 30);
plt.xlabel('sigma')
plt.figure()
z = my_prior.logpdf(theta)
plt.hist(z, 30)
plt.xlabel('log-pdf');
```

We may want to transform `sigma`

into its logarithm, so that the support of the distribution is not constrained to \(\mathbb{R}^+\):

```
[3]:
```

```
another_prior_dict = {'rho': dists.Uniform(a=-1., b=1.),
'log_sigma':dists.LogD(dists.Gamma())}
another_prior = dists.StructDist(another_prior_dict)
another_theta = another_prior.rvs(size=100)
plt.hist(another_theta['log_sigma'], 20)
plt.xlabel('log-sigma');
```

Now, `another_theta`

contains two variables, `rho`

and `log_sigma`

, and the latter variable is distributed according to \(Y=\log(X)\), with \(X\sim \mathrm{Gamma}(1, 1)\). (The documentation of module `distributions`

has more details on transformed distributions.)

We may also want to introduce dependencies between \(\rho\) and \(\sigma\). Consider this:

```
[4]:
```

```
from collections import OrderedDict
dep_prior_dict = OrderedDict()
dep_prior_dict['rho'] = dists.Uniform(a=0., b=1.)
dep_prior_dict['sigma'] = dists.Cond( lambda theta: dists.Gamma(b=1./theta['rho']))
dep_prior = dists.StructDist(dep_prior_dict)
dep_theta = dep_prior.rvs(size=2000)
plt.scatter(dep_theta['rho'], dep_theta['sigma'])
plt.axis([0., 1., 0., 8.])
plt.xlabel('rho')
plt.ylabel('sigma');
```

The lines above encodes a **chain rule** decomposition: first we specify the marginal distribution of \(\rho\), then we specify the distribution of \(\sigma\) given \(\rho\). A standard dictionary in Python is unordered: there is no way to make sure that the keys appear in a certain order. Thus we use instead an `OrderedDict`

, and define first the distribution of \(\rho\), then the distribution of \(\sigma\) given \(\rho\); `Cond`

is a particular `ProbDist`

class
that defines a conditional distribution, based on a function that takes an argument `theta`

, and returns a `ProbDist`

object.

All the example above involve univariate distributions; however, the components of `StructDist`

also accept multivariate distributions.

```
[5]:
```

```
reg_prior_dict = OrderedDict()
reg_prior_dict['sigma2'] = dists.InvGamma(a=2., b=3.)
reg_prior_dict['beta'] = dists.MvNormal(cov=np.eye(20))
reg_prior = dists.StructDist(reg_prior_dict)
reg_theta = reg_prior.rvs(size=200)
```

## Bayesian inference for state-space models¶

We return to the simplified stochastic volatility introduced in the basic tutorial:

\begin{align*} X_0 & \sim N\left(\mu, \frac{\sigma^2}{1-\rho^2}\right) \\ X_t|X_{t-1}=x_{t-1} & \sim N\left( \mu + \rho (x_{t-1}-\mu), \sigma^2\right) \\ Y_t|X_t=x_t & \sim N\left(0, e^{x_t}\right) \end{align*}

which we implemented as follows (this time with default values for the parameters):

```
[6]:
```

```
from particles import state_space_models as ssm
class StochVol(ssm.StateSpaceModel):
default_parameters = {'mu':-1., 'rho':0.95, 'sigma': 0.2}
def PX0(self): # Distribution of X_0
return dists.Normal(loc=self.mu, scale=self.sigma / np.sqrt(1. - self.rho**2))
def PX(self, t, xp): # Distribution of X_t given X_{t-1}=xp (p=past)
return dists.Normal(loc=self.mu + self.rho * (xp - self.mu), scale=self.sigma)
def PY(self, t, xp, x): # Distribution of Y_t given X_t=x (and possibly X_{t-1}=xp)
return dists.Normal(loc=0., scale=np.exp(x))
```

We mentioned in the basic tutorial that `StochVol`

represents the parameteric class of univariate stochastic volatility model. Indeed, `StochVol`

will be the object we pass to Bayesian inference algorithms (such as PMMH or SMC\(^2\)) in order to perform inference with respect to that class of models.

## PMMH (Particle marginal Metropolis-Hastings)¶

Let’s try first PMMH. This is a Metropolis-Hastings algorithm that samples from the posterior of parameter \(\theta\) (given the data). However, since the corresponding likelihood is intractable, each iteration of PMMH runs a particle filter that approximates it.

```
[9]:
```

```
from particles import mcmc
from particles import datasets as dts # real datasets available in the package
# real data
T = 50
data = dts.GBP_vs_USD_9798().data[:T]
my_pmmh = mcmc.PMMH(ssm_cls=StochVol, prior=my_prior, data=data, Nx=200,
niter=1000)
my_pmmh.run(); # may take several seconds...
```

The arguments we set when instantiating class `PMMH`

requires little explanation; just in case:

`Nx`

is the number of particles (for the particle filter run at each iteration);`niter`

is the number of MCMC iterations.

Upon completion, object `my_pmmh.chain`

is a `ThetaParticles`

object, with the following attributes:

`my_pmmh.chain.theta`

is a structured array of size 10 (the number of iterations) with keys`'mu'`

,`'rho'`

and`'sigma'`

;`my_pmmh.chain.lpost`

is an array of length 10, contanining the (estimated) log-posterior density for each simulated \(\theta\).

Let’s plot the mcmc traces.

```
[10]:
```

```
for p in prior_dict.keys(): # loop over mu, theta, rho
plt.figure()
plt.plot(my_pmmh.chain.theta[p])
plt.xlabel('iter')
plt.ylabel(p)
```

You might wonder what type of Metropolis sampler is really implemented here:

the starting point of the chain is sampled from the prior; you may instead set it to a specific value using option

`starting_point`

(when instantiating`PMMH`

);the proposal is an

**adaptative**Gaussian random walk: this means that the covariance matrix of the random step is calibrated on the fly on past simulations (using vanishing adaptation). This may be disabled by setting option`adaptive=False`

;a bootstrap filter is run to approximate the log-likelihood; you may use a different filter (e.g. a guided filter) by passing a

`FeynmanKac`

class to option`fk_cls`

;you may also want to pass various parameters to each call to

`SMC`

through (dict) argument`smc_options`

; e.g.`smc_options={'qmc': True}`

will make each particle filter a SQMC algorithm.

Thus, by and large, quite a lot of flexibility is hidden behind this default behaviour.

## Particle Gibbs¶

PMMH is just a particular instance of the general family of PMCMC samplers; that is MCMC samplers that run some particle filter at each iteration. Another instance is Particle Gibbs (PG), where one simulate alternatively: 1. from the distribution of \(\theta\) given the states and the data; 2. renew the state trajectory through a CSMC (conditional SMC step).

Since Step 1 is model- (and user-)dependent, you need to define it for the model you are considering. This is done by sub-classing `ParticleGibbs`

and defining method `update_theta`

as follows:

```
[11]:
```

```
class PGStochVol(mcmc.ParticleGibbs):
def update_theta(self, theta, x):
new_theta = theta.copy()
sigma, rho = 0.2, 0.95 # fixed values
xlag = np.array(x[1:] + [0.,])
dx = (x - rho * xlag) / (1. - rho)
s = sigma / (1. - rho)**2
new_theta['mu'] = self.prior.laws['mu'].posterior(dx, sigma=s).rvs()
return new_theta
```

For simplicity \(\rho\) and \(\sigma\) are kept constant; only \(\mu\) is updated. This means we are actually sampling from the posterior of \(\mu\) given the data, while these other parameters are kept constant. Let’s run our PG algorithm:

```
[12]:
```

```
pg = PGStochVol(ssm_cls=StochVol, data=data, prior=my_prior, Nx=200, niter=1000)
pg.run() # may take several seconds...
```

Now let’s plot the results:

```
[16]:
```

```
plt.plot(pg.chain.theta['mu'])
plt.xlabel('iter')
plt.ylabel(r'$\mu$')
plt.figure()
plt.hist(pg.chain.theta['mu'][20:], 50)
plt.xlabel(r'$\mu$');
```

## SMC^2¶

Finally, we consider SMC\(^2\), a SMC algorithm that makes it possible to approximate:

all the partial posteriors (of \(\theta\) given \(y_{0:t}\), for \(t=0, 1, ..., T\)) rather than only the final posterior;

the marginal likelihoods of the data.

SMC\(^2\) is a two-level SMC sampler:

it simulates many \(\theta\)-values from the prior, and update their weights recursively, according to the likelihood of each new datapoint;

however, since these likelihood factors are intractable, for each \(\theta\), a particle filter is run to approximate it; hence a number \(N_x\) of \(x\)-particles are generated for, and attached to, each \(\theta\).

The class `SMC2`

is defined inside module `smc_samplers`

. It is run in the same way as the other SMC algorithms.

```
[17]:
```

```
import particles
from particles import smc_samplers as ssp
fk_smc2 = ssp.SMC2(ssm_cls=StochVol, data=data, prior=my_prior,init_Nx=50,
ar_to_increase_Nx=0.1)
alg_smc2 = particles.SMC(fk=fk_smc2, N=500)
alg_smc2.run()
```

Again, a few choices are made for you by default:

A bootstrap filter is run for each \(\theta-\)particle; this may be changed by setting option

`fk_class`

while instantiating`SMC2`

; e.g.`fk_class=ssm.GuidedPF`

will run instead a guided filter.Option

`init_Nx`

determines the**initial**number of \(x-\)particles; the algorithm automatically increases \(N_x\) each time the acceptance rate drops below \(10%\) (as specified through option`ar_to_increase=0.1`

. Set this this option to`0.`

if you do not want to increase \(N_x\) in the course of t he algorithm.The particle filters (in the \(x-\)dimension) are run with the default options of class

`SMC`

; e.g. resampling is set to`systematic`

and so on; other options may be set by using option`smc_options`

.

```
[19]:
```

```
plt.scatter(alg_smc2.X.theta['mu'], alg_smc2.X.theta['rho'])
plt.xlabel(r'$\mu$')
plt.ylabel(r'$\rho$');
```