**distributions**¶

Probability distributions as Python objects.

## Overview¶

This module lets users define probability distributions as Python objects.

The probability distributions defined in this module may be used:

- to define state-space models (see module state_space_models);
- to define a prior distribution, in order to perform parameter estimation (see modules smc_samplers and mcmc).

## Univariate distributions¶

The module defines the following classes of univariate continuous distributions:

class (with signature) | comments |
---|---|

Beta(a=1., b=1.) | |

Dirac(loc=0.) | Dirac mass at point loc |

FlatNormal(loc=0.) | Normal with inf variance (missing data) |

Gamma(a=1., b=1.) | scale = 1/b |

InvGamma(a=1., b=1.) | Distribution of 1/X for X~Gamma(a,b) |

Laplace(loc=0., scale=1.) | |

Logistic(loc=0., scale=1.) | |

LogNormal(mu=0., sigma=1.) | Dist of Y=e^X, X ~ N(μ, σ^2) |

Normal(loc=0., scale=1.) | N(loc,scale^2) distribution |

Student(loc=0., scale=1., df=3) | |

TruncNormal(mu=0, sigma=1., a=0., b=1.) | N(mu, sigma^2) truncated to interval [a,b] |

Uniform(a=0., b=1.) | uniform over interval [a,b] |

and the following classes of univariate discrete distributions:

class (with signature) | comments |
---|---|

Binomial(n=1, p=0.5) | |

Categorical(p=None) | returns i with prob p[i] |

DiscreteUniform(lo=0, hi=2) | uniform over a, …, b-1 |

Geometric(p=0.5) | |

Poisson(rate=1.) | Poisson with expectation `rate` |

Note that all the parameters of these distributions have default values, e.g.:

```
some_norm = Normal(loc=2.4) # N(2.4, 1)
some_gam = Gamma() # Gamma(1, 1)
```

## Mixture distributions (new in version 0.4)¶

A (univariate) mixture distribution may be specified as follows:

```
mix = Mixture([0.5, 0.5], Normal(loc=-1), Normal(loc=1.))
```

The first argument is the vector of probabilities, the next arguments are the k component distributions.

See also `MixMissing`

for defining a mixture distributions, between one
component that generates the label “missing”, and another component:

```
mixmiss = MixMissing(pmiss=0.1, base_dist=Normal(loc=2.))
```

This particular distribution is useful to specify a state-space model where the observation may be missing with a certain probability.

## Transformed distributions¶

To further enrich the list of available univariate distributions, the module
lets you define **transformed distributions**, that is, the distribution of
Y=f(X), for a certain function f, and a certain base distribution for X.

class name (and signature) | description |
---|---|

LinearD(base_dist, a=1., b=0.) | Y = a * X + b |

LogD(base_dist) | Y = log(X) |

LogitD(base_dist, a=0., b=1.) | Y = logit( (X-a)/(b-a) ) |

A quick example:

```
from particles import distributions as dists
d = dists.LogD(dists.Gamma(a=2., b=2.)) # law of Y=log(X), X~Gamma(2, 2)
```

Note

These transforms are often used to obtain random variables defined over the full real line. This is convenient in particular when implementing random walk Metropolis steps.

## Multivariate distributions¶

The module implements one multivariate distribution class, for Gaussian
distributions; see `MvNormal`

.

Furthermore, the module provides two ways to construct multivariate distributions from a collection of univariate distributions:

`IndepProd`

: product of independent distributions; mainly used to define state-space models.`StructDist`

: distributions for named variables; mainly used to specify prior distributions; see modules smc_samplers and mcmc (and the corresponding tutorials).

## Under the hood¶

Probability distributions are represented as objects of classes that inherit
from base class `ProbDist`

, and implement the following methods:

`logpdf(self, x)`

: computes the log-pdf (probability density function) at point`x`

;`rvs(self, size=None)`

: simulates`size`

random variates; (if set to None, number of samples is either one if all parameters are scalar, or the same number as the common size of the parameters, see below);`ppf(self, u)`

: computes the quantile function (or Rosenblatt transform for a multivariate distribution) at point`u`

.

A quick example:

```
some_dist = dists.Normal(loc=2., scale=3.)
x = some_dist.rvs(size=30) # a (30,) ndarray containing IID N(2, 3^2) variates
z = some_dist.logpdf(x) # a (30,) ndarray containing the log-pdf at x
```

By default, the inputs and outputs of these methods are either scalars or Numpy arrays (with appropriate type and shape). In particular, passing a Numpy array to a distribution parameter makes it possible to define “array distributions”. For instance:

```
some_dist = dists.Normal(loc=np.arange(1., 11.))
x = some_dist.rvs(size=10)
```

generates 10 Gaussian-distributed variates, with respective means 1., …, 10. This is how we manage to define “Markov kernels” in state-space models; e.g. when defining the distribution of X_t given X_{t-1} in a state-space model:

```
class StochVol(ssm.StateSpaceModel):
def PX(self, t, xp, x):
return stats.norm(loc=xp)
### ... see module state_space_models for more details
```

Then, in practice, in e.g. the bootstrap filter, when we generate particles
X_t^n, we call method `PX`

and pass as an argument a numpy array of shape
(N,) containing the N ancestors.

Note

ProbDist objects are roughly similar to the frozen distributions of package
`scipy.stats`

. However, they are not equivalent. Using such a
frozen distribution when e.g. defining a state-space model will return an
error.

## Posterior distributions¶

A few classes also implement a `posterior`

method, which returns the posterior
distribution that corresponds to a prior set to `self`

, a model which is
conjugate for the considered class, and some data. Here is a quick example:

```
from particles import distributions as dists
prior = dists.InvGamma(a=.3, b=.3)
data = random.randn(20) # 20 points generated from N(0,1)
post = prior.posterior(data)
# prior is conjugate wrt model X_1, ..., X_n ~ N(0, theta)
print("posterior is Gamma(%f, %f)" % (post.a, post.b))
```

Here is a list of distributions implementing posteriors:

Distribution | Corresponding model | comments |
---|---|---|

InvGamma | N(0, theta) | |

Gamma | N(0, 1/theta) | |

Normal | N(theta, sigma^2) | sigma fixed (passed as extra argument) |

TruncNormal | same | |

MvNormal | N(theta, Sigma) | Sigma fixed (passed as extra argument) |

## Implementing your own distributions¶

If you would like to create your own univariate probability distribution, the
easiest way to do so is to sub-class `ProbDist`

, for a continuous distribution,
or `DiscreteDist`

, for a discrete distribution. This will properly set class
attributes `dim`

(the dimension, set to one, for a univariate distribution),
and `dtype`

, so that they play nicely with `StructDist`

and so on. You will
also have to properly define methods `rvs`

, `logpdf`

and `ppf`

. You may
omit `ppf`

if you do not plan to use SQMC (Sequential quasi Monte Carlo).

## Summary of module¶

`DiscreteDist` |
Base class for discrete probability distributions. |

`Beta` |
Beta(a,b) distribution. |

`Dirac` |
Dirac mass. |

`FlatNormal` |
Normal with infinite variance. |

`Gamma` |
Gamma(a,b) distribution, scale=1/b. |

`InvGamma` |
Inverse Gamma(a,b) distribution. |

`Laplace` |
Laplace(loc,scale) distribution. |

`Logistic` |
Logistic(loc, scale) distribution. |

`LogNormal` |
Distribution of Y=e^X, with X ~ N(mu, sigma^2). |

`Normal` |
N(loc, scale^2) distribution. |

`Student` |
Student distribution. |

`TruncNormal` |
Normal(mu, sigma^2) truncated to [a, b] interval. |

`Uniform` |
Uniform([a,b]) distribution. |

`Binomial` |
Binomial(n,p) distribution. |

`Geometric` |
Geometric(p) distribution. |

`Poisson` |
Poisson(rate) distribution. |

`LinearD` |
Distribution of Y = a*X + b. |

`LogD` |
Distribution of Y = log(X). |

`LogitD` |
Distributions of Y=logit((X-a)/(b-a)). |

`MvNormal` |
Multivariate Normal distribution. |

`IndepProd` |
Product of independent univariate distributions. |

`ProbDist` |
Base class for probability distributions. |

`StructDist` |
A distribution such that inputs/outputs are structured arrays. |