resampling

Resampling and related numerical algorithms.

Overview

This module implements resampling schemes, plus some basic numerical functions related to weights and weighted data (ESS, weighted mean, etc). The recommended import is:

from particles import resampling as rs

Resampling is covered in Chapter 9 of the book.

Resampling schemes

All the resampling schemes are implemented as functions with the following signature:

A = rs.scheme(W, M=None)

where:

  • W is a vector of N normalised weights (i.e. positive and summing to one).
  • M (int) is the number of resampled indices that must be generated; (optional, set to N if not provided).
  • A is a ndarray containing the M resampled indices (i.e. ints in the range 0, …, N-1).

Here the list of currently implemented resampling schemes:

Alternative ways to sample from a multinomial distribution

Function multinomial effectively samples M times from the multinomial distribution that produces output n with probability W[n]. It does so using an algorithm with complexity O(M+N), as explained in Section 9.4 of the book. However, this function is not really suited:

  1. if you want to draw only once from that distribution;
  2. If you do not know in advance how many draws you need.

The two functions below cover these scenarios:

Weights objects

Objects of class SMC, which represent the output of a particle filter, have an attribute called wgts, which is an instance of class Weights. The attributes of that object are:

  • lw: the N un-normalised log-weights
  • W: the N normalised weights (sum equals one)
  • ESS: the effective sample size (1/sum(W^2))

For instance:

pf = particles.SMC(fk=some_fk_model, N=100)
pf.run()
print(pf.wgts.ESS)  # The ESS of the final weights

The rest of this section should be of interest only to advanced users (who wish for instance to subclass SMC in order to define new particle algorithms). Basically, class Weights is used to automate and abstract away the computation of the normalised weights and their ESS. Here is a quick example:

from numpy import random

wgts = rs.Weights(lw=random.randn(10))  # we provide log-weights
print(wgts.W)  # the normalised weights have been computed automatically
print(wgts.ESS)  # and so the ESS
incr_lw = 3. * random.randn(10)  # incremental weights
new_wgts = wgts.add(incr_lw)
print(new_wgts.ESS)  # the ESS of the new weights

A technical point is that Weights objects should be considered as immutable: in particular method add returns a new Weights object. Trying to modify directly (in place) a Weights object may introduce hairy bugs. Basically, SMC and the methods of particles.smoothing.ParticleHistory do not copy such objects, so if you modify them later, then you also modify the version that has been stored at a previous iteration.

Other functions of interest

In particles, importance weights and similar quantities are always computed and stored on the log-scale, to avoid numerical overflow. This module also contains a few basic functions to deal with log-weights:

Module summary

multinomial(W, M) Multinomial resampling.
residual(W, M) Residual resampling.
stratified(W, M) Stratified resampling.
systematic(W, M) Systematic resampling.
multinomial_once(W) Sample once from a Multinomial distribution
MultinomialQueue(W[, M]) On-the-fly generator for the multinomial distribution.
Weights([lw]) A class to store N log-weights, and automatically compute normalised weights and their ESS.
inverse_cdf Inverse CDF algorithm for a finite distribution.
uniform_spacings(N) Generate ordered uniform variates in O(N) time.
ssp(W, M) Ssp resampling.
essl(lw) ESS (Effective sample size) computed from log-weights.
exp_and_normalise(lw) Exponentiate, then normalise (so that sum equals one).
log_mean_exp(v[, W]) Returns log of (weighted) mean of exp(v).
log_sum_exp(v) Log of the sum of the exp of the arguments.
wmean_and_var(W, x) Component-wise weighted mean and variance.
wmean_and_var_str_array(W, x) Weighted mean and variance of each component of a structured array.
wquantiles(W, x[, alphas]) Quantiles for weighted data.