particles.utils

Non-numerical utilities (notably for parallel computation).

Overview

This module gathers several non-numerical utilities. The only one of direct interest to the user is the multiplexer function, which we now describe briefly.

Say we have some function f, which takes only keyword arguments:

def f(x=0, y=0, z=0):
    return x + y + z**2

We wish to evaluate f repetitively for a range of x, y and/or z values. To do so, we may use function multiplexer as follows:

results = multiplexer(f=f, x=3, y=[2, 4, 6], z=[3, 5])

which returns a list of 3*2 dictionaries of the form:

[ {'x':3, 'y':2, 'z':3, 'out':14},  # 14=f(3, 2, 3)
  {'x':3, 'y':2, 'z':5, 'out':30},
  {'x':3, 'y':4, 'z':3, 'out':16},
   ... ]

In other words, multiplexer computes the Cartesian product of the inputs.

For each argument, you may use a dictionary instead of a list:

results = multiplexer(f=f, z={'good': 3, 'bad': 5})

In that case, the values of the dictionaries are used in the same way as above, but the output reports the corresponding keys, i.e.:

[ {'z': 'good', 'out': 12},  # f(0, 0, 3)
  {'z': 'bad', 'out': 28}    # f(0, 0, 5)
]

This is useful when f takes as arguments complex objects that you would like to replace by more legible labels; e.g. option ` model` of class SMC.

multiplexer also accepts three extra keyword arguments (whose name may not therefore be used as keyword arguments for function f):

  • nprocs (default=1): if >0, number of CPU cores to use in parallel; if <=0, number of cores not to use; in particular, nprocs=0 means all CPU cores must be used.

  • nruns (default=1): evaluate f nruns time for each combination of arguments; an entry run (ranging from 0 to nruns-1) is added to the output dictionaries.

  • seeding (default: True if nruns>1, False otherwise): if True, seeds the pseudo-random generator before each call of function f with a different seed; see below.

Warning

Parallel processing relies on library joblib, which generates identical workers, up to the state of the Numpy random generator. If your function involves random numbers: (a) set option seeding to True (otherwise, you will get identical results from all your workers); (b) make sure the function f does not rely on scipy frozen distributions, as these distributions also freeze the states. For instance, do not use any frozen distribution when defining your own Feynman-Kac object.

See also

multiSMC

Functions

add_to_dict(d, obj[, key])

cartesian_args(args, listargs, dictargs)

Compute a list of inputs and outputs for a function with kw arguments.

cartesian_lists(d)

turns a dict of lists into a list of dicts that represents the cartesian product of the initial lists

distinct_seeds(k)

generates distinct seeds for random number generation.

distribute_work(f, inputs[, outputs, ...])

For each input i (a dict) in list inputs, evaluate f(**i) using multiprocessing if nprocs>1

multiplexer([f, nruns, nprocs, seeding, ...])

Evaluate a function for different parameters, optionally in parallel.

timer(method)

worker(qin, qout, f)

Worker for muliprocessing.

Classes

seeder(func)