API

This file describes the classes and methods available in ELFI.

Modelling API

Below is the API for creating generative models.

elfi.ElfiModel([name, observed, source_net])

A container for the inference model.

General model nodes

elfi.Constant(value, **kwargs)

A node holding a constant value.

elfi.Operation(fn, *parents, **kwargs)

A generic deterministic operation node.

elfi.RandomVariable(distribution, *params[, ...])

A node that draws values from a random distribution.

LFI nodes

elfi.Prior(distribution, *params[, size])

A parameter node of an ELFI graph.

elfi.Simulator(fn, *params, **kwargs)

A simulator node of an ELFI graph.

elfi.Summary(fn, *parents, **kwargs)

A summary node of an ELFI graph.

elfi.Discrepancy(discrepancy, *parents, **kwargs)

A discrepancy node of an ELFI graph.

elfi.Distance(distance, *summaries, **kwargs)

A convenience class for the discrepancy node.

elfi.AdaptiveDistance(*summaries, **kwargs)

Euclidean (2-norm) distance calculation with adaptive scale.

Other

elfi.new_model([name, set_default])

Create a new ElfiModel instance.

elfi.load_model(name[, prefix, set_default])

Load the pickled ElfiModel.

elfi.get_default_model()

Return the current default ElfiModel instance.

elfi.set_default_model([model])

Set the current default ElfiModel instance.

elfi.draw(G[, internal, param_names, ...])

Draw the ElfiModel.

elfi.plot_params_vs_node(node[, n_samples, ...])

Plot some realizations of parameters vs.

Inference API

Below is a list of inference methods included in ELFI.

elfi.Rejection(model[, discrepancy_name, ...])

Parallel ABC rejection sampler.

elfi.SMC(model[, discrepancy_name, output_names])

Sequential Monte Carlo ABC sampler.

elfi.AdaptiveDistanceSMC(model[, ...])

SMC-ABC sampler with adaptive threshold and distance function.

elfi.AdaptiveThresholdSMC(model[, ...])

ABC-SMC sampler with adaptive threshold selection.

elfi.BayesianOptimization(model[, ...])

Bayesian Optimization of an unknown target function.

elfi.BOLFI(model[, target_name, bounds, ...])

Bayesian Optimization for Likelihood-Free Inference (BOLFI).

elfi.ROMC(model[, bounds, discrepancy_name, ...])

Robust Optimisation Monte Carlo inference method.

elfi.BSL(model, n_sim_round[, ...])

Bayesian Synthetic Likelihood for parameter inference.

Result objects

OptimizationResult(x_min, **kwargs)

Base class for results from optimization.

Sample(method_name, outputs, parameter_names)

Sampling results from inference methods.

SmcSample(method_name, outputs, ...)

Container for results from SMC-ABC.

BolfiSample(method_name, chains, ...)

Container for results from BOLFI.

Post-processing

elfi.adjust_posterior(sample, model, ...[, ...])

Adjust the posterior using local regression.

LinearAdjustment(**kwargs)

Regression adjustment using a local linear model.

Diagnostics

elfi.TwoStageSelection(simulator, fn_distance)

Perform the summary-statistics selection proposed by Nunes and Balding (2010).

Acquisition methods

LCBSC(*args[, delta, additive_cost])

Lower Confidence Bound Selection Criterion.

MaxVar(model, prior[, quantile_eps])

The maximum variance acquisition method.

RandMaxVar(model, prior[, quantile_eps, ...])

The randomised maximum variance acquisition method.

ExpIntVar(model, prior[, quantile_eps, ...])

The Expected Integrated Variance (ExpIntVar) acquisition method.

UniformAcquisition(model[, prior, n_inits, ...])

Acquisition from uniform distribution.

Other

Data pools

elfi.OutputPool([outputs, name, prefix])

Store node outputs to dictionary-like stores.

elfi.ArrayPool([outputs, name, prefix])

OutputPool that uses binary .npy files as default stores.

Module functions

elfi.get_client()

Get the current ELFI client instance.

elfi.set_client([client])

Set the current ELFI client instance.

Tools

elfi.tools.vectorize(operation[, constants, ...])

Vectorize an operation.

elfi.tools.external_operation(command[, ...])

Wrap an external command as a Python callable (function).

Class documentations

Modelling API classes

class elfi.ElfiModel(name=None, observed=None, source_net=None)[source]

A container for the inference model.

The ElfiModel is a directed acyclic graph (DAG), whose nodes represent parts of the inference task, for example the parameters to be inferred, the simulator or a summary statistic.

Initialize the inference model.

Parameters:
  • name (str, optional) –

  • observed (dict, optional) – Observed data with node names as keys.

  • source_net (nx.DiGraph, optional) –

  • set_current (bool, optional) – Sets this model as the current (default) ELFI model

copy()[source]

Return a copy of the ElfiModel instance.

Return type:

ElfiModel

generate(batch_size=1, outputs=None, with_values=None, seed=None)[source]

Generate a batch of outputs.

This method is useful for testing that the ELFI graph works.

Parameters:
  • batch_size (int, optional) –

  • outputs (list, optional) –

  • with_values (dict, optional) – You can specify values for nodes to use when generating data

  • seed (int, optional) – Defaults to global numpy seed.

get_reference(name)[source]

Return a new reference object for a node in the model.

Parameters:

name (str) –

get_state(name)[source]

Return the state of the node.

Parameters:

name (str) –

classmethod load(name, prefix)[source]

Load the pickled ElfiModel.

Assumes there exists a file “name.pkl” in the current directory.

Parameters:
  • name (str) – Name of the model file to load (without the .pkl extension).

  • prefix (str) – Path to directory where the model file is located, optional.

Return type:

ElfiModel

property name

Return name of the model.

property observed

Return the observed data for the nodes in a dictionary.

property parameter_names

Return a list of model parameter names in an alphabetical order.

remove_node(name)[source]

Remove a node from the graph.

Parameters:

name (str) –

save(prefix=None)[source]

Save the current model to pickled file.

Parameters:

prefix (str, optional) – Path to the directory under which to save the model. Default is the current working directory.

update_node(name, updating_name)[source]

Update node with updating_node in the model.

The node with name name gets the state (operation), parents and observed data (if applicable) of the updating_node. The updating node is then removed from the graph.

Parameters:
  • name (str) –

  • updating_name (str) –

class elfi.Constant(value, **kwargs)[source]

A node holding a constant value.

Initialize a node holding a constant value.

Parameters:

value – The constant value of the node.

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property state

Return the state dictionary of the node.

class elfi.Operation(fn, *parents, **kwargs)[source]

A generic deterministic operation node.

Initialize a node that performs an operation.

Parameters:

fn (callable) – The operation of the node.

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property state

Return the state dictionary of the node.

class elfi.RandomVariable(distribution, *params, size=None, **kwargs)[source]

A node that draws values from a random distribution.

Initialize a node that represents a random variable.

Parameters:
  • distribution (str or scipy-like distribution object) –

  • params (params of the distribution) –

  • size (int, tuple or None, optional) – Output size of a single random draw.

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

static compile_operation(state)[source]

Compile a callable operation that samples the associated distribution.

Parameters:

state (dict) –

property distribution

Return the distribution object.

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property size

Return the size of the output from the distribution.

property state

Return the state dictionary of the node.

class elfi.Prior(distribution, *params, size=None, **kwargs)[source]

A parameter node of an ELFI graph.

Initialize a Prior.

Parameters:
  • distribution (str, object) – Any distribution from scipy.stats, either as a string or an object. Objects must implement at least an rvs method with signature rvs(*parameters, size, random_state). Can also be a custom distribution object that implements at least an rvs method. Many of the algorithms also require the pdf and logpdf methods to be available.

  • size (int, tuple or None, optional) – Output size of a single random draw.

  • params – Parameters of the prior distribution

  • kwargs

Notes

The parameters of the scipy distributions (typically loc and scale) must be given as positional arguments.

Many algorithms (e.g. SMC) also require a pdf method for the distribution. In general the definition of the distribution is a subset of scipy.stats.rv_continuous.

Scipy distributions: https://docs.scipy.org/doc/scipy-0.19.0/reference/stats.html

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

static compile_operation(state)

Compile a callable operation that samples the associated distribution.

Parameters:

state (dict) –

property distribution

Return the distribution object.

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property size

Return the size of the output from the distribution.

property state

Return the state dictionary of the node.

class elfi.Simulator(fn, *params, **kwargs)[source]

A simulator node of an ELFI graph.

Simulator nodes are stochastic and may have observed data in the model.

Initialize a Simulator.

Parameters:
  • fn (callable) – Simulator function with a signature sim(*params, batch_size, random_state)

  • params – Input parameters for the simulator.

  • kwargs

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property state

Return the state dictionary of the node.

class elfi.Summary(fn, *parents, **kwargs)[source]

A summary node of an ELFI graph.

Summary nodes are deterministic operations associated with the observed data. if their parents hold observed data it will be automatically transformed.

Initialize a Summary.

Parameters:
  • fn (callable) – Summary function with a signature summary(*parents)

  • parents – Input data for the summary function.

  • kwargs

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property state

Return the state dictionary of the node.

class elfi.Discrepancy(discrepancy, *parents, **kwargs)[source]

A discrepancy node of an ELFI graph.

This class provides a convenience node for custom distance operations.

Initialize a Discrepancy.

Parameters:
  • discrepancy (callable) – Signature of the discrepancy function is of the form: discrepancy(summary_1, summary_2, …, observed), where summaries are arrays containing batch_size simulated values and observed is a tuple (observed_summary_1, observed_summary_2, …). The callable object should return a vector of discrepancies between the simulated summaries and the observed summaries.

  • *parents – Typically the summaries for the discrepancy function.

  • **kwargs

See also

elfi.Distance

creating common distance discrepancies.

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property state

Return the state dictionary of the node.

class elfi.Distance(distance, *summaries, **kwargs)[source]

A convenience class for the discrepancy node.

Initialize a distance node of an ELFI graph.

This class contains many common distance implementations through scipy.

Parameters:
  • distance (str, callable) –

    If string it must be a valid metric from scipy.spatial.distance.cdist.

    Is a callable, the signature must be distance(X, Y), where X is a n x m array containing n simulated values (summaries) in rows and Y is a 1 x m array that contains the observed values (summaries). The callable should return a vector of distances between the simulated summaries and the observed summaries.

  • *summaries – Summary nodes of the model.

  • **kwargs – Additional parameters may be required depending on the chosen distance. See the scipy documentation. (The support is not exhaustive.) ELFI-related kwargs are passed on to elfi.Discrepancy.

Examples

>>> d = elfi.Distance('euclidean', summary1, summary2...) 
>>> d = elfi.Distance('minkowski', summary, p=1) 

Notes

Your summaries need to be scalars or vectors for this method to work. The summaries will be first stacked to a single 2D array with the simulated summaries in the rows for every simulation and the distance is taken row wise against the corresponding observed summary vector.

Scipy distances: https://docs.scipy.org/doc/scipy/reference/generated/generated/scipy.spatial.distance.cdist.html # noqa

See also

elfi.Discrepancy

A general discrepancy node

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property state

Return the state dictionary of the node.

class elfi.AdaptiveDistance(*summaries, **kwargs)[source]

Euclidean (2-norm) distance calculation with adaptive scale.

Summary statistics are normalised to vary on similar scales.

References

Prangle D (2017). Adapting the ABC Distance Function. Bayesian Analysis 12(1):289-309, 2017. https://projecteuclid.org/euclid.ba/1460641065

Initialize an AdaptiveDistance.

Parameters:
  • *summaries – Summary nodes of the model.

  • **kwargs

Notes

Your summaries need to be scalars or vectors for this method to work. The summaries will be first stacked to a single 2D array with the simulated summaries in the rows for every simulation and the distances are taken row wise against the corresponding observed summary vector.

add_data(*data)[source]

Add summaries data to update estimated standard deviation.

Parameters:

*data – Summary nodes output data.

Notes

Standard deviation is computed with Welford’s online algorithm.

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:

other_node (NodeReference) –

generate(batch_size=1, with_values=None)

Generate output from this node.

Useful for testing.

Parameters:
  • batch_size (int, optional) –

  • with_values (dict, optional) –

init_adaptation_round()[source]

Initialise data stores to start a new adaptation round.

init_state()[source]

Initialise adaptive distance state.

nested_distance(u, v)[source]

Compute distance between simulated and observed summaries.

Parameters:
  • u (ndarray) – 2D array with M x (num summaries) observations

  • v (ndarray) – 2D array with 1 x (num summaries) observations

Returns:

2D array with M x (num distance functions) distances

Return type:

ndarray

property parents

Get all positional parent nodes (inputs) of this node.

Returns:

parents – List of positional parents

Return type:

list

classmethod reference(name, model)

Construct a reference for an existing node in the model.

Parameters:
  • name (string) – name of the node

  • model (ElfiModel) –

Return type:

NodePointer instance

property state

Return the state dictionary of the node.

update_distance()[source]

Update distance based on accumulated summaries data.

Other

elfi.new_model(name=None, set_default=True)[source]

Create a new ElfiModel instance.

In addition to making a new ElfiModel instance, this method sets the new instance as the default for new nodes.

Parameters:
  • name (str, optional) –

  • set_default (bool, optional) – Whether to set the newly created model as the current model.

elfi.load_model(name, prefix=None, set_default=True)[source]

Load the pickled ElfiModel.

Assumes there exists a file “name.pkl” in the current directory. Also sets the loaded model as the default model for new nodes.

Parameters:
  • name (str) – Name of the model file to load (without the .pkl extension).

  • prefix (str) – Path to directory where the model file is located, optional.

  • set_default (bool, optional) – Set the loaded model as the default model. Default is True.

Return type:

ElfiModel

elfi.get_default_model()[source]

Return the current default ElfiModel instance.

New nodes will be added to this model by default.

elfi.set_default_model(model=None)[source]

Set the current default ElfiModel instance.

New nodes will be placed the given model by default.

Parameters:

model (ElfiModel, optional) – If None, creates a new ElfiModel.

elfi.draw(G, internal=False, param_names=False, filename=None, format=None)

Draw the ElfiModel.

Parameters:
  • G (nx.DiGraph or ElfiModel) – Graph or model to draw

  • internal (boolean, optional) – Whether to draw internal nodes (starting with an underscore)

  • param_names (bool, optional) – Show param names on edges

  • filename (str, optional) – If given, save the dot file into the given filename.

  • format (str, optional) – format of the file

Notes

Requires the optional ‘graphviz’ library.

Returns:

A GraphViz dot representation of the model.

Return type:

dot

elfi.plot_params_vs_node(node, n_samples=100, func=None, seed=None, axes=None, **kwargs)[source]

Plot some realizations of parameters vs. node.

Useful e.g. for exploring how a summary statistic varies with parameters. Currently only nodes with scalar output are supported, though a function func can be given to reduce node output. This allows giving the simulator as the node and applying a summarizing function without incorporating it into the ELFI graph.

If node is one of the model parameters, its histogram is plotted.

Parameters:
  • node (elfi.NodeReference) – The node which to evaluate. Its output must be scalar (shape=(batch_size,1)).

  • n_samples (int, optional) – How many samples to plot.

  • func (callable, optional) – A function to apply to node output.

  • seed (int, optional) –

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

Inference API classes

class elfi.Rejection(model, discrepancy_name=None, output_names=None, **kwargs)[source]

Parallel ABC rejection sampler.

For a description of the rejection sampler and a general introduction to ABC, see e.g. Lintusaari et al. 2016.

References

Lintusaari J, Gutmann M U, Dutta R, Kaski S, Corander J (2016). Fundamentals and Recent Developments in Approximate Bayesian Computation. Systematic Biology. http://dx.doi.org/10.1093/sysbio/syw077.

Initialize the Rejection sampler.

Parameters:
  • model (ElfiModel or NodeReference) –

  • discrepancy_name (str, NodeReference, optional) – Only needed if model is an ElfiModel

  • output_names (list, optional) – Additional outputs from the model to be included in the inference result, e.g. corresponding summaries to the acquired samples

  • kwargs – See ParameterInference

property batch_size

Return the current batch_size.

extract_result()[source]

Extract the result from the current state.

Returns:

result

Return type:

Sample

property finished

Check whether objective of n_batches have been reached.

infer(*args, vis=None, bar=True, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Parameters:
  • vis (dict, optional) – Plotting options. More info in self.plot_state method

  • bar (bool, optional) – Flag to remove (False) or keep (True) the progress bar from/in output.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property parameter_names

Return the parameters to be inferred.

plot_state(**options)[source]

Plot the current state of the inference algorithm.

This feature is still experimental and only supports 1d or 2d cases.

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)

Prepare values for a new batch.

ELFI calls this method before submitting a new batch with an increasing index batch_index. This is an optional method to override. Use this if you have a need do do preparations, e.g. in Bayesian optimization algorithm, the next acquisition points would be acquired here.

If you need provide values for certain nodes, you can do so by constructing a batch dictionary and returning it. See e.g. BayesianOptimization for an example.

Parameters:

batch_index (int) – next batch_index to be submitted

Returns:

batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.

Return type:

dict or None

sample(n_samples, *args, **kwargs)

Sample from the approximate posterior.

See the other arguments from the set_objective method.

Parameters:
  • n_samples (int) – Number of samples to generate from the (approximate) posterior

  • *args

  • **kwargs

Returns:

result

Return type:

Sample

property seed

Return the seed of the inference.

set_objective(n_samples, threshold=None, quantile=None, n_sim=None)[source]

Set objective for inference.

Parameters:
  • n_samples (int) – number of samples to generate

  • threshold (float) – Acceptance threshold

  • quantile (float) – In between (0,1). Define the threshold as the p-quantile of all the simulations. n_sim = n_samples/quantile.

  • n_sim (int) – Total number of simulations. The threshold will be the n_samples-th smallest discrepancy among n_sim simulations.

update(batch, batch_index)[source]

Update the inference state with a new batch.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

class elfi.SMC(model, discrepancy_name=None, output_names=None, **kwargs)[source]

Sequential Monte Carlo ABC sampler.

Initialize the SMC-ABC sampler.

Parameters:
  • model (ElfiModel or NodeReference) –

  • discrepancy_name (str, NodeReference, optional) – Only needed if model is an ElfiModel

  • output_names (list, optional) – Additional outputs from the model to be included in the inference result, e.g. corresponding summaries to the acquired samples

  • kwargs – See ParameterInference

property batch_size

Return the current batch_size.

property current_population_threshold

Return the threshold for current population.

extract_result()[source]

Extract the result from the current state.

Return type:

SmcSample

property finished

Check whether objective of n_batches have been reached.

infer(*args, vis=None, bar=True, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Parameters:
  • vis (dict, optional) – Plotting options. More info in self.plot_state method

  • bar (bool, optional) – Flag to remove (False) or keep (True) the progress bar from/in output.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property parameter_names

Return the parameters to be inferred.

plot_state(**kwargs)

Plot the current state of the algorithm.

Parameters:
  • axes (matplotlib.axes.Axes (optional)) –

  • figure (matplotlib.figure.Figure (optional)) –

  • xlim – x-axis limits

  • ylim – y-axis limits

  • interactive (bool (default False)) – If true, uses IPython.display to update the cell figure

  • close – Close figure in the end of plotting. Used in the end of interactive mode.

Return type:

None

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)[source]

Prepare values for a new batch.

Parameters:

batch_index (int) – next batch_index to be submitted

Returns:

batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.

Return type:

dict or None

sample(n_samples, *args, **kwargs)

Sample from the approximate posterior.

See the other arguments from the set_objective method.

Parameters:
  • n_samples (int) – Number of samples to generate from the (approximate) posterior

  • *args

  • **kwargs

Returns:

result

Return type:

Sample

property seed

Return the seed of the inference.

set_objective(n_samples, thresholds=None, quantiles=None)[source]

Set objective for ABC-SMC inference.

Parameters:
  • n_samples (int) – Number of samples to generate

  • thresholds (list, optional) – List of thresholds for ABC-SMC

  • quantiles (list, optional) – List of selection quantiles used to determine sample thresholds

update(batch, batch_index)[source]

Update the inference state with a new batch.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

class elfi.AdaptiveDistanceSMC(model, discrepancy_name=None, output_names=None, **kwargs)[source]

SMC-ABC sampler with adaptive threshold and distance function.

Notes

Algorithm 5 in Prangle (2017)

References

Prangle D (2017). Adapting the ABC Distance Function. Bayesian Analysis 12(1):289-309, 2017. https://projecteuclid.org/euclid.ba/1460641065

Initialize the adaptive distance SMC-ABC sampler.

Parameters:
  • model (ElfiModel or NodeReference) –

  • discrepancy_name (str, NodeReference, optional) – Only needed if model is an ElfiModel

  • output_names (list, optional) – Additional outputs from the model to be included in the inference result, e.g. corresponding summaries to the acquired samples

  • kwargs – See ParameterInference

property batch_size

Return the current batch_size.

property current_population_threshold

Return the threshold for current population.

extract_result()

Extract the result from the current state.

Return type:

SmcSample

property finished

Check whether objective of n_batches have been reached.

infer(*args, vis=None, bar=True, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Parameters:
  • vis (dict, optional) – Plotting options. More info in self.plot_state method

  • bar (bool, optional) – Flag to remove (False) or keep (True) the progress bar from/in output.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property parameter_names

Return the parameters to be inferred.

plot_state(**kwargs)

Plot the current state of the algorithm.

Parameters:
  • axes (matplotlib.axes.Axes (optional)) –

  • figure (matplotlib.figure.Figure (optional)) –

  • xlim – x-axis limits

  • ylim – y-axis limits

  • interactive (bool (default False)) – If true, uses IPython.display to update the cell figure

  • close – Close figure in the end of plotting. Used in the end of interactive mode.

Return type:

None

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)

Prepare values for a new batch.

Parameters:

batch_index (int) – next batch_index to be submitted

Returns:

batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.

Return type:

dict or None

sample(n_samples, *args, **kwargs)

Sample from the approximate posterior.

See the other arguments from the set_objective method.

Parameters:
  • n_samples (int) – Number of samples to generate from the (approximate) posterior

  • *args

  • **kwargs

Returns:

result

Return type:

Sample

property seed

Return the seed of the inference.

set_objective(n_samples, rounds, quantile=0.5)[source]

Set objective for adaptive distance ABC-SMC inference.

Parameters:
  • n_samples (int) – Number of samples to generate

  • rounds (int, optional) – Number of populations to sample

  • quantile (float, optional) – Selection quantile used to determine sample thresholds

update(batch, batch_index)

Update the inference state with a new batch.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

class elfi.AdaptiveThresholdSMC(model, discrepancy_name=None, output_names=None, initial_quantile=0.2, q_threshold=0.99, densratio_estimation=None, **kwargs)[source]

ABC-SMC sampler with adaptive threshold selection.

References

Simola U, Cisewski-Kehe J, Gutmann M U, Corander J (2021). Adaptive Approximate Bayesian Computation Tolerance Selection. Bayesian Analysis. https://doi.org/10.1214/20-BA1211

Initialize the adaptive threshold SMC-ABC sampler.

Parameters:
  • model (ElfiModel or NodeReference) –

  • discrepancy_name (str, NodeReference, optional) – Only needed if model is an ElfiModel

  • output_names (list, optional) – Additional outputs from the model to be included in the inference result, e.g. corresponding summaries to the acquired samples

  • initial_quantile (float, optional) – Initial selection quantile for the first round of adaptive-ABC-SMC

  • q_threshold (float, optional) – Termination criteratia for adaptive-ABC-SMC

  • densratio_estimation (DensityRatioEstimation, optional) – Density ratio estimation object defining parameters for KLIEP

  • kwargs – See ParameterInference

property batch_size

Return the current batch_size.

property current_population_threshold

Return the threshold for current population.

extract_result()

Extract the result from the current state.

Return type:

SmcSample

property finished

Check whether objective of n_batches have been reached.

infer(*args, vis=None, bar=True, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Parameters:
  • vis (dict, optional) – Plotting options. More info in self.plot_state method

  • bar (bool, optional) – Flag to remove (False) or keep (True) the progress bar from/in output.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property parameter_names

Return the parameters to be inferred.

plot_state(**kwargs)

Plot the current state of the algorithm.

Parameters:
  • axes (matplotlib.axes.Axes (optional)) –

  • figure (matplotlib.figure.Figure (optional)) –

  • xlim – x-axis limits

  • ylim – y-axis limits

  • interactive (bool (default False)) – If true, uses IPython.display to update the cell figure

  • close – Close figure in the end of plotting. Used in the end of interactive mode.

Return type:

None

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)

Prepare values for a new batch.

Parameters:

batch_index (int) – next batch_index to be submitted

Returns:

batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.

Return type:

dict or None

sample(n_samples, *args, **kwargs)

Sample from the approximate posterior.

See the other arguments from the set_objective method.

Parameters:
  • n_samples (int) – Number of samples to generate from the (approximate) posterior

  • *args

  • **kwargs

Returns:

result

Return type:

Sample

property seed

Return the seed of the inference.

set_objective(n_samples, max_iter=10)[source]

Set objective for ABC-SMC inference.

Parameters:
  • n_samples (int) – Number of samples to generate

  • thresholds (list, optional) – List of thresholds for ABC-SMC

  • max_iter (int, optional) – Maximum number of iterations

update(batch, batch_index)[source]

Update the inference state with a new batch.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

class elfi.BayesianOptimization(model, target_name=None, bounds=None, initial_evidence=None, update_interval=10, target_model=None, acquisition_method=None, acq_noise_var=0, exploration_rate=10, batch_size=1, batches_per_acquisition=None, async_acq=False, **kwargs)[source]

Bayesian Optimization of an unknown target function.

Initialize Bayesian optimization.

Parameters:
  • model (ElfiModel or NodeReference) –

  • target_name (str or NodeReference) – Only needed if model is an ElfiModel

  • bounds (dict, optional) – The region where to estimate the posterior for each parameter in model.parameters: dict(‘parameter_name’:(lower, upper), … )`. Not used if custom target_model is given.

  • initial_evidence (int, dict, optional) – Number of initial evidence or a precomputed batch dict containing parameter and discrepancy values. Default value depends on the dimensionality.

  • update_interval (int, optional) – How often to update the GP hyperparameters of the target_model

  • target_model (GPyRegression, optional) –

  • acquisition_method (Acquisition, optional) – Method of acquiring evidence points. Defaults to LCBSC.

  • acq_noise_var (float or dict, optional) – Variance(s) of the noise added in the default LCBSC acquisition method. If a dictionary, values should be float specifying the variance for each dimension.

  • exploration_rate (float, optional) – Exploration rate of the acquisition method

  • batch_size (int, optional) – Elfi batch size. Defaults to 1.

  • batches_per_acquisition (int, optional) – How many batches will be requested from the acquisition function at one go. Defaults to max_parallel_batches.

  • async_acq (bool, optional) – Allow acquisitions to be made asynchronously, i.e. do not wait for all the results from the previous acquisition before making the next. This can be more efficient with a large amount of workers (e.g. in cluster environments) but forgoes the guarantee for the exactly same result with the same initial conditions (e.g. the seed). Default False.

  • **kwargs

property acq_batch_size

Return the total number of acquisition per iteration.

property batch_size

Return the current batch_size.

extract_result()[source]

Extract the result from the current state.

Return type:

OptimizationResult

property finished

Check whether objective of n_batches have been reached.

infer(*args, vis=None, bar=True, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Parameters:
  • vis (dict, optional) – Plotting options. More info in self.plot_state method

  • bar (bool, optional) – Flag to remove (False) or keep (True) the progress bar from/in output.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property n_evidence

Return the number of acquired evidence points.

property parameter_names

Return the parameters to be inferred.

plot_discrepancy(axes=None, **kwargs)[source]

Plot acquired parameters vs. resulting discrepancy.

Parameters:

axes (plt.Axes or arraylike of plt.Axes) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_gp(axes=None, resol=50, const=None, bounds=None, true_params=None, **kwargs)[source]

Plot pairwise relationships as a matrix with parameters vs. discrepancy.

Parameters:
  • axes (matplotlib.axes.Axes, optional) –

  • resol (int, optional) – Resolution of the plotted grid.

  • const (np.array, optional) – Values for parameters in plots where held constant. Defaults to minimum evidence.

  • bounds (list of tuples, optional) – List of tuples for axis boundaries.

  • true_params (dict, optional) – Dictionary containing parameter names with corresponding true parameter values.

Returns:

axes

Return type:

np.array of plt.Axes

plot_state(**options)[source]

Plot the GP surface.

This feature is still experimental and currently supports only 2D cases.

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)[source]

Prepare values for a new batch.

Parameters:

batch_index (int) – next batch_index to be submitted

Returns:

batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.

Return type:

dict or None

property seed

Return the seed of the inference.

set_objective(n_evidence=None)[source]

Set objective for inference.

You can continue BO by giving a larger n_evidence.

Parameters:

n_evidence (int) – Number of total evidence for the GP fitting. This includes any initial evidence.

update(batch, batch_index)[source]

Update the GP regression model of the target node with a new batch.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

class elfi.BOLFI(model, target_name=None, bounds=None, initial_evidence=None, update_interval=10, target_model=None, acquisition_method=None, acq_noise_var=0, exploration_rate=10, batch_size=1, batches_per_acquisition=None, async_acq=False, **kwargs)[source]

Bayesian Optimization for Likelihood-Free Inference (BOLFI).

Approximates the discrepancy function by a stochastic regression model. Discrepancy model is fit by sampling the discrepancy function at points decided by the acquisition function.

The method implements the framework introduced in Gutmann & Corander, 2016.

References

Gutmann M U, Corander J (2016). Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models. JMLR 17(125):1−47, 2016. http://jmlr.org/papers/v17/15-017.html

Initialize Bayesian optimization.

Parameters:
  • model (ElfiModel or NodeReference) –

  • target_name (str or NodeReference) – Only needed if model is an ElfiModel

  • bounds (dict, optional) – The region where to estimate the posterior for each parameter in model.parameters: dict(‘parameter_name’:(lower, upper), … )`. Not used if custom target_model is given.

  • initial_evidence (int, dict, optional) – Number of initial evidence or a precomputed batch dict containing parameter and discrepancy values. Default value depends on the dimensionality.

  • update_interval (int, optional) – How often to update the GP hyperparameters of the target_model

  • target_model (GPyRegression, optional) –

  • acquisition_method (Acquisition, optional) – Method of acquiring evidence points. Defaults to LCBSC.

  • acq_noise_var (float or dict, optional) – Variance(s) of the noise added in the default LCBSC acquisition method. If a dictionary, values should be float specifying the variance for each dimension.

  • exploration_rate (float, optional) – Exploration rate of the acquisition method

  • batch_size (int, optional) – Elfi batch size. Defaults to 1.

  • batches_per_acquisition (int, optional) – How many batches will be requested from the acquisition function at one go. Defaults to max_parallel_batches.

  • async_acq (bool, optional) – Allow acquisitions to be made asynchronously, i.e. do not wait for all the results from the previous acquisition before making the next. This can be more efficient with a large amount of workers (e.g. in cluster environments) but forgoes the guarantee for the exactly same result with the same initial conditions (e.g. the seed). Default False.

  • **kwargs

property acq_batch_size

Return the total number of acquisition per iteration.

property batch_size

Return the current batch_size.

extract_posterior(threshold=None)[source]

Return an object representing the approximate posterior.

The approximation is based on surrogate model regression.

Parameters:

threshold (float, optional) – Discrepancy threshold for creating the posterior (log with log discrepancy).

Returns:

posterior

Return type:

elfi.methods.posteriors.BolfiPosterior

extract_result()

Extract the result from the current state.

Return type:

OptimizationResult

property finished

Check whether objective of n_batches have been reached.

fit(n_evidence, threshold=None, bar=True)[source]

Fit the surrogate model.

Generates a regression model for the discrepancy given the parameters.

Currently only Gaussian processes are supported as surrogate models.

Parameters:
  • n_evidence (int, required) – Number of evidence for fitting

  • threshold (float, optional) – Discrepancy threshold for creating the posterior (log with log discrepancy).

  • bar (bool, optional) – Flag to remove (False) the progress bar from output.

infer(*args, vis=None, bar=True, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Parameters:
  • vis (dict, optional) – Plotting options. More info in self.plot_state method

  • bar (bool, optional) – Flag to remove (False) or keep (True) the progress bar from/in output.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property n_evidence

Return the number of acquired evidence points.

property parameter_names

Return the parameters to be inferred.

plot_discrepancy(axes=None, **kwargs)

Plot acquired parameters vs. resulting discrepancy.

Parameters:

axes (plt.Axes or arraylike of plt.Axes) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_gp(axes=None, resol=50, const=None, bounds=None, true_params=None, **kwargs)

Plot pairwise relationships as a matrix with parameters vs. discrepancy.

Parameters:
  • axes (matplotlib.axes.Axes, optional) –

  • resol (int, optional) – Resolution of the plotted grid.

  • const (np.array, optional) – Values for parameters in plots where held constant. Defaults to minimum evidence.

  • bounds (list of tuples, optional) – List of tuples for axis boundaries.

  • true_params (dict, optional) – Dictionary containing parameter names with corresponding true parameter values.

Returns:

axes

Return type:

np.array of plt.Axes

plot_state(**options)

Plot the GP surface.

This feature is still experimental and currently supports only 2D cases.

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)

Prepare values for a new batch.

Parameters:

batch_index (int) – next batch_index to be submitted

Returns:

batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.

Return type:

dict or None

sample(n_samples, warmup=None, n_chains=4, threshold=None, initials=None, algorithm='nuts', sigma_proposals=None, n_evidence=None, **kwargs)[source]

Sample the posterior distribution of BOLFI.

Here the likelihood is defined through the cumulative density function of the standard normal distribution:

L(theta) propto F((h-mu(theta)) / sigma(theta))

where h is the threshold, and mu(theta) and sigma(theta) are the posterior mean and (noisy) standard deviation of the associated Gaussian process.

The sampling is performed with an MCMC sampler (the No-U-Turn Sampler, NUTS).

Parameters:
  • n_samples (int) – Number of requested samples from the posterior for each chain. This includes warmup, and note that the effective sample size is usually considerably smaller.

  • warmpup (int, optional) – Length of warmup sequence in MCMC sampling. Defaults to n_samples//2.

  • n_chains (int, optional) – Number of independent chains.

  • threshold (float, optional) – The threshold (bandwidth) for posterior (give as log if log discrepancy).

  • initials (np.array of shape (n_chains, n_params), optional) – Initial values for the sampled parameters for each chain. Defaults to best evidence points.

  • algorithm (string, optional) – Sampling algorithm to use. Currently ‘nuts’(default) and ‘metropolis’ are supported.

  • sigma_proposals (dict, optional) – Standard deviations for Gaussian proposals of each parameter for Metropolis Markov Chain sampler. Defaults to 1/10 of surrogate model bound lengths.

  • n_evidence (int) – If the regression model is not fitted yet, specify the amount of evidence

Return type:

BolfiSample

property seed

Return the seed of the inference.

set_objective(n_evidence=None)

Set objective for inference.

You can continue BO by giving a larger n_evidence.

Parameters:

n_evidence (int) – Number of total evidence for the GP fitting. This includes any initial evidence.

update(batch, batch_index)

Update the GP regression model of the target node with a new batch.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

class elfi.ROMC(model: ElfiModel | NodeReference, bounds: List | None = None, discrepancy_name: str | None = None, output_names: List[str] | None = None, custom_optim_class=None, parallelize: bool = False, **kwargs)[source]

Robust Optimisation Monte Carlo inference method.

Ikonomov, B., & Gutmann, M. U. (2019). Robust Optimisation Monte Carlo. http://arxiv.org/abs/1904.00670

Class constructor.

Parameters:
  • model (Model or NodeReference) – the elfi model or the output node of the graph

  • bounds (List[(start,stop), ...]) – bounds of the n-dim bounding box area containing the mass of the posterior

  • discrepancy_name (string, optional) – the name of the output node (obligatory, only if Model is passed as model)

  • output_names (List[string]) – which node values to store during inference

  • custom_optim_class (class) – Custom OptimizationProblem class provided by the user, to extend the algorithm

  • parallelize (bool) – whether to parallelize all parts of the algorithm

  • kwargs (Dict) – other named parameters

property batch_size

Return the current batch_size.

compute_divergence(gt_posterior, bounds=None, step=0.1, distance='Jensen-Shannon')[source]

Compute divergence between ROMC posterior and ground-truth.

Parameters:
  • gt_posterior (Callable,) – ground-truth posterior, must accepted input in a batched fashion (np.ndarray with shape: (BS,D))

  • bounds (List[(start, stop)]) – if bounds are not passed at the ROMC constructor, they can be passed here

  • step (float) –

  • distance (str) – which distance to use. must be in [“Jensen-Shannon”, “KL-Divergence”]

Returns:

The computed divergence between the distributions

Return type:

float

compute_eps(quantile)[source]

Return the quantile distance, out of all optimal distance.

Parameters:

quantile (value in [0,1]) –

Return type:

float

compute_ess()[source]

Compute the Effective Sample Size.

Returns:

The effective sample size.

Return type:

float

compute_expectation(h)[source]

Compute an expectation, based on h.

Parameters:

h (Callable) –

Return type:

float or np.array, depending on the return value of the Callable h

distance_hist(savefig=False, **kwargs)[source]

Plot a histogram of the distances at the optimal point.

Parameters:
  • savefig (False or str, if str it must be the path to save the figure) –

  • kwargs (Dict with arguments to be passed to the plt.hist()) –

estimate_regions(eps_filter, use_surrogate=False, region_args=None, fit_models=True, fit_models_args=None, eps_region=None, eps_cutoff=None)[source]

Filter solutions and build the N-Dimensional bounding box around the optimal point.

Parameters:
  • eps_filter (float) – threshold for filtering the solutions

  • use_surrogate (Union[None, bool]) – whether to use the surrogate model for bulding the bounding box. if None, it will be set based on which optimisation scheme has been used.

  • region_args (Union[None, Dict]) – keyword-arguments that will be passed to the regionConstructor. The arguments “eps_region” and “use_surrogate” are automatically appended, if not defined explicitly.

  • fit_models (bool) – whether to fit a helping model around the optimal point

  • fit_models_args (Union[None, Dict]) – arguments passed for fitting the helping models

  • eps_region (Union[None, float]) – threshold for the bounding box limits. If None, it will be equal to eps_filter.

  • eps_cutoff (Union[None, float]) – threshold for the indicator function. If None, it will be equal to eps_filter.

eval_posterior(theta)[source]

Evaluate the normalized posterior. The operation is NOT vectorized.

Parameters:

theta (np.ndarray (BS, D)) –

Returns:

np.array

Return type:

(BS,)

eval_unnorm_posterior(theta)[source]

Evaluate the unnormalized posterior. The operation is NOT vectorized.

Parameters:

theta (np.ndarray (BS, D)) – the position to evaluate

Returns:

np.array

Return type:

(BS,)

extract_result()[source]

Extract the result from the current state.

Returns:

result

Return type:

Sample

property finished

Check whether objective of n_batches have been reached.

fit_posterior(n1, eps_filter, use_bo=False, quantile=None, optimizer_args=None, region_args=None, fit_models=False, fit_models_args=None, seed=None, eps_region=None, eps_cutoff=None)[source]

Execute all training steps.

Parameters:
  • n1 (integer) – nof deterministic optimisation problems

  • use_bo (Boolean) – whether to use Bayesian Optimisation

  • eps_filter (Union[float, str]) – threshold for filtering solution or “auto” if defined by through quantile

  • quantile (Union[None, float], optional) – quantile of optimal distances to set as eps if eps=”auto”

  • optimizer_args (Union[None, Dict]) – keyword-arguments that will be passed to the optimiser

  • region_args (Union[None, Dict]) – keyword-arguments that will be passed to the regionConstructor

  • seed (Union[None, int]) – seed definition for making the training process reproducible

  • eps_region – threshold for region construction

infer(*args, vis=None, bar=True, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Parameters:
  • vis (dict, optional) – Plotting options. More info in self.plot_state method

  • bar (bool, optional) – Flag to remove (False) or keep (True) the progress bar from/in output.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property parameter_names

Return the parameters to be inferred.

plot_state(**kwargs)

Plot the current state of the algorithm.

Parameters:
  • axes (matplotlib.axes.Axes (optional)) –

  • figure (matplotlib.figure.Figure (optional)) –

  • xlim – x-axis limits

  • ylim – y-axis limits

  • interactive (bool (default False)) – If true, uses IPython.display to update the cell figure

  • close – Close figure in the end of plotting. Used in the end of interactive mode.

Return type:

None

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)

Prepare values for a new batch.

ELFI calls this method before submitting a new batch with an increasing index batch_index. This is an optional method to override. Use this if you have a need do do preparations, e.g. in Bayesian optimization algorithm, the next acquisition points would be acquired here.

If you need provide values for certain nodes, you can do so by constructing a batch dictionary and returning it. See e.g. BayesianOptimization for an example.

Parameters:

batch_index (int) – next batch_index to be submitted

Returns:

batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.

Return type:

dict or None

sample(n2, seed=None)[source]

Get samples from the posterior.

Parameters:
  • n2 (int) – number of samples

  • seed (int,) – seed of the sampling procedure

property seed

Return the seed of the inference.

set_objective(*args, **kwargs)

Set the objective of the inference.

This method sets the objective of the inference (values typically stored in the self.objective dict).

Return type:

None

solve_problems(n1, use_bo=False, optimizer_args=None, seed=None)[source]

Define and solve n1 optimisation problems.

Parameters:
  • n1 (integer) – number of deterministic optimisation problems to solve

  • use_bo (Boolean, default: False) – whether to use Bayesian Optimisation. If False, gradients are used.

  • optimizer_args (Union[None, Dict], default None) – keyword-arguments that will be passed to the optimiser. The argument “seed” is automatically appended to the dict. In the current implementation, all arguments are optional.

  • seed (Union[None, int]) –

update(batch, batch_index)

Update the inference state with a new batch.

ELFI calls this method when a new batch has been computed and the state of the inference should be updated with it. It is also possible to bypass ELFI and call this directly to update the inference.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

Return type:

None

visualize_region(i, force_objective=False, savefig=False)[source]

Plot the acceptance area of the i-th optimisation problem.

Parameters:
  • i (int,) – index of the problem

  • savefig – None or path

class elfi.BSL(model, n_sim_round, feature_names=None, likelihood=None, **kwargs)[source]

Bayesian Synthetic Likelihood for parameter inference.

For a description of the default BSL see Price et. al. 2018. Sampler implemented using Metropolis-Hastings MCMC.

References

L. F. Price, C. C. Drovandi, A. Lee & D. J. Nott (2018). Bayesian Synthetic Likelihood, Journal of Computational and Graphical Statistics, 27:1, 1-11, DOI: 10.1080/10618600.2017.1302882

Initialize the BSL sampler.

Parameters:
  • model (ElfiModel) – ELFI graph used by the algorithm.

  • n_sim_round (int) – Number of simulations for 1 parametric approximation of the likelihood.

  • feature_names (str or list, optional) – Features used in synthetic likelihood estimation. Defaults to all summary statistics.

  • likelihood (callable, optional) – Synthetic likelihood estimation method. Defaults to gaussian_syn_likelihood.

property batch_size

Return the current batch_size.

property current_params

Return parameter values explored in the current round.

BSL runs simulations with the candidate parameter values stored in method state.

extract_result()[source]

Extract the result from the current state.

Returns:

result

Return type:

BslSample

property finished

Check whether objective of n_batches have been reached.

infer(*args, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

Initialise a new data collection round if needed.

Returns:

result

Return type:

Sample

iterate()

Advance the inference by one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Return type:

None

property parameter_names

Return the parameters to be inferred.

plot_state(**kwargs)

Plot the current state of the algorithm.

Parameters:
  • axes (matplotlib.axes.Axes (optional)) –

  • figure (matplotlib.figure.Figure (optional)) –

  • xlim – x-axis limits

  • ylim – y-axis limits

  • interactive (bool (default False)) – If true, uses IPython.display to update the cell figure

  • close – Close figure in the end of plotting. Used in the end of interactive mode.

Return type:

None

property pool

Return the output pool of the inference.

prepare_new_batch(batch_index)

Prepare values for a new batch.

Parameters:

batch_index (int) –

Returns:

batch

Return type:

dict

sample(n_samples, sigma_proposals, params0=None, param_names=None, burn_in=0, logit_transform_bound=None, tau=0.5, w=1, max_iter=1000, **kwargs)[source]

Sample from the posterior distribution of BSL.

The specific approximate likelihood estimated depends on the BSL class but generally uses a multivariate normal approximation.

The sampling is performed with a metropolis MCMC sampler, and gamma parameters are sampled with a slice sampler when adjustment for model misspecification is used.

Parameters:
  • n_samples (int) – Number of requested samples from the posterior. This includes burn_in.

  • sigma_proposals (np.array of shape (k x k) - k = number of parameters) – Standard deviations for Gaussian proposals of each parameter.

  • params0 (array_like, optional) – Initial values for each sampled parameter.

  • param_names (list, optional) – Custom list of parameter names corresponding to the order of parameters in params0 and sigma_proposals.

  • burn_in (int, optional) – Length of burnin sequence in MCMC sampling. These samples are “thrown away”. Defaults to 0.

  • logit_transform_bound (list, optional) – Each list element contains the lower and upper bound for the logit transformation of the corresponding parameter.

  • tau (float, optional) – Scale parameter for the prior distribution used by the gamma sampler.

  • w (float, optional) – Step size used by the gamma sampler.

  • max_iter (int, optional) – Maximum number of iterations used by the gamma sampler.

Return type:

BslSample

property seed

Return the seed of the inference.

set_objective(rounds)

Set objective for inference.

Parameters:

rounds (int) – Number of data collection rounds.

update(batch, batch_index)

Update the inference state with a new batch.

Parameters:
  • batch (dict) – dict with self.outputs as keys and the corresponding outputs for the batch as values

  • batch_index (int) –

Result objects

class elfi.methods.results.OptimizationResult(x_min, **kwargs)[source]

Base class for results from optimization.

Initialize result.

Parameters:
  • x_min – The optimized parameters

  • **kwargs – See ParameterInferenceResult

property is_multivariate

Check whether the result contains multivariate parameters.

class elfi.methods.results.Sample(method_name, outputs, parameter_names, discrepancy_name=None, weights=None, **kwargs)[source]

Sampling results from inference methods.

Initialize result.

Parameters:
  • method_name (string) – Name of inference method.

  • outputs (dict) – Dictionary with outputs from the nodes, e.g. samples.

  • parameter_names (list) – Names of the parameter nodes

  • discrepancy_name (string, optional) – Name of the discrepancy in outputs.

  • weights (array_like) –

  • **kwargs – Other meta information for the result

property dim

Return the number of parameters.

property discrepancies

Return the discrepancy values.

get_sample_covariance()[source]

Return covariance of samples.

property is_multivariate

Check whether the result contains multivariate parameters.

property n_samples

Return the number of samples.

plot_marginals(selector=None, bins=20, axes=None, reference_value=None, **kwargs)[source]

Plot marginal distributions for parameters.

Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_pairs(selector=None, bins=20, axes=None, reference_value=None, draw_upper_triagonal=False, **kwargs)[source]

Plot pairwise relationships as a matrix with marginals on the diagonal.

The y-axis of marginal histograms are scaled. Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

property sample_means

Evaluate weighted averages of sampled parameters.

Return type:

OrderedDict

property sample_means_and_95CIs

Construct OrderedDict for mean and 95% credible interval.

property sample_means_array

Evaluate weighted averages of sampled parameters.

Return type:

np.array

sample_means_summary()[source]

Print a representation of sample means.

sample_quantiles(alpha=0.5)[source]

Evaluate weighted sample quantiles of sampled parameters.

sample_summary()[source]

Print sample mean and 95% credible interval.

property samples_array

Return the samples as an array.

The columns are in the same order as in self.parameter_names.

Return type:

list of np.arrays

save(fname=None)[source]

Save samples in csv, json or pickle file formats.

Clarification: csv saves only samples, json saves the whole object’s dictionary except outputs key and pickle saves the whole object.

Parameters:

fname (str, required) – File name to be saved. The type is inferred from extension (‘csv’, ‘json’ or ‘pkl’).

summary()[source]

Print a verbose summary of contained results.

class elfi.methods.results.SmcSample(method_name, outputs, parameter_names, populations, *args, **kwargs)[source]

Container for results from SMC-ABC.

Initialize result.

Parameters:
  • method_name (str) –

  • outputs (dict) –

  • parameter_names (list) –

  • populations (list[Sample]) – List of Sample objects

  • args

  • kwargs

property dim

Return the number of parameters.

property discrepancies

Return the discrepancy values.

get_sample_covariance()

Return covariance of samples.

property is_multivariate

Check whether the result contains multivariate parameters.

property n_populations

Return the number of populations.

property n_samples

Return the number of samples.

plot_marginals(selector=None, bins=20, axes=None, all=False, **kwargs)[source]

Plot marginal distributions for parameters for all populations.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

  • all (bool, optional) – Plot the marginals of all populations

plot_pairs(selector=None, bins=20, axes=None, all=False, **kwargs)[source]

Plot pairwise relationships as a matrix with marginals on the diagonal.

The y-axis of marginal histograms are scaled.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

  • all (bool, optional) – Plot for all populations

property sample_means

Evaluate weighted averages of sampled parameters.

Return type:

OrderedDict

property sample_means_and_95CIs

Construct OrderedDict for mean and 95% credible interval.

property sample_means_array

Evaluate weighted averages of sampled parameters.

Return type:

np.array

sample_means_summary(all=False)[source]

Print a representation of sample means.

Parameters:

all (bool, optional) – Whether to print the means for all populations separately, or just the final population (default).

sample_quantiles(alpha=0.5)

Evaluate weighted sample quantiles of sampled parameters.

sample_summary()

Print sample mean and 95% credible interval.

property samples_array

Return the samples as an array.

The columns are in the same order as in self.parameter_names.

Return type:

list of np.arrays

save(fname=None)

Save samples in csv, json or pickle file formats.

Clarification: csv saves only samples, json saves the whole object’s dictionary except outputs key and pickle saves the whole object.

Parameters:

fname (str, required) – File name to be saved. The type is inferred from extension (‘csv’, ‘json’ or ‘pkl’).

summary(all=False)[source]

Print a verbose summary of contained results.

Parameters:

all (bool, optional) – Whether to print the summary for all populations separately, or just the final population (default).

class elfi.methods.results.BolfiSample(method_name, chains, parameter_names, warmup, **kwargs)[source]

Container for results from BOLFI.

Initialize result.

Parameters:
  • method_name (string) – Name of inference method.

  • chains (np.array) – Chains from sampling, warmup included. Shape: (n_chains, n_samples, n_parameters).

  • parameter_names (list : list of strings) – List of names in the outputs dict that refer to model parameters.

  • warmup (int) – Number of warmup iterations in chains.

property dim

Return the number of parameters.

property discrepancies

Return the discrepancy values.

get_sample_covariance()

Return covariance of samples.

property is_multivariate

Check whether the result contains multivariate parameters.

property n_samples

Return the number of samples.

plot_marginals(selector=None, bins=20, axes=None, reference_value=None, **kwargs)

Plot marginal distributions for parameters.

Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_pairs(selector=None, bins=20, axes=None, reference_value=None, draw_upper_triagonal=False, **kwargs)

Plot pairwise relationships as a matrix with marginals on the diagonal.

The y-axis of marginal histograms are scaled. Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_traces(selector=None, axes=None, **kwargs)[source]

Plot MCMC traces.

property sample_means

Evaluate weighted averages of sampled parameters.

Return type:

OrderedDict

property sample_means_and_95CIs

Construct OrderedDict for mean and 95% credible interval.

property sample_means_array

Evaluate weighted averages of sampled parameters.

Return type:

np.array

sample_means_summary()

Print a representation of sample means.

sample_quantiles(alpha=0.5)

Evaluate weighted sample quantiles of sampled parameters.

sample_summary()

Print sample mean and 95% credible interval.

property samples_array

Return the samples as an array.

The columns are in the same order as in self.parameter_names.

Return type:

list of np.arrays

save(fname=None)

Save samples in csv, json or pickle file formats.

Clarification: csv saves only samples, json saves the whole object’s dictionary except outputs key and pickle saves the whole object.

Parameters:

fname (str, required) – File name to be saved. The type is inferred from extension (‘csv’, ‘json’ or ‘pkl’).

summary()

Print a verbose summary of contained results.

class elfi.methods.results.RomcSample(method_name, outputs, parameter_names, discrepancy_name, weights, **kwargs)[source]

Container for results from ROMC.

Class constructor.

Parameters:
  • method_name (string) – Name of the inference method

  • outputs (Dict) – Dict where key is the parameter name and value are the samples

  • parameter_names (List[string]) – List of the parameter names

  • discrepancy_name (string) – name of the output (=distance) node

  • weights (np.ndarray) – the weights of the samples

  • kwargs

property dim

Return the number of parameters.

property discrepancies

Return the discrepancy values.

get_sample_covariance()

Return covariance of samples.

property is_multivariate

Check whether the result contains multivariate parameters.

property n_samples

Return the number of samples.

plot_marginals(selector=None, bins=20, axes=None, reference_value=None, **kwargs)

Plot marginal distributions for parameters.

Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_pairs(selector=None, bins=20, axes=None, reference_value=None, draw_upper_triagonal=False, **kwargs)

Plot pairwise relationships as a matrix with marginals on the diagonal.

The y-axis of marginal histograms are scaled. Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

property sample_means

Evaluate weighted averages of sampled parameters.

Return type:

OrderedDict

property sample_means_and_95CIs

Construct OrderedDict for mean and 95% credible interval.

property sample_means_array

Evaluate weighted averages of sampled parameters.

Return type:

np.array

sample_means_summary()

Print a representation of sample means.

sample_quantiles(alpha=0.5)

Evaluate weighted sample quantiles of sampled parameters.

sample_summary()

Print sample mean and 95% credible interval.

property samples_array

Return the samples as an array.

The columns are in the same order as in self.parameter_names.

Return type:

list of np.arrays

samples_cov()[source]

Print the empirical covariance matrix.

Returns:

the covariance matrix

Return type:

np.ndarray (D,D)

save(fname=None)

Save samples in csv, json or pickle file formats.

Clarification: csv saves only samples, json saves the whole object’s dictionary except outputs key and pickle saves the whole object.

Parameters:

fname (str, required) – File name to be saved. The type is inferred from extension (‘csv’, ‘json’ or ‘pkl’).

summary()

Print a verbose summary of contained results.

class elfi.methods.results.BslSample(method_name, samples_all, parameter_names, burn_in=0, acc_rate=None, **kwargs)[source]

Container for results from BSL.

Initialize result.

Parameters:
  • method_name (string) – Name of inference method.

  • samples_all (np.ndarray) – Dictionary with all samples from the MCMC chain, burn in included.

  • parameter_names (list) – Names of the parameter nodes

  • burn_in (int) – Number of samples to discard from start of MCMC chain.

  • acc_rate (float) – The acceptance rate of proposed parameters in the MCMC chain

  • **kwargs – Other meta information for the result

compute_ess()[source]

Compute the effective sample size of mcmc chain.

Returns:

Effective sample size for each paramter

Return type:

dict

property dim

Return the number of parameters.

property discrepancies

Return the discrepancy values.

get_sample_covariance()

Return covariance of samples.

property is_multivariate

Check whether the result contains multivariate parameters.

property n_samples

Return the number of samples.

plot_marginals(selector=None, bins=20, axes=None, reference_value=None, **kwargs)

Plot marginal distributions for parameters.

Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_pairs(selector=None, bins=20, axes=None, reference_value=None, draw_upper_triagonal=False, **kwargs)

Plot pairwise relationships as a matrix with marginals on the diagonal.

The y-axis of marginal histograms are scaled. Supports only univariate distributions.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.

  • bins (int, optional) – Number of bins in histograms.

  • axes (one or an iterable of plt.Axes, optional) –

Returns:

axes

Return type:

np.array of plt.Axes

plot_traces(selector=None, axes=None, **kwargs)[source]

Plot MCMC traces.

property sample_means

Evaluate weighted averages of sampled parameters.

Return type:

OrderedDict

property sample_means_and_95CIs

Construct OrderedDict for mean and 95% credible interval.

property sample_means_array

Evaluate weighted averages of sampled parameters.

Return type:

np.array

sample_means_summary()

Print a representation of sample means.

sample_quantiles(alpha=0.5)

Evaluate weighted sample quantiles of sampled parameters.

sample_summary()

Print sample mean and 95% credible interval.

property samples_array

Return the samples as an array.

The columns are in the same order as in self.parameter_names.

Return type:

list of np.arrays

save(fname=None)

Save samples in csv, json or pickle file formats.

Clarification: csv saves only samples, json saves the whole object’s dictionary except outputs key and pickle saves the whole object.

Parameters:

fname (str, required) – File name to be saved. The type is inferred from extension (‘csv’, ‘json’ or ‘pkl’).

summary()

Print a verbose summary of contained results.

Post-processing

elfi.adjust_posterior(sample, model, summary_names, parameter_names=None, adjustment='linear')[source]

Adjust the posterior using local regression.

Note that the summary nodes need to be explicitly included to the sample object with the output_names keyword argument when performing the inference.

Parameters:
  • sample (elfi.methods.results.Sample) – a sample object from an ABC algorithm

  • model (elfi.ElfiModel) – the inference model

  • summary_names (list[str]) – names of the summary nodes

  • parameter_names (list[str] (optional)) – names of the parameters

  • adjustment (RegressionAdjustment or string) –

    a regression adjustment object or a string specification

    Accepted values for the string specification:
    • ’linear’

Returns:

a Sample object with the adjusted posterior

Return type:

elfi.methods.results.Sample

Examples

>>> import elfi
>>> from elfi.examples import gauss
>>> m = gauss.get_model()
>>> res = elfi.Rejection(m['d'], output_names=['ss_mean', 'ss_var'],
...                      batch_size=10).sample(500, bar=False)
>>> adj = adjust_posterior(res, m, ['ss_mean', 'ss_var'], ['mu'], LinearAdjustment())
class elfi.methods.post_processing.LinearAdjustment(**kwargs)[source]

Regression adjustment using a local linear model.

adjust()

Adjust the posterior.

Only the non-finite values used to fit the regression model will be adjusted.

Return type:

a Sample object containing the adjusted posterior

fit(sample, model, summary_names, parameter_names=None)

Fit a regression adjustment model to the posterior sample.

Non-finite values in the summary statistics and parameters will be omitted.

Parameters:
  • sample (elfi.methods.Sample) – a sample object from an ABC method

  • model (elfi.ElfiModel) – the inference model

  • summary_names (list[str]) – a list of names for the summary nodes

  • parameter_names (list[str] (optional)) – a list of parameter names

Diagnostics

class elfi.TwoStageSelection(simulator, fn_distance, list_ss=None, prepared_ss=None, max_cardinality=4, seed=0)[source]

Perform the summary-statistics selection proposed by Nunes and Balding (2010).

The user can provide a list of summary statistics as list_ss, and let ELFI to combine them, or provide some already combined summary statistics as prepared_ss.

The rationale of the Two Stage procedure procedure is the following:

  • First, the module computes or accepts the combinations of the candidate summary statistics.

  • In Stage 1, each summary-statistics combination is evaluated using the Minimum Entropy algorithm.

  • In Stage 2, the minimum-entropy combination is selected, and the ‘closest’ datasets are identified.

  • Further in Stage 2, for each summary-statistics combination, the mean root sum of squared errors (MRSSE) is calculated over all ‘closest datasets’, and the minimum-MRSSE combination is chosen as the one with the optimal performance.

References

[1] Nunes, M. A., & Balding, D. J. (2010). On optimal selection of summary statistics for approximate Bayesian computation. Statistical applications in genetics and molecular biology, 9(1). [2] Blum, M. G., Nunes, M. A., Prangle, D., & Sisson, S. A. (2013). A comparative review of dimension reduction methods in approximate Bayesian computation. Statistical Science, 28(2), 189-208.

Initialise the summary-statistics selection for the Two Stage Procedure.

Parameters:
  • simulator (elfi.Node) – Node (often elfi.Simulator) for which the summary statistics will be applied. The node is the final node of a coherent ElfiModel (i.e. it has no child nodes).

  • fn_distance (str or callable function) – Distance metric, consult the elfi.Distance documentation for calling as a string.

  • list_ss (List of callable functions, optional) – List of candidate summary statistics.

  • prepared_ss (List of lists of callable functions, optional) – List of prepared combinations of candidate summary statistics. No other combinations will be evaluated.

  • max_cardinality (int, optional) – Maximum cardinality of a candidate summary-statistics combination.

  • seed (int, optional) –

run(n_sim, n_acc=None, n_closest=None, batch_size=1, k=4)[source]

Run the Two Stage Procedure for identifying relevant summary statistics.

Parameters:
  • n_sim (int) – Number of the total ABC-rejection simulations.

  • n_acc (int, optional) – Number of the accepted ABC-rejection simulations.

  • n_closest (int, optional) – Number of the ‘closest’ datasets (i.e., the closest n simulation datasets w.r.t the observations).

  • batch_size (int, optional) – Number of samples per batch.

  • k (int, optional) – Parameter for the kth-nearest-neighbour search performed in the minimum-entropy step (in Nunes & Balding, 2010 it is fixed to 4).

Returns:

Summary-statistics combination showing the optimal performance.

Return type:

array_like

Acquisition methods

class elfi.methods.bo.acquisition.LCBSC(*args, delta=None, additive_cost=None, **kwargs)[source]

Lower Confidence Bound Selection Criterion.

Srinivas et al. call this GP-LCB.

LCBSC uses the parameter delta which is here equivalent to 1/exploration_rate.

Parameter delta should be in (0, 1) for the theoretical results to hold. The theoretical upper bound for total regret in Srinivas et al. has a probability greater or equal to 1 - delta, so values of delta very close to 1 or over it do not make much sense in that respect.

Delta is roughly the exploitation tendency of the acquisition function.

References

N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger. Gaussian process optimization in the bandit setting: No regret and experimental design. In Proc. International Conference on Machine Learning (ICML), 2010

E. Brochu, V.M. Cora, and N. de Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:1012.2599, 2010.

Notes

The formula presented in Brochu (pp. 15) seems to be from Srinivas et al. Theorem 2. However, instead of having t**(d/2 + 2) in beta_t, it seems that the correct form would be t**(2d + 2).

Initialize LCBSC.

Parameters:
  • delta (float, optional) – In between (0, 1). Default is 1/exploration_rate. If given, overrides the exploration_rate.

  • additive_cost (CostFunction, optional) – Cost function output is added to the base acquisition value.

acquire(n, t=None)

Return the next batch of acquisition points.

Gaussian noise ~N(0, self.noise_var) is added to the acquired points.

Parameters:
  • n (int) – Number of acquisition points to return.

  • t (int) – Current acq_batch_index (starting from 0).

Returns:

x – The shape is (n, input_dim)

Return type:

np.ndarray

property delta

Return the inverse of exploration rate.

evaluate(x, t=None)[source]

Evaluate the Lower confidence bound selection criterion.

Parameters:
  • x (np.ndarray) –

  • t (int, optional) – Current iteration (starting from 0).

Return type:

np.ndarray

evaluate_gradient(x, t=None)[source]

Evaluate the gradient of the lower confidence bound selection criterion.

Parameters:
  • x (np.ndarray) –

  • t (int, optional) – Current iteration (starting from 0).

Return type:

np.ndarray

class elfi.methods.bo.acquisition.MaxVar(model, prior, quantile_eps=0.01, **opts)[source]

The maximum variance acquisition method.

The next evaluation point is acquired in the maximiser of the variance of the unnormalised approximate posterior.

\[\theta_{t+1} = \arg \max \text{Var}(p(\theta) \cdot p_a(\theta)),\]

where the unnormalised likelihood \(p_a\) is defined using the CDF of normal distribution, \(\Phi\), as follows:

\[p_a(\theta) = \Phi((\epsilon - \mu_{1:t}(\theta)) / \sqrt{v_{1:t}(\theta) + \sigma^2_n}),\]

where epsilon is the ABC threshold, \(\mu_{1:t}\) and \(v_{1:t}\) are determined by the Gaussian process, \(\sigma^2_n\) is the noise.

References

Järvenpää et al. (2019). Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation. Bayesian Analysis 14(2):595-622, 2019 https://projecteuclid.org/euclid.ba/1537258134

Initialise MaxVar.

Parameters:
  • model (elfi.GPyRegression) – Gaussian process model used to calculate the unnormalised approximate likelihood.

  • prior (scipy-like distribution) – Prior distribution.

  • quantile_eps (int, optional) – Quantile of the observed discrepancies used in setting the ABC threshold.

acquire(n, t=None)[source]

Acquire a batch of acquisition points.

Parameters:
  • n (int) – Number of acquisitions.

  • t (int, optional) – Current iteration, (unused).

Returns:

Coordinates of the yielded acquisition points.

Return type:

array_like

evaluate(theta_new, t=None)[source]

Evaluate the acquisition function at the location theta_new.

Parameters:
  • theta_new (array_like) – Evaluation coordinates.

  • t (int, optional) – Current iteration, (unused).

Returns:

Variance of the approximate posterior.

Return type:

array_like

evaluate_gradient(theta_new, t=None)[source]

Evaluate the acquisition function’s gradient at the location theta_new.

Parameters:
  • theta_new (array_like) – Evaluation coordinates.

  • t (int, optional) – Current iteration, (unused).

Returns:

Gradient of the variance of the approximate posterior

Return type:

array_like

class elfi.methods.bo.acquisition.RandMaxVar(model, prior, quantile_eps=0.01, sampler='nuts', n_samples=50, warmup=None, limit_faulty_init=1000, init_from_prior=False, sigma_proposals=None, **opts)[source]

The randomised maximum variance acquisition method.

The next evaluation point is sampled from the density corresponding to the variance of the unnormalised approximate posterior (The MaxVar acquisition function).

\[\theta_{t+1} \thicksim q(\theta),\]

where \(q(\theta) \propto \text{Var}(p(\theta) \cdot p_a(\theta))\) and the unnormalised likelihood \(p_a\) is defined using the CDF of normal distribution, \(\Phi\), as follows:

\[p_a(\theta) = \Phi((\epsilon - \mu_{1:t}(\theta)) / \sqrt{v_{1:t}(\theta) + \sigma^2_n} ),\]

where \(\epsilon\) is the ABC threshold, \(\mu_{1:t}\) and \(v_{1:t}\) are determined by the Gaussian process, \(\sigma^2_n\) is the noise.

References

Järvenpää et al. (2019). Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation. Bayesian Analysis 14(2):595-622, 2019 https://projecteuclid.org/euclid.ba/1537258134

Initialise RandMaxVar.

Parameters:
  • model (elfi.GPyRegression) – Gaussian process model used to calculate the unnormalised approximate likelihood.

  • prior (scipy-like distribution) – Prior distribution.

  • quantile_eps (int, optional) – Quantile of the observed discrepancies used in setting the ABC threshold.

  • sampler (string, optional) – Name of the sampler (options: metropolis, nuts).

  • n_samples (int, optional) – Length of the sampler’s chain for obtaining the acquisitions.

  • warmup (int, optional) – Number of samples discarded as warmup. Defaults to n_samples/2.

  • limit_faulty_init (int, optional) – Limit for the iterations used to obtain the sampler’s initial points.

  • init_from_prior (bool, optional) – Controls whether the sampler’s initial points are sampled from the prior or a uniform distribution within model bounds. Defaults to model bounds.

  • sigma_proposals (dict, optional) – Standard deviations for Gaussian proposals of each parameter for Metropolis Markov Chain sampler. Defaults to 1/10 of surrogate model bound lengths.

acquire(n, t=None)[source]

Acquire a batch of acquisition points.

Parameters:
  • n (int) – Number of acquisitions.

  • t (int, optional) – Current iteration, (unused).

Returns:

Coordinates of the yielded acquisition points.

Return type:

array_like

evaluate(theta_new, t=None)

Evaluate the acquisition function at the location theta_new.

Parameters:
  • theta_new (array_like) – Evaluation coordinates.

  • t (int, optional) – Current iteration, (unused).

Returns:

Variance of the approximate posterior.

Return type:

array_like

evaluate_gradient(theta_new, t=None)

Evaluate the acquisition function’s gradient at the location theta_new.

Parameters:
  • theta_new (array_like) – Evaluation coordinates.

  • t (int, optional) – Current iteration, (unused).

Returns:

Gradient of the variance of the approximate posterior

Return type:

array_like

class elfi.methods.bo.acquisition.ExpIntVar(model, prior, quantile_eps=0.01, integration='grid', d_grid=0.2, n_samples_imp=100, iter_imp=2, sampler='nuts', n_samples=2000, sigma_proposals=None, **opts)[source]

The Expected Integrated Variance (ExpIntVar) acquisition method.

Essentially, we define a loss function that measures the overall uncertainty in the unnormalised ABC posterior over the parameter space. The value of the loss function depends on the next simulation and thus the next evaluation location \(\theta^*\) is chosen to minimise the expected loss.

\[\theta_{t+1} = arg min_{\theta^* \in \Theta} L_{1:t}(\theta^*),\]

where \(\Theta\) is the parameter space, and \(L\) is the expected loss function approximated as follows:

\[L_{1:t}(\theta^*) \approx 2 * \sum_{i=1}^s (\omega^i \cdot p^2(\theta^i) \cdot w_{1:t+1}(\theta^i, \theta^*),\]

where \(\omega^i\) is an importance weight, \(p^2(\theta^i)\) is the prior squared, and \(w_{1:t+1}(\theta^i, \theta^*)\) is the expected variance of the unnormalised ABC posterior at theta^i after running the simulation model with parameter \(\theta^*\)

References

Järvenpää et al. (2019). Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation. Bayesian Analysis 14(2):595-622, 2019 https://projecteuclid.org/euclid.ba/1537258134

Initialise ExpIntVar.

Parameters:
  • model (elfi.GPyRegression) – Gaussian process model used to calculate the approximate unnormalised likelihood.

  • prior (scipy-like distribution) – Prior distribution.

  • quantile_eps (int, optional) – Quantile of the observed discrepancies used in setting the discrepancy threshold.

  • integration (str, optional) – Integration method. Options: - grid (points are taken uniformly): more accurate yet computationally expensive in high dimensions; - importance (points are taken based on the importance weight): less accurate though applicable in high dimensions.

  • d_grid (float, optional) – Grid tightness.

  • n_samples_imp (int, optional) – Number of importance samples.

  • iter_imp (int, optional) – Gap between acquisition iterations in performing importance sampling.

  • sampler (string, optional) – Sampler for generating random numbers from the proposal distribution for IS. (Options: metropolis, nuts.)

  • n_samples (int, optional) – Chain length for the sampler that generates the random numbers from the proposal distribution for IS.

  • sigma_proposals (dict, optional) – Standard deviations for Gaussian proposals of each parameter for Metropolis Markov Chain sampler. Defaults to 1/10 of surrogate model bound lengths.

acquire(n, t)[source]

Acquire a batch of acquisition points.

Parameters:
  • n (int) – Number of acquisitions.

  • t (int) – Current iteration.

Returns:

Coordinates of the yielded acquisition points.

Return type:

array_like

evaluate(theta_new, t=None)[source]

Evaluate the acquisition function at the location theta_new.

Parameters:
  • theta_new (array_like) – Evaluation coordinates.

  • t (int, optional) – Current iteration, (unused).

Returns:

Expected loss’s term dependent on theta_new.

Return type:

array_like

evaluate_gradient(theta_new, t=None)

Evaluate the acquisition function’s gradient at the location theta_new.

Parameters:
  • theta_new (array_like) – Evaluation coordinates.

  • t (int, optional) – Current iteration, (unused).

Returns:

Gradient of the variance of the approximate posterior

Return type:

array_like

class elfi.methods.bo.acquisition.UniformAcquisition(model, prior=None, n_inits=10, max_opt_iters=1000, noise_var=None, exploration_rate=10, seed=None, constraints=None)[source]

Acquisition from uniform distribution.

Initialize AcquisitionBase.

Parameters:
  • model (an object with attributes) –

    input_dimint

    bounds : tuple of length ‘input_dim’ of tuples (min, max)

    and methods

    evaluate(x) : function that returns model (mean, var, std)

  • prior (scipy-like distribution, optional) – By default uniform distribution within model bounds.

  • n_inits (int, optional) – Number of initialization points in internal optimization.

  • max_opt_iters (int, optional) – Max iterations to optimize when finding the next point.

  • noise_var (float or np.array, optional) – Acquisition noise variance for adding noise to the points near the optimized location. If array, must be 1d specifying the variance for different dimensions. Default: no added noise.

  • exploration_rate (float, optional) – Exploration rate of the acquisition function (if supported)

  • seed (int, optional) – Seed for getting consistent acquisition results. Used in getting random starting locations in acquisition function optimization.

  • constraints ({Constraint, dict} or List of {Constraint, dict}, optional) – Additional model constraints.

acquire(n, t=None)[source]

Return random points from uniform distribution.

Parameters:
  • n (int) – Number of acquisition points to return.

  • t (int, optional) – (unused)

Returns:

x – The shape is (n, input_dim)

Return type:

np.ndarray

evaluate(x, t=None)

Evaluate the acquisition function at ‘x’.

Parameters:
  • x (numpy.array) –

  • t (int) – current iteration (starting from 0)

evaluate_gradient(x, t=None)

Evaluate the gradient of acquisition function at ‘x’.

Parameters:
  • x (numpy.array) –

  • t (int) – Current iteration (starting from 0).

Model selection

elfi.compare_models(sample_objs, model_priors=None)[source]

Find posterior probabilities for different models.

The algorithm requires elfi.Sample objects from prerun inference methods. For example the output from elfi.Rejection.sample is valid. The portion of samples for each model in the top discrepancies are adjusted by each models acceptance ratio and prior probability.

The discrepancies (including summary statistics) must be comparable so that it is meaningful to sort them!

Parameters:
  • sample_objs (list of elfi.Sample) – Resulting Sample objects from prerun inference models. The objects must include a valid discrepancies attribute.

  • model_priors (array_like, optional) – Prior probability of each model. Defaults to 1 / n_models.

Returns:

Posterior probabilities for the considered models.

Return type:

np.array

Other

Data pools

class elfi.OutputPool(outputs=None, name=None, prefix=None)[source]

Store node outputs to dictionary-like stores.

The default store is a Python dictionary.

Notes

Saving the store requires that all the stores are pickleable.

Arbitrary objects that support simple array indexing can be used as stores by using the elfi.store.ArrayObjectStore class.

See the elfi.store.StoreBase interfaces if you wish to implement your own ELFI compatible store. Basically any object that fulfills the Pythons dictionary api will work as a store in the pool.

Initialize OutputPool.

Depending on the algorithm, some of these values may be reused after making some changes to ElfiModel thus speeding up the inference significantly. For instance, if all the simulations are stored in Rejection sampling, one can change the summaries and distances without having to rerun the simulator.

Parameters:
  • outputs (list, dict, optional) – List of node names which to store or a dictionary with existing stores. The stores are created on demand.

  • name (str, optional) – Name of the pool. Used to open a saved pool from disk.

  • prefix (str, optional) – Path to directory under which elfi.ArrayPool will place its folder. Default is a relative path ./pools.

Returns:

instance

Return type:

OutputPool

add_batch(batch, batch_index)[source]

Add the outputs from the batch to their stores.

add_store(node, store=None)[source]

Add a store object for the node.

Parameters:
  • node (str) –

  • store (dict, StoreBase, optional) –

clear()[source]

Remove all data from the stores.

close()[source]

Save and close the stores that support it.

The pool will not be usable afterwards.

delete()[source]

Remove all persisted data from disk.

flush()[source]

Flush all data from the stores.

If the store does not support flushing, do nothing.

get_batch(batch_index, output_names=None)[source]

Return a batch from the stores of the pool.

Parameters:
  • batch_index (int) –

  • output_names (list) – which outputs to include to the batch

Returns:

batch

Return type:

dict

get_store(node)[source]

Return the store for node.

property has_context

Check if current pool has context information.

has_store(node)[source]

Check if node is in stores.

classmethod open(name, prefix=None)[source]

Open a closed or saved ArrayPool from disk.

Parameters:
  • name (str) –

  • prefix (str, optional) –

Return type:

ArrayPool

property output_names

Return a list of stored names.

property path

Return the path to the pool.

remove_batch(batch_index)[source]

Remove the batch from all stores.

remove_store(node)[source]

Remove and return a store from the pool.

Parameters:

node (str) –

Returns:

The removed store

Return type:

store

save()[source]

Save the pool to disk.

This will use pickle to store the pool under self.path.

set_context(context)[source]

Set the context of the pool.

The pool needs to know the batch_size and the seed.

Notes

Also sets the name of the pool if not set already.

Parameters:

context (elfi.ComputationContext) –

class elfi.ArrayPool(outputs=None, name=None, prefix=None)[source]

OutputPool that uses binary .npy files as default stores.

The default store medium for output data is a NumPy binary .npy file for NumPy array data. You can however also add other types of stores as well.

Notes

The default store is implemented in elfi.store.NpyStore that uses NpyArrays as stores. The NpyArray is a wrapper over NumPy .npy binary file for array data and supports appending the .npy file. It uses the .npy format 2.0 files.

Initialize OutputPool.

Depending on the algorithm, some of these values may be reused after making some changes to ElfiModel thus speeding up the inference significantly. For instance, if all the simulations are stored in Rejection sampling, one can change the summaries and distances without having to rerun the simulator.

Parameters:
  • outputs (list, dict, optional) – List of node names which to store or a dictionary with existing stores. The stores are created on demand.

  • name (str, optional) – Name of the pool. Used to open a saved pool from disk.

  • prefix (str, optional) – Path to directory under which elfi.ArrayPool will place its folder. Default is a relative path ./pools.

Returns:

instance

Return type:

OutputPool

add_batch(batch, batch_index)

Add the outputs from the batch to their stores.

add_store(node, store=None)

Add a store object for the node.

Parameters:
  • node (str) –

  • store (dict, StoreBase, optional) –

clear()

Remove all data from the stores.

close()

Save and close the stores that support it.

The pool will not be usable afterwards.

delete()

Remove all persisted data from disk.

flush()

Flush all data from the stores.

If the store does not support flushing, do nothing.

get_batch(batch_index, output_names=None)

Return a batch from the stores of the pool.

Parameters:
  • batch_index (int) –

  • output_names (list) – which outputs to include to the batch

Returns:

batch

Return type:

dict

get_store(node)

Return the store for node.

property has_context

Check if current pool has context information.

has_store(node)

Check if node is in stores.

classmethod open(name, prefix=None)

Open a closed or saved ArrayPool from disk.

Parameters:
  • name (str) –

  • prefix (str, optional) –

Return type:

ArrayPool

property output_names

Return a list of stored names.

property path

Return the path to the pool.

remove_batch(batch_index)

Remove the batch from all stores.

remove_store(node)

Remove and return a store from the pool.

Parameters:

node (str) –

Returns:

The removed store

Return type:

store

save()

Save the pool to disk.

This will use pickle to store the pool under self.path.

set_context(context)

Set the context of the pool.

The pool needs to know the batch_size and the seed.

Notes

Also sets the name of the pool if not set already.

Parameters:

context (elfi.ComputationContext) –

Module functions

elfi.get_client()[source]

Get the current ELFI client instance.

elfi.set_client(client=None, **kwargs)[source]

Set the current ELFI client instance.

Parameters:

client (ClientBase or str) – Instance of a client from ClientBase, or a string from [‘native’, ‘multiprocessing’, ‘ipyparallel’]. If string, the respective constructor is called with kwargs.

Tools

tools.vectorize(constants=None, dtype=None)

Vectorize an operation.

Helper for cases when you have an operation that does not support vector arguments. This tool is still experimental and may not work in all cases.

Parameters:
  • operation (callable) – Operation to vectorize.

  • constants (tuple, list, optional) – A mask for constants in inputs, e.g. (0, 2) would indicate that the first and third positional inputs are constants. The constants will be passed as they are to each operation call.

  • dtype (np.dtype, bool[False], optional) – If None, numpy converts a list of outputs automatically. In some cases this produces non desired results. If you wish to keep the outputs as they are with no conversion, specify dtype=False. This results into a 1d object numpy array with outputs as they were returned.

Notes

This is a convenience method that uses a for loop internally for the vectorization. For best performance, one should aim to implement vectorized operations (by using e.g. numpy functions that are mostly vectorized) if at all possible.

Examples

# This form works in most cases
vectorized_simulator = elfi.tools.vectorize(simulator)

# Tell that the second and third argument to the simulator will be a constant
vectorized_simulator = elfi.tools.vectorize(simulator, [1, 2])
elfi.Simulator(vectorized_simulator, prior, constant_1, constant_2)

# Tell the vectorizer that it should not do any conversion to the outputs
vectorized_simulator = elfi.tools.vectorize(simulator, dtype=False)
tools.external_operation(process_result=None, prepare_inputs=None, sep=' ', stdout=True, subprocess_kwargs=None)

Wrap an external command as a Python callable (function).

The external command can be e.g. a shell script, or an executable file.

Parameters:
  • command (str) – Command to execute. Arguments can be passed to the executable by using Python’s format strings, e.g. “myscript.sh {0} {batch_size} –seed {seed}”. The command is expected to write to stdout. Since random_state is python specific object, a seed keyword argument will be available to operations that use random_state.

  • process_result (callable, np.dtype, str, optional) – Callable result handler with a signature output = callable(result, *inputs, **kwinputs). Here the result is either the stdout or subprocess.CompletedProcess depending on the stdout flag below. The inputs and kwinputs will come from ELFI. The default handler converts the stdout to numpy array with array = np.fromstring(stdout, sep=sep). If process_result is np.dtype or a string, then the stdout data is casted to that type with stdout = np.fromstring(stdout, sep=sep, dtype=process_result).

  • prepare_inputs (callable, optional) – Callable with a signature inputs, kwinputs = callable(*inputs, **kwinputs). The inputs will come from elfi.

  • sep (str, optional) – Separator to use with the default process_result handler. Default is a space ‘ ‘. If you specify your own callable to process_result this value has no effect.

  • stdout (bool, optional) – Pass the process_result handler the stdout instead of the subprocess.CompletedProcess instance. Default is true.

  • subprocess_kwargs (dict, optional) – Options for Python’s subprocess.run that is used to run the external command. Defaults are shell=True, check=True. See the subprocess documentation for more details.

Examples

>>> import elfi
>>> op = elfi.tools.external_operation('echo 1 {0}', process_result='int8')
>>>
>>> constant = elfi.Constant(123)
>>> simulator = elfi.Simulator(op, constant)
>>> simulator.generate()
array([  1, 123], dtype=int8)
Returns:

operation – ELFI compatible operation that can be used e.g. as a simulator.

Return type:

callable