API

This file describes the classes and methods available in ELFI.

Modelling API

Below is the API for creating generative models.

elfi.ElfiModel([name, observed, source_net]) A generative model for LFI

General model nodes

elfi.Constant(value, **kwargs) A node holding a constant value.
elfi.Operation(fn, *parents, **kwargs) A generic deterministic operation node.
elfi.RandomVariable(distribution, *params[, ...]) A node that draws values from a random distribution.

LFI nodes

elfi.Prior(distribution, *params[, size]) A parameter node of a generative model.
elfi.Simulator(fn, *params, **kwargs) A simulator node of a generative model.
elfi.Summary(fn, *parents, **kwargs) A summary node of a generative model.
elfi.Discrepancy(discrepancy, *parents, **kwargs) A discrepancy node of a generative model.
elfi.Distance(distance, *summaries[, p, w, ...]) A distance node of a generative model.

Other

elfi.new_model([name, set_current])
elfi.get_current_model() Return the current default elfi.ElfiModel instance.
elfi.set_current_model([model]) Set the current default elfi.ElfiModel instance.
elfi.draw(G[, internal, param_names, ...]) Draw the ElfiModel.

Inference API

Below is a list of inference methods included in ELFI.

elfi.Rejection(model[, discrepancy_name, ...]) Parallel ABC rejection sampler.
elfi.SMC(model[, discrepancy_name, output_names]) Sequential Monte Carlo ABC sampler
elfi.BayesianOptimization(model[, ...]) Bayesian Optimization of an unknown target function.
elfi.BOLFI(model[, target_name, bounds, ...]) Bayesian Optimization for Likelihood-Free Inference (BOLFI).

Result objects

OptimizationResult(x_min, **kwargs)
param x_min:The optimized parameters
Sample(method_name, outputs, parameter_names) Sampling results from the methods.
SmcSample(method_name, outputs, ...) Container for results from SMC-ABC.
BolfiSample(method_name, chains, ...) Container for results from BOLFI.

Post-processing

elfi.adjust_posterior(sample, model, ...[, ...]) Adjust the posterior using local regression.
LinearAdjustment(**kwargs) Regression adjustment using a local linear model.

Other

Data pools

elfi.OutputPool([outputs, name, prefix]) Store node outputs to dictionary-like stores.
elfi.ArrayPool([outputs, name, prefix]) OutputPool that uses binary .npy files as default stores.

Module functions

elfi.get_client() Get the current ELFI client instance.
elfi.set_client([client]) Set the current ELFI client instance.

Tools

elfi.tools.vectorize(operation[, constants, ...]) Vectorizes an operation.
elfi.tools.external_operation(command[, ...]) Wrap an external command as a Python callable (function).

Class documentations

Modelling API classes

class elfi.ElfiModel(name=None, observed=None, source_net=None)[source]

A generative model for LFI

Parameters:
  • name (str, optional) –
  • observed (dict, optional) – Observed data with node names as keys.
  • source_net (nx.DiGraph, optional) –
  • set_current (bool, optional) – Sets this model as the current ELFI model
copy()[source]

Return a copy of the ElfiModel instance

Returns:
Return type:ElfiModel
generate(batch_size=1, outputs=None, with_values=None)[source]

Generates a batch of outputs using the global seed.

This method is useful for testing that the generative model works.

Parameters:
  • batch_size (int) –
  • outputs (list) –
  • with_values (dict) – You can specify values for nodes to use when generating data
get_reference(name)[source]

Returns a new reference object for a node in the model.

get_state(name)[source]

Return the state of the node.

name

Name of the model

observed

The observed data for the nodes in a dictionary.

parameter_names

A list of model parameter names in an alphabetical order.

remove_node(name)[source]

Remove a node from the graph

Parameters:name (str) –
update_node(name, updating_name)[source]

Updates node with updating_node in the model.

The node with name name gets the state (operation), parents and observed data (if applicable) of the updating_node. The updating node is then removed from the graph.

Parameters:
  • name (str) –
  • updating_name (str) –
class elfi.Constant(value, **kwargs)[source]

A node holding a constant value.

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

state

State dictionary of the node

class elfi.Operation(fn, *parents, **kwargs)[source]

A generic deterministic operation node.

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

state

State dictionary of the node

class elfi.RandomVariable(distribution, *params, size=None, **kwargs)[source]

A node that draws values from a random distribution.

Parameters:
  • distribution (str or scipy-like distribution object) –
  • params (params of the distribution) –
  • size (int, tuple or None, optional) – Output size of a single random draw.
become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
distribution

Returns the distribution object.

generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

size

Returns the size of the output from the distribution.

state

State dictionary of the node

class elfi.Prior(distribution, *params, size=None, **kwargs)[source]

A parameter node of a generative model.

Parameters:
  • distribution (str, object) – Any distribution from scipy.stats, either as a string or an object. Objects must implement at least an rvs method with signature rvs(*parameters, size, random_state). Can also be a custom distribution object that implements at least an rvs method. Many of the algorithms also require the pdf and logpdf methods to be available.
  • size (int, tuple or None, optional) – Output size of a single random draw.
  • params – Parameters of the prior distribution
  • kwargs

Notes

The parameters of the scipy distributions (typically loc and scale) must be given as positional arguments.

Many algorithms (e.g. SMC) also require a pdf method for the distribution. In general the definition of the distribution is a subset of scipy.stats.rv_continuous.

Scipy distributions: https://docs.scipy.org/doc/scipy-0.19.0/reference/stats.html

become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
distribution

Returns the distribution object.

generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

size

Returns the size of the output from the distribution.

state

State dictionary of the node

class elfi.Simulator(fn, *params, **kwargs)[source]

A simulator node of a generative model.

Simulator nodes are stochastic and may have observed data in the model.

Parameters:
  • fn (callable) – Simulator function with a signature sim(*params, batch_size, random_state)
  • params – Input parameters for the simulator.
  • kwargs
become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

state

State dictionary of the node

class elfi.Summary(fn, *parents, **kwargs)[source]

A summary node of a generative model.

Summary nodes are deterministic operations associated with the observed data. if their parents hold observed data it will be automatically transformed.

Parameters:
  • fn (callable) – Summary function with a signature summary(*parents)
  • parents – Input data for the summary function.
  • kwargs
become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

state

State dictionary of the node

class elfi.Discrepancy(discrepancy, *parents, **kwargs)[source]

A discrepancy node of a generative model.

This class provides a convenience node for custom distance operations.

Parameters:
  • discrepancy (callable) – Signature of the discrepancy function is of the form: discrepancy(summary_1, summary_2, ..., observed), where summaries are arrays containing batch_size simulated values and observed is a tuple (observed_summary_1, observed_summary_2, ...). The callable object should return a vector of discrepancies between the simulated summaries and the observed summaries.
  • *parents – Typically the summaries for the discrepancy function.
  • **kwargs

See also

elfi.Distance
creating common distance discrepancies.
become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

state

State dictionary of the node

class elfi.Distance(distance, *summaries, p=None, w=None, V=None, VI=None, **kwargs)[source]

A distance node of a generative model.

This class contains many common distance implementations through scipy.

Parameters:
  • distance (str, callable) –

    If string it must be a valid metric from scipy.spatial.distance.cdist.

    Is a callable, the signature must be distance(X, Y), where X is a n x m array containing n simulated values (summaries) in rows and Y is a 1 x m array that contains the observed values (summaries). The callable should return a vector of distances between the simulated summaries and the observed summaries.

  • summaries – summary nodes of the model
  • p (double, optional) – The p-norm to apply Only for distance Minkowski (‘minkowski’), weighted and unweighted. Default: 2.
  • w (ndarray, optional) – The weight vector. Only for weighted Minkowski (‘wminkowski’). Mandatory.
  • V (ndarray, optional) – The variance vector. Only for standardized Euclidean (‘seuclidean’). Mandatory.
  • VI (ndarray, optional) – The inverse of the covariance matrix. Only for Mahalanobis. Mandatory.

Examples

>>> d = elfi.Distance('euclidean', summary1, summary2...) 
>>> d = elfi.Distance('minkowski', summary, p=1) 

Notes

Your summaries need to be scalars or vectors for this method to work. The summaries will be first stacked to a single 2D array with the simulated summaries in the rows for every simulation and the distance is taken row wise against the corresponding observed summary vector.

Scipy distances: https://docs.scipy.org/doc/scipy/reference/generated/generated/scipy.spatial.distance.cdist.html

See also

elfi.Discrepancy
A general discrepancy node
become(other_node)

Make this node become the other_node.

The children of this node will be preserved.

Parameters:other_node (NodeReference) –
generate(batch_size=1, with_values=None)

Generates output from this node.

Useful for testing.

Parameters:
  • batch_size (int) –
  • with_values (dict) –
parents

Get all the positional parent nodes (inputs) of this node

Returns:parents – List of positional parents
Return type:list
reference(name, model)

Constructor for creating a reference for an existing node in the model

Parameters:
  • name (string) – name of the node
  • model (ElfiModel) –
Returns:

Return type:

NodePointer instance

state

State dictionary of the node

Other

elfi.new_model(name=None, set_current=True)
elfi.get_current_model()

Return the current default elfi.ElfiModel instance.

New nodes will be added to this model by default.

elfi.set_current_model(model=None)

Set the current default elfi.ElfiModel instance.

visualization.nx_draw(G, internal=False, param_names=False, filename=None, format=None)

Draw the ElfiModel.

Parameters:
  • G (nx.DiGraph or ElfiModel) – Graph or model to draw
  • internal (boolean, optional) – Whether to draw internal nodes (starting with an underscore)
  • param_names (bool, optional) – Show param names on edges
  • filename (str, optional) – If given, save the dot file into the given filename.
  • format (str, optional) – format of the file

Notes

Requires the optional ‘graphviz’ library.

Returns:A GraphViz dot representation of the model.
Return type:dot

Inference API classes

class elfi.Rejection(model, discrepancy_name=None, output_names=None, **kwargs)[source]

Parallel ABC rejection sampler.

For a description of the rejection sampler and a general introduction to ABC, see e.g. Lintusaari et al. 2016.

References

Lintusaari J, Gutmann M U, Dutta R, Kaski S, Corander J (2016). Fundamentals and Recent Developments in Approximate Bayesian Computation. Systematic Biology. http://dx.doi.org/10.1093/sysbio/syw077.

Parameters:
  • model (ElfiModel or NodeReference) –
  • discrepancy_name (str, NodeReference, optional) – Only needed if model is an ElfiModel
  • output_names (list) – Additional outputs from the model to be included in the inference result, e.g. corresponding summaries to the acquired samples
  • kwargs – See InferenceMethod
batch_size

Return the current batch_size.

extract_result()[source]

Extracts the result from the current state

Returns:result
Return type:Sample
infer(*args, vis=None, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Returns:result
Return type:Sample
iterate()

Forward the inference one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Returns:
Return type:None
parameter_names

Return the parameters to be inferred.

plot_state(**options)[source]

Plot the current state of the inference algorithm.

This feature is still experimental and only supports 1d or 2d cases.

pool

Return the output pool of the inference.

prepare_new_batch(batch_index)

Prepare values for a new batch

ELFI calls this method before submitting a new batch with an increasing index batch_index. This is an optional method to override. Use this if you have a need do do preparations, e.g. in Bayesian optimization algorithm, the next acquisition points would be acquired here.

If you need provide values for certain nodes, you can do so by constructing a batch dictionary and returning it. See e.g. BayesianOptimization for an example.

Parameters:batch_index (int) – next batch_index to be submitted
Returns:batch – Keys should match to node names in the model. These values will override any default values or operations in those nodes.
Return type:dict or None
sample(n_samples, *args, **kwargs)

Sample from the approximate posterior

See the other arguments from the set_objective method.

Parameters:
  • n_samples (int) – Number of samples to generate from the (approximate) posterior
  • *args
  • **kwargs
Returns:

result

Return type:

Sample

seed

Return the seed of the inference.

set_objective(n_samples, threshold=None, quantile=None, n_sim=None)[source]
Parameters:
  • n_samples (int) – number of samples to generate
  • threshold (float) – Acceptance threshold
  • quantile (float) – In between (0,1). Define the threshold as the p-quantile of all the simulations. n_sim = n_samples/quantile.
  • n_sim (int) – Total number of simulations. The threshold will be the n_samples smallest discrepancy among n_sim simulations.
class elfi.SMC(model, discrepancy_name=None, output_names=None, **kwargs)[source]

Sequential Monte Carlo ABC sampler

batch_size

Return the current batch_size.

extract_result()[source]
Returns:
Return type:SmcSample
infer(*args, vis=None, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Returns:result
Return type:Sample
iterate()

Forward the inference one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Returns:
Return type:None
parameter_names

Return the parameters to be inferred.

plot_state(**kwargs)

Plot the current state of the algorithm.

Parameters:
  • axes (matplotlib.axes.Axes (optional)) –
  • figure (matplotlib.figure.Figure (optional)) –
  • xlim – x-axis limits
  • ylim – y-axis limits
  • interactive (bool (default False)) – If true, uses IPython.display to update the cell figure
  • close – Close figure in the end of plotting. Used in the end of interactive mode.
Returns:

Return type:

None

pool

Return the output pool of the inference.

sample(n_samples, *args, **kwargs)

Sample from the approximate posterior

See the other arguments from the set_objective method.

Parameters:
  • n_samples (int) – Number of samples to generate from the (approximate) posterior
  • *args
  • **kwargs
Returns:

result

Return type:

Sample

seed

Return the seed of the inference.

class elfi.BayesianOptimization(model, target_name=None, bounds=None, initial_evidence=None, update_interval=10, target_model=None, acquisition_method=None, acq_noise_var=0, exploration_rate=10, batch_size=1, batches_per_acquisition=None, async=False, **kwargs)[source]

Bayesian Optimization of an unknown target function.

Parameters:
  • model (ElfiModel or NodeReference) –
  • target_name (str or NodeReference) – Only needed if model is an ElfiModel
  • bounds (dict) – The region where to estimate the posterior for each parameter in model.parameters: dict(‘parameter_name’:(lower, upper), ... )`. Not used if custom target_model is given.
  • initial_evidence (int, dict, optional) – Number of initial evidence or a precomputed batch dict containing parameter and discrepancy values. Default value depends on the dimensionality.
  • update_interval (int) – How often to update the GP hyperparameters of the target_model
  • target_model (GPyRegression, optional) –
  • acquisition_method (Acquisition, optional) – Method of acquiring evidence points. Defaults to LCBSC.
  • acq_noise_var (float or np.array, optional) – Variance(s) of the noise added in the default LCBSC acquisition method. If an array, should be 1d specifying the variance for each dimension.
  • exploration_rate (float, optional) – Exploration rate of the acquisition method
  • batch_size (int, optional) – Elfi batch size. Defaults to 1.
  • batches_per_acquisition (int, optional) – How many batches will be requested from the acquisition function at one go. Defaults to max_parallel_batches.
  • async (bool) – Allow acquisitions to be made asynchronously, i.e. do not wait for all the results from the previous acquisition before making the next. This can be more efficient with a large amount of workers (e.g. in cluster environments) but forgoes the guarantee for the exactly same result with the same initial conditions (e.g. the seed). Default False.
  • **kwargs
batch_size

Return the current batch_size.

infer(*args, vis=None, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Returns:result
Return type:Sample
iterate()

Forward the inference one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Returns:
Return type:None
parameter_names

Return the parameters to be inferred.

plot_discrepancy(axes=None, **kwargs)[source]

Plot acquired parameters vs. resulting discrepancy.

TODO: refactor

plot_state(**options)[source]

Plot the GP surface

This feature is still experimental and currently supports only 2D cases.

pool

Return the output pool of the inference.

seed

Return the seed of the inference.

set_objective(n_evidence=None)[source]

You can continue BO by giving a larger n_evidence

Parameters:n_evidence (int) – Number of total evidence for the GP fitting. This includes any initial evidence.
update(batch, batch_index)[source]

Update the GP regression model of the target node.

class elfi.BOLFI(model, target_name=None, bounds=None, initial_evidence=None, update_interval=10, target_model=None, acquisition_method=None, acq_noise_var=0, exploration_rate=10, batch_size=1, batches_per_acquisition=None, async=False, **kwargs)[source]

Bayesian Optimization for Likelihood-Free Inference (BOLFI).

Approximates the discrepancy function by a stochastic regression model. Discrepancy model is fit by sampling the discrepancy function at points decided by the acquisition function.

The method implements the framework introduced in Gutmann & Corander, 2016.

References

Gutmann M U, Corander J (2016). Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models. JMLR 17(125):1−47, 2016. http://jmlr.org/papers/v17/15-017.html

Parameters:
  • model (ElfiModel or NodeReference) –
  • target_name (str or NodeReference) – Only needed if model is an ElfiModel
  • bounds (dict) – The region where to estimate the posterior for each parameter in model.parameters: dict(‘parameter_name’:(lower, upper), ... )`. Not used if custom target_model is given.
  • initial_evidence (int, dict, optional) – Number of initial evidence or a precomputed batch dict containing parameter and discrepancy values. Default value depends on the dimensionality.
  • update_interval (int) – How often to update the GP hyperparameters of the target_model
  • target_model (GPyRegression, optional) –
  • acquisition_method (Acquisition, optional) – Method of acquiring evidence points. Defaults to LCBSC.
  • acq_noise_var (float or np.array, optional) – Variance(s) of the noise added in the default LCBSC acquisition method. If an array, should be 1d specifying the variance for each dimension.
  • exploration_rate (float, optional) – Exploration rate of the acquisition method
  • batch_size (int, optional) – Elfi batch size. Defaults to 1.
  • batches_per_acquisition (int, optional) – How many batches will be requested from the acquisition function at one go. Defaults to max_parallel_batches.
  • async (bool) – Allow acquisitions to be made asynchronously, i.e. do not wait for all the results from the previous acquisition before making the next. This can be more efficient with a large amount of workers (e.g. in cluster environments) but forgoes the guarantee for the exactly same result with the same initial conditions (e.g. the seed). Default False.
  • **kwargs
batch_size

Return the current batch_size.

extract_posterior(threshold=None)[source]

Returns an object representing the approximate posterior based on surrogate model regression.

Parameters:threshold (float) – Discrepancy threshold for creating the posterior (log with log discrepancy).
Returns:posterior
Return type:elfi.methods.posteriors.BolfiPosterior
fit(n_evidence, threshold=None)[source]

Fit the surrogate model (e.g. Gaussian process) to generate a GP regression model for the discrepancy given the parameters.

infer(*args, vis=None, **kwargs)

Set the objective and start the iterate loop until the inference is finished.

See the other arguments from the set_objective method.

Returns:result
Return type:Sample
iterate()

Forward the inference one iteration.

This is a way to manually progress the inference. One iteration consists of waiting and processing the result of the next batch in succession and possibly submitting new batches.

Notes

If the next batch is ready, it will be processed immediately and no new batches are submitted.

New batches are submitted only while waiting for the next one to complete. There will never be more batches submitted in parallel than the max_parallel_batches setting allows.

Returns:
Return type:None
parameter_names

Return the parameters to be inferred.

plot_discrepancy(axes=None, **kwargs)

Plot acquired parameters vs. resulting discrepancy.

TODO: refactor

plot_state(**options)

Plot the GP surface

This feature is still experimental and currently supports only 2D cases.

pool

Return the output pool of the inference.

sample(n_samples, warmup=None, n_chains=4, threshold=None, initials=None, algorithm='nuts', n_evidence=None, **kwargs)[source]

Sample the posterior distribution of BOLFI, where the likelihood is defined through the cumulative density function of standard normal distribution:

L( heta) propto F((h-mu( heta)) / sigma( heta))

where h is the threshold, and mu( heta) and sigma( heta) are the posterior mean and (noisy) standard deviation of the associated Gaussian process.

The sampling is performed with an MCMC sampler (the No-U-Turn Sampler, NUTS).

Parameters:
  • n_samples (int) – Number of requested samples from the posterior for each chain. This includes warmup, and note that the effective sample size is usually considerably smaller.
  • warmpup (int, optional) – Length of warmup sequence in MCMC sampling. Defaults to n_samples//2.
  • n_chains (int, optional) – Number of independent chains.
  • threshold (float, optional) – The threshold (bandwidth) for posterior (give as log if log discrepancy).
  • initials (np.array of shape (n_chains, n_params), optional) – Initial values for the sampled parameters for each chain. Defaults to best evidence points.
  • algorithm (string, optional) – Sampling algorithm to use. Currently only ‘nuts’ is supported.
  • n_evidence (int) – If the regression model is not fitted yet, specify the amount of evidence
Returns:

Return type:

np.array

seed

Return the seed of the inference.

set_objective(n_evidence=None)

You can continue BO by giving a larger n_evidence

Parameters:n_evidence (int) – Number of total evidence for the GP fitting. This includes any initial evidence.
update(batch, batch_index)

Update the GP regression model of the target node.

Result objects

class elfi.methods.results.OptimizationResult(x_min, **kwargs)[source]
Parameters:
  • x_min – The optimized parameters
  • **kwargs – See ParameterInferenceResult
class elfi.methods.results.Sample(method_name, outputs, parameter_names, discrepancy_name=None, weights=None, **kwargs)[source]

Sampling results from the methods.

Parameters:
  • discrepancy_name (string, optional) – Name of the discrepancy in outputs.
  • **kwargs – Other meta information for the result
plot_marginals(selector=None, bins=20, axes=None, **kwargs)[source]

Plot marginal distributions for parameters.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.
  • bins (int, optional) – Number of bins in histograms.
  • axes (one or an iterable of plt.Axes, optional) –
Returns:

axes

Return type:

np.array of plt.Axes

plot_pairs(selector=None, bins=20, axes=None, **kwargs)[source]

Plot pairwise relationships as a matrix with marginals on the diagonal.

The y-axis of marginal histograms are scaled.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.
  • bins (int, optional) – Number of bins in histograms.
  • axes (one or an iterable of plt.Axes, optional) –
Returns:

axes

Return type:

np.array of plt.Axes

sample_means_summary()[source]

Print a representation of posterior means.

samples_array

Return the samples as an array with columns in the same order as in self.parameter_names.

Returns:
Return type:list of np.arrays
summary()[source]

Print a verbose summary of contained results.

class elfi.methods.results.SmcSample(method_name, outputs, parameter_names, populations, *args, **kwargs)[source]

Container for results from SMC-ABC.

Parameters:
  • method_name (str) –
  • outputs (dict) –
  • parameter_names (list) –
  • populations (list[Sample]) – List of Sample objects
  • args
  • kwargs
plot_marginals(selector=None, bins=20, axes=None, all=False, **kwargs)[source]

Plot marginal distributions for parameters for all populations.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.
  • bins (int, optional) – Number of bins in histograms.
  • axes (one or an iterable of plt.Axes, optional) –
  • all (bool, optional) – Plot the marginals of all populations
plot_pairs(selector=None, bins=20, axes=None, all=False, **kwargs)[source]

Plot pairwise relationships as a matrix with marginals on the diagonal for all populations.

The y-axis of marginal histograms are scaled.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.
  • bins (int, optional) – Number of bins in histograms.
  • axes (one or an iterable of plt.Axes, optional) –
  • all (bool, optional) – Plot for all populations
samples_array

Return the samples as an array with columns in the same order as in self.parameter_names.

Returns:
Return type:list of np.arrays
class elfi.methods.results.BolfiSample(method_name, chains, parameter_names, warmup, **kwargs)[source]

Container for results from BOLFI.

Parameters:
  • method_name (string) – Name of inference method.
  • chains (np.array) – Chains from sampling. Shape should be (n_chains, n_samples, n_parameters) with warmup included.
  • parameter_names (list : list of strings) – List of names in the outputs dict that refer to model parameters.
  • warmup (int) – Number of warmup iterations in chains.
plot_marginals(selector=None, bins=20, axes=None, **kwargs)

Plot marginal distributions for parameters.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.
  • bins (int, optional) – Number of bins in histograms.
  • axes (one or an iterable of plt.Axes, optional) –
Returns:

axes

Return type:

np.array of plt.Axes

plot_pairs(selector=None, bins=20, axes=None, **kwargs)

Plot pairwise relationships as a matrix with marginals on the diagonal.

The y-axis of marginal histograms are scaled.

Parameters:
  • selector (iterable of ints or strings, optional) – Indices or keys to use from samples. Default to all.
  • bins (int, optional) – Number of bins in histograms.
  • axes (one or an iterable of plt.Axes, optional) –
Returns:

axes

Return type:

np.array of plt.Axes

sample_means_summary()

Print a representation of posterior means.

samples_array

Return the samples as an array with columns in the same order as in self.parameter_names.

Returns:
Return type:list of np.arrays
summary()

Print a verbose summary of contained results.

Post-processing

elfi.adjust_posterior(sample, model, summary_names, parameter_names=None, adjustment='linear')

Adjust the posterior using local regression.

Note that the summary nodes need to be explicitly included to the sample object with the output_names keyword argument when performing the inference.

Parameters:
  • sample (elfi.methods.results.Sample) – a sample object from an ABC algorithm
  • model (elfi.ElfiModel) – the inference model
  • summary_names (list[str]) – names of the summary nodes
  • parameter_names (list[str] (optional)) – names of the parameters
  • adjustment (RegressionAdjustment or string) –

    a regression adjustment object or a string specification

    Accepted values for the string specification:
    • ‘linear’
Returns:

a Sample object with the adjusted posterior

Return type:

elfi.methods.results.Sample

Examples

>>> import elfi
>>> from elfi.examples import gauss
>>> m = gauss.get_model()
>>> res = elfi.Rejection(m['d'], output_names=['S1', 'S2']).sample(1000)
>>> adj = adjust_posterior(res, m, ['S1', 'S2'], ['mu'], LinearAdjustment())
class elfi.methods.post_processing.LinearAdjustment(**kwargs)[source]

Regression adjustment using a local linear model.

adjust()

Adjust the posterior.

Only the non-finite values used to fit the regression model will be adjusted.

Returns:
Return type:a Sample object containing the adjusted posterior
fit(sample, model, summary_names, parameter_names=None)

Fit a regression adjustment model to the posterior sample.

Non-finite values in the summary statistics and parameters will be omitted.

Parameters:
  • sample (elfi.methods.Sample) – a sample object from an ABC method
  • model (elfi.ElfiModel) – the inference model
  • summary_names (list[str]) – a list of names for the summary nodes
  • parameter_names (list[str] (optional)) – a list of parameter names

Other

Data pools

class elfi.OutputPool(outputs=None, name=None, prefix=None)[source]

Store node outputs to dictionary-like stores.

The default store is a Python dictionary.

Notes

Saving the store requires that all the stores are pickleable.

Arbitrary objects that support simple array indexing can be used as stores by using the elfi.store.ArrayObjectStore class.

See the elfi.store.StoreBase interfaces if you wish to implement your own ELFI compatible store. Basically any object that fulfills the Pythons dictionary api will work as a store in the pool.

Depending on the algorithm, some of these values may be reused after making some changes to ElfiModel thus speeding up the inference significantly. For instance, if all the simulations are stored in Rejection sampling, one can change the summaries and distances without having to rerun the simulator.

Parameters:
  • outputs (list, dict, optional) – list of node names which to store or a dictionary with existing stores. The stores are created on demand.
  • name (str, optional) – Name of the pool. Used to open a saved pool from disk.
  • prefix (str, optional) – Path to directory under which elfi.ArrayPool will place its folder. Default is a relative path ./pools.
Returns:

instance

Return type:

OutputPool

add_batch(batch, batch_index)[source]

Adds the outputs from the batch to their stores.

add_store(node, store=None)[source]

Adds a store object for the node.

Parameters:
  • node (str) –
  • store (dict, StoreBase, optional) –
Returns:

Return type:

None

clear()[source]

Removes all data from the stores

close()[source]

Save and close the stores that support it.

The pool will not be usable afterwards.

delete()[source]

Remove all persisted data from disk.

flush()[source]

Flushes all data from the stores.

If the store does not support flushing, does nothing.

get_batch(batch_index, output_names=None)[source]

Returns a batch from the stores of the pool.

Parameters:
  • batch_index (int) –
  • output_names (list) – which outputs to include to the batch
Returns:

batch

Return type:

dict

classmethod open(name, prefix=None)[source]

Open a closed or saved ArrayPool from disk.

Parameters:
  • name (str) –
  • prefix (str, optional) –
Returns:

Return type:

ArrayPool

remove_batch(batch_index)[source]

Removes the batch from all the stores.

remove_store(node)[source]

Removes a store from the pool

Parameters:node (str) –
Returns:The removed store
Return type:store
save()[source]

Save the pool to disk.

This will use pickle to store the pool under self.path.

set_context(context)[source]

Sets the context of the pool.

The pool needs to know the batch_size and the seed.

Notes

Also sets the name of the pool if not set already.

Parameters:context (elfi.ComputationContext) –
Returns:
Return type:None
class elfi.ArrayPool(outputs=None, name=None, prefix=None)[source]

OutputPool that uses binary .npy files as default stores.

The default store medium for output data is a NumPy binary .npy file for NumPy array data. You can however also add other types of stores as well.

Notes

The default store is implemented in elfi.store.NpyStore that uses NpyArrays as stores. The NpyArray is a wrapper over NumPy .npy binary file for array data and supports appending the .npy file. It uses the .npy format 2.0 files.

Depending on the algorithm, some of these values may be reused after making some changes to ElfiModel thus speeding up the inference significantly. For instance, if all the simulations are stored in Rejection sampling, one can change the summaries and distances without having to rerun the simulator.

Parameters:
  • outputs (list, dict, optional) – list of node names which to store or a dictionary with existing stores. The stores are created on demand.
  • name (str, optional) – Name of the pool. Used to open a saved pool from disk.
  • prefix (str, optional) – Path to directory under which elfi.ArrayPool will place its folder. Default is a relative path ./pools.
Returns:

instance

Return type:

OutputPool

add_batch(batch, batch_index)

Adds the outputs from the batch to their stores.

add_store(node, store=None)

Adds a store object for the node.

Parameters:
  • node (str) –
  • store (dict, StoreBase, optional) –
Returns:

Return type:

None

clear()

Removes all data from the stores

close()

Save and close the stores that support it.

The pool will not be usable afterwards.

delete()

Remove all persisted data from disk.

flush()

Flushes all data from the stores.

If the store does not support flushing, does nothing.

get_batch(batch_index, output_names=None)

Returns a batch from the stores of the pool.

Parameters:
  • batch_index (int) –
  • output_names (list) – which outputs to include to the batch
Returns:

batch

Return type:

dict

open(name, prefix=None)

Open a closed or saved ArrayPool from disk.

Parameters:
  • name (str) –
  • prefix (str, optional) –
Returns:

Return type:

ArrayPool

remove_batch(batch_index)

Removes the batch from all the stores.

remove_store(node)

Removes a store from the pool

Parameters:node (str) –
Returns:The removed store
Return type:store
save()

Save the pool to disk.

This will use pickle to store the pool under self.path.

set_context(context)

Sets the context of the pool.

The pool needs to know the batch_size and the seed.

Notes

Also sets the name of the pool if not set already.

Parameters:context (elfi.ComputationContext) –
Returns:
Return type:None

Module functions

elfi.get_client()

Get the current ELFI client instance.

elfi.set_client(client=None)

Set the current ELFI client instance.

Tools

tools.vectorize(operation, constants=None, dtype=None)

Vectorizes an operation.

Helper for cases when you have an operation that does not support vector arguments. This tool is still experimental and may not work in all cases.

Parameters:
  • operation (callable) – Operation to vectorize.
  • constants (tuple, list, optional) – A mask for constants in inputs, e.g. (0, 2) would indicate that the first and third positional inputs are constants. The constants will be passed as they are to each operation call.
  • dtype (np.dtype, bool[False], optional) – If None, numpy converts a list of outputs automatically. In some cases this produces non desired results. If you wish to keep the outputs as they are with no conversion, specify dtype=False. This results into a 1d object numpy array with outputs as they were returned.

Notes

This is a convenience method that uses a for loop internally for the vectorization. For best performance, one should aim to implement vectorized operations (by using e.g. numpy functions that are mostly vectorized) if at all possible.

Examples

# This form works in most cases
vectorized_simulator = elfi.tools.vectorize(simulator)

# Tell that the second and third argument to the simulator will be a constant
vectorized_simulator = elfi.tools.vectorize(simulator, [1, 2])
elfi.Simulator(vectorized_simulator, prior, constant_1, constant_2)

# Tell the vectorizer that it should not do any conversion to the outputs
vectorized_simulator = elfi.tools.vectorize(simulator, dtype=False)
tools.external_operation(command, process_result=None, prepare_inputs=None, sep=' ', stdout=True, subprocess_kwargs=None)

Wrap an external command as a Python callable (function).

The external command can be e.g. a shell script, or an executable file.

Parameters:
  • command (str) – Command to execute. Arguments can be passed to the executable by using Python’s format strings, e.g. “myscript.sh {0} {batch_size} –seed {seed}”. The command is expected to write to stdout. Since random_state is python specific object, a seed keyword argument will be available to operations that use random_state.
  • process_result (callable, np.dtype, str, optional) – Callable result handler with a signature output = callable(result, *inputs, **kwinputs). Here the result is either the stdout or subprocess.CompletedProcess depending on the stdout flag below. The inputs and kwinputs will come from ELFI. The default handler converts the stdout to numpy array with array = np.fromstring(stdout, sep=sep). If process_result is np.dtype or a string, then the stdout data is casted to that type with stdout = np.fromstring(stdout, sep=sep, dtype=process_result).
  • prepare_inputs (callable, optional) – Callable with a signature inputs, kwinputs = callable(*inputs, **kwinputs). The inputs will come from elfi.
  • sep (str, optional) – Separator to use with the default process_result handler. Default is a space ‘ ‘. If you specify your own callable to process_result this value has no effect.
  • stdout (bool, optional) – Pass the process_result handler the stdout instead of the subprocess.CompletedProcess instance. Default is true.
  • subprocess_kwargs (dict, optional) – Options for Python’s subprocess.run that is used to run the external command. Defaults are shell=True, check=True. See the subprocess documentation for more details.

Examples

>>> import elfi
>>> op = elfi.tools.external_operation('echo 1 {0}', process_result='int8')
>>>
>>> constant = elfi.Constant(123)
>>> simulator = elfi.Simulator(op, constant)
>>> simulator.generate()
array([  1, 123], dtype=int8)
Returns:operation – ELFI compatible operation that can be used e.g. as a simulator.
Return type:callable