matador.workflows package

The workflows module contains ways of creating custom workflows for chaining together calculations. Each custom workflow inherits from workflows.Workflow and consists of several workflows.WorkflowStep objects. If the creation of the Workflow is wrapped in a function with the correct signature, a whole Workflow can itself be used as a workflows.WorkflowStep.

class matador.workflows.Workflow(computer, calc_doc, seed, **workflow_kwargs)[source]

Bases: object

Workflow objects are bundles of calculations defined as WorkflowStep objects. Each WorkflowStep takes three arguments: the matador.compute.ComputeTask object used to run the calculations, the calculation parameters (which can be modified by each step), the seed name. Any subclass of Workflow must implement preprocess and postprocess methods (even if they just return True).

computer

the object that will be running the computation.

Type:

matador.compute.ComputeTask

calc_doc

the interim dictionary of structural and calculation parameters.

Type:

dict

seed

the root seed for the calculation.

Type:

str

label

the name of the type of the Workflow object.

Type:

str

success

the status of the workflow. This is only set to True after final step completes, but BEFORE post-processing.

Type:

bool

steps

list of steps to be completed.

Type:

list of WorkflowStep

Initialise the Workflow object from a matador.compute.ComputeTask, calculation parameters and the seed name.

Parameters:
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises:

RuntimeError – if any part of the calculation fails.

abstract preprocess()[source]

This function is run at the start of the workflow, and is responsible for adding WorkflowStep objects to the Workflow.

abstract postprocess()[source]

This OPTIONAL function is run upon successful completion of all steps of the workflow and can be overloaded by the subclass to perform any postprocessing steps. This occurs before cleaning up the directory (i.e. moving to completed/bad_castep).

add_step(function, name, input_exts=None, output_exts=None, clean_after=False, **func_kwargs)[source]

Add a step to the workflow.

Parameters:
  • function (Function) – the function to run in the step; must accept arguments of (self.computer, self.calc_doc, self.seed).

  • name (str) – the desired name for the step (human-readable).

Keyword Arguments:
  • clean_after (bool) – whether or not to clean up after this step is called

  • func_kwargs (dict) – any arguments to pass to function when called.

run_steps()[source]

Loop over steps and run them.

class matador.workflows.WorkflowStep(function, name, compute_dir=None, input_exts=None, output_exts=None, **func_kwargs)[source]

Bases: object

An individual step in a Workflow, defined by a Python function and a name. The function will be called with arguments (computer, calc_doc, seed) with the run_step method.

function

the function to call.

Type:

function

name

the human-readable name of the step.

Type:

str

compute_dir

the folder that computer will perform the calculation in.

Type:

str

func_kwargs

any extra kwargs to pass to the function.

Type:

dict

input_exts

list of input file extensions to cache after running.

Type:

list

output_exts

list of output file extensions to cache after running.

Type:

list

Construct a WorkflowStep from a function.

success = False
cache_files(seed)[source]

Wrapper for calling both _cache_inputs and _cache_outputs, without throwing any errors.

run_step(computer, calc_doc, seed)[source]

Run the workflow step.

Parameters:
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises:

RuntimeError – if any step fails.

Subpackages

Submodules

matador.workflows.workflows module

This module implements various workflows, ways of chaining up different calculations at high-throughput.

class matador.workflows.workflows.Workflow(computer, calc_doc, seed, **workflow_kwargs)[source]

Bases: object

Workflow objects are bundles of calculations defined as WorkflowStep objects. Each WorkflowStep takes three arguments: the matador.compute.ComputeTask object used to run the calculations, the calculation parameters (which can be modified by each step), the seed name. Any subclass of Workflow must implement preprocess and postprocess methods (even if they just return True).

computer

the object that will be running the computation.

Type:

matador.compute.ComputeTask

calc_doc

the interim dictionary of structural and calculation parameters.

Type:

dict

seed

the root seed for the calculation.

Type:

str

label

the name of the type of the Workflow object.

Type:

str

success

the status of the workflow. This is only set to True after final step completes, but BEFORE post-processing.

Type:

bool

steps

list of steps to be completed.

Type:

list of WorkflowStep

Initialise the Workflow object from a matador.compute.ComputeTask, calculation parameters and the seed name.

Parameters:
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises:

RuntimeError – if any part of the calculation fails.

abstract preprocess()[source]

This function is run at the start of the workflow, and is responsible for adding WorkflowStep objects to the Workflow.

abstract postprocess()[source]

This OPTIONAL function is run upon successful completion of all steps of the workflow and can be overloaded by the subclass to perform any postprocessing steps. This occurs before cleaning up the directory (i.e. moving to completed/bad_castep).

add_step(function, name, input_exts=None, output_exts=None, clean_after=False, **func_kwargs)[source]

Add a step to the workflow.

Parameters:
  • function (Function) – the function to run in the step; must accept arguments of (self.computer, self.calc_doc, self.seed).

  • name (str) – the desired name for the step (human-readable).

Keyword Arguments:
  • clean_after (bool) – whether or not to clean up after this step is called

  • func_kwargs (dict) – any arguments to pass to function when called.

run_steps()[source]

Loop over steps and run them.

class matador.workflows.workflows.WorkflowStep(function, name, compute_dir=None, input_exts=None, output_exts=None, **func_kwargs)[source]

Bases: object

An individual step in a Workflow, defined by a Python function and a name. The function will be called with arguments (computer, calc_doc, seed) with the run_step method.

function

the function to call.

Type:

function

name

the human-readable name of the step.

Type:

str

compute_dir

the folder that computer will perform the calculation in.

Type:

str

func_kwargs

any extra kwargs to pass to the function.

Type:

dict

input_exts

list of input file extensions to cache after running.

Type:

list

output_exts

list of output file extensions to cache after running.

Type:

list

Construct a WorkflowStep from a function.

success = False
cache_files(seed)[source]

Wrapper for calling both _cache_inputs and _cache_outputs, without throwing any errors.

run_step(computer, calc_doc, seed)[source]

Run the workflow step.

Parameters:
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises:

RuntimeError – if any step fails.