matador.workflows package

The workflows module contains ways of creating custom workflows for chaining together calculations. Each custom workflow inherits from workflows.Workflow and consists of several workflows.WorkflowStep objects. If the creation of the Workflow is wrapped in a function with the correct signature, a whole Workflow can itself be used as a workflows.WorkflowStep.

class matador.workflows.Workflow(computer, calc_doc, seed, **workflow_kwargs)[source]

Bases: object

Workflow objects are bundles of calculations defined as WorkflowStep objects. Each WorkflowStep takes three arguments: the matador.compute.ComputeTask object used to run the calculations, the calculation parameters (which can be modified by each step), the seed name. Any subclass of Workflow must implement preprocess and postprocess methods (even if they just return True).

computer

the object that will be running the computation.

Type

matador.compute.ComputeTask

calc_doc

the interim dictionary of structural and calculation parameters.

Type

dict

seed

the root seed for the calculation.

Type

str

label

the name of the type of the Workflow object.

Type

str

success

the status of the workflow. This is only set to True after final step completes, but BEFORE post-processing.

Type

bool

steps

list of steps to be completed.

Type

list of WorkflowStep

Initialise the Workflow object from a matador.compute.ComputeTask, calculation parameters and the seed name.

Parameters
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises

RuntimeError – if any part of the calculation fails.

abstract preprocess()[source]

This function is run at the start of the workflow, and is responsible for adding WorkflowStep objects to the Workflow.

abstract postprocess()[source]

This OPTIONAL function is run upon successful completion of all steps of the workflow and can be overloaded by the subclass to perform any postprocessing steps. This occurs before cleaning up the directory (i.e. moving to completed/bad_castep).

add_step(function, name, input_exts=None, output_exts=None, clean_after=False, **func_kwargs)[source]

Add a step to the workflow.

Parameters
  • function (Function) – the function to run in the step; must accept arguments of (self.computer, self.calc_doc, self.seed).

  • name (str) – the desired name for the step (human-readable).

Keyword Arguments
  • clean_after (bool) – whether or not to clean up after this step is called

  • func_kwargs (dict) – any arguments to pass to function when called.

run_steps()[source]

Loop over steps and run them.

class matador.workflows.WorkflowStep(function, name, compute_dir=None, input_exts=None, output_exts=None, **func_kwargs)[source]

Bases: object

An individual step in a Workflow, defined by a Python function and a name. The function will be called with arguments (computer, calc_doc, seed) with the run_step method.

function

the function to call.

Type

function

name

the human-readable name of the step.

Type

str

compute_dir

the folder that computer will perform the calculation in.

Type

str

func_kwargs

any extra kwargs to pass to the function.

Type

dict

input_exts

list of input file extensions to cache after running.

Type

list

output_exts

list of output file extensions to cache after running.

Type

list

Construct a WorkflowStep from a function.

success = False
cache_files(seed)[source]

Wrapper for calling both _cache_inputs and _cache_outputs, without throwing any errors.

run_step(computer, calc_doc, seed)[source]

Run the workflow step.

Parameters
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises

RuntimeError – if any step fails.

Subpackages

Submodules

matador.workflows.workflows module

This module implements various workflows, ways of chaining up different calculations at high-throughput.

class matador.workflows.workflows.Workflow(computer, calc_doc, seed, **workflow_kwargs)[source]

Bases: object

Workflow objects are bundles of calculations defined as WorkflowStep objects. Each WorkflowStep takes three arguments: the matador.compute.ComputeTask object used to run the calculations, the calculation parameters (which can be modified by each step), the seed name. Any subclass of Workflow must implement preprocess and postprocess methods (even if they just return True).

computer

the object that will be running the computation.

Type

matador.compute.ComputeTask

calc_doc

the interim dictionary of structural and calculation parameters.

Type

dict

seed

the root seed for the calculation.

Type

str

label

the name of the type of the Workflow object.

Type

str

success

the status of the workflow. This is only set to True after final step completes, but BEFORE post-processing.

Type

bool

steps

list of steps to be completed.

Type

list of WorkflowStep

Initialise the Workflow object from a matador.compute.ComputeTask, calculation parameters and the seed name.

Parameters
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises

RuntimeError – if any part of the calculation fails.

abstract preprocess()[source]

This function is run at the start of the workflow, and is responsible for adding WorkflowStep objects to the Workflow.

abstract postprocess()[source]

This OPTIONAL function is run upon successful completion of all steps of the workflow and can be overloaded by the subclass to perform any postprocessing steps. This occurs before cleaning up the directory (i.e. moving to completed/bad_castep).

add_step(function, name, input_exts=None, output_exts=None, clean_after=False, **func_kwargs)[source]

Add a step to the workflow.

Parameters
  • function (Function) – the function to run in the step; must accept arguments of (self.computer, self.calc_doc, self.seed).

  • name (str) – the desired name for the step (human-readable).

Keyword Arguments
  • clean_after (bool) – whether or not to clean up after this step is called

  • func_kwargs (dict) – any arguments to pass to function when called.

run_steps()[source]

Loop over steps and run them.

class matador.workflows.workflows.WorkflowStep(function, name, compute_dir=None, input_exts=None, output_exts=None, **func_kwargs)[source]

Bases: object

An individual step in a Workflow, defined by a Python function and a name. The function will be called with arguments (computer, calc_doc, seed) with the run_step method.

function

the function to call.

Type

function

name

the human-readable name of the step.

Type

str

compute_dir

the folder that computer will perform the calculation in.

Type

str

func_kwargs

any extra kwargs to pass to the function.

Type

dict

input_exts

list of input file extensions to cache after running.

Type

list

output_exts

list of output file extensions to cache after running.

Type

list

Construct a WorkflowStep from a function.

success = False
cache_files(seed)[source]

Wrapper for calling both _cache_inputs and _cache_outputs, without throwing any errors.

run_step(computer, calc_doc, seed)[source]

Run the workflow step.

Parameters
  • computer (matador.compute.ComputeTask) – the object that will be running the computation.

  • calc_doc (dict) – dictionary of structure and calculation parameters.

  • seed (str) – root seed for the calculation.

Raises

RuntimeError – if any step fails.