Complete Workflows#
A workflow is a list of calls to jobs, with additional arguments. The job name should be the first element on each line. Based on the two jobs PLOT and ECL_HIST we can create a small workflow example:
PLOT WWCT:OP_1 WWCT:OP_3 PRESSURE:10,10,10
PLOT FGPT FOPT
ECL_HIST <RUNPATH_FILE> <QC_PATH>/<ERTCASE>/wwct_hist WWCT:OP_1 WWCT:OP_2
In this workflow we create plots of the nodes
WWCT
: OP_1
, WWCT
: OP_3
, PRESSURE
:10,10,10, FGPT
and FOPT
. The plot job we
have created in this example is general, if we limited
ourselves to ECLIPSE summary variables we could get wildcard
support. Then we invoke the ECL_HIST example job to create a
histogram. See documentation of RUNPATH_FILE. and
ERTCASE.
Loading workflows#
Workflows are loaded with the configuration option LOAD_WORKFLOW
:
LOAD_WORKFLOW /path/to/workflow/WFLOW1
LOAD_WORKFLOW /path/to/workflow/workflow2 WFLOW2
The LOAD_WORKFLOW
takes the path to a workflow file as the first
argument. By default the workflow will be labeled with the filename
internally in ERT, but you can optionally supply a second extra argument
which will be used as the name for the workflow. Alternatively,
you can load a workflow interactively.
Automatically run workflows#
With the keyword HOOK_WORKFLOW
you can configure workflow
‘hooks’; meaning workflows which will be run automatically at certain
points during ERTs execution. Currently there are five points in ERTs
flow of execution where you can hook in a workflow:
Before the simulations (all forward models for a realization) start using
PRE_SIMULATION
,after all the simulations have completed using
POST_SIMULATION
,before the update step using
PRE_UPDATE
after the update step using
POST_UPDATE
andonly before the first update using
PRE_FIRST_UPDATE
.
For non interactive algorithms, PRE_FIRST_UPDATE
is equal to PRE_UPDATE
.
The POST_SIMULATION
hook is typically used to trigger QC workflows.
HOOK_WORKFLOW initWFLOW PRE_SIMULATION
HOOK_WORKFLOW preUpdateWFLOW PRE_UPDATE
HOOK_WORKFLOW postUpdateWFLOW POST_UPDATE
HOOK_WORKFLOW QC_WFLOW1 POST_SIMULATION
HOOK_WORKFLOW QC_WFLOW2 POST_SIMULATION
In this example the workflow initWFLOW
will run after all the
simulation directories have been created, just before the forward
model is submitted to the queue. The workflow preUpdateWFLOW
will be run before the update step and postUpdateWFLOW
will be
run after the update step. When all the simulations have completed the
two workflows QC_WFLOW1
and QC_WFLOW2
will be run.
Observe that the workflows being ‘hooked in’ with the
HOOK_WORKFLOW
must be loaded with the LOAD_WORKFLOW
keyword.
Locating the realisations: <RUNPATH_FILE>#
Context must be passed between the main ERT process and the script through the use of string substitution, in particular the ‘magic’ key <RUNPATH_FILE> has been introduced for this purpose.
Many of the external workflow jobs involve looping over all the realisations in a construction like this:
for each realisation:
// Do something for realisation
summarize()
When running an external job in a workflow there is no direct transfer of information between the main ERT process and the external script. We therefore must have a convention for transferring the information of which realisations we have simulated on, and where they are located in the filesystem. This is done through a file which looks like this:
0 /path/to/real0 CASE_0000
1 /path/to/real1 CASE_0001
...
9 /path/to/real9 CASE_0009
The name and location of this file is available as the magical string <RUNPATH_FILE> which is typically used as the first argument to external workflow jobs which should iterate over all realisations. The realisations referred to in the <RUNPATH_FILE> should be the last simulations you have run. The file is updated every time you run simulations. This implies that it is (currently) not so convenient to alter which directories should be used when running a workflow.