Complete Workflows¶
A workflow is a list of calls to jobs, with additional arguments. The job name should be the first element on each line. Based on the two jobs PLOT and ECL_HIST we can create a small workflow example:
PLOT WWCT:OP_1 WWCT:OP_3 PRESSURE:10,10,10
PLOT FGPT FOPT
ECL_HIST <RUNPATH_FILE> <QC_PATH>/<ERTCASE>/wwct_hist WWCT:OP_1 WWCT:OP_2
In this workflow we create plots of the nodes
WWCT : OP_1, WWCT : OP_3, PRESSURE:10,10,10, FGPT and FOPT. The plot job we
have created in this example is general, if we limited
ourselves to ECLIPSE summary variables we could get wildcard
support. Then we invoke the ECL_HIST example job to create a
histogram. See documentation of RUNPATH_FILE. and
ERTCASE.
DEFINE usage in workflows¶
Variables within workflows can be defined using the DEFINE keyword. If a DEFINE is already set in the ert config,
and then re-specified within a workflow, the define within the workflow will overshadow
the DEFINE from the ERT config. A DEFINE within
the workflow will set the value of that variable only within the scope of the workflow, but not alter its
value outside of the workflow.
Loading workflows¶
Workflows are loaded with the configuration option LOAD_WORKFLOW:
LOAD_WORKFLOW /path/to/workflow/WFLOW1
LOAD_WORKFLOW /path/to/workflow/workflow2 WFLOW2
The LOAD_WORKFLOW takes the path to a workflow file as the first
argument. By default the workflow will be labeled with the filename
internally in ERT, but you can optionally supply a second extra argument
which will be used as the name for the workflow. Alternatively,
you can load a workflow interactively.
Automatically run workflows¶
With the keyword HOOK_WORKFLOW you can configure workflow
‘hooks’; meaning workflows which will be run automatically at certain
points during ERTs execution. Currently there are five points in ERTs
flow of execution where you can hook in a workflow:
Before the experiment starts using
PRE_EXPERIMENTbefore the simulations (all forward models for a realization) start using
PRE_SIMULATION,after all the simulations have completed using
POST_SIMULATION,before the update step using
PRE_UPDATEafter the update step using
POST_UPDATEandonly before the first update using
PRE_FIRST_UPDATE.after the experiment has completed using
POST_EXPERIMENT
For non-iterative algorithms, PRE_FIRST_UPDATE is equal to PRE_UPDATE.
The POST_SIMULATION hook is typically used to trigger QC workflows.
- ::
HOOK_WORKFLOW preExperimentWFLOW PRE_EXPERIMENT HOOK_WORKFLOW initWFLOW PRE_SIMULATION HOOK_WORKFLOW preUpdateWFLOW PRE_UPDATE HOOK_WORKFLOW postUpdateWFLOW POST_UPDATE HOOK_WORKFLOW QC_WFLOW1 POST_SIMULATION HOOK_WORKFLOW QC_WFLOW2 POST_SIMULATION HOOK_WORKFLOW postExperimentWFLOW POST_EXPERIMENT
In this example the workflow, preExperimentWFLOW will run,
then initWFLOW will run at the start of every iteration, when
simulation directories have been created, just before the forward
model is submitted to the queue. The workflow preUpdateWFLOW
will be run before the update step and postUpdateWFLOW will be
run after the update step. At the end of each forward model run, the
two workflows QC_WFLOW1 and QC_WFLOW2 will be run.
After all iterations are complete, the postExperimentWFLOW will
run.
Observe that the workflows being ‘hooked in’ with the
HOOK_WORKFLOW must be loaded with the LOAD_WORKFLOW
keyword.
Locating the realisations: <RUNPATH_FILE>¶
Context must be passed between the main ERT process and the script through the use of string substitution, in particular the ‘magic’ key <RUNPATH_FILE> has been introduced for this purpose.
Many of the external workflow jobs involve looping over all the realisations in a construction like this:
for each realisation:
// Do something for realisation
summarize()
When running an external job in a workflow there is no direct transfer of information between the main ERT process and the external script. We therefore must have a convention for transferring the information of which realisations we have simulated on, and where they are located in the filesystem. This is done through a file which looks like this:
0 /path/to/real0 CASE_0000
1 /path/to/real1 CASE_0001
...
9 /path/to/real9 CASE_0009
The name and location of this file is available as the magical string <RUNPATH_FILE> which is typically used as the first argument to external workflow jobs which should iterate over all realisations. The realisations referred to in the <RUNPATH_FILE> should be the last simulations you have run. The file is updated every time you run simulations. This implies that it is (currently) not so convenient to alter which directories should be used when running a workflow.