Engine
SHAR processes each NATS message in the WORKFLOW stream WORKFLOW.>
.
The engine contains several message processors, and each of them deal with a specific message type such as state transitions and activity execution.
If an error occurs whilst processing a message, the error type defines whether the step should be retried, the activity aborted, or the workflow terminated.
The following example describes the way SHAR processes an activity. In this case SHAR is running a service task on a client that has previously registered that it can perform task y.
sequenceDiagram autonumber participant SHAR participant NATS participant Client NATS--)SHAR: WORKFLOW.x.State.Traversal.Execute activate SHAR SHAR--)NATS: WORKFLOW.x.State.Activity.Execute SHAR--)NATS: WORKFLOW.x.State.Traversal.Complete note over SHAR: Locate the activity in the WORKFLOW_DEF KV.
This activity is a Service Task type.
Export any variables the task needs. note over SHAR: Store state snapshot inWORKFLOW_VARSTATE KV. SHAR--)NATS: WORKFLOW.x.State.Job.Execute.ServiceTask.y deactivate SHAR NATS--)Client: WORKFLOW.x.State.Job.Execute.ServiceTask.y activate Client note over Client: The client performs any processing
using the provided workflow variables
and returns result variables. Client--)NATS: WORKFLOW.x.State.Job.Complete.ServiceTask deactivate Client NATS--)SHAR: WORKFLOW.x.State.Job.Complete.ServiceTask activate SHAR note over SHAR: The Service Task completed successfully.
Merge any variables back in the workflow statefrom the WORKFLOW_VARSTATE KV. SHAR--)NATS: WORKFLOW.x.State.Activity.Complete deactivate SHAR NATS--)SHAR: WORKFLOW.x.State.Activity.Complete activate SHAR note over SHAR: Locate the next traversals in the WORKFLOW_DEF KV. loop For each traversal opt If traversal condition is met SHAR--)NATS: WORKFLOW.x.State.Traversal.Execute end end deactivate SHAR
NATS messages are used to trigger state-machine activities and transitions as seen above.
The same NATS messages are used by the engine for performing housekeeping tasks such as clearing up the key/value store.
These messages can also be used for extensibility. An example of this is the SHAR Telemetry Server which listens to workflow and activity messages, and converts them into Jaeger spans for tracing.
It is to be expected that the engine’s host may terminate abruptly during execution. The engine seeks to mitigate the effects of this by starting each critical piece of functionality using a message.
NATS by default will retry delivery of a message if is not acknowledged within the timeout period.
If the engine terminates during execution of a critical section, then the triggering message will be resent to another SHAR instance to be re-processed.
SHAR has been written in such a way that the follow-on NATS message is sent before the previous NATS message is acknowledged. This ensures that the workflow stays live even during NATS down, or SHAR termination.
It is imperative that all critical section code is idempotent i.e. it can be re-entered with the same parameters without causing side effects. Code not designed this way could possibly execute processes tasks and activities multiple times!