# Parallel execution in Apache Hop workflows

One of the first concepts new Apache Hop users learn is that pipelines are executed in parallel and workflows are executed sequentially.

However, there are cases where you want to overrule these defaults and execute pipelines sequentially and workflows in parallel. We’ll take a closer look at the latter use case in more detail and show how you can run actions in a workflow in parallel.

### Multiple workflow action hops

As you already know, actions in a workflow are executed sequentially. Each action in a workflow has an exit code (success or failure) that determines the path the workflow will follow. This exit code can be ignored in the case of an unconditional hop.

A workflow action can have multiple outgoing hops. However, this doesn’t mean the workflow will follow all hops in parallel. If an action has multiple outgoing hops, the default workflow behavior is to execute all actions sequentially in the order they were added to the workflow.

In the example below, the workflow will execute "sample-pipeline.hpl 1" first. Once that action is completed, the workflow will continue to "sample-pipeline.hpl 2".

<img src="/files/00N7yjoZDHchcsupie5P" alt="Sequential actions in Apache Hop workflows" data-size="original">

### Parallel execution

Parallel execution in a workflow is possible, but this needs to be specified explicitly. To do so, click on an action’s icon and click the "parallel execution" option. Once the parallel option has been activated, the hop line will be dotted and double-crossed, as shown in the screenshot below.

Keep in mind that parallel execution means that all actions that run in parallel will have to share the resources in the Java Virtual Machine (JVM). Small pipelines and workflow actions that run in parallel may be faster, but larger items that require a lot of memory or CPU power may be faster when executed sequentially.

<figure><img src="/files/6bGTzLVOcWKd6UlTQdoC" alt="" width="375"><figcaption></figcaption></figure>

<figure><img src="/files/SoDZTFEBxteG9k55RMwI" alt="" width="375"><figcaption></figcaption></figure>

When you run this workflow, the log message will tell you both actions have started in parallel:

```highlight
2023/05/01 10:14:42 - parallel-workflow - Start of workflow execution
2023/05/01 10:14:42 - parallel-workflow - Starting action [sample-pipeline.hpl 1]
2023/05/01 10:14:42 - parallel-workflow - Launched action [sample-pipeline.hpl 1] in parallel.
2023/05/01 10:14:42 - parallel-workflow - Starting action [sample-pipeline.hpl 2]
2023/05/01 10:14:42 - parallel-workflow - Launched action [sample-pipeline.hpl 2] in parallel.
```

### Combining sequential and parallel execution

Once you tell a workflow to run in parallel from a given action, it will continue to run the subsequent actions in parallel.

Consider the extremely simple workflow below. This workflow starts both "sample pipeline actions in parallel. After the sample pipelines, the workflow will execute the respective "Write to log" actions, and both workflows will execute the "Dummy" action.

The effective result will be what is shown in the second screenshot below:

<figure><img src="/files/LYYm0fklZjdzVqY7QrKL" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/VtxeABT1OqpNxSEPlydu" alt=""><figcaption></figcaption></figure>

In a lot of cases, you’ll only want to execute parts of a workflow in parallel. Example use cases could be that you want to load data to a number of relatively small database tables or generate a number of relatively small files before continuing with the more heavy lifting.

In those scenarios, you’ll want to isolate the parallel processing in a separate child workflow.

In the screenshot below, we’ve isolated the part of the workflow we want to execute in parallel into a child workflow. When this workflow runs, the child workflow ("parallel workflow") will run both actions in parallel. The child workflow will run both sample pipelines in parallel. When the last of these two pipelines finishes, the parent workflow will continue its (sequential) execution.

<figure><img src="/files/6PqMvfRKq5SMMvJVoruV" alt="" width="563"><figcaption></figcaption></figure>

<figure><img src="/files/SoDZTFEBxteG9k55RMwI" alt="" width="375"><figcaption></figcaption></figure>

#### Using the Join Action Instead of a Child Workflow

As of recent versions of Apache Hop, you can achieve the same effect **without a child workflow** by using the new `Join` action.

The `Join` action allows you to **synchronize multiple parallel branches directly within the same workflow**. It waits until all incoming branches have completed before allowing the workflow to continue. This makes it ideal when you want to combine parallel and sequential execution in a single workflow, without the added complexity of nesting child workflows.

To use the Join action:

1. From your starting action, create multiple outgoing hops.
2. Enable **parallel execution** on the hops you want to run simultaneously.
3. Add a `Join` action where those branches should merge.
4. Connect the `Join` action to the next sequential action(s).

<figure><img src="/files/C1jLP2ezUowHtEfvUs6z" alt="" width="352"><figcaption></figcaption></figure>

This approach simplifies workflow design, improves readability, and reduces the number of components to manage.

**Use the `Join` action when:**

* You want to synchronize parallel execution within a single workflow.
* You want to avoid external child workflows for basic parallel branching.
* You need to continue processing **only after all parallel branches are finished**.

### Summary

In this post, we walked through the various options to run workflow actions in parallel in Apache Hop. You also learned how to combine parallel and sequential execution through child workflows.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.primeur.com/data-shaper-1.21/knowing-the-data-shaper-designer/index-5/workflows-parallel-execution.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
