5 minutes to configure Unit tests in Apache Hop - Bypass and Remove

5 minutes to configure Unit tests in Apache Hop - Bypass and Remove

Unit tests in Apache Hop - a quick recap

This is the second post in our series of three about unit testing in Apache Hop. Check the first post to find out why you want to build unit tests for your Apache Hop projects. 

Briefly put: a unit test executes a pipeline and compares the generated output to the expected output (also known as the Golden Dataset).
If the generated output exactly matches the Golden Dataset, the test passes.
If there are any differences between the generated and the expected (golden) result, the test fails, and you have some work to do. 

Bypass and remove in unit tests

The basic scenario we covered in the first post just runs a unit test pipeline "as is". That works for basic unit tests, but there are use cases where you may need more tweaks. There will be certain transforms you want to ignore or substreams in your pipeline that you want to exclude from the test.

Even if you don't really need any of these in your real unit test, they may be useful while you're developing your unit tests. 

Unit tests in Apache Hop provide two options to exclude parts of a pipeline from a unit test: 

  • bypass a transform from the unit test. Transforms that are bypassed will be replaced with a dummy transform when the unit test runs. 
  • remove a transform from the unit test. This will cut all hops from incoming transforms to this transform, effectively cutting the stream flowing through this transform (and any of the following transforms). 

Let's take a closer look at a couple of examples. As always, the examples here use a Hop project with environment variables to separate code and configuration in your Hop projects.

Sample unit test 1 - Bypass

The following pipeline generates a simple calculation from a generated number column and writes the results to a CSV file. The generate-rows-unit unit test includes the following datasets:

  • data-set-number as the input dataset
  • data-set-calc as the golden dataset
ut-sample1

 

If you run the pipeline with no modifications, you’ll receive a notification in the logs:

ut-sample1-2

But if we, for example, add a transform to modify the stream, the results will cause the test to fail.

ut-sample1-3

In this case, we’re filtering the row with sequence=12 and when we run the pipeline a popup is shown for the failed rows:

ut-sample1-4

Note that the number of rows causes the test to fail.

Bypass in test

What if we want to disable this transform in the pipeline when we run the unit test?

While developing pipelines, you’ll often remove or disable transforms in a pipeline. We can do the same in unit tests.

In our example, we may want to bypass the transform that caused the test to fail (filter-seq-12).

Bypassing a transform in a test will replace the transform with a Dummy transform while executing the test.

To do so follow the steps below.

Step 1: Set the transform as a bypass

Click on the filter-seq-12 transform icon and select the Bypass in test option, then you will see an arrow :arrow_right: icon in the transform as in the image below:

ut-sample1-8

Step 2: Run the pipeline

ut-sample1-9

Note that the pipeline runs with all tests passed.

You can use the Remove bypass in test option to enable the transform again. Click on the filter-seq-12 transform icon and select the Remove bypass in test option:

ut-sample1-10Then you will see how the arrow :arrow_right: icon is removed from the transform as in the image below:

ut-sample1-11

Sample unit test 2 - Remove

Let’s see another sample pipeline. The following pipeline generates the same simple calculation from a generated number column but in this case, writes the results to a CSV file and the filtered result to another CSV file.

The generate-rows-unit unit test includes the following datasets:

  • data-set-number as the input dataset
  • data-set-calc as the golden dataset

ut-sample-ut2

Remove from test

What if you want to exclude the write-12-to-csv transform from the unit test execution? You want to write the false results but not the true results.

Well in this case you can use the Remove from test option.

To do so follow the steps below:

Step 1: Set the transform as removed

Click on the write-12-to-csv transform icon and select the Remove from test option, then you will see an :heavy_multiplication_x: icon in the transform as in the image below:

ut-sample-ut2-1

Step 2: Run the pipeline

ut-sample-ut2-2

Note that the pipeline runs with all tests passed.

ut-sample-ut2-3

You can use the Include in test option to enable the transform again. Click on the write-12-to-csv transform icon and select the Include in test option:

ut-sample-ut2-4Then you will see how the :heavy_multiplication_x: icon is removed from the transform as in the image below:

ut-sample-ut2-5

We have all the required tools now to build standard and more advanced pipelines in Apache Hop.

Building and running tests manually, as we've done so far, is useful but it only gets you so far. As your library of unit tests grows, you'll want to have a way to periodically (daily) run all of them. After all, you're interested in test results, not in building tests.  

In the third and final post in this unit testing series, we'll look at how you can orchestrate your Apache Hop unit tests with the unit testing transform and action.  

Want to find out more? Download our free Hop fact sheet now!

Download Now

Subscribe to the know.bi blog

Blog comments