Your data, smarter (18)-May-11-2021-01-05-13-97-PM


Design once, run anywhere


Runtime Agnostic

Design once, run anywhere. Be able to design a data process and run it on any engine you want. Apache Hop allows you to design a data pipeline and run it on your local laptop, a remote server, or on Apache Spark, Apache Flink or Google Dataflow through Apache Beam. 

A Pipeline Run Configuration is a metadata object that decouples the design and execution phases of Apache Hop pipeline development. A pipeline is a definition of how data is processed, a run configuration defines where the pipeline is executed.

pipeline v.1-1

Runtime engines

Apache Hop comes supports a number of different runtime engines:
  • Beam DataFlow pipeline engine: runs pipelines on Google DataFlow over Apache Beam.
  • Beam Direct pipeline engine: runs pipelines on the direct Beam runner (mainly for testing purposes).
  • Beam Flink pipeline engine: this configuration runs pipelines on Apache Flink over Apache Beam
  • Beam Spark pipeline engine: runs pipelines on Apache Spark over Apache Beam.
  • Local pipeline engine: runs pipelines locally in the native Hop engine.
  • Remote pipeline engine: runs pipelines in the native Hop engine on a remote machine.