Apache Hop continues to evolve quickly. After the 2.1.0 release, less than two months ago and over...
Apache Hop 2.6.0 is available!
After three months of work on 65 tickets, the Apache Hop community released Apache Hop 2.6.0 earlier this week.
After the mostly bug-fixing-oriented releases earlier this year, the 2.6.0 release comes with a couple of exciting new features.
Apache 2.6.0 is quite Google-heavy: Apache Beam (to which Google is a significant contributor) was upgraded to 2.50.0, and there are new transform plugins for Google Analytics 4 and Google Sheets (input and output).
Let's take a closer look.
Apache Beam 2.50.0 upgrade, additional docs
Every Apache Hop release ships with the latest Apache Beam version, and Apache Hop 2.6.0 is no exception.
The Beam integration in Apache Hop was upgraded to Apache Beam 2.50.0.
The Apache Hop docs already contained a section on how to run the Hop samples in Apache Spark, Apache Flink and the direct runner over Apache Beam. Instructions on how to set up your environment and run the samples on Google Cloud Dataflow were missing, and have been added with Apache Hop 2.6.0.
Three new Google transforms
Google Analytics 4
Google Analytics 4 is an analytics service that enables you to measure traffic and engagement across your websites and apps. Where Universal Analytics (the "old" Google Analytics) measured screen views in separate mobile-specific properties, Google Analytics 4 (GA4) combines both web and app data in the same property.
The Google Analytics 4 transform lets you read data from a number of dimensions (e.g. geography, website page paths) and metrics (e.g. active users, sessions) into your pipelines.
The user interface for the existing Google Analytics plugin in the external Hop plugins repository was used as a starting point. Behind the scenes, all of the functionality was rebuilt to work with the Google Analytics 4 api.
Google Analytics 4 is an evolving platform, and this first iteration of this transform is very likely to evolve with it. Reach out if you need urgent bug fixes or enhancements.
Google Sheets Input and Output
Just like the Google Analytics plugin, a pre-existing version of the Google Sheets Input and Output transforms was available in the external plugins repository.
The Google Sheets Input and Output transforms were originally developed by Jeff Monteil as plugins for Pentaho Data Integration that were later ported to Apache Hop. After a couple of years of low maintenance, these plugins were dusted off and received some major work: the code was cleaned up and updated to use the latest Apache Hop, Google Drive and Google Sheets apis. Additionally, a number of features that were only added to the original (Pentaho) version of the plugins were implemented in this version of the plugins, and a couple of extra tweaks and improvements were added.
Both plugins use a Google service account JSON file for authentication (email is optional). Once you've successfully tested the connection, using these plugins should be straightforward.
Just like with the Google Analytics 4 transform, these transforms are likely to evolve in future Apache Hop versions. Reach out if you need urgent bug fixes or enhancements.
The Apache Hop community continues to grow.
Building a growing and thriving community is one of the tasks every incubating project gets when they join the ASF Incubator. The Apache Hop community is distributed all over the globe, and new members are added continuously.
The Apache Hop team already started working on Apache Hop 2.7.0, expected to be released in early to mid November.
know.bi will continue to work with the Apache Hop community to make Apache Hop a stable and reliable data integration and orchestration platform with unparalleled technology support.
Reach out if you want to find out more about Apache Hop, or if you'd like to discuss how we can help you build a successful data platform with Apache Hop.