The Apache Hop community released Apache Hop 2.8.0 late last week. This release contains over three...
know.bi blog
building rock-solid data platforms, one step at a time
The Apache Hop community released version 2.7.0 earlier this month. Let's take a closer look at...
After three months of work on 65 tickets, the Apache Hop community released Apache Hop 2.6.0...
Apache Hop 2.5.0 was released late last week. 2.5.0 is the latest in a series of mostly hardening...
The Apache Beam project released 2.48.0 just in time to be included in the upcoming Apache Hop...
A number of questions have been raised in the Apache Hop chat about running workflows and pipelines...
One of the first concepts new Apache Hop users learn is that pipelines are executed in parallel and...
In any data engineering project, there are lots of use cases where you'll want the same process to...
We're two months into what more or less organically has become the bi-monthly release cycle for...
The Apache Hop team released Apache Hop 2.3.0 earlier this week.
Apache Hop 2.2.0 is available!
Apache Hop continues to evolve quickly. After the 2.1.0 release, less than two months ago and over...
The Apache Hop team just released version 2.1.0.
This new release is the result of four and a half...
What is data testing, and why should you test your data?
Apache Hop is a data engineering and data...
This guide will teach you the process of exporting data from a relational database (MySQL) and...
This guide will teach you the process of exporting data from a relational database (MySQL) and...
7 key points to successfully upgrade from Pentaho to Apache Hop
Why would you upgrade your Pentaho projects to Apache Hop?
Before going into the details of how you...
The Apache Hop PMC and community released Apache Hop 2.0.0 late last week. This is the second major...
Workflow Log
Apache Hop is a data engineering and data orchestration platform that allows data...
Pipeline Log
Apache Hop is a data engineering and data orchestration platform that allows data...
Neo4j is the world's leading graph database management system, designed for optimized fast...
Neo4j is the world's leading graph database management system, designed for optimized fast...
What is Apache Hop?
Apache Hop is a visual, metadata-driven data engineering platform that allows...
Earlier this month, the Apache Hop PMC and community released Apache Hop 1.2.0.
Hop 1.1.0 - Apache Hop continues to move fast!
Apache Hop 1.1.0, the first Hop release as an Apache...
Incubator - the Apache Way and Community
Just before the end of 2021, Apache Hop graduated from...
The Apache Hop (Incubating) project just released version 1.0, the first major release of the...
The Apache Hop community recently released Apache Hop 0.99. This will be the last release before...
Pentaho: the rise and fall of a platform
Imagine a couple of years ago. You are carefully exploring...
What is Data Lineage?
Wikipedia's describes data lineage as:
Project Hop joins the Apache Software Foundation, is now Apache Hop (Incubating)
As explained in a previous post, know.bi has been working with Matt Casters (Neo4j) and our growing...
Project Hop was announced at KCM19 back in November 2019. The first preview release is available...
Catching the "bad guys" using graphs.
Figure 1: Gartner layered model for fraud detection
Amazon SageMaker is a "fully managed machine learning service". This means it provisions an...
3 reasons to automate your analytics projects
Automate everything!
Analytics projects are often treated as ad-hoc projects. Code and content are...
What size is this?
Suppose you want to predict what the length or width of a flower petal.For this...
What's weird about this?
At certain times you might be faced with unexpected patterns or events...
How is this related?
In this post, we'll take a look at how we can find out in what way data is...
Is this A, or B?
As a follow-up to last week's machine learning tidbit let's look at an example of...
Graph Databases - Analytical Use Cases
What is a graph database?
Although graph theory has been around for centuries, graph databases...
So you want to get started with Machine Learning?
5 Key Components For Your Cloud Analytics Project
Why move your BI to the cloud?
As discussed in a previous post, there are many reasons to move...
In-SQL machine learning
Vertica is a clustered analytical database that handles large, fast-growing...
Cloud computing is the way to the future, and the way to bring your company to the next level. With...
What is Amazon DMS
Every day, more and more companies are moving towards cloud computing, with...
3 reasons to move your ETL to the web, cloud
ETL development heavily relies on the desktop with...