Data comes in a large variety of sources, often in different quality levels.
To solidly retrieve results from data, a lot of preparatory work is required.
Not necessarily related to a project's goal or the desired solution (data warehousing, big data analytics, machine learning), a general assumption is that 80% of the time is spent on data preparation. Data preparation is a broad term that can cover data acquisition, linking data sets, data cleaning and the actual loading of data into the desired format or target platform.
To be efficient at data preparation, having the right tools for the task is crucial. Kettle (also known as Pentaho Data Integration) is an open source data integration platform with over 15 years of history.
Kettle allows to visually develop data streams or pipelines. After the initial development, the Kettle code is managed as software, including version control, testing, CI/CD etc.
Additionally, visual development allows developers, data engineers and data scientists to focus on what needs to be done, not on how to prepare the data.
Kettle supports a large number of data formats, is able to talk to every significant data platform in the market, and has extensive options to build scalable and extendable solutions.
know.bi has been involved in the development of Kettle from a very early stage. We know the platform inside out and can maximize your return on investment. Apart from standard services, we can provide help in tailoring the Kettle platform to your needs.