Data comes from a wide variety of sources, often with widely varying quality. A lot of preparatory work is required to get results from data in a sound way.
Regardless of the goal of a project or the desired solution (data warehousing, big data analytics, machine learning), it is assumed that 80% of the time is spent on data preparation.
Data preparation is a broad concept that can involve the collection, linking, cleaning, and writing of data.
By merging & cleansing data from different data sources you will get better insights, changing data into useful information. Apache Hop is the tool we use for enabling these processes.
Data processes need to be easy to design, easy to test, easy to run, and easy to deploy. We believe that visually designing data processes greatly increases developer productivity.
Although visually designed, all our work items can be managed like any other piece of software: version control, testing, CI/CD, documentation are all first-class citizens in the Apache Hop platform.