VERTICA
Distributed data warehouse for lightning-fast analytics at scale
What Is Vertica
Vertica is an advanced analytics platform built for the scale and complexity of today’s data-driven world.
It combines the power of a high-performance, Massively Parallel Processing query engine with advanced analytics and machine learning.
Through integration with Kafka, Spark and other platforms, and the option to read directly from your data lake, Vertica is more than just a database.
How fast is Vertica?
Vertica is lightning-fast on data volumes up to petabytes. As a distributed analytical database, Vertica was designed from the start for analytics.
Not even standard queries are fast. Vertica's in-SQL machine learning functions run directly in the database, saving you tons of time offloading and reloading data for training and model building.
Flexible bulk loading
Having a lightning-fast database isn't of much use if your data loading can't keep up.
Apache Hop's bulk loader for Vertica loads your data as quickly as Vertica returns results for your queries. Combined with the visual pipeline design and project life cycle management, Apache Hop and Vertica are a killer combination.
Analyzing your data not only in place, but in the right place – without data movement – while supporting any major cloud deployment for fast and reliable read and write for multiple data formats.
Supporting machine learning at scale to transform the way your data scientists and analysts interact with data, while removing barriers and accelerating time to value on predictive analytics projects.
Queries only read the relevant columns instead of entire rows, as traditional databases do. This makes a huge difference for analytical queries over vast amounts of data.
Future proofing your analytics with the freedom to deploy anywhere – on commodity hardware, across multiple clouds, and natively on any Hadoop distribution.
Bridging the gap between high-cost legacy EDWs and Hadoop data lakes with signature blazing fast ANSI SQL query execution at extreme scale.
Data is stored on a cluster of machines. just like analytical workloads can be shared over the machines in the cluster.