Currently, a large graph database may have trillions of nodes. Suppose that your application suddenly becomes popular. Consider that the traffic and the volume of data is growing fast, and your database gets more overloaded every day.
Imagine your business entities and relations as a single graph. The physical storage of such a graph is divided, or sharded, across many servers or clusters, despite the fact that it’s still a single graph dataset.
Although common in the world of relational databases, sharding is pretty new to graphics databases.
Sharding is a method of splitting and storing a single logical dataset in multiple databases. By distributing the data among multiple machines, a cluster of database systems can store larger dataset and handle additional requests.
Sharding is necessary if a dataset is too large to be stored in a single database. Moreover, many sharding strategies allow additional machines to be added. Sharding allows a database cluster to scale along with its data and traffic growth.
In general terms, a federated database is a type of database management system that maps multiple autonomous database systems into one federated database. A federated graph places several sharded graphs cooperatively. Consequently, all those graphs can be queried as a single big graph database.
Why use a federated graph? Because a graph will ensure you can ask any question you want and to perform graph analytics at scale. While sharding divides graphs, federated graphs bring multiple graphs together, supporting queries across graph databases that may have different logical structures.
And since “Graphs are everywhere”, there are graphs across every organization. Suppose that you have a graph for each business process (goal, department). In this case, a federated graph will allow you to run queries through all of your graphs.