NEO4J

Relationships matter

about

When talking about 'databases', people usually think about relational databases first (for now?).

In a relational database, data is organized in tables. Each table consist of a set of predefined columns, each with a predefined and fixed data type. Each record in a table contains a field that uniquely identifies that record (the primary key). Relationships between tables are defined by including references to other tables' primary keys in a table (foreign keys). The relationship between tables or data points is never stored, but is calculated by creating a 'join' between primary and foreign keys every single time a query is executed. Since relationships have to be calculated every time a query runs, relational databases (contrary to what their name implies), are not very good at working with 'relationships'. 

Graph databases have a number of similarities with relational databases, but differ conceptually:

  • nodes are the primary entities in a graph database. Nodes can be annotated with 'properties' and can be grouped by applying 'labels' to them. Nodes can be though of as the schema-less equivalent of tables in a relational databases.
  • relationships specify the relation between two given nodes. Relationship can have an optional direction and, just like nodes, can contain properties. Just like nodes, relationships are stored in the database. 

cypher-query-data-relationships-nicole-white-graphconnect 

Having nodes with their relationships stored together in the database opens a number of use cases, without having to recalculate the relationships (or joins) for every single query, opens up a whole series of new use cases that would be very hard or impossible to implement with relational databases. 

  • fraud detection: by monitoring relationship in real-time, 'fraud rings' or other scams can be detected before they cause lasting damage
  • network and IT infrastructure monitoring: graphs are inherently more suitable that relational databases to store and analyse complex interdependencies in networks and IT infrastructure
  • social network analysis: analysis of relations within a social network, community detection and infer or recommend new relations is easy when all existing relationships can be queried
  • recommendation engines: existing relationships can be used to predict new relationships through recommendation algorithms

Although the mathematical graph theories have been around for centuries, it took until recently for graph databases to become popular. Increased access to cheap and powerful computing resources and the development of mainly graph database market leader Neo4J have created a huge increase in demand for graph databases.

image-8