Graph vs Relational Databases
A Brief Comparison of the Old Guard Database versus the new upstart NoSQL Database
Old Guard vs the New Upstart
Modern applications have increasing volumes and diversity of data, requiring new ways to store, process, and analyze very large and unstructured types of data. Typical processing tasks include the need to model and interpret unstructured data, find relationships between elements, and find patterns in the data. Other requirements also include data distribution, replication, fast read and write, and high availability. Relational database management systems (RDBMS) don’t provide these combinations of features right out of the box
As a result, a number of new database engines have been developed in an effort to support these features. Graph databases are one of the new generations of databases. These are open-source databases that can support formultiple data types such as key-value, document, and graph models, while also supporting SQL. It’s a tool that can support a wide variety of schemas while at the same time providing high levels of performance and scalability.
Simple Example
Suppose we'd like to determine which of Andreas friend's have electric bikes so that he can borrow them. In the relational model, there is a table of people such as Andreas, another table that is basically a crosswalk of people and their friends, and then another table of people that own electric bikes. In this simple example, there are three tables that need to be joined together to satisfy that query. For a small number of data records, this is relatively easy and performant. However, with records of tens of millions of records and more data tables to be joined, this can be problematic impacting performance.
For graph databases, this is a simple query, with a "network" already identified with relationships between Andreas and his friends, and attributes with each friend that would include electric bikes. Even with tens of millions of records, the ability to view a specific network defined by a multitude of "linking" attributes will still perform relatively well.
More About Graph Databases
- A graph data structure is a data model that stores data in the form of Vertices (graph nodes) interconnected by Edges (graph arcs). By assembling vertices and edges to form relationships in connected graph structures, it is possible to build structures that closely represent the real-world problem domain.
- Each vertex in the graph database can contain any number of Properties that describe a real-world object, and any number of edges that represent the relationships to other vertices. These edges are organized by type and direction and may also hold additional Properties.
- Whenever running the equivalent of a JOIN operation, the graph database follows the edge paths, to access the connected nodes. This eliminates the need for expensive search-and-match computation to find vertices.