5 Reasons for Using Graph Database for Traceability
A Brief Explanation of Graph Technology
A Brief Introduction
Forgive us, if this article is a bit long. The advent of graph databases are powering a new generation of applications that require modeling interconnectedness in a more natural and easier manner, resulting in exponential improvements in performance and adaptability. Graph databases are ideal for traceability applications built on connected relationships between hundreds, thousands, or even millions of different entity types. The most well-known interconnected application is Linked In, whereby individuals can and are connected to each other and form their own personal “network”. This leads to the adage that every actor in Hollywood is connected to Kevin Bacon somehow within six degrees.
Best Apps for Graph Databases
Coupled with entity resolution technology, graph databases can power the new generation of interconnected traceability applications such as ERP, supply chain management, fraud, compliance, target marketing, recommendation engines, master data management, digital asset management, and network management for Telecom, IT, Power grids & Sewers. In our own personal experience, it is a phenomenal technology for fraud for financial crimes analytics. It allowed us to continually add additional data sources for independent 3rd party sources for additional analytic enhancements without a significant rewrite of existing queries.
The Problem with Relational Databases
Previous traceability platforms using traditional relational databases for describing relationships between multiple entities have proven to be limited, particularly as systems need to expand to track a large number of multiple disparate entities. As the number of inter-relationships grows between data tables, so does the complexity of queries. With both the number of records in a database and the number of relationships between entities exploding, system responsiveness is severely impacted by the multiple JOINs between the database table. In addition, relational databases have fixed schemas, so they don't adapt well to changes. A small change to one table can cause a ripple of changes across the system that must be carefully accounted for. And as such, schema changes are problematic and take a great deal of time for ensuring changes do not break the system. Unfortunately, a simple change like adding or replacing a column in a table might be a million-dollar task when using a relational database.
This new technology, graph databases, are designed specifically for recording, defining, searching and visualizing a genealogy linkage between assets, owners, places, and events. Utilizing this database has numerous advantages over traditional relational databases in traceability applications due to the following 5 reasons.
5 Reasons for Graph Databases
- Support hundreds of millions of records without significant impact on performance, as the data "traverses" across the various entity relationships.
- Easily add across the value chain workflow without significantly impacting the application performance or schema complexity. In a value chain these new nodes could represent new inspections, new value chain participants, or compliance points.
- Provide the highest level of user and data security, so that data contributors can only see their own data contribution and no other data, delivering the utmost in security and privacy.
- Allow various participants across the value chain to export natural data output from their IT systems as data input to a decentralized traceability system with no changes required for their databases. Very little additional work from their IT systems is required to contribute data that naturally is available in their system.
- Enable easy intuitive interpretation of data by using a node-link visualization, representing the full path between the source and destination nodes and all the interconnected entities between the two nodes.