More and more organizations are using graph databases rather than relational databases, says Johan Svensson, CTO of Neo Technology, maker of the graph database Neo4j. Why is that? And what's the difference between the two? In an interview with The Enterprisers Project, Svensson sheds more light on this new option for coping with big data.
The Enterprisers Project (TEP): Can you explain why graph databases are compelling? How do they differ from traditional databases?
Svensson: Yes, I would love to. Graph databases are compelling because they enable companies to make sense of the masses of connected data that exist today. Traditional relational databases have been the power-horse of software applications since the 80s. They work fine when your data is predictable and fits well into tables, columns and rows and where queries are not very join intensive. But this is a problem, because in the real world, there are no isolated pieces of information. There are rich, connected domains all around us.
Graph databases, on the other hand, work wonderfully when the relationships inside your data are important and your queries depend on exploring and exploiting them. This is because graph databases store relationship information as a first-class entity.
The flexibility of a graph database model allows you to add new nodes and relationships without compromising your existing network or expensively migrating your data. With data relationships at their center, graph databases are incredibly efficient when it comes to query speeds, even for deep and complex queries.
TEP: Some very large enterprises are now using this technology. Why is that?
Svensson: I think the number one reason why companies are increasingly using graph databases is because it allows them to gather insights that were previously unavailable. This gives them a competitive advantage as they can make better decisions and give their customers better service. More and more businesses are realizing that in order to succeed in a highly connected world, they must leverage the connections within their data for all they're worth — but they need the right technology to do so.
TEP: Does it make sense for companies to use both relational databases and graph databases or should they standardize enterprise-wide on one or the other?
Svensson: Today it makes sense to use both, because different models have their pros and cons. The enterprise typically has a wide set of problems it needs to solve, and there is no single database or database model that is absolutely best at everything. Just knowing which databases to use is becoming an important skill.
TEP: So then how can you tell when the situation is right for graph databases?
Svensson: Start by drawing the domain on a whiteboard. If your domain entities have relationships to other entities and your queries rely on exploring those relationships, then a graph database is a great fit.
This especially applies to new projects as developers find the model very convenient to work with, in part also because of its adaptability. This ability to adapt is particularly useful as new information about the domain becomes known, or changes in requirements cause the model to change.
Existing deployments are also potential candidates for introducing graph databases. Again this is because graph databases can alleviate performance and scaling problems caused by joins. Data warehouse systems and offline analytical workload can sometimes even be moved into a real-time environment using graph databases.
TEP: Are there certain types of software for which graph databases are most useful?
Svensson: The most common applications for graph databases include fraud detection, real-time recommendation engines, master data management (MDM), network and IT operations, and identity and access management (IAM). But really, a graph database makes sense for any organization seeking to make the most of its connected data.
TEP: What are the biggest drawbacks to using graph databases?
Svensson: The number of graph databases available is growing but one thing to be aware of is that the technology is still fairly new if one compares it to relational database (RDBMS) technology, which has now existed for a full generation. It takes time to build a solid database regardless of data model.
Transactions, recovery, durability and other things you would take for granted when working with a database may not actually be working as expected — or worse — not be present at all. As many graph database implementations are still young, it may be a good idea to first verify that core features work as advertised.
Another thing to be aware of is that some graph databases only offer the graph model but the underlying implementation is backed by a traditional database. That can impact runtime behavior as queries may get translated into joins.
TEP: What are your predictions for graph databases in the coming year? Few years?
Svensson: This year we will see an increase in both new entrants into the graph database space and an increase in the size of the overall market for graph databases. Over time, graph databases will become as commonplace as relational databases are today.
TEP: What advice would you offer IT leaders about working with graph databases?
Svensson: Give them a try. The investment required to find out if your organization can benefit from using graph databases is quite small, and the potential return extremely high.