NoSQL Graph Database feature comparison

A few days ago I published a short overview of the most trendy graph database, today I’m bringing you a review of the most important features of theme. What they give you and how directly extracted from their public web sites.

As you can see the current ecosystem is quite space, without a general uniformity amount theme, although it’s normal as far as we are analyzing an ongoing technology movement.

As you could see in the previous table, although always speaking about graph databases there are substantial differences that can bring our projects a positive approach. Next we are going to analyze, in our opinion, the main differences.

License

There are flos licenses, commercial and mixed licenses. A diverse ecosystem, providing you with the right to choice the best one according your needs. You must take into account the importance of flos licenses, if they are backed with an active and diverse community, they could provide the product with a high standards of quality.

Distribution

Confronting us with a high demand needs, either of computing, storage or both, drive our mind into distributed computing. Any database product with the aspiration to succeed must take this into account.
In our review the only one truly distributed is HyperGraphDB , however they must improve a lot of things. The other communities are pushing hard, accomplishing great steps forward, sure they will present interesting solutions in a short time.

Indexing

Every data structure designed to store high quantity of data show allow an efficient retrieving, indexing provide us this feature. An index is nothing more than a direct pointer between a key and a certain value, like a dictionary.
The analyzed solutions facilitate indexing of attributes in nodes and edges, however there are some differences. Specifically Neo4J , which uses Lucene to index node attributes. An interesting thing happen with InfoGrid , where only uuids are indexed. An important peculiarity of Neo4J is that they are not indexing as a default behavior, take this always into account.

There are many different indexing techniques, with different properties and performance, but this would give us for a complete set of posts.

Storage system

Analyzed database products shows us different storage solutions, a custom storage system, a generic one or the possibility to choice your storage solution. Personally this is an important decision, a non specialized storage system tends to perform worth than an specialized one. If we confront with a generic storage system, it’s different to be using a low level, like storage api of mysql, or a high level solution, using the sql interface of mysql for example..

We found cases as HyperGraphDB where the use of Berkeley DB database facilitates rapid development, but also penalizes performance. However another solutions like VertexDB and DEX have their own storage, or a low level generic storage, getting better performance.

In upcoming post we’ll see an in deep performance benchmark, col·laborations, ideas, any thing is truly welcome.

Programming APIs

Basically we want to develop, using our favorite programming language, with the best database. Our review shows us several solutions to this problem.

There is Neo4J which provides web API’s and for a variety of programming languages, however the majority only provides an API for Java. However there is a generic solution,web services API, also found in many of the existing databases. As a conclusion there are enough resources to use this databases using the most common programming languages.

But after the review of their common characteristics, here is a list of missed things we believe are important.

  • An standard, and independent, benchmark will provide customers with comparison data useful while trying to make a decision.
  • Transaction and indexing facilities are not present in all the solutions analyzed. In our opinion this is an important feature that must be in every decent solution.
  • A query language. The use of such languages facilitates the development of queries.
  • Tools, Tools and more tools. Without tools that facilitate the development, maps, objects, management tools, etc. .. development becomes harder.

Although is not a full graph database, today Twitter have presented FlockDB. Backed in a MySQL database, at least this new enhance graph database as an important solution.

Finally just say that if you find any mistake in this comparison is entirely my responsibility, please let me know and I’ll correct theme as soon as possible.