Types of NoSQL Databases: Choosing the right one

Iván Alejandro Marugán Big Data Architecture, Technology Leave a Comment

To continue the previous post NoSQL vs Relational: Which database to use, we will explain the different types of NoSQL databases depending on their target usage. Why should the target usage make a difference? Because bear in mind that when data is saved, not all of it holds the same structure. For example it’s not the same if we work with documents (XML, YAML, JSON…) or graphs, etc. This way, the most popular categories would be:

Key Value Store

NoSQL key-value

Example key-value

These are simple databases where data is saved into a map or dictionary (associative array) and represented as a collection of key-values pairs, assembled by a unique key and the rest of the data (values). The most common uses are session cache with persistence, managing the shopping cart in a online store, message queues (like Celery, a distributed task queue where you can use Redis as a message broker), full text search… even a chat (check out this project!).

There are plenty of key-value databases, but you can start working with Aerospike, Apache Ignite, Redis (it’s not a plain kv-storage, because the value isn’t only a string)

Document Store

NoSQL documents

Example json

This kind of databases are very similar to key-value, but they use a unique key and different values. The main difference lies in the values, as the data is semi-structured like XML, JSON, BSON, YAML, etc.The most common uses are tweets, ratings, log files…

There are several document-oriented databases like Apache CouchDB, MongoDB, Elasticsearch (which can be seen as a NoSQL Database in a document-oriented category but let’s remember that is a search engine too).

Columnar Database

No Sql Columnar

Table -> Column Store

The columnar databases store data in columns instead of rows so that the access is carried throughout a key. This kind of database has many advantages like better performance and more efficient disk access over the row-oriented databases as they save all data in a single column (be it on memory,  disk or both). These databases reduce the disk space because they let you choose the best option to compress the column (depending in the column datatype). Additionally, you can execute queries over the columns you want, because unlike in other databases, you don’t need to select every column for these.

Some uses behind these databases are search suggestions (autocomplete), real-time analytics or data processing (count requests hits by hour, IP, gadget…), etc.

Great examples of this kind are Apache HBase or Apache Cassandra. These represent the vast majority of columnar databases.

Graph Database

NoSQL Graph

Example followers in Twitter with graph database

The data is stored in nodes and relationships are represented as arrows connecting nodes between each other. In this kind of databases there are two key aspects: native graph storage and native graph processing engine. These properties offer less latency and more efficient processing. The typical use cases are network impact analysis of social marketing, fraud detection or route optimization.

Neo4j wins the gold medal as the most common graph database. On a secondary spot, we can find databases like ArangoDB, OrientDB, both multimodel, which can save key/value, graphs or documents alike.

Conclusion

There are several kinds of NoSQL databases. The flexibility reviewed above should allow us to choose the right one for the job, in order to make our applications work harder, better, faster (stronger).

Leave a Reply

Your email address will not be published. Required fields are marked *