NoSQL (Not only SQL)

This page is still under development. ## NoSQL history

Large web companies as Google, Amazon and eBay have to deal with very large volumes of data. They were first to be concerned by the limits of traditional Relational Database Management System(RDBMS).

RDBMS are based on ACID properties, and on computers which led to scalability problems. The Relational Database Management System is hosted on a single server. Moreover, the relational model is a limited data structure. The information is divided into entities represented in a table, and each rows from the table represents an instance of the entity. Each rows capture a set of values. The values type may be numeric, date and time or string. It does not allow complex type records as List, Map and other complex structures.

To overpass these limitations, these companies started to develop their own database management systems. In order to cope with large volumes of data accessed from different parts of the world, it is necessary to be able to replicate this data on different physical machines, which is called a distributed environment. The resulting systems, BigTable for Google, Dynamo(Amazone), Cassandra then HBase (Facebook), MongoDB (, CouchDB(Ubuntu One ), Hypertable (Baidu) were the first precursors of the NoSQL 4 model.

The relationnal model is no more adapted to the distributed database system.

NoSQL technologies (meaning Not Only SQL) is used to describe this new information storage paradigm.

"Different models exist for the NoSQL databases :

???Aggregate models operate on data records having a more complex structure than a set of tuples.

It allows complex records as List, Map and other complex structures to be nested into it."

Key-value document to explore...

For example

A typical relational DBMS will model a purchase system by creating:

And it will perform join between each entities to get information. This architecture can be problematic when the number of transactions / clients / products become large and will be difficult to split between several servers.

A typical NoSQL DBMS architecture will tend to model this problem into a set of aggregates consisting of the information of a customer and his purchases.

  1. Advantages:

This architecture is more easily scalable because these aggregates interact little with each other and can easily be distributed on a cluster of servers.

  1. Inconvenients:

This architecture can be problematic if for some reason it is necessary to carry out requests as calculating the total number of customers or purchases: it may be less efficient than in a relational system that keeps all customers (purchases) in a single table.