The most stupid question to ask today to even a new born baby would be – “What is data?” Quite exaggerated, I know, but you must have understood the understatement. What I wanted to put across is – “We are living in the age where Data has become super-super prime”. Modern economy today is banking on DATA.
Databases – the way of life
From manual record keeping system to so called Object oriented Relational Database Management Systems (OORDBMS) it can be observed databases have come a long way. On a timeline there have been so many technologies that came from hierarchical and network based database model to relational database model. RDBMS was a major game changer and has ruled the planet. RDBMS also evolved from simple Relational DBMS to Object Oriented RDBMS to accommodate Object Oriented Approach and technology introduced in most of the programming languages.
Things are changing very fast
Next Open Source Database Management Systems made use of sophisticated databases a norm and soon everyone was using it. Usage grew and applications grew by many folds. Today we see online users are increasing exponentially. User base is increasing and there is a huge shift of activities to Internet. Because of that suddenly, everybody is working very hard to make best use of ever growing data – and here traditional database management systems have no doubt shouldered the burden but now seem that these databases would be “Once upon a time, databases” in the near future. Simply shape and size of data has changed to a level, these traditional database managements systems can’t cope up with.
Object Oriented Relational DBMS is there and still used and is going to stay, but new technologies have emerged and taking over. Traditional SQL based databases might just go obsolete as the next generation of data is not the same any more. Some of the key points related to ever growing database today are:
• Database emerging and we are dealing with is non-relational,
• It is distributed,
• It is open-source and
• horizontally scalable
Some people even say NoSql to be actually abbreviated as NOSql i.e. Not Only SQL, meaning SQL is going to stay but must get merged with new technologies to be able to accommodate changing characteristics to modern web-scale databases. Today’s databases are not bound to predefined schemas, must provide replication support, should also be open source with simple APIs, should be able to handle huge and diverse database with consistency. This has led to the huge development and research work in the area of databases since more than a decade now.
Lot many NOSQL based databases have emerged, these can be classified as:
Wide Column Store/ Column Families
Hadoop : is an open source distributed processing framework for Big Data.
Cassandra : Its main features include:
• It is massively scalable and also partitioned row store,
• It has a masterless architecture,
• Provides linear scale performance,
• Has no single points of failure,
• It also has read/write support across multiple data centers
MapR, Cloudera, Hortonworks
• These are Hadoop Distributions and professional services
Scylla: Its main features are:
• It is Cassandra-compatible column store,
• It has consistent low latency time
• It can perform more transactions per second.
IBM Informix : It is also
• Horizontally as well as vertically scalable,
• Partitioned row store,
• Document store
eXtremeDB Financial Edition: It is also
• Massively scalable
• Persistent storage DBMS
• Specially for analytics on market data
Druid: It is also
• Open-source analytics data store for business intelligence
• It queries on event data
ConCourseDB: Written in Java, it is
• Self-tuning distributed database
• Performs automatic indexing,
• Handles version control and
• ACID transactions.
Elastic : With this database, you can
• Retain your traditional SQL database benefits and
• Also gain new cloud efficiencies.
MongoDB: Written in C++, provides
• It is Freeware, also comes with Commercial License
Cloud Datastore: Originally part of Google App Engine, this database is a
• A fully managed Document store
• Facilitates multi-master replication across different data centers
Azure DocumentDB: is fully managed NoSQL Database with following features:,
• It is globally distributed
• It is perfect for the massive scale databases
• Low latency needs.
• Guarantees: 99.99% availability,
• 99% of reads at <10ms and
• 99% of writes at <15ms.
• SQL like syntax
• Swapable persistence stores.
• Performs Joins,
• Performs nested matches,
• Takes care of projections,
• Has Asynchronous cursors, and
• Can take care of streaming analytics,
• Also has
o Built-in predicates,
o indexable computed values,
o fully indexed Dates and Arrays,
o also has built in statistical sampling.
Key Value based or Tuple Store
DynamoDB: It is/has
• Automatic ultra scalable
• NoSQL DB.
• Multiple Availability Zones.
• Elastic MapReduce Integration
Azure Table Storage: It has following features:
• Blob Storage and
• Queue Storage available,
• 3 times redundant.
• Can be accessed via REST or ATOM.
LevelDB: Written in C++ and a Database by Google is Fast & provides Batch updates.
Aerospike Database: Written in C++ has the following features:
• It is fast and web-scale database.
• It has predictable performance
o Achieves 2.5 M TPS (reads and writes), 99% under 1 ms.
• Tunable consistency.
• Can be Replicated,
• Requires zero configuration,
• Has zero downtime,
• Performs auto-clustering,
• Performs rolling upgrades
• Has built in analytical functions
• Suitable for high load environment, it is a modern key-value database.
• It is a Key/Value store for Python object serialization.
Grid and Cloud Database Solutions
• It is an in-Memory Computing Platform
• Can perform real-time streaming, and
• Fast analytics in a single data access
Oracle Coherence: It offers/is
• It provides distributed processing,
• Provides querying,
• Performs session management
One thing is for sure these NOSQL databases are highly scaleable and flexible database management systems. These databases allow you to store and process unstructured as well as semi-structured data in the most efficient manner which is not possible through RDBMS tools. Future ready, Fast, efficient, open-source and can handle and analyse massively huge data.