Create
Learn
Share

NoSQL DB Showdown Presentation

rename
Updated 2010-10-16 18:00

Summary

This is a the review for a presentation called NoSQL DB Showdown at the Columbus Ruby Brigade meeting on Oct 18, 2010.

 

If you're going, consider doing at least the first two or three tables before the meeting, and maybe the rest of them in matching mode.

Scope of Presentation

The 3 databases being covered here are:
MongoDB
CouchDB
Cassandra

Follow Along

Go to this url to follow along:

 

memorize.com/nosql
(Click the "NoSQL DB Showdown Presentation" link)

Quote

"Ruby is to Java what Document DB is to Relational DB"
--Me

Should I Use One for my Project?

Consider augmenting, not only replacing
Consider not using an ORM

Document DB Background

Cassandra isn't a document db, but shares these qualities.

 

Question Answer
"Records" are"Documents" - Nested hashes (or arrays)
Adding new fieldsCan be done on-the-fly (no migrations)
TypingLoosely typed
QueryingLimited (no "joins")
memorize

Code Samples

Using MongoDB console

Introducing the Databases

Question Answer
Type of Databasedocument: MongoDB, CouchDB
column-oriented: Cassandra
Closest to MySQLMongoDB
Closest to Google BigTableCassandra
memorize

Origin of Databases

Question Answer
MongoDB came fromCompany called 10gen
CouchDB came fromInspired by Lotus Notes
Cassandra came fromFacebook (searching "inbox")
memorize

Origin of Names

Question Answer
MongoDB: origin of nameFrom "humongous"
CouchDB: origin of nameREST ("rest on a couch")
Cassandra: origin of nameProphet who knew the future but wasn't believed (reference to oracle)
memorize

Language Implemented in

Question Answer
MongoDB: implemented inC
CouchDB: implemented inErlang
Cassandra: implemented inJava
memorize

Questions

Pause for questions

Unique Advantages

Question Answer
MongoDB: unique advantageDynamic querying
CouchDB: unique advantageDisconnected databases (can be synchronized when needed)
Cassandra: unique advantageLots of data (terabytes)
memorize

Unique Disadvantages

Question Answer
MongoDB: unique disadvantageCan lose data
CouchDB: unique disadvantageDoesn't handle volatile data well (frequent changes)
Cassandra: unique disadvantageCan't tell db to add an index
memorize

Indexes

Question Answer
MongoDB: indexingIndex parts of documents, search without indexes
CouchDB: indexingCan create indexes via "views"
Cassandra: indexingYou have to maintain them (write the keys/values)
memorize

Caching (in-memory)

Question Answer
MongoDBYes
CouchDBNone
CassandraYes (tuneable)
memorize

Protocol

Question Answer
MongoDB: protocolbson
CouchDB: protocolREST/json
Cassandra: protocolThrift
memorize

Structure

Question Answer
MongoDB: structureCollection -> Document (nested hash)
CouchDB: structureDatabase -> Document (nested hash)
Cassandra: structureColumn family -> Row -> (optional Super Column) -> Column -> Value
memorize

Cluster Configuration

Question Answer
MongoDB: cluster configurationmaster/slave (with auto-balancing)
CouchDB: cluster configurationdynamo (with bigcouch)
Cassandra: cluster configurationdynamo - nodes configure themselves
memorize

Downtimes

Question Answer
MongoDBFoursquare downtime
CouchDBKeeps running after hard server restart
CassandraDigg downtime (VP of Engineering was fired)
memorize

Scaling Trends

Question Answer
RedundancyUsing a lot of disk space to achieve performance (Facebook inbox)
Key orderHashed, so no iterating through keys (MongoDB is exception)
Cluster configurationDynamo / peer-to-peer (client can connect to any node)
ConsistencyTunable - from eventual to absolute
memorize

Other DB's

Question Answer
BigTableRuns on Google's servers
SimpleDBRuns on Amazon's servers
RiakLike Cassandra but has "links"
RedisHolds whole db in memory
memorize

References

http://en.wikipedia.org/wiki/MongoDB
http://en.wikipedia.org/wiki/Couchdb
http://en.wikipedia.org/wiki/Apache_Cassandra
http://techcrunch.com/2010/09/07/digg-struggles-vp-engineering-door/
http://nosql.mypopescu.com/post/1265191137/foursquare-mongodb-outage-post-mortem