Monday, August 12, 2013

Notes from NoSQL meetup in ATLSWC

Some notes from "Overview of NoSQL Database systems" hosted by Thoughtworks, Atlanta.

Presenter: Noah Kriegel

Starters: Awesome pizza and drinks.

Presentation gist:

  • Beautiful prezi layout (check below image - Courtesy: James Brechtel).



  • Most of the content is inspired from NoSQL Distilled book, but Noah did a good job in extracting the necessary stuff and supplementing it with his own experience.

Why we love RDBMS

  • ACID compliant
  • Standard, Expressive and Powerful SQL.


Why the hate?

  •  Challenges with data replication and multi node setup resulting in expensive infrastructure.
  •  Application object and db data type mismatch. Tools like ActiveRecord and Hibernate plays a major part in overcoming this but has their own pitfalls.


NoSQL movement

  • Inspired by two papers on Google Big Table and Amazon DynamoDB
  • Gain traction with the developers unlike failures like Object databases in the past.

Some important facts to remember

  • CAP theorem
  • Quorum-based voting techniques


NoSQL types

  •  Key Store
  • Document oriented database
  • Column oriented DBMS 
  • Graph databases


Each of the type was discussed on the following terms:

  • Introduction and key feature
  • When to use
  • When not to use
  • Sample code


I'll put some more details on each of these data types in the upcoming posts.

Some key TIL:

  • CQL closely mirrors SQL in its syntax.
  • Redis stands for REmote DIctionary Server.
  • Mongo is a strip-off from the word - humongous

Tuesday, April 9, 2013

Hadoop installation in AWS EC2

I was using the CDH's vm image on my local for a hands-on experience with Hadoop. I thought let's try it out in AWS and see how smooth is the process.

So, I started following the cloudera's blog post on the same. But the blog had a lot of issues and things didn't work as outlined. I received a little more help from this blog.

So, here's the brief setup after creating the instance. As told in both the posts, I used whirr to install the cluster to avoid manual setup.

Step 1: Get the latest whirr binary
Step 2: Setup the whirr config file. You can copy the below contents and update the AWS Access Key ID and Secret Access Key accordingly.
Step 3: Install java
Step 4: Generate public key. I just entered on the first prompt.
Step 5: Launch cluster. Wait till you get the instruction to ssh to the nodes.
Step 6: ssh to the nodes.
Step 7: verify if hadoop installation works. We'll look more on this later.

Sunday, February 10, 2013

Currently reading: NoSQL Distilled


When the rought cut version of "NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence" was released in Safari books, I dived in and pointed out some comments. Those were totally useless to the authors and the publishers as they were either incomplete sentences or grammatical mistakes which were already caught by the proof readers. However, Martin Fowler was kind enough to respond to each and every comment.

Later on, when the book was released, I made a joke to one of my friends who was looking to buy the book that its better to get the Table of Contents from Amazon website and read the topic in wikipedia and save a few bucks. Last week, I thought I should seriously give it a read. So far, I've completed 3 chapters and have learnt a lot of stuff. Not only did I realize that I made a fool of myself by making that joke but also I'm learning many things a little late.

My upcoming blogs in the near future will definitely reflect a few things that I'm learning from this book.