Saturday, April 2, 2011

Vertica @ NEDB Summit - Crescando, SharedDB, SwissBox

[This post is waaay late -- I wrote it back in January but I am just putting it up now]

Last Friday a crew from Vertica attended the New England Database Summit at MIT (Many thanks to Sam Madden for organizing such a good event). Despite Mingsheng walking off with my backpack (by accident) and almost giving me a heart attack it was a fun day.

I was especially impressed with the keynotes.

The first was Donald Kossmann, a professor at ETH Zurich entitled Predictable Performance for Unpredictable Workloads about a system they had co-developed with Amadeus IT Sophia-Antipolis for their airline reservation tracking system – think the computer that the boarding attendant uses at the gate to find out who is eligible for first class upgrades, who is a vegetarian, etc.

Overall I thought it was an excellent example of one-size-doesn't-fit-all: by designing a system for the specific workload, their result was both an order of magnitude less expensive and could actually meet the SLA promised by Amadeus to its customers.

The SLA promised: no query will exceed 2 seconds. Ever. Regardless of load on the system. It is totally fine if the system returns results in 2 seconds when it is running a single query, but it also must return results in 2 seconds if it is running thousands queries simultaneously. Stewardesses can smile at you for 2 seconds and you won't even notice, but waiting minutes under heavy load, was unacceptable. Think about the recent Icelandic volcano eruption which hosed air travel for months.

Their system was built around a storage manager called Crescando which is a word play on how it is structured: a sharded, distributed, in-memory table which is scanned continuously. They choose an amount of memory (~1GB) that can be scanned in ~1 second by a single modern CPU core and shard the table across cores and nodes into enough chunks to store the entire database (~1000 nodes). They put a query processor on top called SharedDB to plans queries and instruct each Crescando scan node what rows to return (what predicates). Since the table is completely scanned each second, voila you have your 2 second guarantee. Since it is an in memory database, it also supports arbitrary updates by applying them in batch during the sweep.

Professor Kossmann concluded with a short bit about the SwissBox project which would package the software into an appliance form factor and add some custom hardware (maybe the networking layer?) I have to admit that I find it amusing that even though a parade of companies have proven time and time again that custom FPGAs and ASICs can't keep pace with well written software for x86 processors people keep trying to add custom hardware anyways. Of course as a software guy I am biased, but I can't imagine being able to make hardware in small batches to compete with the economies of scale and R&D budget that Intel can throw at x86 development.

In another post, I will hopefully write up the other keynote: Renee Miller who I thought presented clearly and concisely the theoretical underpinnings to automatically determine functional dependencies from data (it is fascinating, I swear!), which I have discovered is a classic Data Mining technique. That one got me thinking of one of the directions we are headed at Vertica.