This document provides an agenda and logistical information for a one-day course on NoSQL in the Cloud. The course attracted attendees from both large companies like AOL and Symantec as well as smaller companies. It aims to provide an overview of the NoSQL landscape with a focus on Cassandra, Hadoop, and Pig. The schedule outlines presentations on these topics as well as hands-on demonstrations and a discussion period. The instructor notes there is a lot of content to cover and documentation is distributed, so not all details can be explored in depth.
3. Who is attending this
course?
● Good response to call
○ Did not do much publicization - linkedin, twitter
● Objective was to determine interest in this
topic
○ Evil agenda is to run longer, more hands-on courses
○ Hence very modest price point
● Attendees from big companies
○ AOL, Symantec, HP, D&B
● ...and small companies
○ Boxever, eSpatial, Ezora
● Capped attendance at 50 poeple
5. Course objective
● To have an overview of the NoSQL
landscape with particular emphasis on
analytics
● Conceptual understanding of specifics of
Cassandra/Hadoop/Pig
● Understanding of some key tools used to
work with Cassandra/Hadoop/Pig
● Understanding of example use case based
on Gowalla data
6. Schedule (revised!)
8.45 - Registration/coffee
9.30 - Intro - course overview
9.35 - Overview of NoSQL landscape
10.15 - Overview to Cassandra
11.00 - Break
11.15 - Introduction to Hadoop and Hadoop ecosystem
12.00 - Introduction to Pig
12.25 - Integration of Cassandra/Hadoop/Pig
12.45 - Lunch
13.30 - Description of example problem, design of data models
13.45 - Design of cluster for this simple scenario - essential parameters
14.00 - Cassandra walkthrough
15.00 - Break
15.30 - Hadoop Walkthrough
16.00 - Pig walkthrough
16.45 - Discussion/Q&A
17.00 - Close
7. Further points
● A lot of content to go through
○ In some cases don't have time/space to go into great
detail
● Chose specific technologies which are most
widely used in this space
○ although there is a large set of technologies to
choose from
● Documentation on this topic is very
distributed
● For Cassandra/Hadoop, datastax is a
leading company - chose not to use their