The slides from the December 2012 Cloud Camp Chicago. The slides include slides from our speakers: Dave Falck, Model Metrics: node.js on AWS; Paul Mantz, CohesiveFT: Working with APIs; Bob Chojnacki, Jellyvision Labs: Hadoop on AWS; Karl Zimmerman, Steadfast: Keep control with the Private Cloud
1. Sponsored by
Welcome to
Cloud Chicago
Hosted by
Live Tweet on the second
screen by using:
#cloudcamp
@cloudcamp_chi
1
Thursday, December 13, 12
2. Agenda
6:00pm Registration, Food, Drinks and
Networking
6:30 Opening Remarks, Patrick Kerpan, CoehsiveFT
6:45 Lightning Talks
Dave Falck, Model Metrics: node.js on AWS
Paul Mantz, CohesiveFT: Working with APIs
Bob Chojnacki, Jellyvision Labs: Hadoop on AWS
Karl Zimmerman, Steadfast: Keep control with the Private Cloud
7:45 Unpanel: “Who’s in Control of Your Cloud? Security and
Visibility”
Emceed by Mike Dorosh, IBM & Patrick Kerpan, CoehsiveFT
8:30 Breakout Sessions
9:00 Wrap Up - Drinks, anyone?
#cloudcamp
@cloudcamp_chi
Thursday, December 13, 12
3. Sponsored by
Dave Falck, Customer Solutions Engineer Hosted by
#cloudcamp
@cloudcamp_chi
Thursday, December 13, 12
5. Why
the
Node.js
Buzz?
* LinkedIn’s
entire
mobile
software
stack
is
completely
built
in
Node
* Why?
Scale.
* Huge
performance
gains
compared
to
what
they
were
using
before
(Ruby
on
Rails)
* Went
from
running
15
servers
with
15
instances
(virtual
servers)
on
each
physical
machine,
to
just
four
instances
that
can
handle
double
the
traffic.
6. What
is
Node.js?
* Javascript
platform
based
on
Google
Chrome
V8
JS
Engine
* Ryan
Dahl
(Joyent)
* Event-‐driven,
non-‐blocking
I/O
model
to
allow
your
applications
to
scale
while
keeping
you
from
having
to
deal
with
threads,
polling,
timeouts,
and
event
loops
* FAST
* Used
for
real-‐time,
data-‐intensive
apps
(mobile!)
* POPULAR
8. Hello
World
var
http
=
require('http');
http.createServer(function
(req,
res)
{
res.writeHead(200,
{'Content-‐Type':
'text/plain'});
res.end('Hello
Worldn');
}).listen(1337,
'127.0.0.1');
9. What
makes
Node.js
so
fast?
* Thread-‐based
networking
is
inefficient
and
difficult
* Node
shows
much
better
memory
efficiency
under
high-‐
loads
than
systems
which
allocate
2mb
thread
stacks
for
each
connection.
* Users
of
Node
are
free
from
worries
of
dead-‐locking
the
process
(*there
are
no
locks*)
* Almost
no
function
in
Node
directly
performs
I/O,
so
the
process
never
blocks.
* Because
nothing
blocks,
less-‐than-‐expert
programmers
are
able
to
develop
fast
systems
11. Under
the
Node.js
hood
* Javascript!
* Platform
independent
* Easy
to
use
* Ubiquitous
* Google
Chrome’s
V8
Javascript
Engine
* Translates
JS
into
machine
code
(not
interpreted)
12. When
not
to
use
Node.js
* Node.js
is
not
ideal
for
CPU
intensive
jobs
like
sorting,
transformations,
number
crunching,
analytics…
* Traditional
CRUD
web
apps
that
need
to
be
highly
concurrent,
performance
degradation
will
occur
when
the
data
is
needed
to
be
transformed…
* You
can
offload
processing
to
another
language
that
is
better
at
making
use
of
the
CPU
* Cultural
fit?
Too
new?
You
decide…
13. Node.js
+
AWS
* Dec
6th:
AWS
released
developer
preview
of
node.js
libraries
to
access
AWS:
* DynamoDB
* S3
* EC2
* SWS
* Allows
you
to
manage
parallel
calls
to
several
AWS
web
services
15. More
info
* http://nodejs.org
* http://en.wikipedia.org/wiki/Nodejs
* http://aws.typepad.com/aws/2012/12/aws-‐sdk-‐for-‐
nodejs-‐now-‐available-‐in-‐preview-‐form.html
* http://www.jamesward.com/2011/06/21/getting-‐
started-‐with-‐node-‐js-‐on-‐the-‐cloud/
* http://venturebeat.com/2011/08/16/linkedin-‐node/
16. Sponsored by
Paul Mantz, Software Engineer Hosted by
#cloudcamp
@cloudcamp_chi
Thursday, December 13, 12
17. APIs in Cloud Environments
Paul Mantz
Copyright CohesiveFT - Dec 13, 2012 1
Thursday, December 13, 12
18. API Command-Line Clients
• Benefits to Creating API Command-Line Clients
• Lowers barrier of entry
• Familiar to technical consumers
• Advanced usage cases
• Integrates into existing toolsets
Copyright CohesiveFT - Dec 13, 2012 2
Thursday, December 13, 12
19. API Command-Line Clients
Excellent Internal Developer Tool
• Excellent for testing and rapid development
• Useful operations tool
Copyright CohesiveFT - Dec 13, 2012 3
Thursday, December 13, 12
20. API Command-Line Clients
Reference Implementation
• Gives developers an example to integrate the API
• Helps users model workflows
• DSL
Copyright CohesiveFT - Dec 13, 2012 4
Thursday, December 13, 12
21. API Command-Line Clients
Excellent Demo Tool
• Quick installation, often one file
Copyright CohesiveFT - Dec 13, 2012 5
Thursday, December 13, 12
22. Sponsored by
Bob Chojnacki, Programmer Hosted by
#cloudcamp
@cloudcamp_chi
Thursday, December 13, 12
23. Big
Data
in
the
Cloud
A
Journey
into
the
unknown
24. Who
Jellyvision
is
and
why
are
analy9cs
important
to
us
• We
create
interac9ve
experiences
– Desktop
– Mobile
• …
which
ask
ques9ons,
inform
people,
generate
leads
• “Virtual
Advisors”
• We
also
collect
analy9cs
in
real
9me
to
generate
reports
about:
– How
people
answered
a
ques9on
– Where
they
dropped
out
– Lots
of
impressive
stats!
25. The
Problem
• Longer
term
projects
and
high
volume
projects
causing
MySQL
to
bust
at
the
seams
• Some
types
of
reports
taking
too
long,
or
causing
MySQL
to
crash
if
we
include
too
much
data
• In
all
fairness,
we
could
probably
tune
MySQL,
throw
it
on
bigger
servers,
more
memory
• Diminishing
returns
• MySQL
is
fine
for
collec9ng
the
data…
26. The
Solu9on
• Hadoop!
• Why
Hadoop?
Lots
of
possibili9es
out
there,
but
which
one
to
use?
Cassandra,
CouchDB,
Hadoop,
Membase,
MongoDB,
Neo4j,
…
• Big
Data
meetups
tended
to
have
lots
of
people
using
Hadoop
• And
I
knew
others
using
it.
• And
Hortonworks
had
a
fancy
point
and
click
solu9on
I
could
use
to
get
started
quickly
27. Op9ons
with
op9ons
• Now
that
I
picked
Hadoop,
I
had
several
op9ons,
and
op9ons
within
op9ons
to
use
to
analyze
my
data:
– Hive,
Pig,
MapReduce,
Java,
R
• I
knew
Java
• MapReduce
seemed
to
make
sense
• I’ll
probably
play
with
Hive
and
Pig
next
28. It’s
All
About
The
Data
• Visit
data
• Event
data
• Denormaliza9on
of
data
• Generated
a
ton
of
fake
data:
– Started
with
600K
visits,
3M
events
– Moved
up
to
1.8M
visits,
60M
events
29. Make
it
so
• First
experience:
Hortonworks
Virtual
Sandbox
– Single
node
AMI
at
Amazon
– Hadoop
1.0
– 600K
visits,
3M
events
• On
our
exis9ng
placorm
we
needed
to
break
reports
up
into
smaller
chunks
for
some
data
because
MySQL
could
not
handle
it.
• Results!
What
would
have
taken
hours,
took
only
5
minutes
on
a
single
node
Hadoop
"cluster”
• In
reality,
some
of
the
queries
I
could
also
run
with
command-‐line
tools
(wc,
grep,
awk)
on
the
data
considerably
faster
than
even
Hadoop.
• Important
lessons
learned
so
far:
– Think
outside
the
RDBMS:
they
are
great,
but
it
may
not
make
sense
for
all
types
data
30. Looking
at
more
real
data
• Now,
lets
generate
data
that
is
much
closer
to
some
of
our
product
• Instead
of
one
ques9on
and
answer,
how
about
15
ques9ons?
Add
in
some
other
events
gives
a
total
of
34
events.
• Throw
in
some
people
returning,
some
of
them
mul9ple
9mes
• Throw
in
some
people
who
don't
start
the
conversa9on,
etc.
• Run
my
lijle
auto-‐data-‐generator
and
BOOM!
20
million
events
and
4.4GB
later
I
have
my
data…
• …
which
took
up
too
much
disk
space
to
run
on
the
demo
system
I
was
using.
Might
as
well
turbo-‐charge
this
puppy...
31. More
disk
space!
• Full
install
of
Hadoop
(Hortonworks
HDP)
• Single
node
• 600K
visits,
20M
events
– 6m
29s,
~30s
aner
map
phase
completed
• 1.8M
visits,
60M
events
– 18m
3s,
~90s
aner
map
phase
completed
33. Caveats
• Not
using
Hadoop
to
its
fullest
/
basically
a
weekend
job
• Algorithms
employed
in
this
example
probably
won't
end
up
it
a
book
alongside
Knuth’s
34. Next
steps
• Make
sure
results
on
real
data
lines
up
• Integrate
with
team
to
generate
reports
they
need
35. End
stuff
• Thanks
to
the
folks
at
Hortonworks
who
answered
my
fran9c
and
spas9c
ques9ons.
36. Sponsored by
Karl Zimmerman, President Hosted by
#cloudcamp
@cloudcamp_chi
Thursday, December 13, 12
38. Private Cloud:
What do we mean?
Private cloud is a form of cloud computing where the
customer has some control/ownership of the service
implementation. It is a scalable, elastic IaaS solution
based on cloud computing but with more control over
resources.
39. Private Cloud:
What are the advantages?
Security
Availability
No vendor lock-in
Ease of management
41. Private Cloud:
Availability
Understanding and control of the infrastructure
Get the resources you need, when you need them
You're not subject to the whims of other users
43. Private Cloud:
Management
Easier to find employees with general IT knowledge
Utilize a broader array of tools and software
Get support/assistance from multiple levels
44. Private Cloud:
To Summarize
Private cloud can deliver what you need out of a public
cloud, but giving you more control. Losing control over
security, availability and issues like vendor lock-in and
management vanish into thin air like, well, a cloud. And the
fact that it doesn’t have to cost you more is a plus, too.
45. Sponsored by
Unpanel: “Who’s in Control of Your
Cloud? Security and Visibility”
Hosted by
Emceed by:
Mike Dorosh, Program Manager –Cloud Technical Partnerships, IBM
& Patrick Kerpan CEO, CoehsiveFT
#cloudcamp
@cloudcamp_chi
Thursday, December 13, 12
46. #cloudcamp
@cloudcamp_chi
Thursday, December 13, 12