Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
A (very) short history of big data
1. 1
A Very Short History
of Big Data
Lightening
photo by: exfordy, flickr
Talk
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
2. 2
The First Big Data Problem
1880 Census
50 Million People
Age, gender, number of insane people
in household
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
3. 3
The First Big Data Solution
Hollerith Tabulating System
Punched cards - 80 variables
Used for 1890 Census
6 weeks instead of 7+ years
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
4. 4
What is Big Data?
I Know It When I See It
More than you can handle with
the computer you’ve got
And scaling up isn’t an option
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
5. 5
Big Science == Big Data
Weather predictions
Super-collider data
Astronomy images
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
6. 6
A Data Explosion
Te xt
“Every two days now we create as much
information as we did from the dawn of
civilization up until 2003. That’s something
like five exabytes of data”
-- Google CEO Erik Schmidt Gigabyte = 10^9 = 1,000,000,000
Terabyte = 10^12 = 1,000,000,000,000
Petabyte = 10^15 = 1,000,000,000,000,000
Exabyte = 10^18 = 1,000,000,000,000,000,000
OK, there’s a lot of data
Increased to 800 billion gigabytes in 2009.
If every person on earth tweeted continuously for a century...
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
7. 7
Search
Analyzing lots of data
Important pages are those that
important pages link to
Solving Satan’s spreadsheet
100 billion rows x 100 billion columns
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
8. 8
Advertising
Specifically online advertising
Lots of data in the form of log files
Lots of value if you increase sales
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
9. 9
Advertising
Specifically online advertising
Lots of data in the form of log files
Lots of value if you increase sales
Targeted advertising can be good
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
10. 10
Advertising
Satisfy your Barney Fetish
Specifically online advertising
Pictures of Barney being
Lots of data in the form of log files drop-kicked off bridges.
Discrete shipping. No
questions asked.
Lots of value if you increase sales
Targeted advertising can be good
But scary, when they know too much
Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.
Monday, December 19, 2011
Notes de l'éditeur
What was the problem? Took 7 years to tabulate, using people - not a job I’d want
24 values = 4.5 bits, so 360 bits of data or 45 bytes x 62M = 2.8GB. Held onto data for a few months.
In 1880, it was 1-2 GB of data. And they couldn’t order bigger people to process the data faster
Many years like this. So what changed? The world wide web, and social services
1999 - 2 exabytes generated in the entire year Images, movies, sensors But what’s driving interest in big data is two things -
Two Stanford students were trying to solve the problem of divining a web page's "importance".Separate from search - e.g. if two pages had roughly the same content, which to show first? Finding the dominant eigenvalue and eigenvector of a matrix Google came up with systems for storing & processing the web