Data at Scale - Michael Peacock, Cloud Connect 2012

Data at Scale

Data problems and solutions with the
connected world

Michael Peacock
Web Systems Developer
Telemetry Team
Smith Electric Vehicles

Lead Developer
Occasional conference speaker
Technical Author

• Worlds largest manufacturer of all electric
commercial vehicles
• Founded in 1920
• US facility opened 2009
• US buyout in 2011

Electric Vehicles
• 16,500 – 26,000 lbs gross vehicle weight
• Commercial Electric Delivery Trucks
• 7,121 – 16,663 lbs payload
• 50 – 240km
• Top Speed 80km/h

Electric Vehicles
• New, continually evolving, technology
• Viability evidence required
• Government research

EV Data
• Performance analysis and metrics
• Proving the technology: Government
research
• Evaluating driver training conversions
• Diagnostics, Service and Warranty Issues
• Continuous Improvement

Current Status
• ~500 telemetry enabled vehicles
• Telemetry is now fitted as standard in our
vehicles
• Our MySQL solution processes:
– 1.5 billion inserts per day
– Constant minimum of 4000 inserts per second

CANBus and Telemetry
• Sample the buses: once per second
• Only sample buses with useful
performance and diagnostic information on
them

Vehicle Data
• Drive train information:
– Motor speed
– Pedal positions
– Temperatures
– Fault Codes
• Battery information:
– Current, Voltage & Power
– Capacity
– Temperatures

Connected World: The Problem
• Connected infrastructure
– EV Charging stations
– Utilities
• Home based telemetry
– Smart Meters
– Smart Homes

Our problem
• Hundreds of connected devices, each with
numerous sensors giving us 2,500 pieces
of data per second per vehicle
• Broadcast time we can’t plan for
• Vehicles rolling off the production line
• New requirements for more data

Issue 2: Capacity
Sometimes data is too
much to cope with

www.flickr.com/photos/eveofdiscovery/314
9008295

Option: Cloud Infrastructure
• Cloud based infrastructure gives:
– More capacity
– More failover
– Higher availability

Cloud Infrastructure: Problem
• Huge volumes of data inserts into a
MySQL solution: sub-optimal on virtualised
environments
• Existing enterprise hardware investment
• Security and legal issues for us storing the
data off-site

www.flickr.com/photos/gadl/89650415/inphotostream

AMQP
Advanced Message Queuing Protocol

Queuing
• Downtime
• Capacity
• Maintenance Windows

What if...
• Queuing allows us to cope with:
– Downtime of our own systems
– Capacity problems
• Queuing doesnt allow us to cope with:
– An outage of a queuing infrastructure

Buffer

www.flickr.com/photos/brapps/403257780

Cloud based infrastructure
• Use a Message Queue to ensure data is
only processed when you have the
resources to process it

SAN
• Backbone to most cloud-based systems
• Powers our MySQL solution
• Supports:
– Huge volumes of data
– Lots of processing
– Fast connection to your servers
– Backups and snapshots

SAN Tips
• When dealing with data on a huge scale
every aspect of your application and
infrastructure needs to be optimised, this
includes your SAN – something which is
commonly overlooked.

• http://www.samlambert.com/2011/07/how-to-push-your-san-with-
open-iscsi_13.html

Speed: Stream  Batch
• Streams of continuously flowing data can
be difficult to process
• Turn the stream into small, quick batches

• MySQL: LOAD DATA INFILE

Shard 1: Hardware
• As the amount of data increased, we hit a
huge performance problem. This was
solved by sharding at a hardware level.
• Each data collection device was given its
own database, which could be on any
number of separate machines, with a
single database acting as a registry

Rationalisation & Extrapolation
• Remember the CANBus
– Always telling us information, which we
sample every second?
– Do we always need that?
• Extrapolate and assume

Getting information from data
• Vehicle performance information involves:
– Looking at 20 – 30 data points for each
second of a vehicles operation in a day
– Analysing the data
– Performing calculations, which vary
depending on certain data points
• Getting this data was slow
– How far did Customer A’s fleet travel last
week?

Regular processing
• Instead of processing data on demand,
process it regularly
• Nightly scheduled task to evaluate
performance information

Regular Processing: Problems
You need to pull the data out faster and
faster than before!

Shard 2: Tables
• All our data has a timestamp associated
with it
• Looking up data for a particular day was
slow. Very slow.
• We sharded the data again, this time with
a table per week within a vehicles specific
database

Sharding: Fallbacks and logic
• What about data before you implemented
sharding?
• Which table do I need to look at?

Aggregation
• With data segregated on a per vehicle and
per week basis, lookups were much faster
• Performance calculations could be
scheduled nightly, with a single record
recorded for each vehicle for each day in a
central database
• Allows for easy aggregation:
– How far did my fleet travel last week?
– How much energy did they use last month?

Backups and Archives
• SAN backups and snapshots
• With date based sharding:
– Dump a table
– Copy it elsewhere
– Drop it / Flush it (if archiving)

Outsource to the cloud
• Why waste resources doing things that
cloud based services do better (where
legal, security and privacy reasons allow?)

• Maps
• Email delivery
• Even phone integration

Data Type Optimization
• When prototyping a system and designing
a database schema, its easy to be sloppy
with your data types, and fields
• DONT BE
• Use as little storage space as you can
– Ensure the data type uses as little as you can
– Use only the fields you need

Sharding: An excuse
• Sharding was a large project for us, and
involved extensive re-architecting of the
system.
• We had to make changes to every query
we have in our code
• Gave us an excuse to:
– Optimise the queries
– Optimise the indexes

Query Optimization
• Run every query through EXPLAIN
EXTENDED
• Check it hits the indexes
• Remove functions like CURDATE from
queries, to ensure query cache is hit

Index Optimization
• Keep it small
• From our legacy days of one database on
one server, we had a column that told us
which vehicle the data related to
– This was still there...as part of an
index...despite the fact the application
hadn’t required it for months

Live data
• Original database design dictated:
• Each type of data point required a separate
query, sub-query or join to obtain
• Collection device and processing service
dictated:
• GPS Co-ordinates can be up to 6 separate
data points, including: Longitude; Latitude;
Altitude; Speed; Number of Satellites used to
get location; Direction

Dashboards: Caching
• Don’t query if you don’t have to

• Cache what you can; access direct

• With message queuing its possible to
route messages to two or more places:
one to be processed and another to
display the latest information directly

Exporting data: Group
• Where possible group exports and reports
together by the same shard/table/index

Code considerations
• Race conditions
• Number of concurrent requests – group
them

Application Quality
• When dealing with lots of data, quickly,
you need to ensure:
– You process it correctly
– You can act fast if there is a bug
– You can act fast when refactoring

Deployment
• When dealing with a stream of data, rolling
out new code can mean pausing the
processing work that is done
• Put deployment measures in place to
make a deployment switch over
instantaneous

Technical Tips
• Measure your applications performance,
data throughput and so on
– A data at scale problem itself
• Use as much RAM on your servers as is
safe to do so
– We give 80% per DB server to MySQL of 100
– 140GB

What do we have now?
• Now we have a fast, stable reliable system
• Pulling in millions of messages from a queue per
day
• Decoding those messages into 1.5 billion data
points per day
• Inserting 1.5 billion data points into MySQL per
day
• Performance data generated, and grant
authority reports exported daily
• More sleep on a night than we used to

Data at Scale - Michael Peacock, Cloud Connect 2012

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Data at Scale - Michael Peacock, Cloud Connect 2012

Similaire à Data at Scale - Michael Peacock, Cloud Connect 2012 (20)

Plus de Michael Peacock

Plus de Michael Peacock (20)

Dernier

Dernier (20)

Data at Scale - Michael Peacock, Cloud Connect 2012

Notes de l'éditeur