Introduction to Warehouse-Scale Computers

Warehouse-Scale
Computers
CS4342 Advanced Computer Architecture
Dilum Bandara
Dilum.Bandara@uom.lk
Slides adapted from “Computer Architecture, A Quantitative Approach” by John L.
Hennessy and David A. Patterson, 5th Edition, 2012, MK Publishers and
The Datacenter as a Computer:An Introduction to the Design of Warehouse-Scale
Machines by Luiz André Barroso & Urs Hölzle

Outline
 Programming model & workloads
 Architectures
 Cloud computing
2

Warehouse-Scale Computers (WSC)
3
www.laserfocusworld.com/articles/print/volume-48/issue-
12/features/optical-technologies-scale-the-datacenter.html http://www.slashgear.com/google-data-center-hd-photos-
hit-where-the-internet-lives-gallery-17252451/

WSC Layout
5
Source: http://bnrg.cs.berkeley.edu/~randy/Courses/CS294.F07/

Warehouse-Scale Computer (WSC)
 Provides Internet services
 Search, social networking, online maps, video sharing,
online shopping, email, cloud computing, etc.
 Differences with HPC clusters
 Clusters use higher performance processors & network
 Clusters emphasize thread-level parallelism, WSCs
emphasize request/task-level parallelism
 Differences with datacenters
 Datacenters consolidate different machines & software
into a single location
 Datacenters emphasize virtual machines & hardware
heterogeneity to serve varied customers 7

Design Factors for WSC
 Cost-performance
 Small savings add up
 Energy efficiency
 Affects power distribution & cooling
 Work per joule
 Operational costs count
 Power consumption is a primary constraint when
designing a system
 Dependability via redundancy
 Many low-cost components
8

Design Factors (Cont.)
 Network I/O
 Interactive & batch processing workloads
 Web search – interactive
 Web indexing – batch
 Ample computational parallelism isn’t important
 Most jobs are totally independent, “Request-level
parallelism”
 Scale – Its opportunities & problems
 Can afford to build customized systems as WSC
require volume purchase
 Frequent failures
9

Failure Example
 Consider a WSC with 50,000 nodes. MTTF of a node is 5
years. How many failures be there for a day?
MTTF in days = 5 x 365 = 1,825
Failure rate = 1/1,825 per day
No of failures per day = 50,000/1,825 = 27.4
 Consider a WSC with 50,000 nodes & each node with 4
hard disks. Suppose a annual failure rate of a disk is 4%.
What is the time for a disk failure?
No of disks = 50,000 x 4 = 200,000
No of failures per year = 200,000 x 0.04 = 8,000
Time for failure = 365 x 24 / 8,000 = 1.095 hours/failure 10

Programming Models & Workloads
 Batch processing framework
– MapReduce
 Map
 Applies a programmer-
supplied function to each
logical input record
 Runs on thousands of
computers
 Provides new set of (key,
value) pairs as intermediate
values
 Reduce
 Collapses values using
another function 11
http://www.cbsolution.net/techniques/ontarget/mapredu
ce_vs_data_warehouse

MapReduce Execution
12
Source: Dean et. al.,
“MapReduce, OSDI, 2004

(Cont.)
13
www.datanami.com/datanami/2012-07-
16/top_5_challenges_for_hadoop_mapreduce
_in_the_enterprise.html

(Cont.)
 MapReduce runtime environment schedules
map & reduce task to WSC nodes
 Availability
 Use replicas of data across different servers
 Use relaxed consistency
 No need for all replicas to always agree
 Workload demands
 Often vary considerably
14

Computer Architecture of WSC
 Often uses a hierarchy of networks for
interconnection
 Each 19” rack holds 48 1U servers connected to
a rack switch
 Rack switches are uplinked to a switch(es)
higher in hierarchy
 Uplink has 48/n times lower bandwidth –
Oversubscription
 n – No of uplink ports
 Goal is to maximize locality of communication relative
to the rack
15

Network Hierarchy
17
Source: www.laserfocusworld.com/articles/print/volume-48/issue-12/features/optical-
technologies-scale-the-datacenter.html

Infrastructure & Costs
 Location
 Proximity to Internet backbones, electricity cost, property tax rates,
low risk from earthquakes, floods, & hurricanes
 Power distribution
19

Power Usage
20
U.S. EPA Report 2007 – 1.5% of total U.S.
power consumption used by data centers
which has more than doubled since 2000 &
costs $4.5 billion

How Many Nodes can a WSC Support?
 Each node
 “Nameplate power rating” gives maximum power
consumption
 To get actual, measure power under actual workloads
 Oversubscribe cumulative nodes power by 40%,
but monitor power closely
21

Cooling
22
Typically operate around 18 – 22 0C

Cooling (Cont.)
23
Cooling system also uses water (evaporation & spills)
e.g. 70,000 to 200,000 gallons per day for an 8 MW facility

Efficiency
 Power Utilization Effectiveness (PUE)
= Total facility power / IT equipment power
 ≥ 1
 Median PUE on 2006 study was 1.69
24
Source: http://hightech.lbl.gov/benchmarking-guides/data-a1.html

Performance
 Latency is important metric because it is seen by
users
 Bing study
 Users will use search less as response time
increases
 Service Level Objectives (SLOs) & Service Level
Agreements (SLAs)
 Typically given at application level
 e.g., 99% of requests be below 100 ms
 In clouds typically given only for static resources
 CPU speed, no of cores, & memory
25

Cost
 Capital expenditures (CAPEX)
 Cost to build a WSC
 Hardware cost dominates
 Operational expenditures (OPEX)
 Cost to operate a WSC
 Power for nodes & cooling dominates
26

Cloud Computing
27
Clients
Other
Cloud Services
Govt.
Cloud Services
Private
Cloud
Cloud
Manager
Public Cloud
Green Cloud Computing by Dr. Rajkumar Buyya

Cloud Computing (Cont.)
 WSCs offer economies of scale that can’t be
achieved with a datacenter
 5.7 times reduction in storage costs
 7.1 times reduction in administrative costs
 7.3 times reduction in networking costs
 This has given rise to cloud services such as Amazon
Web Services
 “Utility Computing”
 Based on using open source virtual machine & operating
system software
28

Amazon Web Services
 Virtual machines
 XEN
 Very low cost
 $ 0.10 per hour per instance
 Primary rely on open source software
 No (initial) service guarantees
 No contract required
 Amazon S3
 Simple Storage Service
 Amazon EC2
 Elastic Computer Cloud 29

Amazon Web Services – Example
30
http://www.ryhug.com/free-art-available-on-amazon-amazon-web-services-that-is/

Introduction to Warehouse-Scale Computers

Recommandé

Recommandé

Contenu connexe

Similaire à Introduction to Warehouse-Scale Computers

Similaire à Introduction to Warehouse-Scale Computers (20)

Plus de Dilum Bandara

Plus de Dilum Bandara (20)

Dernier

Dernier (20)

Introduction to Warehouse-Scale Computers

Notes de l'éditeur