SlideShare une entreprise Scribd logo
1  sur  187
Télécharger pour lire hors ligne
Architecting for the Cloud
Len and Matt Bass
Elasticity
Link to Yesterday’s lectures
http://www.slideshare.net/lenbass/architecting-
for-the-cloud-scabilityavailability
Topics
Scalability is about acquiring resources but once
they are acquired, they still must be used.
Elasticity is about how to use the resources.
This requires understanding
• Concurrency
• State
and their interactions
3
What is concurrency?
• Concurrency means
performing several
activities
simultaneously
• Concurrency is used
to improve
performance.
4
How do concurrent activities come to
be?
• Explicitly through your code creating a new
thread or process.
• Implicitly through some support system creating
a new thread or process
– Operating system
– Web server
– Database management system
• Implicitly through the infrastructure creating a
new virtual machine
– Elasticity in the cloud
– During deployment of your system
5
Key concepts
• Atomicity
– An atomic operation cannot be divided. It is all or nothing.
• Time
– It takes time to perform an operation.
• Computation
• Messages transferred over a network
• Reading/writing information from a disk (rotating or solid state)
• Dependency
– Coordination among concurrent activities is necessary if
they are sharing resource or results
• Problems arise because operations take time and can
be interrupted. I.e. are not atomic.
6
Synchronous vs asynchronous
• Synchronous coordination between two
concurrent processes means that process A sends
a message for process B and waits for a response.
• Asynchronous coordination means that process A
does not wait for a response.
– It can poll for a response
– A response from process B can be sent as an event.
• In either case, coordination takes time and so
coordination is not an atomic operation.
7
Some problems with concurrent
activities
• Time stamps.
• Many protocols involve putting a time stamp on messages
for error detection and ordering purposes.
• Time stamps are often used to identify log messages used
for debugging problems.
• In some environments, e.g. stock market, trades must be
satisfied in the sequence in which they arrive.
• Race conditions – two processes are simultaneously
accessing the same resource.
• Inconsistency – If two activities are being performed
simultaneously, data may become inconsistent.
8
Clock synchronization
• Suppose two different computers are connected via a
network. How do they synchronize their clocks?
• If one computer sends its time reading to another, it
takes time for the message to arrive.
• NTP (Network Time Protocol) can be used to
synchronize time on a collection of computers.
– Accurate to around 1 millisecond in local area networks
– Accurate to around 10 milliseconds over public internet
– Congestion can cause errors of 100 milliseconds or more.
9
Suppose NTP is insufficiently accurate
• Financial industry is spending 100s of millions of dollars to
reduce latency between Chicago and New York by 3
milliseconds.
– Well within error range of NTP
• GPS time is accurate within
– 14 nanocseconds (theoretically)
– 100 nanoseconds (mostly)
• Timestamp messages with GPS time
– Used by electric companies to measure phase angle
– Used by Google to coordinate time across all of their distributed
systems.
– Requires specialized hardware and installation not yet cheaply
available.
10
Example of a race condition
• Suppose withdrawals are being made from a bank
account. If there are two users simultaneously
withdrawing, the following sequence can occur.
11
User 1 User 2 Acct amount
1000
Read account (1000) 1000
Read acount (1000) 1000
Withdraw 100 (900) 1000
Write new amount (900) 900
Withdraw 100 (900) 900
Write new amount (900) 900
Example of inconsistency
• A cache is frequently used to keep data locally rather
than requiring it to be fetched for each request. Web
browsers, for example, cache web pages.
• For every request, the sequence is
1) look in cache to see if the request can be satisfied with the
contents of the cache
2)If no, then retrieve information and return it to the
requester and place it in the cache.
• Now suppose the web page is changed at its source
• Retrievals of the web page from the cache will retrieve
an out of date version of the web page.
12
Solutions bring new problems
• One technique to prevent race conditions is to
lock critical resources.
• Can lead to deadlock – two processes waiting for
each other to release critical resources
– Process one gets a lock on row 1 of a data base
– Process two gets a lock on row 2.
– Process one waits for process 2 to release its lock on
row 2
– Process two waits for process 1 to release its lock on
row 1
– No progress.
13
Yet more problems
• Locks are logical structures maintained in
software or in persistent storage.
• Getting a lock across distributed systems is not an
atomic operation.
– It is possible that while requesting a lock another
process can acquire the lock. This can go on for a long
time (it is called livelock if there is no possibility of
ever acquiring a lock)
• Suppose the virtual machine holding the lock
fails. Then the owner of the lock can never
release it.
14
Is there a solution?
• The general problem is that you want to manage
synchronization of data across a distributed set of
servers where up to half of the servers can fail.
• Paxos is a family of algorithms that use consensus to
manage state concurrency. Complicated and difficult to
implement.
• An example of the problems
– Choose one server as the master that keeps the
“authorative” state.
– Now master server fails. Need to
• Find a new master
• Make sure it is up to data with the authoritative state.
Luckily
• Several open source systems are now available
that
– Implement Paxos or an alternative consencus
algorithm
– Are reasonably easy to use.
• Two such systems are
– Memcached – discussed at the end of this lecture
– Zookeeper – discussed in tomorrow’s lecture.
In general
• Introducing concurrency will improve
performance but also introduces problems.
• Concurrency is a constant consideration when
architecting for the cloud.
– Coordinating activities across concurrent processes is
difficult and prone to many errors.
– Allowing for failure complicates coordination of
activities.
• Systems are available to provide concurrency for
small amounts of data without your having to
worry about the details.
17
Topics
In order to understand how to achieve elasticity
you must understand
• Concurrency
• State
and their interactions
18
Recall Load Balancer
• Client makes a request that is routed to a
server through a load balancer
Message sequence – client makes a
request
Servers
Clients
Load
Balancer
Message sequence- request arrives at
load balancer
Servers
Clients
Load
Balancer
Message sequence – request is send to
one server
Servers
Clients
Load
Balancer
Message sequence – reply goes back
to client
Servers
Clients
Load
Balancer
Message sequence – now client makes second
request – does it matter which server it goes to?
Servers
Clients
Load
Balancer
???
“Sticky” http requests
• Normally load balancer will route requests
depending on load of servers attached to it.
• This is why it is called “load balancer”
• Client can request to be always routed to same
server. This is done by making a “sticky”
http request.
• Dangerous for two reasons:
– Server may be overloaded and response delayed
– Server may have failed and no response is
forthcoming.
• We assume non sticky http requests.
Suppose message is routed to an
arbitrary instance.
• Understanding what happens requires a
digression into state.
• A computation has two inputs
– Instructions
– Data
• The data input of a computation is called the
state.
How does this work with functions?
• Consider a function that counts how many times it is called.
• Option 1:
int countv1()
{
int i = 0; //declare i and initialize it to 0.
i = i + 1; //add 1 to the last value of i
return i;
}
• The function count remembers i from one call to the next.
• State is maintained inside the function – it is stateful
27
Option 2
int countv2(int i)
{
int a;
a = i + 1; //add 1 to the last value of i
return a;
}
• The function count does not remember the value of i
from one call to the next.
• The client must pass the last value returned.
• State is passed into the function. The function is stateless
28
Option 3
int countv3()
{
int a;
a = dbase_get (“count”); //retrieve current value
a = a + 1; //add 1 to the last value of a
dbase_write(“count” a); //save current value
return a;
}
• The count is stored in a database.
• Neither the client nor the function remembers the value.
• The function is stateless.
29
What is the difference?
• In option 1, the function kept track of the
count value.
• In option 2, the client must keep track of the
count value.
• In option 3, the count value is kept in an
external database.
• In each case, the state (count value) must be
kept somewhere.
30
Suppose the functions are packaged as
processes in virtual machines
Option 1 Option 2 Option 3
Countv2 Countv3Countv1
Client
DB
Processes communicate via messages
• Message from client to process is call
• Message from process back to client is return
of a value
32
Now suppose each process has two
clients – what is computed by option
1?
Countv1
What is computed by option 2?
Countv2
What is computed by option 3?
Countv3
DB
Where state is kept matters
• Option 1 – counts number of times called by
either client. Process remembers value
• Option 2 – counts number of times called by
each client. Client remembers value
• Option 3 – counts number of times called by
either client. Database remembers value.
Options 1 & 3 calculate different things than
option 2.
36
Now suppose each process has two
instances– remember the load balancer
Countv1 Countv1
Load balancer distributes messages to servers
What is computed by option 1?
38
Countv1 Countv1
What is computed by option 2?
Countv2 Countv2
What is computed by option 3 ?
Countv3 Countv3
DB
Now what do the options compute?
• Option 1 – each instance of the function
countv1 computes how many times it was
invoked
• Option 2 – each instance of the function
countv2 computes how many times each
client invoked either instance
• Option 3 – the database contains the number
of times either instance was invoked by either
client.
41
What have we seen?
• When there was one instance of a client and
one instance of the count process- all three
versions were identical
• When there were two clients and one instance
of the count process– two versions were the
same, one was different
• When there were two clients and two
instances of the count process– all three
versions produced different results.
42
Message so far
• How state is managed is important and will
lead to different results when there are
multiple instances of clients or functions.
• Now we return to elasticity
• Remember the sequence?
43
Message sequence – client makes a
request
Servers
Clients
Load
Balancer
Message sequence- request arrives at
load balancer
Servers
Clients
Load
Balancer
Message sequence – request is send to
one server
Servers
Clients
Load
Balancer
Message sequence – reply goes back
to client
Servers
Clients
Load
Balancer
Message sequence – now client makes second
request – does it matter which server it goes to?
Servers
Clients
Load
Balancer
???
It depends where state is kept
• If state is kept in the client, then it does not
matter since the client keeps track of the calls
• If state is kept in a database then it does not
matter since the results are kept external to
the servers
• If state is kept in the server then it does
matter since sending message back to server 1
will give different result than sending it to
server 2.
Keeping servers stateless enables
elasticity
• A new instance of a server can be
– Created/stopped
– Registered /unregistered with the load balancer
– Placed in/removed from service
without
Requiring the client to be aware of which server
instance it is interacting with
Requiring that clients be notified if a server is taken
out of service
Types of State
• Session state
• Client side state
• Server side
• Persistent
What is a session?
• A session typically refers to a series of
interactions between one client login to a
system and the termination of that login –
whether through logging out or through
timing out.
• A session can also span multiple logins. E.g.
Netflix keeps track of where you are in a
movie and returns you to that location the
next time you log in.
Session State
• Session state is information that persists for a
session. We are considering a single login here.
The multiple login case is a special case of
persistent state.
• What happens when you login
– When you successfully login to a service, the service
returns a code that identifies you. This is the session
ID.
– Other information can also be included such as MAC
address (to prevent man in the middle attacks).
– It is typically managed on the client side. Your
browser does all of this.
Client Side State
• This can be difficult if there is significant state
to save, however
– This means you’ll need to pass all of this state with
each request
– This requires more network overhead
• This also means you’ll need to store data on
the client machine
– This can have security implications
Stateful Services
• If your services are stateful that makes
scalability more difficult
• If you’re able to design your system such that
the services are stateless you’ll make scaling
much easier
• If an operation is dependent on the results of a
previous operation it’s more difficult to make
services stateless
Management of state between
services and persistent tier
• Non client side state can be either kept in the
services or in a persistent store.
• The choice depends on the volume of data,
the latency involved, the synchronization
needs for the servers and the time the state is
expected to persist.
Important latency numbers
• Main memory reference 100 ns
• Send 1K bytes over 1 Gbps network 0.01 ms
• Read 4K randomly from SSD. 15 ms
• Read 1 MB sequentially from memory 0.25 ms
• Round trip within same datacenter 0.5 ms
• Read 1 MB sequentially from SSD 1 ms
(4X memory)
• Disk seek 10 ms
(20x datacenter roundtrip)
• Read 1 MB sequentially from disk 20 ms
(80x memory, 20X SSD)
• Send packet CA->Netherlands->CA 150 ms
57
* dean-keynote-ladis2009_scalable_distributed_google_system
Implications of latency numbers
• State stored in persistent storage (disk or SSD) will
take longer to fetch than state stored in memory.
• State stored in a different datacenter will take longer
to access than state stored locally, especially across
continents.
• Persistent store is typically replicated both for
performance (latency) reasons and for availability
(failure) reasons.
• => keeping data consistent across different
occurrences of it is important but difficult.
Topics
In order to understand how to achieve elasticity
you must understand
• Concurrency
• State
and their interactions
59
Keeping data consistent
• We will discuss persistent data consistency
when we discuss databases.
• Memcached is an open source tool that
provides in-memory synchronization of data
across different instances of a service.
• Now consider these layers deployed onto
multiple servers.
Layers of a service
Business logic for the service
Memcached
Memcached in multiple servers
• Memcached keeps small amount of state in all
servers consistent.
• At a small cost in latency as long as they are in
same physical location.
Memcached Memcached
Business
logic
Business
logic
When to use Memcached
• Data must be synchronized among servers.
• Memcached takes care of concurrency issues
• Data is relatively small
– One object < 1MB
– Total memory used per server depends on how much
you are willing to give it per server since it is stored in
memory, not on a persistent store
• Lifetime of the data should not exceed time any
of the servers are alive. I.e. if all the servers die,
then the data disappears.
Summary
• The cloud doesn’t guarantee elasticity
• You’ll need to design your system to be elastic
• State management, your storage solution, and
consistency, are all factors that you’ll need to
consider
QUESTIONS?
Architecting for the Cloud
Introduction to Security
Agenda
• What is security?
• Understanding the threat
• Architectural approaches to security
• Designing for security
• Summary
Agenda
• What is security?
• Understanding the threat
• Architectural approaches to security
• Designing for security
• Summary
Your Experience
• Think about your past experience
– How have you thought about security?
– What steps have you (or your organization) taken
to protect the system?
• Do you remember Assignment 2?
– Security was equivalent to having a login feature
or encryption
Security … What is it?
• What do we mean when we say security?
• In your experience what does this mean?
Let’s Look at some Examples
Fort Knox
• Fort Knox is a US Army post in Kentucky
• In addition to housing various US Army
functions it is also the home to a gold bullion
depository
– 5000+ tons of gold housed there
Security
• What is the business asset that needs
protection in this case?
• What does protect mean here?
What About the CIA?
• The Central Intelligence Agency (CIA) is a US civilian
intelligence organization
• Primary purpose is to collect information about
foreign governments, corporations, and individuals
• It uses this information to influence public
policymakers
– It does at times engage in tactical operations as well
Security
• What is the business asset that needs
protection?
• What does protect mean in this case?
Power Distribution
• What would security mean if you have a
system that manages the power grid?
Business Context
• The business need differs from one context to another
• Organizations have assets they need to protect
• They need to protect these assets for different reasons
– Business continuity
– Liability reasons
– Regulation
– Protection of IP
– …
Security – A Set of Concerns
• The related concerns are typically classified as
“security” concerns
• In software these concerns are typically:
– Confidentiality
– Data integrity
– Non repudiation
– Availability
Confidentiality
• The property that reflects the extent to which:
– Data and services are only available to those that
are authorized to access them
• Is this a concern for a Museum? How about a
Financial Institution?
Integrity
• This property can also refer to data or services
• It reflects the extent to which data or services can
be delivered as intended
• E.g. hopefully the grade that we have recorded
for you in this course is correct …
Non Repudiation
• Nonrepudiation is refers to the ability to guarantee
that the sender can not later repudiate or deny
having sent the message
• It can also refer to the guarantee that the recipient
cannot later deny having received the message
• When might this be important?
Availability
• This is the property that reflects the extent to
which the system will be available for
legitimate use
• A denial of service attack is meant to disrupt
the availability of a system
Protection Against What?
• Now that we understand the business asset,
what are we protecting against?
• In order to appropriately protect our system
we need to understand the threat
• Let’s look at example exploits …
Agenda
• What is security?
• Understanding the threat
• Architectural approaches to security
• Summary
Threat Sources?
• Insider threats
• Physical threats
• Social engineering
• External attacks
Who is Leveraging These Techniques?
• The art of hacking has gone from an individual
activity to a highly coordinated and sophisticated
effort
– It can now be quite lucrative as well
• Today many legitimate and illegitimate organizations
routinely launch attacks
– Just run a port scan detector on your system
• Let’s look at the progression of exploits
Progression of Exploits
• Mischievous individuals:
– The first generation of hackers were technical youth performing mischievous acts
• Revenue generation: a proof of concept
– These were the first example of hacking for money
– Still small scale
• Organized crime
– These were criminal organizations involved in larger scale criminal activity
• Widespread adoption
– The infrastructure needed to launch Cyber attacks is now widespread
– The barrier to entry has been lowered
– Legitimate entities enter the game
• Advanced persistent threats
Hackers – First Generation
• In the 1990s hackers were by and large not
malicious
• They were in it for the challenge
• Notable hackers
– Kevin Mitnick
– Chen Ing-Hau
– Jeffery Lee Parson
– Sven Jaschan
Kevin Mitnick
• Broke into dozens of computer networks
– Pac Bell
– DEC
– MCI
– Digital
– …
• Wasn’t in it for financial gain
• Largely used “social engineering” techniques
• Arrested twice 1988 and again in 1999
Mitnick 1995
Mitnick’s Techniques
• Largely used “social engineering” to gain
access to passwords and insider information
• Used this information to gain access to target
system
• Mitnick claims that he never “hacked” a
system (still a point of controversy)
Chen Ing-Hau
• University student that created and released the CIH
virus in 1999
– Wrote the virus to “make a fool of the software vendors”
• Virus that would render the computer essentially
inoperable on a specified date
• Became one of the most widespread viruses
• Some version of this virus have showed up multiple
times
CIH Virus
• Exploited vulnerability in Windows 95, 98, &
ME
– Along with an issue in various BIOS chipsets
• Would overwrite the first megabyte of the
hard drive and attempt to overwrite flashdrive
• Result rendered the pc inoperable
Jeffery Lee Parson
• Was 18 when he confessed to be the creator of
Blaster worm
• A Chinese “cracking” collective reverse engineered a
MS patch
• Parson created a worm to exploit a buffer overflow
issue
• Affected DCOM’s RPC service
– Worm could spread without users opening an attachment
Blaster Worm
• In addition to changing RPC service it would
– Change registry to launch msblast.exe
• Worm would launch a distributed denial of
service attack from infected computers
– Attack was against windowsupdate.com
• Sent messages to Bill Gates
Sven Jaschan
• Authored Sasser and Netsky worms
• Claims to have written them to remove
Mydoom and Bagle worms
• Worms were responsible for 70% of the
infections in 2004
Netsky
• Sent out as an email attachment
• Contained insults aimed at the author of
Mydoom and Bagle
• Other symptoms included “beeping” in the
early morning hours of specific dates
Sasser
• Would connect to computers through a
particular port that was often open by default
• Exploited a buffer overflow
• Would shut the computer down after
displaying a shutdown timer
Cyber Criminals – Proof of Concept
• After the turn of the century a new breed emerged
• They took the techniques employed by the
mischievous youth and used them for monetary gain
• These were the first real “cyber criminals”
– Ferid Essebar
– Attilla Ekici
– Jeanson James Ancheta
Ferid Essebar & Attilla Ekici
• The two people behind Zotab computer worm
• Worm affected CNN, ABC News, NY Times, US
Dept of Homeland Security, …
• Intention was to facilitate credit card forgery
scams
Zotab
• Exploited vulnerability in Windows 2000
• Caused the computer to restart continuously
• Files would be created with every reboot
• Spyware was installed on the system
– The spyware remained after the virus was removed
• The goal was to facilitate scams (for money)
Jeanson James Anacheta
• First person to be arrested for controlling a
large number of hijacked computers
• Created a large Botnet
– Network of bots or “software robots”
• Offered his collection of bots for hire
• Leveraged rxbot to increase his network
Rxbot
• Contained a proxy server
• Server can be spawned by a remote attacker
• Typically used for denial of service attacks
Cyber Gangs
• “Organized” crime gets involved
• Coordinated attacks against high value targets
• Often involve groups and large sums of money
• Examples
– Yaron Bolondi
– Maria Zarubina
– Albert Gonzalez
Yarib Bolondi
• Part of a gang that attempted to steal £220
million from Japanese bank
• Used keylogging to gain access to bank’s
computers
• Software is installed on employees computers
– Via malware or other virus
Maria Zarubina
• Part of a gang that used cyber attacks as a means for
extortion
• Attacked British “bookmakers”
– Agreed to stop attacks if ransom was paid
• Used denial of service attacks to shut down gambling
sites
• Would then threaten additional attacks unless
payment was made
Albert Gonalez
• Responsible for largest credit card theft in history
• Stole and resold more than 170 million cards
• Used SQL injection to introduce “malware
backdoors”
– These allowed packet sniffing attacks
• Targets included Target, TJ Max, Dave & Busters, 7-
eleven, JC Pennys, …
ARP Spoofing
• Used to attack an ethernet network
• Allows attacker to “sniff” data on a LAN and modify
or stop the traffic
• Attacker sends a spoofed ARP message to Ethernet
LAN
• “Man in the middle” attack
– Attackers computer masquerades as destination computer
and gets intended traffic
Advanced Persistent Threat
• Today we’ve started to see a new class of threat
emerge
• These threats are against specific high value targets
• They are characterized by coordinated activity taking
place of a long period of time
– The individual actions may seem isolated
• The perpetrator doesn’t act on the exploit until
sufficient penetration has been achieved
• Has anyone heard of Stuxnet?
• How about Gauss or Flame?
Software as a Weapon
• In 2010 Iran announced they put their nuclear
program on hold
– No one was sure why
• It turns out the reason was that more than 1000
centrifuges in their uranium enrichment facilities
were destroyed
• How were these centrifuges destroyed?
– By the first known weapon that was 100% software
Stuxnet
• Stuxnet was a worm that infected SCADA systems
made by Siemens
– Think power plant and power distribution control systems
• It was capable of
– Increasing the pressure inside nuclear reactors
– Switching off oil pipelines
• Additionally it would report that the systems were operating
normally
Sophisticated Attack
• Why is stuxnet special?
• First, it didn’t use a forged security clearance
– It used a genuine security clearance that was stolen
• Second, it had a specific target
– It infected many systems worldwide but remained dormant until
it found the systems controlling the intended target
• Third, it exploited not 1, but 20 zero day vulnerabilities
Response
• Iran responded to the attack with an open call
for hackers to join the Iranian Revolutionary
Guard
• Iran now has reportedly amassed the 2nd
largest online army in the world
Side Note
• Stuxnet is now open source
• This is code that is capable of crashing power
plants and disrupting oil pipelines
• Go to youtube and search for stuxnet
– You’ll get many videos of people dissecting
stuxnet …
Advanced Persistent Threats
• Stuxnet is an example of what we call “Advanced
Persistent Threats”
• In some cases exploits are not opportunistic
reactions to discovering a vulnerability
• They are coordinated multipronged attacks that can
take place over an extended period of time
Coordinated Attack
• Intruders will look for some way to find access
to a system
• They will then try to move laterally until they
are able to access the intended target
• This can take days, weeks, months, or even
years
Email
What’s the Point?
• Almost all of these incidents exploited
vulnerabilities
• These vulnerabilities came along with the
commercially available software used in the
attacked systems
• Vulnerabilities continue to exist in the
software that we use
Vulnerabilities
• Many organizations (legitimate and illegitimate) try to find these
vulnerabilities
– CERT is an example of such an organization
• Organizations like CERT would inform the developers of the
software of the vulnerability
• Historically companies were slow to react
• CERT didn’t want to release it publically without a fix being available
• So CERT would notify the organization and then release the
vulnerability publically after a given time elapsed
X Day Vulnerabilities
• Vulnerabilities are characterized by the time since
they were made public
– 1 day vulnerabilities were released 1 day ago
• The newer the vulnerability the less likely it is to be
patched
• Zero day vulnerabilities are those that the
manufacturer doesn’t yet know about
– Clearly these are the most attractive to attackers
Vulnerability Market
• A market has emerged for these vulnerabilities
• If you discover a vulnerability you can sell it
• The value of the vulnerability is determined by:
– The “day” of the vulnerability
– The number of instances of the software containing the
vulnerability
Selling The Vulnerability
• Many entities buy these vulnerabilities
– Governments (including the US)
– Organized crime syndicates
– Individuals
• Prices range from $10 - $250,000 or more
– Depending on the exclusivity of the sale as well as the value of the exploit
• Check out:
– http://www.forbes.com/sites/andygreenberg/2012/03/23/shopping-for-zero-days-an-
price-list-for-hackers-secret-software-exploits/
– http://www.zdnet.com/blog/security/black-market-for-zero-day-vulnerabilities-still-
thriving/2108
Exploit Auction Houses
• There are now auction houses that sell
vulnerabilities (or exploits)
– Like the ebay of exploits
– In fact exploits were originally sold on ebay
• It’s actually legal to sell these exploits
– Even though the attacks themselves may be
illegal
Exploit as a Service (EaaS)
• Believe it or not you can now get a service to
manage your attacks
• One issue if you’re going to launch an attack is
finding a “bulletproof” provider
– A provider willing to host a malware server
• These services will provide “exploit kits” and
manage the hosting
• In some cases they even offer analytics for the
consumer’s campaigns (think google analytics)
Widespread Adoption
• All of this has lowered the barrier to entry for
exploiting vulnerabilities
• There are large numbers of people with the
means and motive to attack any system online
• Furthermore secure practices are often not
followed
– See next slide
Many Systems Remain Vulnerable
• Remember the issues with Open SSL that surfaced in early 2014?
– Despite widespread news reports, many systems continue to be vulnerable
• June 2014 survey of TLS vulnerabilities
Cloud Related Issues
• In many respects security in the cloud is not
different from security for a traditional system
• Some threats are magnified, and some
additional threats exit
• We’ll look at:
– VM sprawl
– Insecure interfaces or API
– Malicious insiders
– Shared resources
VM Sprawl
• VM creations is quick and easy
– It can be done in seconds without procuring hardware,
administrative knowledge, or securing permissions
• As a result it’s done often
– Sometimes for transient needs
• Once created the VM is often forgotten about
– It might still exist even if it is no longer doing any work
• Keeping track of the existing VMs is difficult to do
– It requires different processes than tracking physical assets
• This results in something called VM Sprawl
Consequences of VM Sprawl
• VM Sprawl is bad for many reasons
• First, it imposes additional overhead on the
overall solution
– The VM still costs money even if it is offline
• Second, it is less likely to be included in the
normal maintenance efforts
– Updates and patches might not be applied
• As a result the VM can remain vulnerable
Insecure Interfaces or API
• IaaS and PaaS providers expose a set of API
• These API are used by customers to:
– Provision
– Manage
– Orchestrate
– Monitor
– …
• The security of the cloud is dependent on the security
of these API
• These API must be designed in a way to resist
accidental and malicious attempts to circumvent policy
3rd Party API
• We not only need to trust the expertise and
procedures of the cloud providers but 3rd party vendors
as well
• Organizations often layer capability on top of the
provided API in order to add value to the consumer e.g.
– Deployment tools
– Monitoring aggregation tools
– Data management services
– …
• The security of these providers also needs to be trusted
How Does This Work?
User 3rd Party Service Cloud Provider
Malicious Insiders
• Malicious insiders are a known and significant
threat to corporate security
– E.g. former and disgruntled employees
• When deploying your application on the cloud
you need to worry about employees of the
cloud provider as well
Shared Resources
Shared Resources
• When software running in a process within
a VM can elevate privileges sufficiently they
can “escape” the bounds of the VM
• This is called “guest to host VM escape”
• Once this happens the software is able to
control all of the instances within that
hypervisor
Hypervisor Vulnerabilities
• The most commonly used hypervisors have all
been exploited
• Vulnerabilities continue to be discovered in all of
the major hypervisor software
– Discovered by both the good guys and bad guys
• Do a Google search on VM Escape for the latest
vulnerabilities …
Addressing Security Issues
• The strategies for dealing with security issues
typically fall into one of three categories
– Secure coding practices
– Processes and policy
– Architectural approaches
Secure Coding Practices
• Looking at the source of the vulnerabilities it may seem that secure
coding practices will solve the problem
• While this is true to some extent as we said these vulnerabilities
exist in most commercially available software
• We must therefore assume that our software is to some extent
insecure
• It’s also the case that we will miss issues
• Inevitably the software will have defects, will be used in a context
other than what was intended, or will be used with software that it
wasn’t intended to work with
Processes and Policy
• A large aspect of dealing with security includes
processes and procedures
• The security of the system is impacted by things
like:
– Physical security
– IT policy governing computers on the network
– Updating and patching procedures
– Organizational structure and access policies
• Defining appropriate practices is a key
component to security
Agenda
• What is security?
• Understanding the threat
• Architectural approaches to security
• Designing for security
• Summary
Security Strategies
• Security strategies fall in one of several categories
– Policy/process
– Secure coding practices
– Architectural
• We will now look at some architectural strategies
• The thing to keep in mind is that you cannot easily eliminate
all vulnerabilities
– Some of the approaches are aimed at minimizing vulnerabilities
– Some are aimed at reducing the impact if the vulnerabilities are
exploited
Resisting Attacks
• Resisting attacks is analogous to securing the
perimeter
• Strategies for resisting attacks include:
– Encryption
– Checking data integrity
– Limiting exposure
– Limiting access
Encryption
• Applied to data and communications can help
maintain confidentiality
• Can be symmetric
– Both parties use the same key
• Or asymmetric
– Public/private key
Encryption
• What kind of attack would encryption protect
against?
• What kind of attack would it not protect
against?
• What kind of security concern would it
address?
Data Integrity
• Encoding data with checksum or hash results
can help ensure the data has not been
tampered with
• This additional data can be encrypted along
with or independently from the original data
Data Integrity
• Think about data integrity concerns in the context of
some of the recent attacks
– Stuxnet
– Gauss
– …
• These techniques can be important for detecting an
attack
– Additional techniques might be needed to recover
Limiting Exposure
• Attacks depend on exploiting weaknesses to
gain access to data and services
• Limiting access to the attack surface limits
risk*
• The following are approaches to limiting
exposure
* Manadhata 2006
Client Data Storage
• Problem: many applications store data at
potentially untrusted clients.
– These clients could tamper with the data
• Solution: this pattern uses encryption to store
security-critical data client-side
Client Data Storage II
• Manual inspection of this data could reveal
details of the application that could be used to
compromise the site
Client Input Filters
• Problem: in many cases clients execute
outside the control of the system developer.
– These clients can be tampered with to behave in
an untrustworthy manner
• Solution: treat all data provided by clients as
suspect
Client Input Filters II
• Perform (or re-execute) data validity checks on the server
• Exam headers and URLs for malicious code
• Text input should be checked for scripts
• Calculated fields should be re-computed on the server
• Considerations:
– Should use a symmetric key as it’s less computationally expensive
– Storage of the key should not be stored in a file
Trusted Proxy
• Problem: it may be necessary to expose
inadequately protected aspects of the system
to untrusted users
• Solution: create a trusted proxy that acts as a
buffer between the component and the users
Trusted Proxy II
• This proxy intercepts and filters all
communication
• In that way it can compensate for the lack
of protections
• Typically two options
– Filter requests for bad input
– Recreate a new request with only the essential
parts of the old request
Single Access Point
Problem: a system is more difficult to secure if it has multiple
entry points
• With multiple entry points:
– You may need to separately secure multiple applications
– You may have duplicate authentication logic to maintain
– Unix is an example with multiple entry points
– Different services can be set up on different machines
Single Access Point II
• The solution is to create a single point of entry
• A session is then created
• This allows global tracking of session state and
authorization information
• There is a single “gateway” or “check point” through
which user’s login is validated
Single Access Point III
• Which aspects of security does this pattern
address?
• What are some of the implications of using
this pattern?
Partitioned Application
• Problem: large complex applications often
require root privileges in some portions of the
application
– If these elements are compromised the entire
system is at risk
• Solution: partition the large application into
smaller elements each adhering to least
privilege principle
Partitioned Application II
• This becomes more difficult to manage
• Additionally performance can suffer as
interprocess communication increases
• Additional points of entry are introduced
– Even though the impact of being compromised is
diminished
Password Propagation
• Problem: most applications manage user data
under a single database account
– Thus if the single account is compromised all user
data can be accessed
• Solution: the users password is required with
each backend database request
Password Propagation II
• This is essentially an instance of application
partitioning
• The front end will cache the password and
provide it with each back end request
Limiting Access
• You can think of this as “securing the
perimeter”
• This is a widely used approach of limiting
access to data and services
• The following are examples of techniques for
limiting access
Session
• Background: Systems need to keep track of
user’s login status, level of authorization, and
so forth
– The Singleton pattern is often used for this
– This pattern can be difficult to use when the
system support concurrent logins
• The solution is to create a “session” object to
hold these global variables
Session II
• This session object is accessible by all
components of the application
• This facilitates having a common interface for
accessing this information
– Easier to implement and maintain than having a
number of variables passed around
Roles
• Background: when an application supports
many types of users security becomes more
complicated
– It can be difficult to track and maintain all of the
things that every user has access to
• It eases implementation issues if a smaller
number of “roles” are created
• Each role has a given set of rights
Roles II
• What kinds of security does this address?
• Implications?
Account Lockout
• Problem: there is an increased number of password guessing
tools to compromise systems requiring user authentication
• Solution: lock the user account after some number of
incorrect attempts
• How it works:
– The system records each incorrect login attempt
– When a predetermined number of attempts is reached the account is
locked
– Each time there is a correct login the account is reset
Account Lockout II
• Issues:
– Doesn’t address the situation where different user
IDs are used
– Usability can be adversely affected
– Availability can be adversely affected
• Can facilitate denial of service
Detecting Attacks
• Detect Intrusion
• Detect Denial of Service
Minefield
• Problem: hackers are likely familiar with the
vulnerabilities of various configurations
– Once they figure out your setup they’ll know how
to get in
• Solution: change your setup to a non-standard
configuration
Minefield II
• Even small changes can increase the effort enough to
discourage hackers
• You can do things like:
– Alter file structure
– Rename common administrative commands
– Instrument commands to alert administrators
– Add booby traps that will recognize tampering
Secure Assertion
• Problem: the activities performed by a
malicious intruder may look legitimate at the
local level
– E.g. transferring money from an account
• Solution: create a framework for reporting
specific activities that violate assertions
Secure Assertion II
• The application developer is in a position to determine
activities that may be suspicious
– They can create assertions
• If the application is being developed in an environment
that supports exceptions, assertion violations could be
reported in a similar fashion
• The violations could be collected globally to provide
additional insight on the current activities
Recovering From Attacks
• Availability tactics
– We will discuss these in a future class
• Auditing
– Keeps a trail of the users and their actions
– Helps to maintain a record of the attack
Network Address Blacklist
• Problem: all systems with an online presence
are subject to attack
– Locking individual accounts doesn’t address
systemic attacks
• Solution: block network addresses that are the
source of attack
Network Address Blacklist II
• The server will monitor requests from clients
– Any suspicious requests will be logged
– If there are repeated suspicious requests the address is
blocked
• One question is where to implement the check
– Network (e.g. firewall) or application
• Performance as list grows can be an issue
• Can still be subject to denial of service attack
Agenda
• What is security?
• Understanding the threat
• Architectural approaches to security
• Designing for security
• Summary
So How Do We Decide?
• There are many options, which ones are
required?
• What are the side effects of selecting these
security mechanisms?
Fit for Purpose
• It is (hopefully) clear that each of these techniques
addresses a different concern
• What concerns does your organization have?
– This depends on the business assets that need
protection
– And the ways in which these assets could be
compromised given the system
Threat Modeling
Threat Modeling and Analysis in a nutshell:
– Identify the business asset to protect
– Brainstorm the known threats to the system
– Rank the threats by decreasing risk
– Chose techniques to mitigate the threats
– Chose appropriate technologies from the identified
techniques
Business Asset
• The reason for security is to protect some
aspect of the business
• You need to identify those aspects of the
business that need protection
• You also to determine what “protection”
means
Brainstorm Threats
• Given a particular design what might happen to
compromise the business asset?
• You should think about these from two perspectives
– Likelihood
– Impact
• At this point you don’t worry about if they need
mitigation
Rank the Threats
• Based on the likelihood and the impact you
can determine the “risk exposure”
– Look at risk management techniques
• Prioritize the risks according to the exposure
• Determine the threshold that require
mitigation
Mitigation Techniques
• Look for generic patterns that will mitigate the
risks
• Mitigate means lower the risk exposure to a
tolerable level
– You lower the exposure by reducing the likelihood
or reducing the impact
– A tolerable level means below the threshold
defined previously
Choose Technologies
• Basically you need to map the generic pattern to
some concrete solution
• This is where you factor in the costs
• Costs could come in terms of level of effort to
implement
• Costs could also come in terms of tradeoffs
– You might need to iterate these steps
Consider Trade Offs
• Most of these mechanism adversely impact
performance
– Blindly selecting these capabilities can bring the
system to a standstill
• They also have an impact on the flexibility of
the system
• Balancing concerns is key
References
• STRIDE: http://msdn2.microsoft.com/en-us/library/aa302419.aspx
• Hinton, Hondo, Hutchison: Security Patterns within a Service Oriented Architecture
IBM 2005
• Hafiz, Johnson Security Patterns and their Classification Schemes
• Thomas Erl Service Oriented Architecture Chapters 4 and 11
• SEI/CERT OCTAVE: Operationally Critical Threat, Asset, and Vulnerability
Evaluation: http://www.cert.org/octave
• Manadhata et al. Measuring the Attack Surfaces of Two FTP Daemons 2006
Questions??

Contenu connexe

Tendances

Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systemsTinniam V Ganesh (TV)
 
Load Balancing In Distributed Computing
Load Balancing In Distributed ComputingLoad Balancing In Distributed Computing
Load Balancing In Distributed ComputingRicha Singh
 
The Architect's Two Hats
The Architect's Two HatsThe Architect's Two Hats
The Architect's Two HatsBen Stopford
 
Building large scale, job processing systems with Scala Akka Actor framework
Building large scale, job processing systems with Scala Akka Actor frameworkBuilding large scale, job processing systems with Scala Akka Actor framework
Building large scale, job processing systems with Scala Akka Actor frameworkVignesh Sukumar
 
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...Load Balancing in Cloud Computing Environment: A Comparative Study of Service...
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...Eswar Publications
 
Designing Distributed Systems: Google Cas Study
Designing Distributed Systems: Google Cas StudyDesigning Distributed Systems: Google Cas Study
Designing Distributed Systems: Google Cas StudyMeysam Javadi
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systemsAshwani Priyedarshi
 
Open west 2015 talk ben coverston
Open west 2015 talk ben coverstonOpen west 2015 talk ben coverston
Open west 2015 talk ben coverstonbcoverston
 
A load balancing model based on cloud partitioning
A load balancing model based on cloud partitioningA load balancing model based on cloud partitioning
A load balancing model based on cloud partitioningLavanya Vigrahala
 
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...bhavikpooja
 
MariaDB High Availability Webinar
MariaDB High Availability WebinarMariaDB High Availability Webinar
MariaDB High Availability WebinarMariaDB plc
 
Bosun Monitoring Talk at LISA14
Bosun Monitoring Talk at LISA14Bosun Monitoring Talk at LISA14
Bosun Monitoring Talk at LISA14Kyle Brandt
 
Distributed systems and consistency
Distributed systems and consistencyDistributed systems and consistency
Distributed systems and consistencyseldo
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals Vrushali Lanjewar
 
Fault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataFault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataKaran Pardeshi
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centersMariaDB plc
 

Tendances (20)

Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systems
 
Load Balancing In Distributed Computing
Load Balancing In Distributed ComputingLoad Balancing In Distributed Computing
Load Balancing In Distributed Computing
 
The Architect's Two Hats
The Architect's Two HatsThe Architect's Two Hats
The Architect's Two Hats
 
Load balancing
Load balancingLoad balancing
Load balancing
 
Building large scale, job processing systems with Scala Akka Actor framework
Building large scale, job processing systems with Scala Akka Actor frameworkBuilding large scale, job processing systems with Scala Akka Actor framework
Building large scale, job processing systems with Scala Akka Actor framework
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...Load Balancing in Cloud Computing Environment: A Comparative Study of Service...
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...
 
Designing Distributed Systems: Google Cas Study
Designing Distributed Systems: Google Cas StudyDesigning Distributed Systems: Google Cas Study
Designing Distributed Systems: Google Cas Study
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systems
 
Open west 2015 talk ben coverston
Open west 2015 talk ben coverstonOpen west 2015 talk ben coverston
Open west 2015 talk ben coverston
 
A load balancing model based on cloud partitioning
A load balancing model based on cloud partitioningA load balancing model based on cloud partitioning
A load balancing model based on cloud partitioning
 
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
 
MariaDB High Availability Webinar
MariaDB High Availability WebinarMariaDB High Availability Webinar
MariaDB High Availability Webinar
 
Bosun Monitoring Talk at LISA14
Bosun Monitoring Talk at LISA14Bosun Monitoring Talk at LISA14
Bosun Monitoring Talk at LISA14
 
Distributed systems and consistency
Distributed systems and consistencyDistributed systems and consistency
Distributed systems and consistency
 
My Dissertation 2016
My Dissertation 2016My Dissertation 2016
My Dissertation 2016
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals
 
Fault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataFault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big Data
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centers
 
Big Data for QAs
Big Data for QAsBig Data for QAs
Big Data for QAs
 

En vedette

Introduction to dev ops
Introduction to dev opsIntroduction to dev ops
Introduction to dev opsLen Bass
 
MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...
MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...
MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...Hanumantha Raju
 
Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...
Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...
Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...BMS Institute of Technology and Management
 
EASA PART-66 MODULE 5.3 : DATA CONVERSION
EASA PART-66 MODULE 5.3 : DATA CONVERSIONEASA PART-66 MODULE 5.3 : DATA CONVERSION
EASA PART-66 MODULE 5.3 : DATA CONVERSIONsoulstalker
 
Principles of software architecture design
Principles of software architecture designPrinciples of software architecture design
Principles of software architecture designLen Bass
 
The logic gate circuit
The logic gate circuitThe logic gate circuit
The logic gate circuitroni Febriandi
 
Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...
Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...
Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...BMS Institute of Technology and Management
 
Introduction to VLSI
Introduction to VLSI Introduction to VLSI
Introduction to VLSI illpa
 
Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...
Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...
Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...BMS Institute of Technology and Management
 

En vedette (9)

Introduction to dev ops
Introduction to dev opsIntroduction to dev ops
Introduction to dev ops
 
MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...
MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...
MOSFETs (10EC63) Notes for Electronics & Communication Engineering Students o...
 
Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...
Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...
Microelectronic Circuits Notes (10EC63) by Dr. M. C. Hanumantharaju of BMS In...
 
EASA PART-66 MODULE 5.3 : DATA CONVERSION
EASA PART-66 MODULE 5.3 : DATA CONVERSIONEASA PART-66 MODULE 5.3 : DATA CONVERSION
EASA PART-66 MODULE 5.3 : DATA CONVERSION
 
Principles of software architecture design
Principles of software architecture designPrinciples of software architecture design
Principles of software architecture design
 
The logic gate circuit
The logic gate circuitThe logic gate circuit
The logic gate circuit
 
Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...
Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...
Microelectronic Circuits (10EC63) Notes for Visvesvaraya Technological Univer...
 
Introduction to VLSI
Introduction to VLSI Introduction to VLSI
Introduction to VLSI
 
Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...
Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...
Unit 6 Operational Amplifiers Notes by Dr. M. C. Hanumantharaju of BMSIT Bang...
 

Similaire à Architecting for the cloud elasticity security

Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopAyon Sinha
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
Architecting for the cloud scability-availability
Architecting for the cloud scability-availabilityArchitecting for the cloud scability-availability
Architecting for the cloud scability-availabilityLen Bass
 
02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdf02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdfRobeliaJoyVillaruz
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performanceSyed Zaid Irshad
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
Cloud Computing - Geektalk
Cloud Computing - GeektalkCloud Computing - Geektalk
Cloud Computing - GeektalkMalisa Ncube
 
UNIT II DIS.pptx
UNIT II DIS.pptxUNIT II DIS.pptx
UNIT II DIS.pptxSamPrem3
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Continuent
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Xavier Lucas
 
An Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.pptAn Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.pptHarshalUbale2
 
Architectural Tactics for Large Scale Systems
Architectural Tactics for Large Scale SystemsArchitectural Tactics for Large Scale Systems
Architectural Tactics for Large Scale SystemsLen Bass
 
CSA unit5.pptx
CSA unit5.pptxCSA unit5.pptx
CSA unit5.pptxAbcvDef
 
Anatomy behind Fast Data Applications.pptx
Anatomy behind Fast Data Applications.pptxAnatomy behind Fast Data Applications.pptx
Anatomy behind Fast Data Applications.pptxdusavamsikrisna
 

Similaire à Architecting for the cloud elasticity security (20)

Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Architecting for the cloud scability-availability
Architecting for the cloud scability-availabilityArchitecting for the cloud scability-availability
Architecting for the cloud scability-availability
 
02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdf02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdf
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
 
Lecture1
Lecture1Lecture1
Lecture1
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
Cloud Computing - Geektalk
Cloud Computing - GeektalkCloud Computing - Geektalk
Cloud Computing - Geektalk
 
UNIT II DIS.pptx
UNIT II DIS.pptxUNIT II DIS.pptx
UNIT II DIS.pptx
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
An Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.pptAn Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.ppt
 
Architectural Tactics for Large Scale Systems
Architectural Tactics for Large Scale SystemsArchitectural Tactics for Large Scale Systems
Architectural Tactics for Large Scale Systems
 
Real time database
Real time databaseReal time database
Real time database
 
CSA unit5.pptx
CSA unit5.pptxCSA unit5.pptx
CSA unit5.pptx
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
Anatomy behind Fast Data Applications.pptx
Anatomy behind Fast Data Applications.pptxAnatomy behind Fast Data Applications.pptx
Anatomy behind Fast Data Applications.pptx
 
Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
 
Introduction
IntroductionIntroduction
Introduction
 

Plus de Len Bass

Devops syllabus
Devops syllabusDevops syllabus
Devops syllabusLen Bass
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020Len Bass
 
11 secure development
11  secure development 11  secure development
11 secure development Len Bass
 
10 disaster recovery
10 disaster recovery  10 disaster recovery
10 disaster recovery Len Bass
 
9 postproduction
9 postproduction 9 postproduction
9 postproduction Len Bass
 
8 pipeline
8 pipeline 8 pipeline
8 pipeline Len Bass
 
7 configuration management
7 configuration management 7 configuration management
7 configuration management Len Bass
 
6 microservice architecture
6 microservice architecture6 microservice architecture
6 microservice architectureLen Bass
 
5 infrastructure security
5 infrastructure security5 infrastructure security
5 infrastructure securityLen Bass
 
4 container management
4  container management4  container management
4 container managementLen Bass
 
3 the cloud
3 the cloud 3 the cloud
3 the cloud Len Bass
 
1 virtual machines
1 virtual machines1 virtual machines
1 virtual machinesLen Bass
 
2 networking
2 networking2 networking
2 networkingLen Bass
 
Quantum talk
Quantum talkQuantum talk
Quantum talkLen Bass
 
Icsa2018 blockchain tutorial
Icsa2018 blockchain tutorialIcsa2018 blockchain tutorial
Icsa2018 blockchain tutorialLen Bass
 
Experience in teaching devops
Experience in teaching devopsExperience in teaching devops
Experience in teaching devopsLen Bass
 
Understanding blockchains
Understanding blockchainsUnderstanding blockchains
Understanding blockchainsLen Bass
 
What is a blockchain
What is a blockchainWhat is a blockchain
What is a blockchainLen Bass
 
Dev ops and safety critical systems
Dev ops and safety critical systemsDev ops and safety critical systems
Dev ops and safety critical systemsLen Bass
 
My first deployment pipeline
My first deployment pipelineMy first deployment pipeline
My first deployment pipelineLen Bass
 

Plus de Len Bass (20)

Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
 
11 secure development
11  secure development 11  secure development
11 secure development
 
10 disaster recovery
10 disaster recovery  10 disaster recovery
10 disaster recovery
 
9 postproduction
9 postproduction 9 postproduction
9 postproduction
 
8 pipeline
8 pipeline 8 pipeline
8 pipeline
 
7 configuration management
7 configuration management 7 configuration management
7 configuration management
 
6 microservice architecture
6 microservice architecture6 microservice architecture
6 microservice architecture
 
5 infrastructure security
5 infrastructure security5 infrastructure security
5 infrastructure security
 
4 container management
4  container management4  container management
4 container management
 
3 the cloud
3 the cloud 3 the cloud
3 the cloud
 
1 virtual machines
1 virtual machines1 virtual machines
1 virtual machines
 
2 networking
2 networking2 networking
2 networking
 
Quantum talk
Quantum talkQuantum talk
Quantum talk
 
Icsa2018 blockchain tutorial
Icsa2018 blockchain tutorialIcsa2018 blockchain tutorial
Icsa2018 blockchain tutorial
 
Experience in teaching devops
Experience in teaching devopsExperience in teaching devops
Experience in teaching devops
 
Understanding blockchains
Understanding blockchainsUnderstanding blockchains
Understanding blockchains
 
What is a blockchain
What is a blockchainWhat is a blockchain
What is a blockchain
 
Dev ops and safety critical systems
Dev ops and safety critical systemsDev ops and safety critical systems
Dev ops and safety critical systems
 
My first deployment pipeline
My first deployment pipelineMy first deployment pipeline
My first deployment pipeline
 

Dernier

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 

Dernier (20)

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

Architecting for the cloud elasticity security

  • 1. Architecting for the Cloud Len and Matt Bass Elasticity
  • 2. Link to Yesterday’s lectures http://www.slideshare.net/lenbass/architecting- for-the-cloud-scabilityavailability
  • 3. Topics Scalability is about acquiring resources but once they are acquired, they still must be used. Elasticity is about how to use the resources. This requires understanding • Concurrency • State and their interactions 3
  • 4. What is concurrency? • Concurrency means performing several activities simultaneously • Concurrency is used to improve performance. 4
  • 5. How do concurrent activities come to be? • Explicitly through your code creating a new thread or process. • Implicitly through some support system creating a new thread or process – Operating system – Web server – Database management system • Implicitly through the infrastructure creating a new virtual machine – Elasticity in the cloud – During deployment of your system 5
  • 6. Key concepts • Atomicity – An atomic operation cannot be divided. It is all or nothing. • Time – It takes time to perform an operation. • Computation • Messages transferred over a network • Reading/writing information from a disk (rotating or solid state) • Dependency – Coordination among concurrent activities is necessary if they are sharing resource or results • Problems arise because operations take time and can be interrupted. I.e. are not atomic. 6
  • 7. Synchronous vs asynchronous • Synchronous coordination between two concurrent processes means that process A sends a message for process B and waits for a response. • Asynchronous coordination means that process A does not wait for a response. – It can poll for a response – A response from process B can be sent as an event. • In either case, coordination takes time and so coordination is not an atomic operation. 7
  • 8. Some problems with concurrent activities • Time stamps. • Many protocols involve putting a time stamp on messages for error detection and ordering purposes. • Time stamps are often used to identify log messages used for debugging problems. • In some environments, e.g. stock market, trades must be satisfied in the sequence in which they arrive. • Race conditions – two processes are simultaneously accessing the same resource. • Inconsistency – If two activities are being performed simultaneously, data may become inconsistent. 8
  • 9. Clock synchronization • Suppose two different computers are connected via a network. How do they synchronize their clocks? • If one computer sends its time reading to another, it takes time for the message to arrive. • NTP (Network Time Protocol) can be used to synchronize time on a collection of computers. – Accurate to around 1 millisecond in local area networks – Accurate to around 10 milliseconds over public internet – Congestion can cause errors of 100 milliseconds or more. 9
  • 10. Suppose NTP is insufficiently accurate • Financial industry is spending 100s of millions of dollars to reduce latency between Chicago and New York by 3 milliseconds. – Well within error range of NTP • GPS time is accurate within – 14 nanocseconds (theoretically) – 100 nanoseconds (mostly) • Timestamp messages with GPS time – Used by electric companies to measure phase angle – Used by Google to coordinate time across all of their distributed systems. – Requires specialized hardware and installation not yet cheaply available. 10
  • 11. Example of a race condition • Suppose withdrawals are being made from a bank account. If there are two users simultaneously withdrawing, the following sequence can occur. 11 User 1 User 2 Acct amount 1000 Read account (1000) 1000 Read acount (1000) 1000 Withdraw 100 (900) 1000 Write new amount (900) 900 Withdraw 100 (900) 900 Write new amount (900) 900
  • 12. Example of inconsistency • A cache is frequently used to keep data locally rather than requiring it to be fetched for each request. Web browsers, for example, cache web pages. • For every request, the sequence is 1) look in cache to see if the request can be satisfied with the contents of the cache 2)If no, then retrieve information and return it to the requester and place it in the cache. • Now suppose the web page is changed at its source • Retrievals of the web page from the cache will retrieve an out of date version of the web page. 12
  • 13. Solutions bring new problems • One technique to prevent race conditions is to lock critical resources. • Can lead to deadlock – two processes waiting for each other to release critical resources – Process one gets a lock on row 1 of a data base – Process two gets a lock on row 2. – Process one waits for process 2 to release its lock on row 2 – Process two waits for process 1 to release its lock on row 1 – No progress. 13
  • 14. Yet more problems • Locks are logical structures maintained in software or in persistent storage. • Getting a lock across distributed systems is not an atomic operation. – It is possible that while requesting a lock another process can acquire the lock. This can go on for a long time (it is called livelock if there is no possibility of ever acquiring a lock) • Suppose the virtual machine holding the lock fails. Then the owner of the lock can never release it. 14
  • 15. Is there a solution? • The general problem is that you want to manage synchronization of data across a distributed set of servers where up to half of the servers can fail. • Paxos is a family of algorithms that use consensus to manage state concurrency. Complicated and difficult to implement. • An example of the problems – Choose one server as the master that keeps the “authorative” state. – Now master server fails. Need to • Find a new master • Make sure it is up to data with the authoritative state.
  • 16. Luckily • Several open source systems are now available that – Implement Paxos or an alternative consencus algorithm – Are reasonably easy to use. • Two such systems are – Memcached – discussed at the end of this lecture – Zookeeper – discussed in tomorrow’s lecture.
  • 17. In general • Introducing concurrency will improve performance but also introduces problems. • Concurrency is a constant consideration when architecting for the cloud. – Coordinating activities across concurrent processes is difficult and prone to many errors. – Allowing for failure complicates coordination of activities. • Systems are available to provide concurrency for small amounts of data without your having to worry about the details. 17
  • 18. Topics In order to understand how to achieve elasticity you must understand • Concurrency • State and their interactions 18
  • 19. Recall Load Balancer • Client makes a request that is routed to a server through a load balancer
  • 20. Message sequence – client makes a request Servers Clients Load Balancer
  • 21. Message sequence- request arrives at load balancer Servers Clients Load Balancer
  • 22. Message sequence – request is send to one server Servers Clients Load Balancer
  • 23. Message sequence – reply goes back to client Servers Clients Load Balancer
  • 24. Message sequence – now client makes second request – does it matter which server it goes to? Servers Clients Load Balancer ???
  • 25. “Sticky” http requests • Normally load balancer will route requests depending on load of servers attached to it. • This is why it is called “load balancer” • Client can request to be always routed to same server. This is done by making a “sticky” http request. • Dangerous for two reasons: – Server may be overloaded and response delayed – Server may have failed and no response is forthcoming. • We assume non sticky http requests.
  • 26. Suppose message is routed to an arbitrary instance. • Understanding what happens requires a digression into state. • A computation has two inputs – Instructions – Data • The data input of a computation is called the state.
  • 27. How does this work with functions? • Consider a function that counts how many times it is called. • Option 1: int countv1() { int i = 0; //declare i and initialize it to 0. i = i + 1; //add 1 to the last value of i return i; } • The function count remembers i from one call to the next. • State is maintained inside the function – it is stateful 27
  • 28. Option 2 int countv2(int i) { int a; a = i + 1; //add 1 to the last value of i return a; } • The function count does not remember the value of i from one call to the next. • The client must pass the last value returned. • State is passed into the function. The function is stateless 28
  • 29. Option 3 int countv3() { int a; a = dbase_get (“count”); //retrieve current value a = a + 1; //add 1 to the last value of a dbase_write(“count” a); //save current value return a; } • The count is stored in a database. • Neither the client nor the function remembers the value. • The function is stateless. 29
  • 30. What is the difference? • In option 1, the function kept track of the count value. • In option 2, the client must keep track of the count value. • In option 3, the count value is kept in an external database. • In each case, the state (count value) must be kept somewhere. 30
  • 31. Suppose the functions are packaged as processes in virtual machines Option 1 Option 2 Option 3 Countv2 Countv3Countv1 Client DB
  • 32. Processes communicate via messages • Message from client to process is call • Message from process back to client is return of a value 32
  • 33. Now suppose each process has two clients – what is computed by option 1? Countv1
  • 34. What is computed by option 2? Countv2
  • 35. What is computed by option 3? Countv3 DB
  • 36. Where state is kept matters • Option 1 – counts number of times called by either client. Process remembers value • Option 2 – counts number of times called by each client. Client remembers value • Option 3 – counts number of times called by either client. Database remembers value. Options 1 & 3 calculate different things than option 2. 36
  • 37. Now suppose each process has two instances– remember the load balancer Countv1 Countv1 Load balancer distributes messages to servers
  • 38. What is computed by option 1? 38 Countv1 Countv1
  • 39. What is computed by option 2? Countv2 Countv2
  • 40. What is computed by option 3 ? Countv3 Countv3 DB
  • 41. Now what do the options compute? • Option 1 – each instance of the function countv1 computes how many times it was invoked • Option 2 – each instance of the function countv2 computes how many times each client invoked either instance • Option 3 – the database contains the number of times either instance was invoked by either client. 41
  • 42. What have we seen? • When there was one instance of a client and one instance of the count process- all three versions were identical • When there were two clients and one instance of the count process– two versions were the same, one was different • When there were two clients and two instances of the count process– all three versions produced different results. 42
  • 43. Message so far • How state is managed is important and will lead to different results when there are multiple instances of clients or functions. • Now we return to elasticity • Remember the sequence? 43
  • 44. Message sequence – client makes a request Servers Clients Load Balancer
  • 45. Message sequence- request arrives at load balancer Servers Clients Load Balancer
  • 46. Message sequence – request is send to one server Servers Clients Load Balancer
  • 47. Message sequence – reply goes back to client Servers Clients Load Balancer
  • 48. Message sequence – now client makes second request – does it matter which server it goes to? Servers Clients Load Balancer ???
  • 49. It depends where state is kept • If state is kept in the client, then it does not matter since the client keeps track of the calls • If state is kept in a database then it does not matter since the results are kept external to the servers • If state is kept in the server then it does matter since sending message back to server 1 will give different result than sending it to server 2.
  • 50. Keeping servers stateless enables elasticity • A new instance of a server can be – Created/stopped – Registered /unregistered with the load balancer – Placed in/removed from service without Requiring the client to be aware of which server instance it is interacting with Requiring that clients be notified if a server is taken out of service
  • 51. Types of State • Session state • Client side state • Server side • Persistent
  • 52. What is a session? • A session typically refers to a series of interactions between one client login to a system and the termination of that login – whether through logging out or through timing out. • A session can also span multiple logins. E.g. Netflix keeps track of where you are in a movie and returns you to that location the next time you log in.
  • 53. Session State • Session state is information that persists for a session. We are considering a single login here. The multiple login case is a special case of persistent state. • What happens when you login – When you successfully login to a service, the service returns a code that identifies you. This is the session ID. – Other information can also be included such as MAC address (to prevent man in the middle attacks). – It is typically managed on the client side. Your browser does all of this.
  • 54. Client Side State • This can be difficult if there is significant state to save, however – This means you’ll need to pass all of this state with each request – This requires more network overhead • This also means you’ll need to store data on the client machine – This can have security implications
  • 55. Stateful Services • If your services are stateful that makes scalability more difficult • If you’re able to design your system such that the services are stateless you’ll make scaling much easier • If an operation is dependent on the results of a previous operation it’s more difficult to make services stateless
  • 56. Management of state between services and persistent tier • Non client side state can be either kept in the services or in a persistent store. • The choice depends on the volume of data, the latency involved, the synchronization needs for the servers and the time the state is expected to persist.
  • 57. Important latency numbers • Main memory reference 100 ns • Send 1K bytes over 1 Gbps network 0.01 ms • Read 4K randomly from SSD. 15 ms • Read 1 MB sequentially from memory 0.25 ms • Round trip within same datacenter 0.5 ms • Read 1 MB sequentially from SSD 1 ms (4X memory) • Disk seek 10 ms (20x datacenter roundtrip) • Read 1 MB sequentially from disk 20 ms (80x memory, 20X SSD) • Send packet CA->Netherlands->CA 150 ms 57 * dean-keynote-ladis2009_scalable_distributed_google_system
  • 58. Implications of latency numbers • State stored in persistent storage (disk or SSD) will take longer to fetch than state stored in memory. • State stored in a different datacenter will take longer to access than state stored locally, especially across continents. • Persistent store is typically replicated both for performance (latency) reasons and for availability (failure) reasons. • => keeping data consistent across different occurrences of it is important but difficult.
  • 59. Topics In order to understand how to achieve elasticity you must understand • Concurrency • State and their interactions 59
  • 60. Keeping data consistent • We will discuss persistent data consistency when we discuss databases. • Memcached is an open source tool that provides in-memory synchronization of data across different instances of a service.
  • 61. • Now consider these layers deployed onto multiple servers. Layers of a service Business logic for the service Memcached
  • 62. Memcached in multiple servers • Memcached keeps small amount of state in all servers consistent. • At a small cost in latency as long as they are in same physical location. Memcached Memcached Business logic Business logic
  • 63. When to use Memcached • Data must be synchronized among servers. • Memcached takes care of concurrency issues • Data is relatively small – One object < 1MB – Total memory used per server depends on how much you are willing to give it per server since it is stored in memory, not on a persistent store • Lifetime of the data should not exceed time any of the servers are alive. I.e. if all the servers die, then the data disappears.
  • 64. Summary • The cloud doesn’t guarantee elasticity • You’ll need to design your system to be elastic • State management, your storage solution, and consistency, are all factors that you’ll need to consider
  • 66. Architecting for the Cloud Introduction to Security
  • 67. Agenda • What is security? • Understanding the threat • Architectural approaches to security • Designing for security • Summary
  • 68. Agenda • What is security? • Understanding the threat • Architectural approaches to security • Designing for security • Summary
  • 69. Your Experience • Think about your past experience – How have you thought about security? – What steps have you (or your organization) taken to protect the system? • Do you remember Assignment 2? – Security was equivalent to having a login feature or encryption
  • 70. Security … What is it? • What do we mean when we say security? • In your experience what does this mean?
  • 71. Let’s Look at some Examples
  • 72. Fort Knox • Fort Knox is a US Army post in Kentucky • In addition to housing various US Army functions it is also the home to a gold bullion depository – 5000+ tons of gold housed there
  • 73. Security • What is the business asset that needs protection in this case? • What does protect mean here?
  • 74. What About the CIA? • The Central Intelligence Agency (CIA) is a US civilian intelligence organization • Primary purpose is to collect information about foreign governments, corporations, and individuals • It uses this information to influence public policymakers – It does at times engage in tactical operations as well
  • 75. Security • What is the business asset that needs protection? • What does protect mean in this case?
  • 76. Power Distribution • What would security mean if you have a system that manages the power grid?
  • 77. Business Context • The business need differs from one context to another • Organizations have assets they need to protect • They need to protect these assets for different reasons – Business continuity – Liability reasons – Regulation – Protection of IP – …
  • 78. Security – A Set of Concerns • The related concerns are typically classified as “security” concerns • In software these concerns are typically: – Confidentiality – Data integrity – Non repudiation – Availability
  • 79. Confidentiality • The property that reflects the extent to which: – Data and services are only available to those that are authorized to access them • Is this a concern for a Museum? How about a Financial Institution?
  • 80. Integrity • This property can also refer to data or services • It reflects the extent to which data or services can be delivered as intended • E.g. hopefully the grade that we have recorded for you in this course is correct …
  • 81. Non Repudiation • Nonrepudiation is refers to the ability to guarantee that the sender can not later repudiate or deny having sent the message • It can also refer to the guarantee that the recipient cannot later deny having received the message • When might this be important?
  • 82. Availability • This is the property that reflects the extent to which the system will be available for legitimate use • A denial of service attack is meant to disrupt the availability of a system
  • 83. Protection Against What? • Now that we understand the business asset, what are we protecting against? • In order to appropriately protect our system we need to understand the threat • Let’s look at example exploits …
  • 84. Agenda • What is security? • Understanding the threat • Architectural approaches to security • Summary
  • 85. Threat Sources? • Insider threats • Physical threats • Social engineering • External attacks
  • 86. Who is Leveraging These Techniques? • The art of hacking has gone from an individual activity to a highly coordinated and sophisticated effort – It can now be quite lucrative as well • Today many legitimate and illegitimate organizations routinely launch attacks – Just run a port scan detector on your system • Let’s look at the progression of exploits
  • 87. Progression of Exploits • Mischievous individuals: – The first generation of hackers were technical youth performing mischievous acts • Revenue generation: a proof of concept – These were the first example of hacking for money – Still small scale • Organized crime – These were criminal organizations involved in larger scale criminal activity • Widespread adoption – The infrastructure needed to launch Cyber attacks is now widespread – The barrier to entry has been lowered – Legitimate entities enter the game • Advanced persistent threats
  • 88. Hackers – First Generation • In the 1990s hackers were by and large not malicious • They were in it for the challenge • Notable hackers – Kevin Mitnick – Chen Ing-Hau – Jeffery Lee Parson – Sven Jaschan
  • 89. Kevin Mitnick • Broke into dozens of computer networks – Pac Bell – DEC – MCI – Digital – … • Wasn’t in it for financial gain • Largely used “social engineering” techniques • Arrested twice 1988 and again in 1999
  • 91. Mitnick’s Techniques • Largely used “social engineering” to gain access to passwords and insider information • Used this information to gain access to target system • Mitnick claims that he never “hacked” a system (still a point of controversy)
  • 92. Chen Ing-Hau • University student that created and released the CIH virus in 1999 – Wrote the virus to “make a fool of the software vendors” • Virus that would render the computer essentially inoperable on a specified date • Became one of the most widespread viruses • Some version of this virus have showed up multiple times
  • 93. CIH Virus • Exploited vulnerability in Windows 95, 98, & ME – Along with an issue in various BIOS chipsets • Would overwrite the first megabyte of the hard drive and attempt to overwrite flashdrive • Result rendered the pc inoperable
  • 94. Jeffery Lee Parson • Was 18 when he confessed to be the creator of Blaster worm • A Chinese “cracking” collective reverse engineered a MS patch • Parson created a worm to exploit a buffer overflow issue • Affected DCOM’s RPC service – Worm could spread without users opening an attachment
  • 95. Blaster Worm • In addition to changing RPC service it would – Change registry to launch msblast.exe • Worm would launch a distributed denial of service attack from infected computers – Attack was against windowsupdate.com • Sent messages to Bill Gates
  • 96. Sven Jaschan • Authored Sasser and Netsky worms • Claims to have written them to remove Mydoom and Bagle worms • Worms were responsible for 70% of the infections in 2004
  • 97. Netsky • Sent out as an email attachment • Contained insults aimed at the author of Mydoom and Bagle • Other symptoms included “beeping” in the early morning hours of specific dates
  • 98. Sasser • Would connect to computers through a particular port that was often open by default • Exploited a buffer overflow • Would shut the computer down after displaying a shutdown timer
  • 99. Cyber Criminals – Proof of Concept • After the turn of the century a new breed emerged • They took the techniques employed by the mischievous youth and used them for monetary gain • These were the first real “cyber criminals” – Ferid Essebar – Attilla Ekici – Jeanson James Ancheta
  • 100. Ferid Essebar & Attilla Ekici • The two people behind Zotab computer worm • Worm affected CNN, ABC News, NY Times, US Dept of Homeland Security, … • Intention was to facilitate credit card forgery scams
  • 101. Zotab • Exploited vulnerability in Windows 2000 • Caused the computer to restart continuously • Files would be created with every reboot • Spyware was installed on the system – The spyware remained after the virus was removed • The goal was to facilitate scams (for money)
  • 102. Jeanson James Anacheta • First person to be arrested for controlling a large number of hijacked computers • Created a large Botnet – Network of bots or “software robots” • Offered his collection of bots for hire • Leveraged rxbot to increase his network
  • 103. Rxbot • Contained a proxy server • Server can be spawned by a remote attacker • Typically used for denial of service attacks
  • 104. Cyber Gangs • “Organized” crime gets involved • Coordinated attacks against high value targets • Often involve groups and large sums of money • Examples – Yaron Bolondi – Maria Zarubina – Albert Gonzalez
  • 105. Yarib Bolondi • Part of a gang that attempted to steal £220 million from Japanese bank • Used keylogging to gain access to bank’s computers • Software is installed on employees computers – Via malware or other virus
  • 106. Maria Zarubina • Part of a gang that used cyber attacks as a means for extortion • Attacked British “bookmakers” – Agreed to stop attacks if ransom was paid • Used denial of service attacks to shut down gambling sites • Would then threaten additional attacks unless payment was made
  • 107. Albert Gonalez • Responsible for largest credit card theft in history • Stole and resold more than 170 million cards • Used SQL injection to introduce “malware backdoors” – These allowed packet sniffing attacks • Targets included Target, TJ Max, Dave & Busters, 7- eleven, JC Pennys, …
  • 108. ARP Spoofing • Used to attack an ethernet network • Allows attacker to “sniff” data on a LAN and modify or stop the traffic • Attacker sends a spoofed ARP message to Ethernet LAN • “Man in the middle” attack – Attackers computer masquerades as destination computer and gets intended traffic
  • 109. Advanced Persistent Threat • Today we’ve started to see a new class of threat emerge • These threats are against specific high value targets • They are characterized by coordinated activity taking place of a long period of time – The individual actions may seem isolated • The perpetrator doesn’t act on the exploit until sufficient penetration has been achieved • Has anyone heard of Stuxnet? • How about Gauss or Flame?
  • 110. Software as a Weapon • In 2010 Iran announced they put their nuclear program on hold – No one was sure why • It turns out the reason was that more than 1000 centrifuges in their uranium enrichment facilities were destroyed • How were these centrifuges destroyed? – By the first known weapon that was 100% software
  • 111. Stuxnet • Stuxnet was a worm that infected SCADA systems made by Siemens – Think power plant and power distribution control systems • It was capable of – Increasing the pressure inside nuclear reactors – Switching off oil pipelines • Additionally it would report that the systems were operating normally
  • 112. Sophisticated Attack • Why is stuxnet special? • First, it didn’t use a forged security clearance – It used a genuine security clearance that was stolen • Second, it had a specific target – It infected many systems worldwide but remained dormant until it found the systems controlling the intended target • Third, it exploited not 1, but 20 zero day vulnerabilities
  • 113. Response • Iran responded to the attack with an open call for hackers to join the Iranian Revolutionary Guard • Iran now has reportedly amassed the 2nd largest online army in the world
  • 114. Side Note • Stuxnet is now open source • This is code that is capable of crashing power plants and disrupting oil pipelines • Go to youtube and search for stuxnet – You’ll get many videos of people dissecting stuxnet …
  • 115. Advanced Persistent Threats • Stuxnet is an example of what we call “Advanced Persistent Threats” • In some cases exploits are not opportunistic reactions to discovering a vulnerability • They are coordinated multipronged attacks that can take place over an extended period of time
  • 116. Coordinated Attack • Intruders will look for some way to find access to a system • They will then try to move laterally until they are able to access the intended target • This can take days, weeks, months, or even years
  • 117. Email
  • 118. What’s the Point? • Almost all of these incidents exploited vulnerabilities • These vulnerabilities came along with the commercially available software used in the attacked systems • Vulnerabilities continue to exist in the software that we use
  • 119. Vulnerabilities • Many organizations (legitimate and illegitimate) try to find these vulnerabilities – CERT is an example of such an organization • Organizations like CERT would inform the developers of the software of the vulnerability • Historically companies were slow to react • CERT didn’t want to release it publically without a fix being available • So CERT would notify the organization and then release the vulnerability publically after a given time elapsed
  • 120. X Day Vulnerabilities • Vulnerabilities are characterized by the time since they were made public – 1 day vulnerabilities were released 1 day ago • The newer the vulnerability the less likely it is to be patched • Zero day vulnerabilities are those that the manufacturer doesn’t yet know about – Clearly these are the most attractive to attackers
  • 121. Vulnerability Market • A market has emerged for these vulnerabilities • If you discover a vulnerability you can sell it • The value of the vulnerability is determined by: – The “day” of the vulnerability – The number of instances of the software containing the vulnerability
  • 122. Selling The Vulnerability • Many entities buy these vulnerabilities – Governments (including the US) – Organized crime syndicates – Individuals • Prices range from $10 - $250,000 or more – Depending on the exclusivity of the sale as well as the value of the exploit • Check out: – http://www.forbes.com/sites/andygreenberg/2012/03/23/shopping-for-zero-days-an- price-list-for-hackers-secret-software-exploits/ – http://www.zdnet.com/blog/security/black-market-for-zero-day-vulnerabilities-still- thriving/2108
  • 123. Exploit Auction Houses • There are now auction houses that sell vulnerabilities (or exploits) – Like the ebay of exploits – In fact exploits were originally sold on ebay • It’s actually legal to sell these exploits – Even though the attacks themselves may be illegal
  • 124. Exploit as a Service (EaaS) • Believe it or not you can now get a service to manage your attacks • One issue if you’re going to launch an attack is finding a “bulletproof” provider – A provider willing to host a malware server • These services will provide “exploit kits” and manage the hosting • In some cases they even offer analytics for the consumer’s campaigns (think google analytics)
  • 125. Widespread Adoption • All of this has lowered the barrier to entry for exploiting vulnerabilities • There are large numbers of people with the means and motive to attack any system online • Furthermore secure practices are often not followed – See next slide
  • 126. Many Systems Remain Vulnerable • Remember the issues with Open SSL that surfaced in early 2014? – Despite widespread news reports, many systems continue to be vulnerable • June 2014 survey of TLS vulnerabilities
  • 127. Cloud Related Issues • In many respects security in the cloud is not different from security for a traditional system • Some threats are magnified, and some additional threats exit • We’ll look at: – VM sprawl – Insecure interfaces or API – Malicious insiders – Shared resources
  • 128. VM Sprawl • VM creations is quick and easy – It can be done in seconds without procuring hardware, administrative knowledge, or securing permissions • As a result it’s done often – Sometimes for transient needs • Once created the VM is often forgotten about – It might still exist even if it is no longer doing any work • Keeping track of the existing VMs is difficult to do – It requires different processes than tracking physical assets • This results in something called VM Sprawl
  • 129. Consequences of VM Sprawl • VM Sprawl is bad for many reasons • First, it imposes additional overhead on the overall solution – The VM still costs money even if it is offline • Second, it is less likely to be included in the normal maintenance efforts – Updates and patches might not be applied • As a result the VM can remain vulnerable
  • 130. Insecure Interfaces or API • IaaS and PaaS providers expose a set of API • These API are used by customers to: – Provision – Manage – Orchestrate – Monitor – … • The security of the cloud is dependent on the security of these API • These API must be designed in a way to resist accidental and malicious attempts to circumvent policy
  • 131. 3rd Party API • We not only need to trust the expertise and procedures of the cloud providers but 3rd party vendors as well • Organizations often layer capability on top of the provided API in order to add value to the consumer e.g. – Deployment tools – Monitoring aggregation tools – Data management services – … • The security of these providers also needs to be trusted
  • 132. How Does This Work? User 3rd Party Service Cloud Provider
  • 133. Malicious Insiders • Malicious insiders are a known and significant threat to corporate security – E.g. former and disgruntled employees • When deploying your application on the cloud you need to worry about employees of the cloud provider as well
  • 135. Shared Resources • When software running in a process within a VM can elevate privileges sufficiently they can “escape” the bounds of the VM • This is called “guest to host VM escape” • Once this happens the software is able to control all of the instances within that hypervisor
  • 136. Hypervisor Vulnerabilities • The most commonly used hypervisors have all been exploited • Vulnerabilities continue to be discovered in all of the major hypervisor software – Discovered by both the good guys and bad guys • Do a Google search on VM Escape for the latest vulnerabilities …
  • 137. Addressing Security Issues • The strategies for dealing with security issues typically fall into one of three categories – Secure coding practices – Processes and policy – Architectural approaches
  • 138. Secure Coding Practices • Looking at the source of the vulnerabilities it may seem that secure coding practices will solve the problem • While this is true to some extent as we said these vulnerabilities exist in most commercially available software • We must therefore assume that our software is to some extent insecure • It’s also the case that we will miss issues • Inevitably the software will have defects, will be used in a context other than what was intended, or will be used with software that it wasn’t intended to work with
  • 139. Processes and Policy • A large aspect of dealing with security includes processes and procedures • The security of the system is impacted by things like: – Physical security – IT policy governing computers on the network – Updating and patching procedures – Organizational structure and access policies • Defining appropriate practices is a key component to security
  • 140. Agenda • What is security? • Understanding the threat • Architectural approaches to security • Designing for security • Summary
  • 141. Security Strategies • Security strategies fall in one of several categories – Policy/process – Secure coding practices – Architectural • We will now look at some architectural strategies • The thing to keep in mind is that you cannot easily eliminate all vulnerabilities – Some of the approaches are aimed at minimizing vulnerabilities – Some are aimed at reducing the impact if the vulnerabilities are exploited
  • 142. Resisting Attacks • Resisting attacks is analogous to securing the perimeter • Strategies for resisting attacks include: – Encryption – Checking data integrity – Limiting exposure – Limiting access
  • 143. Encryption • Applied to data and communications can help maintain confidentiality • Can be symmetric – Both parties use the same key • Or asymmetric – Public/private key
  • 144. Encryption • What kind of attack would encryption protect against? • What kind of attack would it not protect against? • What kind of security concern would it address?
  • 145. Data Integrity • Encoding data with checksum or hash results can help ensure the data has not been tampered with • This additional data can be encrypted along with or independently from the original data
  • 146. Data Integrity • Think about data integrity concerns in the context of some of the recent attacks – Stuxnet – Gauss – … • These techniques can be important for detecting an attack – Additional techniques might be needed to recover
  • 147. Limiting Exposure • Attacks depend on exploiting weaknesses to gain access to data and services • Limiting access to the attack surface limits risk* • The following are approaches to limiting exposure * Manadhata 2006
  • 148. Client Data Storage • Problem: many applications store data at potentially untrusted clients. – These clients could tamper with the data • Solution: this pattern uses encryption to store security-critical data client-side
  • 149. Client Data Storage II • Manual inspection of this data could reveal details of the application that could be used to compromise the site
  • 150. Client Input Filters • Problem: in many cases clients execute outside the control of the system developer. – These clients can be tampered with to behave in an untrustworthy manner • Solution: treat all data provided by clients as suspect
  • 151. Client Input Filters II • Perform (or re-execute) data validity checks on the server • Exam headers and URLs for malicious code • Text input should be checked for scripts • Calculated fields should be re-computed on the server • Considerations: – Should use a symmetric key as it’s less computationally expensive – Storage of the key should not be stored in a file
  • 152. Trusted Proxy • Problem: it may be necessary to expose inadequately protected aspects of the system to untrusted users • Solution: create a trusted proxy that acts as a buffer between the component and the users
  • 153. Trusted Proxy II • This proxy intercepts and filters all communication • In that way it can compensate for the lack of protections • Typically two options – Filter requests for bad input – Recreate a new request with only the essential parts of the old request
  • 154. Single Access Point Problem: a system is more difficult to secure if it has multiple entry points • With multiple entry points: – You may need to separately secure multiple applications – You may have duplicate authentication logic to maintain – Unix is an example with multiple entry points – Different services can be set up on different machines
  • 155. Single Access Point II • The solution is to create a single point of entry • A session is then created • This allows global tracking of session state and authorization information • There is a single “gateway” or “check point” through which user’s login is validated
  • 156. Single Access Point III • Which aspects of security does this pattern address? • What are some of the implications of using this pattern?
  • 157. Partitioned Application • Problem: large complex applications often require root privileges in some portions of the application – If these elements are compromised the entire system is at risk • Solution: partition the large application into smaller elements each adhering to least privilege principle
  • 158. Partitioned Application II • This becomes more difficult to manage • Additionally performance can suffer as interprocess communication increases • Additional points of entry are introduced – Even though the impact of being compromised is diminished
  • 159. Password Propagation • Problem: most applications manage user data under a single database account – Thus if the single account is compromised all user data can be accessed • Solution: the users password is required with each backend database request
  • 160. Password Propagation II • This is essentially an instance of application partitioning • The front end will cache the password and provide it with each back end request
  • 161. Limiting Access • You can think of this as “securing the perimeter” • This is a widely used approach of limiting access to data and services • The following are examples of techniques for limiting access
  • 162. Session • Background: Systems need to keep track of user’s login status, level of authorization, and so forth – The Singleton pattern is often used for this – This pattern can be difficult to use when the system support concurrent logins • The solution is to create a “session” object to hold these global variables
  • 163. Session II • This session object is accessible by all components of the application • This facilitates having a common interface for accessing this information – Easier to implement and maintain than having a number of variables passed around
  • 164. Roles • Background: when an application supports many types of users security becomes more complicated – It can be difficult to track and maintain all of the things that every user has access to • It eases implementation issues if a smaller number of “roles” are created • Each role has a given set of rights
  • 165. Roles II • What kinds of security does this address? • Implications?
  • 166. Account Lockout • Problem: there is an increased number of password guessing tools to compromise systems requiring user authentication • Solution: lock the user account after some number of incorrect attempts • How it works: – The system records each incorrect login attempt – When a predetermined number of attempts is reached the account is locked – Each time there is a correct login the account is reset
  • 167. Account Lockout II • Issues: – Doesn’t address the situation where different user IDs are used – Usability can be adversely affected – Availability can be adversely affected • Can facilitate denial of service
  • 168. Detecting Attacks • Detect Intrusion • Detect Denial of Service
  • 169. Minefield • Problem: hackers are likely familiar with the vulnerabilities of various configurations – Once they figure out your setup they’ll know how to get in • Solution: change your setup to a non-standard configuration
  • 170. Minefield II • Even small changes can increase the effort enough to discourage hackers • You can do things like: – Alter file structure – Rename common administrative commands – Instrument commands to alert administrators – Add booby traps that will recognize tampering
  • 171. Secure Assertion • Problem: the activities performed by a malicious intruder may look legitimate at the local level – E.g. transferring money from an account • Solution: create a framework for reporting specific activities that violate assertions
  • 172. Secure Assertion II • The application developer is in a position to determine activities that may be suspicious – They can create assertions • If the application is being developed in an environment that supports exceptions, assertion violations could be reported in a similar fashion • The violations could be collected globally to provide additional insight on the current activities
  • 173. Recovering From Attacks • Availability tactics – We will discuss these in a future class • Auditing – Keeps a trail of the users and their actions – Helps to maintain a record of the attack
  • 174. Network Address Blacklist • Problem: all systems with an online presence are subject to attack – Locking individual accounts doesn’t address systemic attacks • Solution: block network addresses that are the source of attack
  • 175. Network Address Blacklist II • The server will monitor requests from clients – Any suspicious requests will be logged – If there are repeated suspicious requests the address is blocked • One question is where to implement the check – Network (e.g. firewall) or application • Performance as list grows can be an issue • Can still be subject to denial of service attack
  • 176. Agenda • What is security? • Understanding the threat • Architectural approaches to security • Designing for security • Summary
  • 177. So How Do We Decide? • There are many options, which ones are required? • What are the side effects of selecting these security mechanisms?
  • 178. Fit for Purpose • It is (hopefully) clear that each of these techniques addresses a different concern • What concerns does your organization have? – This depends on the business assets that need protection – And the ways in which these assets could be compromised given the system
  • 179. Threat Modeling Threat Modeling and Analysis in a nutshell: – Identify the business asset to protect – Brainstorm the known threats to the system – Rank the threats by decreasing risk – Chose techniques to mitigate the threats – Chose appropriate technologies from the identified techniques
  • 180. Business Asset • The reason for security is to protect some aspect of the business • You need to identify those aspects of the business that need protection • You also to determine what “protection” means
  • 181. Brainstorm Threats • Given a particular design what might happen to compromise the business asset? • You should think about these from two perspectives – Likelihood – Impact • At this point you don’t worry about if they need mitigation
  • 182. Rank the Threats • Based on the likelihood and the impact you can determine the “risk exposure” – Look at risk management techniques • Prioritize the risks according to the exposure • Determine the threshold that require mitigation
  • 183. Mitigation Techniques • Look for generic patterns that will mitigate the risks • Mitigate means lower the risk exposure to a tolerable level – You lower the exposure by reducing the likelihood or reducing the impact – A tolerable level means below the threshold defined previously
  • 184. Choose Technologies • Basically you need to map the generic pattern to some concrete solution • This is where you factor in the costs • Costs could come in terms of level of effort to implement • Costs could also come in terms of tradeoffs – You might need to iterate these steps
  • 185. Consider Trade Offs • Most of these mechanism adversely impact performance – Blindly selecting these capabilities can bring the system to a standstill • They also have an impact on the flexibility of the system • Balancing concerns is key
  • 186. References • STRIDE: http://msdn2.microsoft.com/en-us/library/aa302419.aspx • Hinton, Hondo, Hutchison: Security Patterns within a Service Oriented Architecture IBM 2005 • Hafiz, Johnson Security Patterns and their Classification Schemes • Thomas Erl Service Oriented Architecture Chapters 4 and 11 • SEI/CERT OCTAVE: Operationally Critical Threat, Asset, and Vulnerability Evaluation: http://www.cert.org/octave • Manadhata et al. Measuring the Attack Surfaces of Two FTP Daemons 2006