4. Motivation
Today’s Web applications are complex
• Complex applications -> Modules
• Modules -> Separate Teams
• Separate Teams -> No unified security model
Security is a cross-cutting concern; we need an
abstraction for expressing policies and
enforcing them across the entire application
7. Deputy Confusion in Web Apps
John’s Browser Cloud Application
App John
String
Logic
App Jane
Jane’s Browser
App
Backend Server E-mail Server Database
8. Deputy Confusion at Facebook
Test your privacy settings by displaying your
profile as it is shown to your friends
10. Deputy Confusion at Facebook
Victor Victor
Instant
Nickolai’s Profile: Victor Feed:
Victor’s Profile: Nickolai Victor
Chat:
Victor
Victor’s Feed: Victor’s Friends:
Nickolai Nickolai
11. Deputy Confusion at Facebook
“Facebook Chat is now down for maintenance. The feature was
presumably disabled following a report that exposed a Facebook security
bug that allowed users to access and view friends’ live chats, friend
requests and friends in common.
The report indicates that access to this personal information was
accessible via Facebook’s privacy settings, with the Preview My Profile
feature creating the loophole to access the private live chats of friends.
With Preview My Profile, users can view how their profile appears to any
given Facebook friend. The bug apparently let those users see the live
chats and friend requests of the friend in question.
Unfortunately for the company, this is not the first time users’ personal
information has been exposed without consent. Earlier this year, user e-
mail addresses were exposed in a hiccup following a site update.”
12. Deputy Confusion in Web Apps
John’s Browser Cloud Application
App John
String
Logic
App Jane
Jane’s Browser
App
Backend Server E-mail Server Database
13. Encoding Confusion in Web Apps
• name: pwnall • name: pwnall
• password: • password:
awesome awesome ” OR “”=“
Correct login query: Password-less login query:
SELECT * FROM users SELECT * FROM users
WHERE name=“pwnall” WHERE name=“pwnall”
AND AND
password=“awesome” password=“awesome” OR
LIMIT 1 “”=“” LIMIT 1
15. Encoding Confusion in Web Apps
Browser Application Server
HTML View
CSS JavaScript String
Model
String
HTTP Request Controller
Form Cookies String
Text SQL
Backend Server E-mail Server Database
16. Encoding Confusion in Web Apps
params[:user]
Field Value
email costan@mit.edu
password mit
password2 mit
email password admin @user = User.new(params[:user])
costan@mit mit false
@user.save
it@mit secret true
…
17. Encoding Confusion in Web Apps
params[:user]
Field Value
email costan@mit.edu
password mit
password2 mit
admin true
email password admin @user = User.new(params[:user])
costan@mit mit true
@user.save
it@mit secret true
…
18. Encoding Confusion at GitHub
"The root cause of the vulnerability was a failure to properly check incoming
form parameters, a problem known as the mass-assignment vulnerability,"
GitHub co-founder Tom Preston-Werner wrote in a blog post on Sunday. "In
parallel to the attack investigation we initiated a full audit of the GitHub
codebase to ensure that no other instances of this vulnerability were
present.”
There is little doubt that the vulnerability was serious. As Homakov himself
noted on his blog, it gave him access to wipe any post in the Rails project and
even "pull/commit/push in any repository on GitHub". He said "lots of Rails
apps" were similarly vulnerable.
20. Eliminate the Confusion!
Add Labels to Data Filter Output Data
Labels address deputy confusion
• Prevent deputy confusion
This text was typed by Victor
– Check security policies before
Only show this to Victor’s friends making database changes
– Check privacy policies before
Labels address encoding confusion outputting data to the user
Unsafe text supplied by users • Prevent encoding confusion
Safe to splice in a HTML page – Only output HTML-safe pages
Safe to splice in a SQL query – Only issue SQL-safe database
queries
21. Encoding Confusion in Web Apps
params[:user]
Field Value
email costan@mit.edu
password mit
password2 mit
admin true
email password admin @user = User.new(params[:user])
costan@mit mit true
@user.save
it@mit secret true
…
22. Eliminating Encoding Confusion
params[:user]
Field Value
email costan@mit.edu
password mit
password2 mit
admin true
email password admin @user = User.new(params[:user])
costan@mit not created No security policy
Blocked for user dictionaries
it@mit secret true
…
23. Eliminating Deputy Confusion
params[:user]
Field Value
email costan@mit.edu
password mit
password2 mit
admin true
@user = User.new(params[:user])
Field Policy
email Users can edit their own
@user.save
password Users can edit their own Security policy: only admins
Blocked
admin Admins can edit any can write the admin field
26. Data Flow Assertions in Rails
• Labeling and Filtering
– Inserted automatically in the Rails stack
• Label propagation
– Hard to do without changing the interpreter
• API for security policies
– Domain-Specific Language (DSL) for model code
27. Labels and Filters in Rails
Database
Request
Model
Controller
Rack
Response View
29. Labels and Filters in Rails
Database
Filter Label
queries results
Request Label
input
Security policies
Response Filter
output
30. Labels and Filters in Rails
Database
Filter Label
queries results
Request Label
input
Model
Controller Security policies
Rack
Response View
Filter
output
31. Label Propagation:
Only show this to Victor’s friends
Unsafe text supplied by users
Safe to splice in a HTML page
(646) 434-8887
<dl>
<dt>Phone number:</dt>
<dd>(646) 434-8887</dd>
</dl>
<dl>
<dt>Phone number:</dt>
<dd><%= phone %></dd>
</dl>
Privacy labels (for deputy confusion) propagate automatically
32. Label Propagation:
Only show this to Victor’s friends
Unsafe text supplied by users
Safe to splice in a HTML page
(646) 434-8887
<dl>
<dt>Phone number:</dt>
<dd>(646) 434-8887</dd>
</dl>
<dl>
<dt>Phone number:</dt>
<dd><%= phone %></dd>
</dl>
Unsafe text labels propagate automatically
Other encoding labels do not propagate automatically
33. Label Propagation:
(646) 434-8887 Only show this to Victor’s friends
Unsafe text supplied by users
HTML escape Safe to splice in a HTML page
(646) 434-8887
<dl>
<dt>Phone number:</dt>
<dd>(646) 434-8887</dd>
</dl>
<dl>
<dt>Phone number:</dt>
<dd><%= phone %></dd>
</dl>
Unsafe text labels propagate automatically
Other encoding labels do not propagate automatically
34. Label Propagation:
Only show this to Victor’s friends
Unsafe text supplied by users
Safe to splice in a HTML page
(646) 434-8887
<dl>
<dt>Phone number:</dt>
<dd>(646) 434-8887</dd>
</dl>
<dl>
<dt>Phone number:</dt>
<dd><%= phone %></dd>
</dl>
Operations on labeled data are non-trivial, and
making them fast is challenging.
35. Label Propagation:
(646) 434-8887 Only show this to Victor’s friends
Unsafe text supplied by users
HTML escape Safe to splice in a HTML page
(646) 434-8887
<dl>
<dt>Phone number:</dt>
<dd>(646) 434-8887</dd>
</dl>
<dl>
<dt>Phone number:</dt>
<dd><%= phone %></dd>
</dl>
Operations on labeled data are non-trivial, and
making them fast is challenging.
Shows of hands: “Who used the Internet in 1995? How about 2005? Who uses Facebook now? You’ve gotta admit, we’ve come a long way. Look at how many things you can do with this Web application.How do you build something this complex without having it crash millions of times a day? Use modularization, as you would to build any other complex software system. Break down the application into modules, and have different people in the development team work on different modules. If you try to break down Facebook’s page into modules…
…you’ll see that if “comes apart” quite easily. The modules are disjoint, so developers can work on their own features without stepping on each other’s toes. Very efficient!
However, this efficiency has its price. The applications we build are so complex that no programmer’s can understand the entire application in detail. We have security experts, or entire teams devoted to security. But this means that all the other programmers don’t concern themselves with security!
Most security issues in cloud applications stem from two issues that we call deputy confusion and language confusion. I’d like to explain these issues, so I can show you what we’re doing to mitigate them.
A cloud application stores and processes data on behalf of its users, but uses a single set of credentials to access the database.This means that the database fully trusts every request that It receives from the application. So there is no mechanism to protect against a malicious user that manages to trick the application into sending a dangerous request to the database. We say that the database is a confused deputy, because it processes requests without knowing on whose behalf it’s doing the work.Furthermore, since the database is not aware of the application’s multiple users, the application code is responsible for keeping track of data ownership, as well as of any security requirements. This means that every line in the application code must be written with security in mind, which is really hard and error-prone.Last but not least, we have a similar situation on the browser side: the browser fully trusts all the JavaScript code received from the application, and executes it with the application’s credentials.
The confused deputy problem is not relegated to the application’s interfaces with the database and browser. Large applications, such as Facebook, are broken into loosely coupled models. Now each module is prone to deputy confusion, and all it takes is one tiny mistake at the interface between two modules.Did you know Facebook has a feature that shows you someone else’s view of your profile, so you can check your privacy settings? Show of hands: how many of you used this?
Here’s what the feature looks like. In this example, I’m making sure that Nickolai, my professor, doesn’t get to see any pictures of me drinking beer. Please look at each module, and think -- which user does it represent? Which modules should use my credentials, and which modules should use Nickolai’s credential?
Here’s a my answer. Did you get all of them right?If you couldn’t solve this right away, think of the poor guy that had to code this up! One single mistake means private information leakage, and front-page news coverage!
Yup. Exactly. Front page news coverage. Facebook’s engineers messed up when they coded this feature!We could say that Facebook has bad engineers and move on. But, let’s face it, Facebook is quite wealthy, and does attract good coders.So we must accept that programmers make mistakes. We can’t fight this problem by asking application developers to “pay more attention”, we have to give them better tools!
That was a rather extreme example. Most applications look like this, not like Facebook, so let’s go back to the interfaces between the cloud application, the browser, and the database. When discussing confused deputies, I said “if a malicious user tricks the application into sending a dangerous query to the database”… how does that happen? Why would an application send a bad query to the server?
Requests sent from the cloud application to the database use a language called SQL. I have a couple examples up here. As you can see, SQL is a text-based language. It turns out that the easiest way of creating SQL requests is to take the user’s input, and stick it straight into a template.Look at this login example: a user provides a name and a password, and the application checks them against its database. If the name and password match, the user is allowed in.See how the application takes the user strings, and combines them with a pre-defined query string? Now look on the right – a clever user can take advantage of this and log in without a password!The problem here is that the application combines the name and password strings with the query string, as if they were all SQL strings. If all the strings were pieces of SQL, they could be combined together like this. But the truth is, the name and password strings are free-form text, so they have to be SQL-escaped, which makes them SQL strings, before they can be combined with other SQL strings.
This sort of attack is called a SQL injection attack, and it used to be so common that there’s an Web comic dedicated to it!
By the way, this isn’t a database layer problem! Web development consists entirely of string-based languages, and the problem I mentioned spans all these languages.For example, the user interfaces that you see in the browser are HTML, combined with CSS and JavaScript. All these are text-based languages, which means they’re prone to the same issue as SQL queries. In fact, if you Google for “cross-site scripting attack”, you’ll find that a lot of companies had this issue.The story doesn’t stop here, for complicated applications! Want to send e-mail? Your application needs to talk to an e-mail server, using SMTP, another text-based language. And so on, so forth.
Let’s look at a more complex case of language confusion. This is a user sign-up form. You give it an e-mail address and a password, and it makes an account for you. Let’s take a look at what happens behind the scenes.When you click that “Create Account” button, your browser sends all the data you filled out to the application, as a gob of text. Rails, the application framework that I’m working with, takes this text, and turns it into a dictionary of key-value pairs, for convenient access. You can see the dictionary in the top-right. Then there is some code – actually, there’s very little code – that takes the values in this dictionary, and puts them into a new User object. Then the user is saved to the database. Very nice and simple, right?
Well, suppose a bad user guesses that our users have an “admin” field in the database, and makes his browser send *these* values to the application. Can you guess what happens? … Oh, I already gave that away. The code will work exactly like before – it will take the input values, and put them into a new User object. So the attacker is now an administrator in our application. That can’t be a good thing!How do we solve this problem? We can’t go around saying “don’t use this feature”, because then we’d be asking programmers to give up productivity for security, and that’s not going to happen. The real problem here is that the dictionary created from the user’s input looks just like a dictionary produced by the application code, which *should* be completely trusted.
I’m sorry for talking about mass-assignment, it’s a bit too much for a morning presentation! But I really wanted to explain it to you, so you can understand this.GitHub is an application that stores source code in the cloud, and helps programmers collaborate. They’re the best at what they do. And yet, less than a month ago, a security researcher uncovered a vulnerability that would have let a malicious user modify anyone else’s code. It turns out that one piece of GitHub was vulnerable to the mass-assignment attack that I’ve described earlier. And one piece is all it takes!
So, writing secure Web applications is really really hard. But don’t worry, we’re going to make all the problems go away! So let me tell you *how* we’re going to make the problems go away.
Do you know this joke, some guy goes to the doctor and says “hey, doctor, if I twist my leg like this, it hurts”, and the doctor says “then stop twisting your leg like that!”In our case, the advice would be “stop being confused!” So let’s start tracking the information that we need to avoid deputy confusion and encoding confusion! Let’s call this additional information “labels”. The left side of this slide has some labels that might be useful. The first two are labels solve deputy confusion by tracking the data’s owner, and his security requirements. The labels on the bottom solve encoding confusion
Let’s see how labels work to prevent the mass-assignment attack that I described earlier. Remember, the problem was that a malicious user guessed the name of a sensitive field in our application, and took advantage of a convenience feature in Rails, which takes any input coming to the application, and saves it to the database.
Now here’s what happens when the application uses our plug-in. The Rails framework still creates a dictionary out of the data, but this time the dictionary is labeled to reflect that its content is controlled by a user. That label will not allow Rails to create a User object out of the dictionary data. The user’s will be told an internal error ocurred, and the database will remain unchanged.Does this mean we disabled mass-asignment? Not quite, the application developer can get that convenience back by declaring a security policy for the User model.
After our application’s developer puts a security policy in place, the Rails code will filter the dictionary containing user input, and remove any fields that aren’t covered by the security policy. Before the User object is written to the database, a filter checks the security policy, which is shown on the bottom-left.In this example, the name, password, and admin values are all labeled to indicate that they were provided by the new user. The labels are checked against the security policy before the new user is saved to the database. This triggers an error, because the admin value is supplied by the new user, but the security policy says only administrators can set that value.
Alright, demo time!
Time permitting, let me talk a little bit about how we’re going to code this up.
Here’s what we need to do to get the functionality that I just described. First, we need to decide where we’re going to label and filter the data. Next, we need to add magic to the programming language, so the labels are automatically propagated. Last, we need to come up with a language that lets programmers describe the security policies.
This is what a Rails application looks like. At the front of the application, there’s Rack, turning HTTP requests into Ruby objects, and Ruby objects into HTTP responses. Then there’s controller code, which is responsible for putting together all the bits and pieces of data needed by the response. The controller achieves this by coordinating across models, which implement the application’s business logic and interface with the database. Last, the controller hands off the data to the view code, which is responsible for formatting it in a nice way that will make the user happy.Let’s see how labels and filters fit in this picture.
Here’s a neat trick: abstraction! We look at the entire Rails application as a system, a big black box.
Data coming into the system must be labeled. Data that’s about to leave the system must be filtered, to make sure that it meets the security policies specified by the application.
And that’s about it! It turns out that Rack needs labeling and filtering, since it interacts with the user, and the model needs labeling and filtering, since it interacts with the database. Security policy descriptions belong in the model, because
From a programming standpoint, label propagation is the most difficult part. Propagation is essential to the system, because it lets people write their code without worrying about security. Let’s take a look at a couple of examples.The box with green text on the top-left is my phone number. I only want to show my phone number to my friends, so there’s a privacy label associated with it. The box below is a template for a HTML page that shows my phone number. Now suppose someone goes to the page displaying my phone number, so the Web application has to produce an HTML fragment with my phone number. It will do that by combining the template with my actual phone number, and get the HTML fragment on the right.How do we label that? Show of hands – who thinks it’s private information? Who thinks it’s not private information? Hey, as far as I’m concerned, it has my phone number in it, so it’s private! So, some labels propagate automatically, which is a fancy way of saying that they “stick” with the data they’re labeling, as it is processed by the application.
Let’s look at the same process from a different perspective. We still have the same phone number and an HTML template on the left, and the same HTML fragment on the right. But this time, we care about encoding.The phone number is something I typed into the Web application, so it’s unsafe. The template on the bottom is part of the application’s source code, so we know it’s safe HTML. What happens when we combine the two? Show of hands – who thinks the result is safe HTML? Who thinks it’s not? Well, my phone number isn’t safe HTML, it might contain a cross-site scripting attack, and that makes the whole result unsafe.The take-away is that encoding labels should not propagate automatically. If you start out with safe HTML and change it, chances are the result will not be safe HTML anymore.
By the way, the right thing to do, in this case, is to HTML-escape the phone number, which produces a string that can be safely included in HTML pages. After HTML escaping, the result is labeled as safe HTML, so the HTML fragment coming from the “plus” operation would also be safe HTML that can be sent to the user.
And, just for completeness, here’s how the labels look like when the phone number is HTML-escaped correctly.