"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
A Structure Preserving Approach for Securing XML Documents
1. A Structure Preserving Approach
for Securing XML Documents
TrustCol-2007
The Department of Computer Science
Purdue University
Mohamed Nabeel
nabeel@cs.purdue.edu
2. Outline
• Introduction and Basic Concepts
• Annotation and Encoding Scheme
• Enforcing and Verifying Security
Requirements
• Experimental Results
• Conclusion and Future Work
3. Secure Sharing
• Hierarchical Data such as XML
• Correct Data
• Access Control
B
Bob
A
E F
B C D
K L
E F G H I J
D
K L
Alice
I J
4. Secure Sharing – Access Control
Apply Access
Control Policy
A
B
B C D
E F G H I J
E F Bob
K L
K L
5. Secure Sharing – Correct Data
Bob
Eve has modified B
the values
E X
A
Y L
B C D
Eve
E F G H I J
B
K L
E F
Eve has dropped
K L
elements
6. Why Preserving Structure
• Partial access to secured documents
• Applying content filters
• Querying secured documents
Late Processing High Scalability
7. Message Level Security
• P2P vs. E2E
– Transport level security (HTTPS, IPSec, etc)
is sufficient to provide P2P security
– But E2E requires more than TLS
– We need message level security
P2P
Source Intermediary Destination
E2E
9. XML Node Orderings
• Two types of ordering
1.Hierarchical ordering
2.Sibling ordering
• What orderings are significant?
• What is the relationship between them?
• How does schema validation tools treat
these orderings?
10. XML Node Orderings
• Is Hierarchical ordering significant?
– Yes, It is!
• Is Sibling ordering significant?
– Depends on the application
Two orderings Two-level structural integrity
11. XML Node Orderings
<Review> <Review>
<p>Einstein is a <p>Einstein is a
<b>genius</b>; <b>ordinary</b>;
<b>ordinary</b> <b>genius</b>
people may not understand his work.</p> people may not understand his work.</p>
</Review> </Review>
XSLT XSLT
Einstein is a genius; ordinary people may Einstein is a ordinary; genius people may
not understand his work. not understand his work.
Sibling ordering in document centric
applications is significant
12. XML Node Orderings
person table
<person> firstname country major
<firstname>nabeel</firstname> nabeel sri lanka cs
<country>sri lanka</country>
<major>cs</major>
<person>
<person> Class Person {
<country>sri lanka</country> String firstname;
<firstname>nabeel</firstname> String country;
<major>cs</major> String major;
<person> };
Sibling ordering in data centric
applications may not be significant
13. Information Leakage
Direct Leakage Indirect Leakage No Leakage
A
A Key K2 B
B C D
Key K1
B C D
E F
E F G H I J E F G H I J
K L
K L
K L
Bob only knows K1
Hiding the existence No Information Leakage
14. One Example
• Delta-publishing
Delta-Message at t2
First Message at t1 Second Message at t2
The smallest unit of change: An Element
15. Our Approach
• Recognize two level-ordering
• Provide E2E security for hierarchical data
• Reason about security at the smallest
possible change
• Minimal indirect information leakage
16. Next
• Introduction and Basic Concepts
• Annotation and Encoding Scheme
• Enforcing and Verifying Security
Requirements
• Experimental Results
• Conclusion and Future Work
17. XML Document
• A Graph G = { V, v, E, f, g}
– V = Ve U Va U Vr where Ve = {x | x is an element}, Va = {x | x is
an attribute}, Vr = {x | x is a node not in Ve U Va}
– v = document root
– E = Ee U Ea U Er where Ee = {e | e is an edge representing an
element-element connection or a link} , Ea = {e | e is an edge
representing an element-attribute connection}, Er = {e | e is an
edge not in Ee U Ea but starts from an element}
– f:E L where L = {l | l is a node name or an attribute name or
a pre-defined label}, f is called the labeling function
– g:(Ve, i) Ver where g returns the ith child of Ve, Ver = Ve U Vr
18. XML Document
• Example
<?xml version=“1.0” encoding=“UTF-8” ?>
<quote type =„bid‟>
<market>NY</market> v
<price cur=„USD‟ size=5m>750</price>
quote
</quote> bid
type
USD
Circles – elements
market price
cur
Squares – attributes
Ellipse - other text text
size
5m
NY 750
19. Properties of the Annotation Scheme
• Two independent annotation schemes for
– Hierarchical ordering and
– Sibling ordering
• Time complexity = O( height of the XML
DOM tree)
• Provides the flexibility to incrementally
annotate
21. Hierarchical Ordering
• Should be able to unambiguously identify
parent-child relationships
• Annotate each element with its parent HID
• Element HID‟s need not be unique
• Example: using XPath as HID‟s
– Element x is the parent of y
– Annotate y with h(XPx || name of y), where h
is a collision-resistant hash function and XPx
is the XPath of x. XPath sequencing numbers
are not used to prevent indirect
Information leakage.
22. Sibling Ordering
• Maintain the following condition
– Given that elements x and y are siblings and x
is to the left of y, seqx < seqy where seqx and
seqy are secure random numbers assigned to
x and y respectively.
Secure random numbers make inferring
about hidden elements difficult, thus
preventing indirect information leakage.
23. Encoding Scheme
v
v
quote quote
bid bid
type type
market price USD USD
market price
cur NY
cur
content
size size
text text 5m content 5m
NY 750
750
Elements and non-elements Only elements
High reduction in |V| and |E| for document-centric
applications.
24. Encoding Scheme
• New Graph G‟ = { V‟, v, E‟, f’, g’}
• V‟ = V U {x | x is an attribute for ID, seq or
content} - Vr
• E‟ = E U {e | e is an attribute-element from
ID, seq or content} - Er
• f‟:V‟ L‟ where L‟ = L U {ID, seq, content}
• g‟:{Ve, i} Ve where Ve consists only of
elements
25. Next
• Introduction and Basic Concepts
• Annotation and Encoding Scheme
• Enforcing and Verifying Security
Requirements
• Experimental Results
• Conclusion and Future Work
26. Integrity
• Two types of integrity
– Structural integrity
– Content integrity
• Introduce a new attribute (signed)
• Attribute value = h(E.attrs || E.content)
– h – hash function
– E.attrs - concatination of attribute name-value pairs of
element E
– E.content – content of element E
• Merkle hash vs. Our approach
27. Integrity
A
Content Integrity is B
B C D
violated
E X
E F G H I J
Y L
Sibling Integrity is
K L
Bob receives.. violated
B
Completeness Hierarchical B
B
is violated Integrity is violated E F
L F
E F
L K
K E
K L
28. Confidentiality
• Content of each element is encrypted
• Introduce a new attribute (encrypted)
• Attribute value = keys(keyr||keyr (E.attrs ||
E.content || E.signed))
– keyr – a randomly generated key
– keys – shared key
– E.attrs – concatination of attribute name-value pairs
of element E
– E.content – content of element E
– E.signed – digital signature computed for E
29. Verifying and Updating
• Each element can be verified
independently
• Hierarchical and Sibling integrity can be
verified independently
• Each element can be updated
independently
• Structure can be updated without affecting
the existing values
30. Example: Updating
<?xml version=“1.0” encoding=“UTF-8” ?>
<quote type =„bid‟>
<market>NY</market>
<price cur=„USD‟ size=5m>765</price> v
</quote> X
quote
signed
Re-calculate signed and encrypted X
encrypted attributes only
for this element market price
signed
X encrypted signed encrypted
X X X
31. Next
• Introduction and Basic Concepts
• Annotation and Encoding Scheme
• Enforcing and Verifying Security
Requirements
• Experimental Results
• Conclusion and Future Work
32. Global vs. Local Annotation
Local Annotation Global Annotation
400
Time taken to annotate (ms)
350
300
250
200
150
100
50
0
1 2 3 4 5 6 7 8
Number of Elements in the XML document (in 500)
33. Updating XML Document
Our Scheme W3C Scheme
800
700
600
Time taken (ms)
500
400
300
200
100
0
1 2 3 4 5
Percentage of the Document Updated
34. Division of Labor
encoding signing encrypting
45000
40000
35000
30000
Time Taken
25000
20000
15000
10000
5000
0
1 2 3 4 5 6 7 8
Number of Elements in the XML Document (in 500)
35. Outline
• Introduction and Basic Concepts
• Annotation and Encoding Scheme
• Enforcing and Verifying Security
Requirements
• Implementation and Experimental Results
• Conclusion and Future Work
36. Conclusion and Future Work
• We presented an interesting approach to
secure XML documents while preserving
the structure
• We plan to extend the work presented to
– Explore ways to reduce the signing time
– Explore possible hybrid combinations of our
approach and the standard approach
• We are planning to publish the library
under ASF license