Techniques for static analysis of PHP, include using Facebooks HipHop for PHP (HpHp) and ClamAV. First presented at PHPDay 2012, Verona, Italy, May 19, 2012. Companion source code is available at https://github.com/client9/hphp-tools
3. Static Analysis
• Typically analyzes source code “at rest” for
bugs, security problems, leaks, threading
problems.
• We’ll cover simple checks and HpHp
• Some commercial tools exists too.
Veracode runs off of PHP byte code
http://www.veracode.com/products/static
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
4. Dynamic Analysis
Analysis of code while running
• valgrind http://valgrind.org/
• xdebug http://xdebug.org/
• xhprof http://pecl.php.net/package/xhprof
Great tools, but not for this talk.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
6. The Littlest Static Analysis
php -l
• Syntax errors should never be committed.
• Syntax errors should never go to prod!
• Make sure dev and prod versions of PHP
are identical
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
7. PHP Leading Whitespace
pre-commit check that every file starts with
either #! or <?php exactly
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
8. PHP Trailing Whitespace
Check that file ends exactly with ?> or make
sure it doesn’t have a closing tag.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
9. Anti-Virus
On Source Code
• It’s static analysis too!
• Not so concerned with PHP but do you
have Javascript, Flash, Word, PowerPoint,
PDFs, ZIPs in your source tree?
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
13. Why not use... AST?
http://docs.php.net/manual/en/function.token-get-all.php
• token_get_all($file) takes a file and
returns an Abstract Syntax Tree in php.
• Orders of magnitude slower -- can’t use for
pre-commit check on large code bases
• Too low level -- need to turn it into an
intermediate representation.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
14. Why not use...
CodeSniffer?
http://pear.php.net/package/PHP_CodeSniffer
• Excellent tool, but...
• Based on token_get_all
• SAX-style API
• Too slow for pre/post commit hooks
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
15. Why not use...
• php-SAT:Orphaned 2009
• php-AST: Orphaned 2008
• phc: active but doesn’t support... OBJECTS
• Every other PHP to Java translator or
converter is orphaned or has other
problems
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
16. Facebook’s HpHp
• A full re-implementation of Apache+PHP
• Compiles PHP to it’s own byte code format
and executes in own runtime.
• May also translate to C++ for other
compilation or use JIT
• Does type-inference for speed-ups
• Also includes a HTTP web server
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
17. Bad News #1
No action since
2011-12-06
Facebook appears to use “code drops”
instead of true “streaming” open source
model. BOO.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
18. Bad News #2
Missing Many Common
Modules
• Has: apc, array, bcmath, bzip2, ctype,
curl,iconv, gd, imap, ipc, json, ldap,math,mb,
mcrypt,memcache, mysql, network,
openssl, pdo, posix, preg, process, session,
simplexml, soap, socket, slqite3, stream,
string, thread, thrift, url, xml*, zlib.
• That’s it! (No filter_var, no ftp, no ..)
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
19. Bad News #3
Doesn’t Track PHP 5.3
PHP 5.4? No way!
• Some functions signatures aren’t quite
right. e.g. debug_print_backtrace
• HpHp 2 arguments
• PHP 5.3.6 3 arguments
• PHP 5.4 4 arguments
• (End up needing to whitelist this to ignore
false positives)
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
20. Bad News #4
Seriously #*$%&!#
annoying to build
• My crappy CentOS build script
https://github.com/client9/hphp-tools
• Ubuntu users are slightly better off (see
HpHp wiki)
• Takes hours.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
21. Bad News #5
Won’t help with
Dynamic Evaluation
$fn = “foo”;
$fn(1,2,3); // function not found
eval(“foo(1,2,3)”); // no
• This is more for runtime dynamic analysis.
• Try to avoid this anyways.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
22. Conclusion
• You aren’t going to run your application
under HpHp (at least not as is)
• But, it has a great static analyzer that works
and finds real bugs really fast.
• Scans thousands of files in a few seconds
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
24. Step 1: Make a
constants file
• HpHp doesn’t know about hardwired constants
• Nifty script generates the constants
• May need to hand edit
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
25. Step 2: Make a stubs file
• HpHp doesn’t have many binary extensions
• But... the analyzer doesn’t care. Just make a
stub function.
// http://php.net/manual/en/function.filter-var.php
function filter_var($var, $filter=0, $options=NULL) {
return $var;
}
You can make stub classes as well.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
26. Step 3: Create the file list
• Create a list of all php files to be analyzed
and include your constants and stubs file.
• Ignore phpunit and other tests
• HpHp implements much of PHP base
functionality as PHP code. (e.g. the
Exception class is written in PHP). You
need to add these system PHP as well
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
28. Step 4: Do it
Include paths are a bit mysterious.
You’ll have to play around to get it right.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
29. Step 5: Analyze it
• /tmp/hphp/Stats.js contains some...
statistics in JSON format.
• /tmp/hphp/CodeError.js is were the
good stuff is.
• JSON format, includes:
Error type, file, line number, code snippet
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
30. UseUndeclaredVariable
• #1 bug.
• Typically typos, scoping or cut-n-paste
errors
• Found frequently in error handling cases
if (!$ok) {
error_log(“$user_id has a problem”);
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
31. TooManyArgument
TooFewArgument
Too Many Arguments typically indicates the
caller is confused and has logic errors (bug).
Too Few Arguments is frequently a serious
bug as PHP silently fails and defaults to null.
hash_hmac(‘sha1’, ‘foo’); // ooops no key
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
32. BadDefine
UsesEvaluation
define($k, $v);
eval(“1+1”);
• “Bad” since HpHp can’t compile it, but likely
legal PHP.
• Avoid using dynamic constant generation.
Use configuration file instead.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
33. UseUndeclaredGlobalVariable
• HpHp only defines certain globals.
• Used only by Smarty?
• $GLOBALS['HTTP_SERVER_VARS']
• $GLOBALS['HTTP_SESSION_VARS']
• $GLOBALS['HTTP_COOKIE_VARS']
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
34. UseVoidReturn
Some function returns “nothing” but the
value is used
function foo() {
if (time() % 60 == 0) { return true; }
// oops void
}
$now = foo(); // error
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
35. RequiredAfterOptionalParam
function foo($first, $second=2, $third) {
• IMHO should be a PHP syntax error
• Confusing
• (Oops, I haven’t investigated behavior)
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
36. DeclaredConstantTwice
• Probably not invalid PHP, but HpHp
analyzes all files at once.
• Best to have one file that defines constants
or just not use them.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
37. UnknownFunction
UnknownObjectMethod
UnknownClass
UnknownBaseClass
• Is your file list complete?
• Do you need to make stubs?
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
38. BadPHPIncludeFile
Likely a PHP file trying to include/require
itself or invalid file name or your autoloader
is ambiguous.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
39. PHPIncludeFileNotFound
• Really common
• Probably unique to your autoloader.
• Not sure I quite understand how HpHp
computes file names and loads includes,
requires, require_once... yet
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
41. Every Commit
• Every commit gets checked in real-time
• “try-server” also allows developers to test
before committing.
• Finds and prevents bugs before
they go live every day.
• Almost no false positives (!!)
• Developers love it (especially the Java
groups)
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
42. Analysis
• CodeError.js is processed through a
custom script.
• Has a large blacklist of checks or files we
don’t care about (3rd party, known bad,
etc).
• File and line info pass through to git blame
to find author and date/time.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
43. hphp-try runs in Jenkins
oops
Console Output
gives details
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
44. Work in Progress
• It took a lot of work to get the code base in
shape so we could add pre-commit hook.
• Over 200 real problems first identified.
• We still have blacklisted some checks since
we are still cleaning up legacy code (and
figuring out how HpHp works)
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
46. Checks aren’t that
complicated
• HpHp’s runtime type-inference isn’t used for
static analysis (good since type-inference is
hard)
• All checks are fairly simple book-keeping.
• All could be done in CodeSniffer/AST but
too slow
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
47. Slice off HpHp?
• The HpHp Runtime is nice, but really
complicated and a moving target.
• Can we slice out the analysis part of HpHp?
• Much simpler to build, easier to hack on.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
48. Or Build New?
• Can this run off “byte code” or hook into
the parsing step of PHP?
• Exec a snippet of PHP for the loading script
files ?
• Seems feasible
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
50. Thanks
• The Facebook Team!
• Sebastian Bergman who first blogged about
using HpHp for static analysis
• Rasmus who first hacked up a version of
HpHp in house at Etsy
• The QA and DevTools teams at Etsy
• All the Etsy developers who had some
painful weeks getting the code in shape!
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
51. Facebook References
• https://github.com/facebook/hiphop-php
Main source repo + wiki
• http://developers.facebook.com/blog/post/
2010/02/02/hiphop-for-php--move-fast/
Main announcement, 2010-02-02
• https://www.facebook.com/note.php?
note_id=416880943919
Update 2012-08-13
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
52. Notes from
Sebastian Bergman
• http://sebastian-bergmann.de/archives/894-
Using-HipHop-for-Static-Analysis.html
Static Analysis Intro, 2010-07-27
• http://sebastian-bergmann.de/archives/918-
Static-Analysis-with-HipHop-for-PHP.html
Tool to help process output, 2012-01-27
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
53. Misc References
• http://arstechnica.com/business/2011/12/
facebook-looks-to-fix-php-performance-
with-hiphop-virtual-machine/
ArsTechnica overview, 2011-12-13
• http://www.serversidemagazine.com/news/
10-questions-with-facebook-research-
engineer-andrei-alexandrescu/
Lots of good stuff in here, 2012-01-29
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
54. This Talk
• These slides are posted at
http://slidesha.re/KzTfLy
• Tools for building on CentOS
https://github.com/client9/hphp-tools
• More about Nick Galbreath
http://client9.com/
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012