SlideShare une entreprise Scribd logo
1  sur  55
Static Analysis
   for PHP
          PHPDay Verona, Italy 2012
 Nick Galbreath @ngalbreath nickg@etsy.com
http://slidesha.re/

  KzTfLy
github.com/client9/hphp-tools

    Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Static Analysis
• Typically analyzes source code “at rest” for
  bugs, security problems, leaks, threading
  problems.
• We’ll cover simple checks and HpHp
• Some commercial tools exists too.
  Veracode runs off of PHP byte code
  http://www.veracode.com/products/static

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Dynamic Analysis
        Analysis of code while running


• valgrind http://valgrind.org/
• xdebug http://xdebug.org/
• xhprof http://pecl.php.net/package/xhprof

      Great tools, but not for this talk.
Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Simple Static Analysis
The Littlest Static Analysis
                  php -l
  • Syntax errors should never be committed.
  • Syntax errors should never go to prod!
  • Make sure dev and prod versions of PHP
    are identical




   Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
PHP Leading Whitespace

pre-commit check that every file starts with
either #! or <?php exactly




Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
PHP Trailing Whitespace
 Check that file ends exactly with ?> or make
 sure it doesn’t have a closing tag.




 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Anti-Virus
      On Source Code

• It’s static analysis too!
• Not so concerned with PHP but do you
  have Javascript, Flash, Word, PowerPoint,
  PDFs, ZIPs in your source tree?



 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
ClamAV
• http://www.clamav.net/
• Free anti-virus.
• Available on every OS.
ClamAV Performance




   1G of Source Code / Minute
      Why not do it?
Advanced Static Analysis
Why not use... AST?
http://docs.php.net/manual/en/function.token-get-all.php

    •    token_get_all($file)  takes a file and
         returns an Abstract Syntax Tree in php.
    • Orders of magnitude slower -- can’t use for
         pre-commit check on large code bases
    • Too low level -- need to turn it into an
         intermediate representation.


        Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Why not use...
         CodeSniffer?
 http://pear.php.net/package/PHP_CodeSniffer

• Excellent tool, but...
• Based on token_get_all
• SAX-style API
• Too slow for pre/post commit hooks

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Why not use...
• php-SAT:Orphaned 2009
• php-AST: Orphaned 2008
• phc: active but doesn’t support... OBJECTS
• Every other PHP to Java translator or
  converter is orphaned or has other
  problems


 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Facebook’s HpHp
• A full re-implementation of Apache+PHP
• Compiles PHP to it’s own byte code format
  and executes in own runtime.
• May also translate to C++ for other
  compilation or use JIT
• Does type-inference for speed-ups
• Also includes a HTTP web server
 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Bad News #1
      No action since
        2011-12-06
Facebook appears to use “code drops”
instead of true “streaming” open source
model. BOO.




Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Bad News #2
Missing Many Common
       Modules
• Has: apc, array, bcmath, bzip2, ctype,
  curl,iconv, gd, imap, ipc, json, ldap,math,mb,
  mcrypt,memcache, mysql, network,
  openssl, pdo, posix, preg, process, session,
  simplexml, soap, socket, slqite3, stream,
  string, thread, thrift, url, xml*, zlib.
• That’s it!   (No filter_var, no ftp, no ..)
 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Bad News #3
Doesn’t Track PHP 5.3
                PHP 5.4? No way!
•   Some functions signatures aren’t quite
    right. e.g. debug_print_backtrace
    • HpHp 2 arguments
    • PHP 5.3.6 3 arguments
    • PHP 5.4 4 arguments
• (End up needing to whitelist this to ignore
    false positives)
 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Bad News #4
    Seriously #*$%&!#
     annoying to build
• My crappy CentOS build script
  https://github.com/client9/hphp-tools
• Ubuntu users are slightly better off (see
  HpHp wiki)
• Takes hours.
 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Bad News #5
    Won’t help with
   Dynamic Evaluation
$fn = “foo”;
$fn(1,2,3); // function not found
eval(“foo(1,2,3)”); // no

  • This is more for runtime dynamic analysis.
  • Try to avoid this anyways.
 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Conclusion

• You aren’t going to run your application
  under HpHp (at least not as is)
• But, it has a great static analyzer that works
  and finds real bugs really fast.
• Scans thousands of files in a few seconds

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Using HpHp
Step 1: Make a
         constants file
• HpHp doesn’t know about hardwired constants
• Nifty script generates the constants
• May need to hand edit



 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Step 2: Make a stubs file

• HpHp doesn’t have many binary extensions
• But... the analyzer doesn’t care. Just make a
  stub function.
  // http://php.net/manual/en/function.filter-var.php
  function filter_var($var, $filter=0, $options=NULL) {
     return $var;
  }

  You can make stub classes as well.

      Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Step 3: Create the file list
 • Create a list of all php files to be analyzed
   and include your constants and stubs file.
 • Ignore phpunit and other tests
 • HpHp implements much of PHP base
   functionality as PHP code. (e.g. the
   Exception class is written in PHP). You
   need to add these system PHP as well


  Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
correction:
       grep -v helper.idl.php | grep -v constants.php >> $JUNK
          Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Step 4: Do it




Include paths are a bit mysterious.
You’ll have to play around to get it right.
   Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Step 5: Analyze it

•   /tmp/hphp/Stats.js contains some...
    statistics in JSON format.
•   /tmp/hphp/CodeError.js is were the
    good stuff is.
•    JSON format, includes:
    Error type, file, line number, code snippet


Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
UseUndeclaredVariable

 • #1 bug.
 • Typically typos, scoping or cut-n-paste
   errors
 • Found frequently in error handling cases
if (!$ok) {
    error_log(“$user_id has a problem”);


  Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
TooManyArgument
TooFewArgument
Too Many Arguments typically indicates the
caller is confused and has logic errors (bug).
Too Few Arguments is frequently a serious
bug as PHP silently fails and defaults to null.

hash_hmac(‘sha1’, ‘foo’); // ooops no key



Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
BadDefine
    UsesEvaluation
define($k, $v);
eval(“1+1”);


 • “Bad” since HpHp can’t compile it, but likely
   legal PHP.
 • Avoid using dynamic constant generation.
   Use configuration file instead.

   Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
UseUndeclaredGlobalVariable


  • HpHp only defines certain globals.
  • Used only by Smarty?
   •   $GLOBALS['HTTP_SERVER_VARS']

   •   $GLOBALS['HTTP_SESSION_VARS']

   •   $GLOBALS['HTTP_COOKIE_VARS']




   Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
UseVoidReturn
Some function returns “nothing” but the
value is used
function foo() {
   if (time() % 60 == 0) { return true; }
   // oops void
}

$now = foo(); // error




Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
RequiredAfterOptionalParam
   function foo($first, $second=2, $third) {




   • IMHO should be a PHP syntax error
   • Confusing
   • (Oops, I haven’t investigated behavior)


    Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
DeclaredConstantTwice


• Probably not invalid PHP, but HpHp
  analyzes all files at once.
• Best to have one file that defines constants
  or just not use them.



 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
UnknownFunction
 UnknownObjectMethod
    UnknownClass
  UnknownBaseClass

• Is your file list complete?
• Do you need to make stubs?

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
BadPHPIncludeFile


 Likely a PHP file trying to include/require
 itself or invalid file name or your autoloader
 is ambiguous.




 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
PHPIncludeFileNotFound


 • Really common
 • Probably unique to your autoloader.
 • Not sure I quite understand how HpHp
   computes file names and loads includes,
   requires, require_once... yet



  Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
HpHp at Etsy
Every Commit
• Every commit gets checked in real-time
• “try-server” also allows developers to test
  before committing.
• Finds and prevents bugs before
  they go live every day.
• Almost no false positives (!!)
• Developers love it (especially the Java
  groups)
 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Analysis
•   CodeError.js is processed through a
    custom script.
• Has a large blacklist of checks or files we
    don’t care about (3rd party, known bad,
    etc).
• File and line info pass through to git blame
    to find author and date/time.

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
hphp-try runs in Jenkins

                       oops

                                 Console Output
                                  gives details




 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Work in Progress
• It took a lot of work to get the code base in
  shape so we could add pre-commit hook.
• Over 200 real problems first identified.
• We still have blacklisted some checks since
  we are still cleaning up legacy code (and
  figuring out how HpHp works)


 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Can We Do Better?
Checks aren’t that
      complicated
• HpHp’s runtime type-inference isn’t used for
  static analysis (good since type-inference is
  hard)
• All checks are fairly simple book-keeping.
• All could be done in CodeSniffer/AST but
  too slow


 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Slice off HpHp?

• The HpHp Runtime is nice, but really
  complicated and a moving target.
• Can we slice out the analysis part of HpHp?
• Much simpler to build, easier to hack on.

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Or Build New?

• Can this run off “byte code” or hook into
  the parsing step of PHP?
• Exec a snippet of PHP for the loading script
  files ?
• Seems feasible

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Acknowledgments
 and References
Thanks
• The Facebook Team!
• Sebastian Bergman who first blogged about
  using HpHp for static analysis
• Rasmus who first hacked up a version of
  HpHp in house at Etsy
• The QA and DevTools teams at Etsy
• All the Etsy developers who had some
  painful weeks getting the code in shape!
 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Facebook References
• https://github.com/facebook/hiphop-php
  Main source repo + wiki
• http://developers.facebook.com/blog/post/
  2010/02/02/hiphop-for-php--move-fast/
  Main announcement, 2010-02-02
• https://www.facebook.com/note.php?
  note_id=416880943919
  Update 2012-08-13

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Notes from
    Sebastian Bergman
• http://sebastian-bergmann.de/archives/894-
  Using-HipHop-for-Static-Analysis.html
  Static Analysis Intro, 2010-07-27
• http://sebastian-bergmann.de/archives/918-
  Static-Analysis-with-HipHop-for-PHP.html
  Tool to help process output, 2012-01-27


 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Misc References
• http://arstechnica.com/business/2011/12/
  facebook-looks-to-fix-php-performance-
  with-hiphop-virtual-machine/
  ArsTechnica overview, 2011-12-13
• http://www.serversidemagazine.com/news/
  10-questions-with-facebook-research-
  engineer-andrei-alexandrescu/
  Lots of good stuff in here, 2012-01-29

 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
This Talk
• These slides are posted at
  http://slidesha.re/KzTfLy
• Tools for building on CentOS
  https://github.com/client9/hphp-tools
• More about Nick Galbreath
  http://client9.com/



 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
Nick Galbreath nickg@etsy.com @ngalbreath
      PHPDay Verona Italy May 19, 2012
            http://2012.phpday.it/

Contenu connexe

En vedette

Static code analysis
Static code analysisStatic code analysis
Static code analysisRune Sundling
 
libinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick Galbreathlibinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick GalbreathCODE BLUE
 
The promise of asynchronous PHP
The promise of asynchronous PHPThe promise of asynchronous PHP
The promise of asynchronous PHPWim Godden
 

En vedette (6)

Static code analysis
Static code analysisStatic code analysis
Static code analysis
 
Static code analysis
Static code analysisStatic code analysis
Static code analysis
 
libinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick Galbreathlibinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick Galbreath
 
How To Detect Xss
How To Detect XssHow To Detect Xss
How To Detect Xss
 
Content security policy
Content security policyContent security policy
Content security policy
 
The promise of asynchronous PHP
The promise of asynchronous PHPThe promise of asynchronous PHP
The promise of asynchronous PHP
 

Plus de Nick Galbreath

Making operations visible - devopsdays tokyo 2013
Making operations visible  - devopsdays tokyo 2013Making operations visible  - devopsdays tokyo 2013
Making operations visible - devopsdays tokyo 2013Nick Galbreath
 
Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013Nick Galbreath
 
Fixing security by fixing software development
Fixing security by fixing software developmentFixing security by fixing software development
Fixing security by fixing software developmentNick Galbreath
 
DevOpsDays Austin 2013 Reading List
DevOpsDays Austin 2013 Reading ListDevOpsDays Austin 2013 Reading List
DevOpsDays Austin 2013 Reading ListNick Galbreath
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Nick Galbreath
 
SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013
SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013
SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013Nick Galbreath
 
Rebooting Software Development - OWASP AppSecUSA
Rebooting Software Development - OWASP AppSecUSA Rebooting Software Development - OWASP AppSecUSA
Rebooting Software Development - OWASP AppSecUSA Nick Galbreath
 
libinjection and sqli obfuscation, presented at OWASP NYC
libinjection and sqli obfuscation, presented at OWASP NYClibinjection and sqli obfuscation, presented at OWASP NYC
libinjection and sqli obfuscation, presented at OWASP NYCNick Galbreath
 
Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012
Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012
Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012Nick Galbreath
 
New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20Nick Galbreath
 
Data Driven Security, from Gartner Security Summit 2012
Data Driven Security, from Gartner Security Summit 2012Data Driven Security, from Gartner Security Summit 2012
Data Driven Security, from Gartner Security Summit 2012Nick Galbreath
 
Slide show font sampler, black on white
Slide show font sampler, black on whiteSlide show font sampler, black on white
Slide show font sampler, black on whiteNick Galbreath
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Nick Galbreath
 
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012Nick Galbreath
 
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012Nick Galbreath
 

Plus de Nick Galbreath (15)

Making operations visible - devopsdays tokyo 2013
Making operations visible  - devopsdays tokyo 2013Making operations visible  - devopsdays tokyo 2013
Making operations visible - devopsdays tokyo 2013
 
Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013Faster Secure Software Development with Continuous Deployment - PH Days 2013
Faster Secure Software Development with Continuous Deployment - PH Days 2013
 
Fixing security by fixing software development
Fixing security by fixing software developmentFixing security by fixing software development
Fixing security by fixing software development
 
DevOpsDays Austin 2013 Reading List
DevOpsDays Austin 2013 Reading ListDevOpsDays Austin 2013 Reading List
DevOpsDays Austin 2013 Reading List
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
 
SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013
SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013
SQL-RISC: New Directions in SQLi Prevention - RSA USA 2013
 
Rebooting Software Development - OWASP AppSecUSA
Rebooting Software Development - OWASP AppSecUSA Rebooting Software Development - OWASP AppSecUSA
Rebooting Software Development - OWASP AppSecUSA
 
libinjection and sqli obfuscation, presented at OWASP NYC
libinjection and sqli obfuscation, presented at OWASP NYClibinjection and sqli obfuscation, presented at OWASP NYC
libinjection and sqli obfuscation, presented at OWASP NYC
 
Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012
Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012
Continuous Deployment - The New #1 Security Feature, from BSildesLA 2012
 
New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20
 
Data Driven Security, from Gartner Security Summit 2012
Data Driven Security, from Gartner Security Summit 2012Data Driven Security, from Gartner Security Summit 2012
Data Driven Security, from Gartner Security Summit 2012
 
Slide show font sampler, black on white
Slide show font sampler, black on whiteSlide show font sampler, black on white
Slide show font sampler, black on white
 
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
Fraud Engineering, from Merchant Risk Council Annual Meeting 2012
 
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
 
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
 

Dernier

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Dernier (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Static Analysis for PHP, from PHPDay Italy 2012

  • 1. Static Analysis for PHP PHPDay Verona, Italy 2012 Nick Galbreath @ngalbreath nickg@etsy.com
  • 2. http://slidesha.re/ KzTfLy github.com/client9/hphp-tools Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 3. Static Analysis • Typically analyzes source code “at rest” for bugs, security problems, leaks, threading problems. • We’ll cover simple checks and HpHp • Some commercial tools exists too. Veracode runs off of PHP byte code http://www.veracode.com/products/static Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 4. Dynamic Analysis Analysis of code while running • valgrind http://valgrind.org/ • xdebug http://xdebug.org/ • xhprof http://pecl.php.net/package/xhprof Great tools, but not for this talk. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 6. The Littlest Static Analysis php -l • Syntax errors should never be committed. • Syntax errors should never go to prod! • Make sure dev and prod versions of PHP are identical Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 7. PHP Leading Whitespace pre-commit check that every file starts with either #! or <?php exactly Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 8. PHP Trailing Whitespace Check that file ends exactly with ?> or make sure it doesn’t have a closing tag. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 9. Anti-Virus On Source Code • It’s static analysis too! • Not so concerned with PHP but do you have Javascript, Flash, Word, PowerPoint, PDFs, ZIPs in your source tree? Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 10. ClamAV • http://www.clamav.net/ • Free anti-virus. • Available on every OS.
  • 11. ClamAV Performance 1G of Source Code / Minute Why not do it?
  • 13. Why not use... AST? http://docs.php.net/manual/en/function.token-get-all.php • token_get_all($file) takes a file and returns an Abstract Syntax Tree in php. • Orders of magnitude slower -- can’t use for pre-commit check on large code bases • Too low level -- need to turn it into an intermediate representation. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 14. Why not use... CodeSniffer? http://pear.php.net/package/PHP_CodeSniffer • Excellent tool, but... • Based on token_get_all • SAX-style API • Too slow for pre/post commit hooks Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 15. Why not use... • php-SAT:Orphaned 2009 • php-AST: Orphaned 2008 • phc: active but doesn’t support... OBJECTS • Every other PHP to Java translator or converter is orphaned or has other problems Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 16. Facebook’s HpHp • A full re-implementation of Apache+PHP • Compiles PHP to it’s own byte code format and executes in own runtime. • May also translate to C++ for other compilation or use JIT • Does type-inference for speed-ups • Also includes a HTTP web server Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 17. Bad News #1 No action since 2011-12-06 Facebook appears to use “code drops” instead of true “streaming” open source model. BOO. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 18. Bad News #2 Missing Many Common Modules • Has: apc, array, bcmath, bzip2, ctype, curl,iconv, gd, imap, ipc, json, ldap,math,mb, mcrypt,memcache, mysql, network, openssl, pdo, posix, preg, process, session, simplexml, soap, socket, slqite3, stream, string, thread, thrift, url, xml*, zlib. • That’s it! (No filter_var, no ftp, no ..) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 19. Bad News #3 Doesn’t Track PHP 5.3 PHP 5.4? No way! • Some functions signatures aren’t quite right. e.g. debug_print_backtrace • HpHp 2 arguments • PHP 5.3.6 3 arguments • PHP 5.4 4 arguments • (End up needing to whitelist this to ignore false positives) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 20. Bad News #4 Seriously #*$%&!# annoying to build • My crappy CentOS build script https://github.com/client9/hphp-tools • Ubuntu users are slightly better off (see HpHp wiki) • Takes hours. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 21. Bad News #5 Won’t help with Dynamic Evaluation $fn = “foo”; $fn(1,2,3); // function not found eval(“foo(1,2,3)”); // no • This is more for runtime dynamic analysis. • Try to avoid this anyways. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 22. Conclusion • You aren’t going to run your application under HpHp (at least not as is) • But, it has a great static analyzer that works and finds real bugs really fast. • Scans thousands of files in a few seconds Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 24. Step 1: Make a constants file • HpHp doesn’t know about hardwired constants • Nifty script generates the constants • May need to hand edit Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 25. Step 2: Make a stubs file • HpHp doesn’t have many binary extensions • But... the analyzer doesn’t care. Just make a stub function. // http://php.net/manual/en/function.filter-var.php function filter_var($var, $filter=0, $options=NULL) { return $var; } You can make stub classes as well. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 26. Step 3: Create the file list • Create a list of all php files to be analyzed and include your constants and stubs file. • Ignore phpunit and other tests • HpHp implements much of PHP base functionality as PHP code. (e.g. the Exception class is written in PHP). You need to add these system PHP as well Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 27. correction: grep -v helper.idl.php | grep -v constants.php >> $JUNK Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 28. Step 4: Do it Include paths are a bit mysterious. You’ll have to play around to get it right. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 29. Step 5: Analyze it • /tmp/hphp/Stats.js contains some... statistics in JSON format. • /tmp/hphp/CodeError.js is were the good stuff is. • JSON format, includes: Error type, file, line number, code snippet Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 30. UseUndeclaredVariable • #1 bug. • Typically typos, scoping or cut-n-paste errors • Found frequently in error handling cases if (!$ok) { error_log(“$user_id has a problem”); Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 31. TooManyArgument TooFewArgument Too Many Arguments typically indicates the caller is confused and has logic errors (bug). Too Few Arguments is frequently a serious bug as PHP silently fails and defaults to null. hash_hmac(‘sha1’, ‘foo’); // ooops no key Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 32. BadDefine UsesEvaluation define($k, $v); eval(“1+1”); • “Bad” since HpHp can’t compile it, but likely legal PHP. • Avoid using dynamic constant generation. Use configuration file instead. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 33. UseUndeclaredGlobalVariable • HpHp only defines certain globals. • Used only by Smarty? • $GLOBALS['HTTP_SERVER_VARS'] • $GLOBALS['HTTP_SESSION_VARS'] • $GLOBALS['HTTP_COOKIE_VARS'] Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 34. UseVoidReturn Some function returns “nothing” but the value is used function foo() { if (time() % 60 == 0) { return true; } // oops void } $now = foo(); // error Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 35. RequiredAfterOptionalParam function foo($first, $second=2, $third) { • IMHO should be a PHP syntax error • Confusing • (Oops, I haven’t investigated behavior) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 36. DeclaredConstantTwice • Probably not invalid PHP, but HpHp analyzes all files at once. • Best to have one file that defines constants or just not use them. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 37. UnknownFunction UnknownObjectMethod UnknownClass UnknownBaseClass • Is your file list complete? • Do you need to make stubs? Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 38. BadPHPIncludeFile Likely a PHP file trying to include/require itself or invalid file name or your autoloader is ambiguous. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 39. PHPIncludeFileNotFound • Really common • Probably unique to your autoloader. • Not sure I quite understand how HpHp computes file names and loads includes, requires, require_once... yet Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 41. Every Commit • Every commit gets checked in real-time • “try-server” also allows developers to test before committing. • Finds and prevents bugs before they go live every day. • Almost no false positives (!!) • Developers love it (especially the Java groups) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 42. Analysis • CodeError.js is processed through a custom script. • Has a large blacklist of checks or files we don’t care about (3rd party, known bad, etc). • File and line info pass through to git blame to find author and date/time. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 43. hphp-try runs in Jenkins oops Console Output gives details Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 44. Work in Progress • It took a lot of work to get the code base in shape so we could add pre-commit hook. • Over 200 real problems first identified. • We still have blacklisted some checks since we are still cleaning up legacy code (and figuring out how HpHp works) Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 45. Can We Do Better?
  • 46. Checks aren’t that complicated • HpHp’s runtime type-inference isn’t used for static analysis (good since type-inference is hard) • All checks are fairly simple book-keeping. • All could be done in CodeSniffer/AST but too slow Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 47. Slice off HpHp? • The HpHp Runtime is nice, but really complicated and a moving target. • Can we slice out the analysis part of HpHp? • Much simpler to build, easier to hack on. Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 48. Or Build New? • Can this run off “byte code” or hook into the parsing step of PHP? • Exec a snippet of PHP for the loading script files ? • Seems feasible Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 50. Thanks • The Facebook Team! • Sebastian Bergman who first blogged about using HpHp for static analysis • Rasmus who first hacked up a version of HpHp in house at Etsy • The QA and DevTools teams at Etsy • All the Etsy developers who had some painful weeks getting the code in shape! Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 51. Facebook References • https://github.com/facebook/hiphop-php Main source repo + wiki • http://developers.facebook.com/blog/post/ 2010/02/02/hiphop-for-php--move-fast/ Main announcement, 2010-02-02 • https://www.facebook.com/note.php? note_id=416880943919 Update 2012-08-13 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 52. Notes from Sebastian Bergman • http://sebastian-bergmann.de/archives/894- Using-HipHop-for-Static-Analysis.html Static Analysis Intro, 2010-07-27 • http://sebastian-bergmann.de/archives/918- Static-Analysis-with-HipHop-for-PHP.html Tool to help process output, 2012-01-27 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 53. Misc References • http://arstechnica.com/business/2011/12/ facebook-looks-to-fix-php-performance- with-hiphop-virtual-machine/ ArsTechnica overview, 2011-12-13 • http://www.serversidemagazine.com/news/ 10-questions-with-facebook-research- engineer-andrei-alexandrescu/ Lots of good stuff in here, 2012-01-29 Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 54. This Talk • These slides are posted at http://slidesha.re/KzTfLy • Tools for building on CentOS https://github.com/client9/hphp-tools • More about Nick Galbreath http://client9.com/ Nick Galbreath @ngalbreath PHPDay Verona, Italy 2012
  • 55. Nick Galbreath nickg@etsy.com @ngalbreath PHPDay Verona Italy May 19, 2012 http://2012.phpday.it/

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n