Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

PHP Internals and Virtual Machine

24 vues

Publié le

PHP Internals and Virtual Machine

Publié dans : Internet
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

PHP Internals and Virtual Machine

  1. 1. 1 PHP's Virtual Machine
  2. 2. 2 Hello ● Julien PAULI ● Programming in PHP since early 2000s ● PHP Internals hacker and trainer ● PHP 5.5/5.6 Release Manager ● Working at SensioLabs in Paris - Blackfire ● Writing PHP tech articles and books ● http://phpinternalsbook.com ● @julienpauli - github.com/jpauli - jpauli@php.net ● Like working on OSS such as PHP :-)
  3. 3. 3 PHP ● A program in itself ● Written in C ● Goal : Define a programming Web language ● High level, interpreted ● Interpreted language ● Less efficient than native-instr compiled language ● but simpler to handle
  4. 4. 4 PHP
  5. 5. 5 PHP from inside ● A software virtual machine ● Compiler/Executor ● intermediate OPCode ● Mono Thread, Mono process ● Automatic dynamic memory management ● Memory Manager ● Garbage collector
  6. 6. 6 Zend Engine ● The heart of PHP ● An extensible part ● extensions and zend_extensions can change it ● A Virtual Machine ● A compiler ● An executor ● Some utilities ● OPCache ● A Zend extension that plays with the engine deeply ● Compiler optimizer is stored into OPCache
  7. 7. 7 Request treatment steps ● Startup (memory allocations) ● Compilation ● Lexical and syntaxic analysis ● Compilation (OP Code generation) ● Execution ● OPCode interpretation ● Several VM flavors ● Include/require/eval = go back to compilation ● Shutdown (free resources) ● "Share nothing architecture" Startup Shutdown zend_compile_file() zend_execute()
  8. 8. 8 Script execution ● Compilation ● Optmization (OPCache) ● Execution ● Destruction
  9. 9. 9 Lexical analysis (lexing) ● Characters recognition ● Transform chars to tokens ● Lexer generator : Re2c ● http://re2c.org/ ● http://www.php.net/manual/fr/tokens.php ● highlight_file() ● highlight_string() ● compile_file() ● compile_string()
  10. 10. 10 Sementical analysis (parsing) ● "Understands" a set of tokens ● Defines the language syntax ● Parser generator : GNU/Bison (LALR) ● Foreach token or token set ● → Execute a function to generate an AST statement ● → Goto next token ● → Can generate "Parse error" and halt ● Very tied to lexical analyzer
  11. 11. 11 zend_language_parser.y ● ext/tokenizer statement: '{' inner_statement_list '}' { $$ = $2; } | if_stmt { $$ = $1; } | alt_if_stmt { $$ = $1; } | T_WHILE '(' expr ')' while_statement { $$ = zend_ast_create(ZEND_AST_WHILE, $3, $5); } | T_DO statement T_WHILE '(' expr ')' ';' { $$ = zend_ast_create(ZEND_AST_DO_WHILE, $2, $5); } | T_FOR '(' for_exprs ';' for_exprs ';' for_exprs ')' for_statement { $$ = zend_ast_create(ZEND_AST_FOR, $3, $5, $7, $9); } | T_SWITCH '(' expr ')' switch_case_list { $$ = zend_ast_create(ZEND_AST_SWITCH, $3, $5); } | T_BREAK optional_expr ';' { $$ = zend_ast_create(ZEND_AST_BREAK, $2); } | T_CONTINUE optional_expr ';' { $$ = zend_ast_create(ZEND_AST_CONTINUE, $2); } | T_RETURN optional_expr ';' { $$ = zend_ast_create(ZEND_AST_RETURN, $2); } $(YACC) -p zend -v -d $(srcdir)/zend_language_parser.y -o zend_language_parser.c
  12. 12. 12 Compilation ● Invoked on final AST ● Userland AST: https://github.com/nikic/php-ast ● Creates an OPCodes array ● OPCode = low level VM instruction ● Somehow similar to low level assembly ● Compilation step is very heavy ● Lots of checks and memory accesses ● address resolutions and computations ● many stacks and memory pools ● Some early optimizations/computations are performed
  13. 13. 13 Optimization ● Optimizations are done by ext/opcache ● The optimizer is very heavy (in PHP 7) ● Steps are defined in opcache.optimization_level INI setting #define ZEND_OPTIMIZER_PASS_1 (1<<0) /* CSE, STRING construction */ #define ZEND_OPTIMIZER_PASS_2 (1<<1) /* Constant conversion and jumps */ #define ZEND_OPTIMIZER_PASS_3 (1<<2) /* ++, +=, series of jumps */ #define ZEND_OPTIMIZER_PASS_4 (1<<3) /* INIT_FCALL_BY_NAME -> DO_FCALL */ #define ZEND_OPTIMIZER_PASS_5 (1<<4) /* CFG based optimization */ #define ZEND_OPTIMIZER_PASS_6 (1<<5) /* DFA based optimization */ #define ZEND_OPTIMIZER_PASS_7 (1<<6) /* CALL GRAPH optimization */ #define ZEND_OPTIMIZER_PASS_8 (1<<7) /* SCCP (constant propagation) */ #define ZEND_OPTIMIZER_PASS_9 (1<<8) /* TMP VAR usage */ #define ZEND_OPTIMIZER_PASS_10 (1<<9) /* NOP removal */ #define ZEND_OPTIMIZER_PASS_11 (1<<10) /* Merge equal constants */ #define ZEND_OPTIMIZER_PASS_12 (1<<11) /* Adjust used stack */ #define ZEND_OPTIMIZER_PASS_13 (1<<12) /* Remove unused variables */ #define ZEND_OPTIMIZER_PASS_14 (1<<13) /* DCE (dead code elimination) */ #define ZEND_OPTIMIZER_PASS_15 (1<<14) /* (unsafe) Collect constants */ #define ZEND_OPTIMIZER_PASS_16 (1<<15) /* Inline functions */
  14. 14. 14 First easy example <?php print 'foo';
  15. 15. 15 Compilation easy example <?php print 'foo'; <ST_IN_SCRIPTING>"print" { return T_PRINT; } T_PRINT expr { $$ = zend_ast_create(ZEND_AST_PRINT, $2); } lexing parsing
  16. 16. 16 Compilation easy example case ZEND_AST_PRINT: zend_compile_print(result, ast); return; compiling T_PRINT expr { $$ = zend_ast_create(ZEND_AST_PRINT, $2); } void zend_compile_print(znode *result, zend_ast *ast) /* {{{ */ { zend_op *opline; zend_ast *expr_ast = ast->child[0]; znode expr_node; zend_compile_expr(&expr_node, expr_ast); opline = zend_emit_op(NULL, ZEND_ECHO, &expr_node, NULL); opline->extended_value = 1; result->op_type = IS_CONST; ZVAL_LONG(&result->u.constant, 1); }
  17. 17. 17 Execution ● Execute OPCodes ● Most complex part of Zend Engine ● VM executor ● zend_vm_execute.h ● Each OPCode ● is run through a handler() function ● "zend_vm_handler" ● runs the instructions in an infinite dipatch loop ● Branching possibles (loops, catch blocks, gotos, etc...) Startup Shutdown zend_compile_file() zend_execute()
  18. 18. 18 ZEND_ECHO ZEND_VM_HANDLER(40, ZEND_ECHO, CONST|TMPVAR|CV, ANY) { USE_OPLINE zend_free_op free_op1; zval *z; SAVE_OPLINE(); z = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R); if (Z_TYPE_P(z) == IS_STRING) { zend_string *str = Z_STR_P(z); if (ZSTR_LEN(str) != 0) { zend_write(ZSTR_VAL(str), ZSTR_LEN(str)); } } else { zend_string *str = _zval_get_string_func(z); if (ZSTR_LEN(str) != 0) { zend_write(ZSTR_VAL(str), ZSTR_LEN(str)); } else if (OP1_TYPE == IS_CV && UNEXPECTED(Z_TYPE_P(z) == IS_UNDEF)) { GET_OP1_UNDEF_CV(z, BP_VAR_R); } zend_string_release(str); }
  19. 19. 19 OPCode ? ● php -dzend_extension=opcache -dopcache.enable_cli=1 -dopcache.opt_debug_level=0x30000 /tmp/script.php L0 (3): CV0($options) = RECV 1 L1 (5): CV1($invalid) = QM_ASSIGN array(...) L2 (6): V4 = FE_RESET_R CV0($options) L18 L3 (6): T5 = FE_FETCH_R V4 CV2($value) L18 L4 (6): ASSIGN CV3($key) T5 ...
  20. 20. 20 Many OPCodes ● OPCodes are low level VM instructions ● Many of them, more and more as PHP evolves ● ~ 200 flavors in PHP 7 ● See the list from Zend/zend_vm_opcodes.h ● ZEND_ADD ● ZEND_DECLARE_ANON_CLASS ● ZEND_FE_RESET ● ZEND_ADD_TRAIT ● ZEND_YIELD_FROM ● ...
  21. 21. 21 OPCode handlers ● Each OPCode is treated by a handler (a function) ● It takes up to 3 arguments and produces exactly one result ● Arguments and result are "variable" like you know them ● ZEND_ADD(num1, num2) : result_num ● ZEND_DECLARE_ANON_CLASS(class) : result_bool ● ZEND_FE_RESET(array_or_object) : result ● ZEND_ADD_TRAIT(class, trait) : result_bool ● ZEND_YIELD_FROM(cur_gen, gen_from) : result
  22. 22. 22 ZEND_CONCAT example ZEND_VM_HANDLER(8, ZEND_CONCAT, CONST|TMPVAR|CV, CONST|TMPVAR|CV) { USE_OPLINE zend_free_op free_op1, free_op2; zval *op1, *op2; op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R); op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R); if ((OP1_TYPE == IS_CONST || EXPECTED(Z_TYPE_P(op1) == IS_STRING)) && (OP2_TYPE == IS_CONST || EXPECTED(Z_TYPE_P(op2) == IS_STRING))) { zend_string *op1_str = Z_STR_P(op1); zend_string *op2_str = Z_STR_P(op2); zend_string *str; if (OP1_TYPE != IS_CONST && OP1_TYPE != IS_CV && !ZSTR_IS_INTERNED(op1_str) && GC_REFCOUNT(op1_str) == 1) { size_t len = ZSTR_LEN(op1_str); str = zend_string_extend(op1_str, len + ZSTR_LEN(op2_str), 0); memcpy(ZSTR_VAL(str) + len, ZSTR_VAL(op2_str), ZSTR_LEN(op2_str)+1); ZVAL_NEW_STR(EX_VAR(opline->result.var), str); FREE_OP2(); } ...
  23. 23. 23 More complex example function check_options(array $options) { $invalid = []; foreach ($options as $key => $value) { if (array_key_exists($key, $this->options)) { $this->options[$key] = $value; } else { $invalid[] = $key; } } } L0 (4): CV0($options) = RECV 1 L1 (6): ASSIGN CV1($invalid) array(...) L2 (7): V5 = FE_RESET_R CV0($options) L18 L3 (7): T6 = FE_FETCH_R V5 CV2($value) L18 L4 (7): ASSIGN CV3($key) T6 L5 (8): INIT_FCALL 2 112 string("array_key_exists") L6 (8): SEND_VAR CV3($key) 1 L7 (8): T8 = FETCH_OBJ_R THIS string("options") L8 (8): SEND_VAL T8 2 L9 (8): V9 = DO_ICALL L10 (8): JMPZ V9 L15 L11 (9): V10 = FETCH_OBJ_W THIS string("options") L12 (9): ASSIGN_DIM V10 CV3($key) L13 (9): OP_DATA CV2($value) L14 (9): JMP L17 L15 (11): ASSIGN_DIM CV1($invalid) NEXT L16 (11): OP_DATA CV3($key) L17 (7): JMP L3 L18 (7): FE_FREE V5 L19 (14): RETURN null
  24. 24. 24 OPCode Cache ● First time ● Compile ● Cache to SHM or cache file ● Execute ● Then, if file did not change ● Load from SHM or cache file ● Execute ● Compilation is very heavy ● Optimization can be as well
  25. 25. 25 Compilation / Execution function foo() { $data = file('/etc/fstab'); sort($data); return $data; } for($i=0; $i<=$argv[1]; $i++) { $a = foo(); $a[] = range(0, $i); $result[] = $a; } var_dump($result); main()==>run_init::tmp/php.php//1 241 main()==>compile::tmp/php.php//1 89 main()==>run_init::tmp/php.php//1 1731 main()==>compile::tmp/php.php//1 89 argv = 1 argv = 10
  26. 26. 26 Other topics in quick
  27. 27. 27 CLI ● Even if more and more used, PHP has not been designed to run in CLI (for long running scripts) ● In long run CLI ("consumers"), the VM never stops ● PHP never stops, thus never reaches its "cleaning memory step" ● The "current request" memory is then never freed ● Even with GC on, the programmer has to really take care not to create "memory leaks" ● And for that he has to master how PHP works internally ● Or use a low level memory debugger, like valgrind/massif ● OPCode caches and optimizers are pretty useless to CLI ● Optimization can be worth it ● Compilation prevention is useless as runtime will take a lot
  28. 28. 28 JIT ? ● JIT is a complex topic, and coming to PHP 8 ● Still under development ● It should accelerate very CPU intensive tasks ● Aka : not web applications, usually ● Until you really treat that many data per request, which you shouldn't do anyway with PHP. ● But CLI scripts will mainly benefit from JIT (composer ?) ● Take care as it wont accelerate any IO intensive tasks ● And we tend to run some using PHP nowadays ● (Aka "event loops" and things like that)
  29. 29. 29 PHP's memory consumption ● Know what you are talking about and what you're doing ● Know your OS and memory allocators ● memory_get_usage(): size used by your runtime code ● memory_get_usage(true): size allocated through the OS ● ZendMM caches blocks ● use gc_mem_caches() to reclaim them if needed ● Use your OS to be accurate php> echo memory_get_usage(); 625272 php> echo memory_get_usage(1); 786432 cat /proc/13399/status Name:php State: S (sleeping) VmPeak: 154440 kB VmSize: 133700 kB VmRSS: 10304 kB VmData: 4316 kB VmStk: 136 kB VmExe: 9876 kB
  30. 30. 30 A software VM ● PHP internally works the same as ● Java ● Python ● Ruby ● Lua ● [... others ] ● But PHP's VM is not threaded, it runs a monolithic path ● PHP's VM compiler/optimizer/interpreter are merged into PHP source code ● zend_compile.c ● ext/opcache/Optimizer/zend_optimizer.c ● zend_vm_def.h / zend_execute.c
  31. 31. 31 Thank you for listening

×