Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Hacking Go Compiler Internals / GoCon 2014 Autumn

🍣

  • Identifiez-vous pour voir les commentaires

Hacking Go Compiler Internals / GoCon 2014 Autumn

  1. 1. Hacking Go Compiler Internals Moriyoshi Koizumi <mozo@mozo.jp>
  2. 2. Intended Audience • An eccentric Go programmer who happens to want to add feture XX to the language, knowing her patch will never be merged. • A keen-minded programmer who wants to know how the compiler works.
  3. 3. Overall Architecture Lexer Parser Escape Analysis Typegen GCproggen Codegen
  4. 4. Phase 1. Lexer
  5. 5. Lexer • A lexer scans over the source code and cut it into a bunch of meaningful chunks (the first abstraction). • Example: a := b + c() LNAME LASOP LNAME + LNAME ( )
  6. 6. Lexer src/go/cmd/gc/lexer.c static int32 _yylex(void) { ... l0: c = getc(); if(yy_isspace(c)) { if(c == 'n' && curio.nlsemi) { ungetc(c); DBG("lex: implicit semin"); return ';'; } goto l0; } ...
  7. 7. Lexer ... switch(c) { ... case '+': c1 = getc(); if(c1 == '+') { c = LINC; goto lx; } if(c1 == '=') { c = OADD; goto asop; } break; .... }
  8. 8. When do you want to hack the lexer • Modify the keyword such as func and make. • Modify the operator only cosmetically (e.g. != → ~=) • Modify how literals and identifiers are represented. • Add a new keyword or operator to the language to later use in the parser.
  9. 9. Example: Emojis for identifiers • http://moriyoshi.hatenablog.com/entry/2014/06/0 3/121728 • Go doesn’t treat emojis as part of identifiers. ./sushi.go:8: invalid identifier character U+1f363 • But I wanted to have 寿司(in the source)
  10. 10. Example: Emojis for identifiers • Patched the following place to let it accept emojis: if(c >= Runeself) { ungetc(c); rune = getr(); // 0xb7 · is used for internal names if(!isalpharune(rune) && !isdigitrune(rune) && (importpkg == nil || rune != 0xb7)) yyerror("invalid identifier character U+%04x", rune); cp += runetochar(cp, &rune); } else if(!yy_isalnum(c) && c != '_') break;
  11. 11. Phase 2. Parser
  12. 12. Parser • Parser repeatedly calls the lexer to fetch the tokens and builds an abstract syntax tree (AST) that represents the source code. • The AST is retouched (“typecheck”and “walk” sub-phase) during type inference and assersion phase so it would be less verbose and contain information helpful for the later stages. • src/cmd/gc/go.y, src/cmd/gc/dcl.c src/cmd/gc/typecheck.c, src/cmd/gc/walk.c, src/cmd/gc/reflect.c
  13. 13. Parser LNAME LASOP LNAME + LNAME ( ) OAS ONAME OADD ONAME OCALL ONAME ∅ Tokens AST
  14. 14. Parser • src/cmd/gc/go.y … /* * expressions */ expr: uexpr | expr LOROR expr { $$ = nod(OOROR, $1, $3); } | expr LANDAND expr { $$ = nod(OANDAND, $1, $3); } …
  15. 15. Example: Bracket operator overload! • Let the following code (A) expand to (B) • https://gist.github.com/moriyoshi/c0e2b2f9be688 3e33251 (A) (B) a := &struct{}{} fmt.Println(a[1]) a[1] = "test2" fmt.Println(a.__getindex(1)) a.__setindex(1, "test2")
  16. 16. Example: Bracket operator overload! • Things to do: • Introduce a new AST node type (e.g. OINDEXINTER) • Add a branch point in “typecheck” to handle the case where the indexed target is neither a string, array, slice nor map type. • Supply a code in “walk” to specially treat the assignment and dereference that involves that kind of node. The code synthesizes the node to invoke the special functions, then typecheck and walk over themselves in a recursive manner. • Don’t forget to take care of evaluation order corrector.
  17. 17. Helpful functions to debug your hack • print(const char *, …) • This is actually printf() of standard libc. • Accepts the following extra format specifiers: • %N (node) • %T (type) • %E, %J, %H, %L, %O, %S, %V, %Z, %B, %F
  18. 18. Roll-up • Go’s compiler internals should look complex at first glance, but it would turn out pretty straightforward and hacker-friendly ;)

×