The document summarizes hacking the Ruby parser (parse.y) to add new syntax features to the language. It discusses four cases: 1) allowing the :-) syntax for hash rockets, 2) treating single quotes as symbols, 3) adding increment operator ++, and 4) allowing method definitions like def A#b. The presenter explains how changes can be made to the lexer and parser to implement these features by modifying parse.y and related files.
16. Colons in Ruby
•A::B, ::C
•:symbol, :"sy-m-bol"
•a ? b : c
•{a: b}
•when 1: something (in 1.8)
17. static int
parser_yylex(struct parser_params *parser) {
...
switch (c = nextc()) {
...
case '#': /* it's a comment */
...
case ':':
c = nextc();
if (c == ':') {
if (IS_BEG() ||...
...
}
... (about 1300 lines)
18. How does parser deal
with colon?
•:: → tCOLON2 or tCOLON3
•tCOLON2 Net::URI
•tCOLON3 ::Kernel
19. enum lex_state_e {
EXPR_BEG, /* ignore newline, +/- is a sign. */
EXPR_END, /* newline significant, +/- is an operator. *
EXPR_ENDARG, /* ditto, and unbound braces. */
EXPR_ARG, /* newline significant, +/- is an operator. *
EXPR_CMDARG, /* newline significant, +/- is an operator. *
EXPR_MID, /* newline significant, +/- is an operator. *
EXPR_FNAME, /* ignore newline, no reserved words. */
EXPR_DOT, /* right after `.' or `::', no reserved words
EXPR_CLASS, /* immediate after `class', no here document.
EXPR_VALUE /* alike EXPR_BEG but label is disallowed. */
};
lex_state
20. case ':':
c = nextc();
if (c == ':') {
if (IS_BEG() ||
lex_state == EXPR_CLASS ||
(IS_ARG() && space_seen)) {
lex_state = EXPR_BEG;
return tCOLON3;
}
lex_state = EXPR_DOT;
return tCOLON2;
}
44. #
(in parser_yylex)
case '#': /* it's a comment */
/* no magic_comment in shebang line */
if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) {
if (comment_at_top(parser)) {
set_file_encoding(parser, lex_p, lex_pend);
}
}
lex_p = lex_pend;
45. #
(in parser_yylex)
case '#': /* it's a comment */
c = nextc();
pushback(c);
if(lex_state == EXPR_END && ISALNUM(c)) return '#';
/* no magic_comment in shebang line */
if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) {
if (comment_at_top(parser)) {
set_file_encoding(parser, lex_p, lex_pend);