1. Parsing JSON with
a single regex
brian d foy
Houston Perl Mongers, October 17, 2013
2. Mastering Perl, 2e
• Read for free now
• http://chimera.labs.oreilly.com/
books/1234000001527/index.html
• http://goo.gl/lmqAKX
• This stuff is in Chapter 2
3. Randal is wicked
• JSON is on a single line (minimized)
• ASCII only
• Fails very quickly
• Doesn't handle everything
• Uses many advanced regex features
• http://www.perlmonks.org/?node_id=995856
7. sub from_json {
local $_ = shift;
local $^R;
eval { m{A$FROM_JSONz}; } and return $_;
die $@ if $@;
return 'no match';
}
while (<>) {
chomp;
print Dumper from_json($_);
}
11. • Uses grammars: (?(DEFINE))
• Recurses: (?&KV), et alia
• Runs code during the regex: (?{ ... })
• Builds up a data structure: $^R
• At the end, replaces the string with a data
structure: (?{ $_ = $^R->[1] })
12. $_ =<<'HERE';
Amelia said "I am a camel"
HERE
say "Matched [$+{said}]!" if m/
( ['"] )
(?<said>.*?)
( ['"] )
/x;
13. $_ =<<'HERE';
Amelia said 'I am a camel'
HERE
say "Matched [$+{said}]!" if m/
( ['"] )
(?<said>.*?)
( 1 )
/x;
14. $_ =<<'HERE';
Amelia said 'I am a camel'
HERE
say "Matched [$+{said}]!" if m/
( ['"] )
(?<said>.*?)
(?1)
/x;
15. $_ =<<'HERE';
Amelia said 'I am a camel"
HERE
say "Matched [$+{said}]!" if m/
( ['"] )
(?<said>.*?)
(?1)
/x;
16. $_ =<<'HERE';
He said 'Amelia said "I am a camel"'
HERE
say "Matched [$+{said}]!" if m/
( ['"] )
(?<said>.*?)
(?1)
# Matches wrong quote!
/x;
17. $_ =<<'HERE';
He said 'Amelia said "I am a camel"'
HERE
say "Matched [$+{said}]!" if m/
(?<said>
(?<quote>['"])
(?:
[^'"]++
|
(?<said> (?1) )
)*
g{quote}
)
/x;
# $1
18. $_ =<<'HERE';
Out "Top 'Middle "Bottom" Middle' Out"
HERE
say "Matched [$+{said}]!" if m/
(?<said>
(?<quote>['"])
(?:
[^'"]++
|
(?R)
)*
g{quote}
)
(?{ say "Inside regex: $+{said}" })
/x;
19. $_ =<<'HERE';
Out "Top 'Mid "Bottom" Mid' Out"
HERE
say "Matched [$+{said}]!" if m/
(?(DEFINE)
(?<QUOTE> ['"])
(?<NOT_QUOTE> [^'"])
)
(?<said>
(?<quote>(?"E))
(?:
(?&NOT_QUOTE)++
|
(?R)
)*
g{quote}
)
(?{ say "Inside regex: $+{said}" })
/x;
20. my @matches;
say "Matched!" if m/
(?(DEFINE)
(?<QUOTE_MARK> ['"])
(?<NOT_QUOTE_MARK> [^'"])
)
(
(?<quote>(?"E_MARK))
(?:
(?&NOT_QUOTE_MARK)++
|
(?R)
)*
g{quote}
)
(?{ push @matches, $^N })
/x;