23. Next time, PLAN!
Harvesting 6.4GB is slow
Parsing 6.4GB is slower
24. Next time, PLAN!
Harvesting 6.4GB is slow
Parsing 6.4GB is slower
● Especially in PHP
25. Next time, PLAN!
Harvesting 6.4GB is slow
Parsing 6.4GB is slower
● Especially in PHP
Running grep because you’ve forgotten to
extract data beforehand is slow AND stupid
26. Next time, PLAN!
Most data is available
Extraction is still running for 1-grams...
27. Next time, PLAN!
sed s/=''/=''/g $filename | sed s/'' /'''' /g | sed "s/$/;/g" | sed
"s/([a-z])'(s)/1'2/g" | sed "s/([A-Z])'(s)/1'2/g" | sed "s/([a-z])'
(l)/1'2/g" | sed "s/([a-z])'(r)/1'2/g" | sed "s/(n)'(t)/1'2/g" |
sed "s/(o)'(c)/1'2/g" | sed "s/(e)'(v)/1'2/g" | sed "s/(I)'(v)
/1'2/g" | sed "s/(u)'(v)/1'2/g" | sed "s/([a-z])'([A-Z])/1'2/g" |
sed "s/(O)'([a-z])/1'2/g" | sed "s/(O)'([A-Z])/1'2/g" | sed "s/(I)'
(m)/1'2/g" | sed "s/([A-Z])'(l)/1'2/g" | sed "s/([a-z])'([a-z])
/1'2/g" | sed "s/([a-z])'-([a-z])/1'-2/g" | sed "s/([A-Z])'([A-Z])
/1'2/g" | sed "s/([A-Z])'([a-z])/1'2/g" | sed "s/'([a-z])'([a-z])
/'1'2/g" | sed "s/-'n'/-'n'/g" | sed "s/-'([a-z])/-'1/g" | sed
"s/-o'-/-o'-/g" | sed "s/ght'-le/ght'-le/g" | sed "s/cats'-meat/cats'-meat/g"
| sed "s/n'-roll/n'-roll/g" | sed "s/sou'-w/sou'-w/g" | sed "s/gleaf'-
for/gleaf'-for/g"