Map, Grep, and Sort are powerful Perl built-in functions that can simplify your code and improve performance.
New Perl users often skip learning about these three functions because they are considered an advanced topic and may not use them to their full advantage when they do use them. Experienced Perl users exploit these functions in amazing ways, but often the result is a confusing mess of unreadable and thus unmaintainable code. Refactoring existing code into these functions where appropriate can significantly improve the quality of the code *and* improve performance as well.
This talk will present how to take advantage of these powerful functions and have the resulting code understandable.
6. Copyright 2014 Daina Pettit
map, grep, sort – slide 6
General Form—code blocks
Damian Conway in Perl Best Practices*
recommends:
“Always use a block with a map and grep”
This is a syntactic aid suggestion to help you
prevent yourself from making an error with grouping
arguments. Block enclosures actually incur more
overhead. Not much, but some.
*Conway, Damian, Perl Best Practices, O'Reilly Media, Sebastopol, CA, 2005, pp 169-170.
@array = map { exp } @list;
@array = grep { exp } @list;
29. Copyright 2014 Daina Pettit
map, grep, sort – slide 29
Boolean Scalar Context
● Anywhere in perl where a true/false is expected
—if, while, and, or, not, &&, ||, !, etc.
● Evaluation results in 0, “0”, 0.0, “”, or undef then
it is false. Everything else is true.
if ( 0 ) {} # False
if ( 400 ) {} # True
if ( 1 ) {} # True
if ( "false" ) {} # True!
if ( "00" ) {} # True!
undef $x;
if ( $x ) {} # False
30. Copyright 2014 Daina Pettit
map, grep, sort – slide 30
Examples of grep
● Expression can be any valid perl expression.
● Expression is in scalar boolean context.
@ones = grep { $_ < 10 } @numbers;
@dirs = grep { d } @files;
@no_dup = grep { ! $h{$_}++ } @old;
@errors = grep { /error/i } @log;
@true = grep { $_ } @all;
31. Copyright 2014 Daina Pettit
map, grep, sort – slide 31
Sorting Basics
Sort can be called in three ways:
1. With no comparison directives
2. With a subroutine that returns comparison
directives
3. With a code block (an anonymous subroutine) that
returns comparison directives
@sorted = sort @unsorted;
@sorted = sort sub @unsorted;
@sorted = sort { exp } @unsorted;
32. Copyright 2014 Daina Pettit
map, grep, sort – slide 32
Sorting Basics
Sort requires the comparison directives value of -1, 0, or
1 to tell whether any two elements, $a and $b, are in
order (-1), the same (0), or out of order (1).
cmp and <=> conveniently provide this for string or
numeric comparisons, respectively.
We don't have to use cmp and <=>. We just have to
return -1, 0, or 1.
$a <=> $b
40. Copyright 2014 Daina Pettit
map, grep, sort – slide 40
Complicated Sorting
We can sorting with multiple keys such as sort
by year, then by month, then by day even if the
data is mm-dd-yyyy.
@sorted_dates = sort {
( $ma, $da, $ya ) = split //, $a;
( $mb, $db, $yb ) = split //, $b;
$ya<=>$yb || $ma<=>$mb || $da<=>$db;
} @dates;
48. Copyright 2014 Daina Pettit
map, grep, sort – slide 48
Optimizing sort
Now use map to extract just element 0 and we
are back to the original list and sorted by date.
This is known as the Schwartzian Transform.*
*Perl idiom named for Randal Schwartz, author of Learning Perl, coined by Tom Christiansen.
@order =
map { $_>[0] }
sort { $a>[1] <=> $b>[1] }
map { [ $_, M ] } @files;
“x.pl” “file1” “5.dat” “file7” “a.out”
53. Copyright 2014 Daina Pettit
map, grep, sort – slide 53
Optimizing sort—Orcish Maneuver*
Uses “or” cache (in a hash) to remember values
already computed: ||=
● Simpler than ST
● Almost as fast as ST
● Faster if list contains duplicates
*Term coined by Joseph Hall in Effective Perl Programming, Addison-Wesley Professional, Boston, MA, 1998.
@order = sort {
( $cache{$a} ||= M $a ) <=>
( $cache{$b} ||= M $b ) }
@files;
58. Copyright 2014 Daina Pettit
map, grep, sort – slide 58
Optimizing sort—Guttman-Rosler Transform*
This is a tweak on ST. Takes advantage of
substr and sprintf being faster than array
manipulation. Also uses default string sort which
is slightly faster.
*A Fresh Look at Efficient Perl Sorting, Uri Guttman and Larry Rosler, approx. 1999.
@order = map { substr $_, 10 }
sort
map { m#(d{4})/(d+)/(d+)#;
sprintf "%d%02d%02d%s",
$1, $2, $3, $_
} @dates;
60. Copyright 2014 Daina Pettit
map, grep, sort – slide 60
Further List & Sort Options
List::Util
shuffle, reduce, any, first, max, min, ...
List::MoreUtils
uniq, natatime, ...
Sort::Key
May be faster than ST or GRT
Sort::Naturally
Automatically sorts numeric when appropriate
Sort::Maker
Internally uses OM, ST, or GRT.