This document provides an overview of higher-order functions in Python. It discusses functions as parameters, examples of higher-order functions like map, filter and reduce, and how they work. It also covers anonymous functions, examples and problems demonstrating the use of map, filter and reduce. Additional topics covered include regular expressions, metacharacters, and solving problems using regex patterns.
2. Slide 2
Functions as parameters
• Have you ever wanted to pass an entire function as a
parameter
• Python has functions as first-class citizens, so you can do
this
• You simply pass the functions by name
3. Slide 3
Higher-Order Functions
• A higher-order function is a function that takes another
function as a parameter
• They are “higher-order” because it’s a function of a function
• Examples
– Lambda
– Map
– Reduce
– Filter
• Lambda works great as a parameter to higher-order
functions if you can deal with its limitations
4. Slide 4
Anonymous functions
• Anonymous functions also called lambda functions
• The difference between lambda functions and
regular Python functions is that lambda functions
evaluate a single expression and return a function.
• This means that lambda functions cannot use
statements such as conditions or even the return
keyword.
5. Slide 5
Map
map(function, iterable, ...)
• Map applies function to each element of iterable
and creates a list of the results
• You can optionally provide more iterables as
parameters to map and it will place tuples in the
result list
• Map returns an iterator which can be cast to list
7. Slide 7
Map Problem
Goal: given a list of three dimensional points in the
form of tuples, create a new list consisting of the
distances of each point from the origin
Loop Method:
- distance(x, y, z) = sqrt(x**2 + y**2 + z**2)
- loop through the list and add results to a new list
8. Slide 8
Map Problem
Solution
1
2
3
4
5
6
7
8
9
from math import sqrt
points = [(2, 1, 3), (5, 7, -3), (2, 4, 0), (9, 6,
8)]
def distance(point) :
x, y, z = point
return sqrt(x**2 + y**2 + z**2)
distances = list(map(distance, points))
9. Slide 9
Filter
filter(function, iterable)
• The filter runs through each element of iterable (any
iterable object such as a List or another collection)
• It applies function to each element of iterable
• If function returns True for that element then the
element is put into a List
• This list is returned from filter in versions of python under
3
• In python 3, filter returns an iterator which must be cast
to type list with list()
11. Slide 11
Filter Problem
NaN = float("nan")
scores = [[NaN, 12, .5, 78, math.pi],
[2, 13, .5, .7, math.pi / 2],
[2, NaN, .5, 78, math.pi],
[2, 14, .5, 39, 1 - math.pi]]
Goal: given a list of lists containing answers to an
algebra exam, filter out those that did not submit a
response for one of the questions, denoted by NaN
12. Slide 12
Filter Problem
Solution
1
2
3
4
5
6
7
8
9
0
1
2
3
4
NaN = float("nan")
scores = [[NaN, 12, .5, 78, pi],[2, 13, .5, .7, pi / 2],
[2,NaN, .5, 78, pi],[2, 14, .5, 39, 1 - pi]]
#solution 1 - intuitive
def has_NaN(answers) :
for num in answers :
if isnan(float(num)) :
return False
return True
valid = list(filter(has_NaN, scores))
print(valid)
#Solution 2 – sick python solution
valid = list(filter(lambda x : NaN not in x, scores))
print(valid)
13. Slide 13
Reduce
reduce(function, iterable[,initializer])
• Reduce will apply function to each element in iterable
along with the sum so far and create a cumulative sum of the
results
• function must take two parameters
• If initializer is provided, initializer will stand as the first
argument in the sum
• Unfortunately in python 3 reduce() requires an import
statement
• from functools import reduce
15. Slide 15
Reduce Problem
Goal: given a list of numbers I want to find the
average of those numbers in a few lines using
reduce()
For Loop Method:
- sum up every element of the list
- divide the sum by the length of the list
17. Slide 17
MapReduce
A framework for processing huge datasets on certain kinds of distributable
problems.
MapReduce is a Hadoop framework used for writing applications that can
process vast amounts of data on large clusters. It can also be called a
programming model in which we can process large datasets across
computer clusters.
Hadoop
MapReduce
Works
18. Slide 18
MapReduce
There are two primary tasks in MapReduce: map and reduce.
Map Step:
- master node takes the input, chops it up into smaller sub-
problems, and distributes those to worker nodes.
- worker node may chop its work into yet small pieces and
redistribute again
Reduce Step:
- master node then takes the answers to all the sub-
problems and combines them in a way to get the output
19. Slide 19
MapReduce
Problem: Given an email how do you tell if it is spam?
- Count occurrences of certain words. If they
occur too frequently the email is spam.
21. Slide 21
Regular Expressions
Specialized programming language embedded in Python
Based on rules definition, and applying to a string
Useful for complex pattern matching
For simple pattern matching better to use string find/search
Also called regex, and is universal to many languages
22. Slide 22
Regular Expressions : Steps
1. Submit a regular expression string
2. Compile this string to get a regex byte code
3. Apply the compiled regex byte code to target string
4. Obtain results of the match
5. Use the results
23. Slide 23
Regular Expressions : Steps
import re
pattern= re.compile( <regex expression> )
matchObject=pattern.match(“target String” )
if( matchObject) :
# use matchObject for various operations
24. Slide 24
Regular Expressions
The match object has following methods
group( ) # will return the matching substring
start ( ) # will return the starting position of the match
end( ) # will return the position where the match ended
span( ) # will return a tuple of start and end positions
import re
pattern= re.compile( “A..T” ) # . matches any one character
matchObject=pattern.match( “ACGTAAT”)
if( matchObject) : # if match was found
print(matchObject.group( ) ) # displays ‘ACGT’
print(matchObject.start(), matchObject.end() ) # 0 4
start and end are positions relative to the string provided
25. Slide 25
Regular Expressions
pattern= re.compile( “A..T” )
Compiles regular expression “A..T” to a regex object - pattern
This object has methods to apply the byte code to any string
pattern.match( ) pattern.search( ) pattern.findall( )
The regex object methods are detailed below
match( “in this string”) # matches from beginning
search( “in this string”) # searches through the string
These function return match objects. A match object has methods to use the search results
findall ( “ in this string “) # returns a list of matching strings
26. Slide 26
Regular Expressions : Metacharacters
. matches any one character except n
w matches any alphanumeric character
W matches any non-alphanumeric character
d matches any digit
D matches any non digit
s matches any white space character
S matches any non white space character
27. Slide 27
Regular Expressions : Metacharacters
[ ] any one character from this set
[ AT ] means either A or T is matched
[ A-G ] matches any character between A to G ( inclusive )
[^ AT ] means any character except from this set
( ) used for grouping expressions
| for using or condition
for negating meaning of metacharacter
28. Slide 28
Regular Expressions : Metacharacters
^ match in the beginning of the string
$ match at end of the string
* Zero or more occurrence of previous character
+ One or more occurrence of previous character
? Zero or one occurrence of previous character
{m,n} minimum m occurrences and maximum n occurrences
29. Slide 29
Metacharacters : Examples
“d[A-Z]..”
This would match the following:
9G99 5DD0 4B23 and so on
Will not match
9g99
30. Slide 30
Problem Solving with REGEX
Write a function that would validate an enrolment number. valid=validateEnrolment(eno )
Example enrolment number U101113FCS498
U-1011-13-F-CS-498
13 – is the year of registration, can be between 00 to 99
CS, EC or BT
Last 3 are digits
Example 2:
“AA..T...A”
Means the pattern should have “AA” followed by any two characters followed by “T” followed by any
two characters and then followed by “A”
3: String contains "af" followed by 0 or more “s" characters:
4:String contains "af" followed by 1 or more “s" characters:
31. Slide 31
Metacharacters in Detail
regex=re.compile(“sddds”)
m=regex.search(“Hello 12 cats ran after 432 dogs”)
if m:
print(m.group())
print(m.span())
NOTE: We have used search here
GROUPING OF EXPRESSIONS
regex=re.compile( “(dd)-(dd)-(dddd)” )
match=regex.match(“12-03-2016”)
if match:
print(“Day: “,match.group(1) )
GROUPING OF EXPRESSIONS
match.group(0) # gives the entire pattern found
match.group(1) # gives the first group
match.groups(default=None) # returns all groups as List
If default is set to a value then that value is returned in
case a particular group is not existing
32. Slide 32
Metacharacters in Detail
regex=re.compile(“swwws”)
mList=regex.findall(“A cat ran after 432 dogs”)
for m in mList:
print(m )
33. Slide 33
Regular Expressions
Additional methods in regex:
Syntax: re.sub(pattern, repl, string, count=0, flags=0)
regex.sub(pattern, repl, string, count=0) # replaces
regex.subn(pattern, repl, string, count=0) # replaces and returns count
>>> import re
>>> text = 'Learn to sing because singing is fun.'
>>> re.sub('sing', 'program', text)
'Learn to program because programing is fun.'