Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architectures (SIAM CSE)
1. QR Factorizations and SVDs for Tall-and-skinny
Matrices in MapReduce Architectures
Austin Benson
ICME, Stanford University
Work completed in part at UC-Berkeley
SIAM CSE 2013
February 27, 2013
3. Quick QR and SVD review
A Q
R
VT
n
n
n
n
n
n
n
n A U
n
n
n
n
Σ
n
n
Figure: Q, U, and V are orthogonal matrices. R is upper triangular and
is diagonal with positive entries.
5. TS-QR ! TS-SVD
R is small, so computing its SVD is cheap.
6. Why Tall-and-skinny QR and SVD?
1. Regression with many samples
2. Principle Component Analysis (PCA)
3. Model Reduction
Pressure, Dilation, Jet Engine
Figure: Dynamic mode decomposition of a rectangular supersonic
screeching jet. Joe Nichols, Stanford University.
7. MapReduce overview
Two functions that operate on key value pairs:
(key; value)
map
! (key; value)
(key; hvalue1; : : : ; valueni) reduce
! (key; value)
A shue stage runs between map and reduce to sort the values by
key.
8. MapReduce overview
Scalability: many map tasks and many reduce tasks are used
https://developers.google.com/appengine/docs/python/images/mapreduce_mapshuffle.png
9. MapReduce overview
The idea is data-local computations. The programmer implements:
I map(key, value)
I reduce(key, h value1, : : :, valuen i)
The shue and data I/O is implemented by the MapReduce
framework, e.g., Hadoop.
This is a very restrictive programming environment! We sacri
11. Matrix representation
We have matrices, so what are the key-value pairs? We can just
have unique row identi
12. ers:
A =
2
1:0 0:0
2:4 3:7
0:8 4:2
9:0 9:0
664
3
775
!
2
(a3320354; [1:0; 0:0])
(5da011e2; [2:4; 3:7])
(94d80025; [0:8; 4:2])
(90764917; [9:0; 9:0])
664
3
775
Each value in a key-value pair is a row of the matrix.
13. Matrix representation: an example
We can encode more information in the keys, e.g., (x, y, z)
coordinates and model number:
((47570,103.429767811242,0,-16.525510963787,iDV7), [0.00019924
-4.706066e-05 2.875293979e-05 2.456653e-05 -8.436627e-06 -1.508808e-05
3.731976e-06 -1.048795e-05 5.229153e-06 6.323812e-06])
Figure: Aircraft simulation data. Paul Constantine, Stanford University
14. Why MapReduce?
1. Fault tolerance
2. Structured data I/O
3. Cheaper clusters with more data storage
4. Easy to program
Generate lots of
data on
supercomputer
Post-process and
analyze on
MapReduce cluster
http://www.nersc.gov/assets/About-Us/hopper1.jpg
15. Fault tolerance
1550
1500
1450
1400
1350
1300
1250
1200
0 50 100 150 200
1/(probability of task fault)
job time (secs.)
Performance with Injected Faults
Running time with no faults
Running times with injected faults
25% performance penalty with 1 out 5 tasks failing
16. Why MapReduce?
Not convinced MapReduce is good for computational science?
MS218: Is MapReduce Good for Science and Simulation Data?,
Thursday, 4:30-6:30pm.
18. Communication-avoiding TSQR
A =
2
A1
A2
A3
A4
664
3
775 | {z }
8n4n
=
2
Q1
664
Q2
Q3
Q4
3
775
| {z }
8n4n
2
R1
R2
R3
R4
664
3
775
| {z }
4nn
Ai = QiRi can be computed in parallel. If we only need R, then
we can throw out the Q factors.
19. MapReduce TSQR
S(1)
A
A1
A3
A3
R1 map
A2
R emit 2 map
R emit 3 map
A4
R emit 4 map
shuffle
S1
A2
reduce
S2
R2,1 emit
reduce R2,2
emit
emit
shuffle
AS23
reduce R2,3 emit
Local TSQR
identity map
SA(22 )
reduce R emit
Local TSQR Local TSQR
Figure: S(1) is the matrix consisting of the rows of all of the Ri factors.
Similarly, S(2) consists of all of the rows of the R2;j factors.
20. MapReduce TSQR: the implementation
1 import random, numpy, hadoopy
2 class SerialTSQR:
3 def i n i t ( sel f , blocksize , isreducer ) :
4 sel f . bsize , sel f . data =blocksize , [ ]
5 sel f . c a l l = sel f . reducer i f isreducer else sel f .mapper
6
7 def compress( sel f ) :
8 R = numpy. l inalg . qr (numpy. array ( sel f . data) , ' r ' )
9 sel f . data = [ ]
10 for row in R: sel f . data .append( [ f loat (v) for v in row] )
11
12 def col lect ( sel f , key , value ) :
13 sel f . data .append(value )
14 i f len ( sel f . data) sel f . bsize len ( sel f . data [ 0 ] ) :
15 sel f . compress()
16
17 def close ( sel f ) :
18 sel f . compress()
19 for row in sel f . data : yield random. randint (0,2000000000), row
20
21 def mapper( sel f , key , value ) :
22 sel f . col lect (key , value )
23
24 def reducer ( sel f , key , values ) :
25 for value in values : sel f .mapper(key , value )
26
27 i f name ==' main ' :
28 mapper = SerialTSQR( blocksize=3, isreducer=False )
29 reducer = SerialTSQR( blocksize=3, isreducer=True)
30 hadoopy. run(mapper , reducer )
21. MapReduce TSQR: Getting Q
We have an ecient way of getting R in QR (and in the SVD).
What if I want Q (or my left singular vectors)?
A = QR ! Q = AR1
This is a numerically unstable computation, and Q can be far from
orthogonal. (This may be OK in some cases).
26. Direct TSQR
Why is it hard to reconstruct Q?
I Only maintained state is the data written to disk at the end
of each map/reduce iteration
I No controlled communication between processors ! can only
control data movement via the shue
I File names and keys are the only methods of labeling data
components
27. Direct TSQR: Steps 1 and 2
A
A1
map R1
Q11 emit
emit
A2
map R2
Q21 emit
emit
A3
map R3
Q31 emit
emit
A4
map R4
Q41 emit
emit
First step
R1
R2
R3
R4
Q12
Q22
Q32
Q42
R
reduce
emit
emit
emit
emit
emit
Second step
shuffle
29. Cholesky QR
We also have a Cholesky QR:
ATA = (QR)T (QR) = RTQTQR = RTR
Computing ATA translates nicely to MapReduce. This method has
worse stability issues but requires fewer
ops.
30. Performance Data
matrix size Time to completion (seconds)
4Bx4 (135 GB) 2.5Bx10 (193 GB) 600Mx25 (112 GB) 500Mx50 (184 GB) 150Mx100 (110 GB)
8000
7000
6000
5000
4000
3000
2000
1000
0
Performance of various MapReduce algorithms for computing QR
Cholesky
Indirect TSQR
Cholesky + IR
Indirect TSQR + IR
Direct TSQR
31. End
Questions??
I code: https://github.com/arbenson/mrtsqr
I arXiv paper: http://arxiv.org/abs/1301.1071
I my email: arbenson@stanford.edu