No Slide Title

1. Sparse Coding for Image and Video Understanding Jean Ponce http://www.di.ens.fr/willow/ Willow team, LIENS, UMR 8548 Ecolenormalesupérieure, Paris Joint work with JulienMairal, Francis Bach, Guillermo Sapiro and Andrew Zisserman

2. What this is all about.. (Courtesy Ivan Laptev) Object class recognition 3D scene reconstruction Face recognition Action recognition (Furukawa & Ponce’07) (Sivic & Zisserman’03) (Laptev & Perez’07) Drinking

3. What this is all about.. (Courtesy Ivan Laptev) Object class recognition 3D scene reconstruction Face recognition Action recognition (Sivic & Zisserman’03) (Laptev & Perez’07) Drinking

4. Outline What this is all about A quick glance at Willow Sparse linear models Learning to classify image features Learning to detect edges On-line sparse matrix factorization Learning to restorean image

7. Category-level object and scene recognition

8. Human activity capture and classification

12. Y. Boureau (INRIA)

13. F. Couzinie-Devy (ENSC)

14. O. Duchenne (ENS)

15. L. Février (ENS)

16. R. Jenatton (DGA)

17. A. Joulin (Polytechnique)

18. J. Mairal (INRIA)

19. M. Sturzel (EADS)

22. J.-Y. Audibert (ENPC)

23. F. Bach (INRIA)

24. I. Laptev (INRIA)

25. J. Ponce (ENS)

26. J. Sivic (INRIA)

28. J. van Gemert (DGA)

29. Kong H. (ANR)

30. N. Cherniavsky (MSR/INRIA)

31. T. Cour (INRIA)

33. Finding human actions in videos (O. Duchenne, I. Laptev, J. Sivic, F. Bach, J. Ponce, ICCV’09)

34. Sparse linear models Dictionary: D=[d1,...,dp]2Rm x p Signal: x2Rm D may be overcomplete, i.e. p> m x ≈ ®1d1 + ®2d2 + ... + ®pdp

35. Sparse linear models Dictionary: D=[d1,...,dp]2Rm x p Signal: x2Rm D is adapted to x when x admits a sparse decomposition on D, i.e., x ≈ j2J®jdjwhere |J| = |®|0is small

36. Sparse linear models Dictionary: D=[d1,...,dp]2Rm x p Signal: x2Rm A priori dictionaries such as wavelets and learned dictionaries are adapted to sparse modeling of audio signals and natural images (see, e.g., [Donoho, Bruckstein, Elad, 2009]).

37. Sparse coding and dictionary learning: A hierarchy of problems min®| x – D® |22 min®| x – D® |22 + ¸ |®|0 min®| x – D® |22 + ¸Ã(®) minDєC,®1,..., ®n1≤i≤n [ 1/2 | xi – D®i |22 + ¸Ã(®i) ] minDєC,®1,..., ®n1≤i≤n [ f (xi, D, ®i) + ¸Ã(®i) ] minDєC,®1,..., ®n1≤i≤n [ f (xi, D, ®i) + ¸1≤k≤q Ã(dk) ] Least squares Sparse coding Dictionary learning Learning for a task Learning structures

38. Discriminative dictionaries for local image analysis (Mairal, Bach, Ponce, Sapiro, Zisserman, CVPR’08) *(x,D) = Argmin | x - D |22 s.t. ||0 ≤ L R*(x,D) = | x – D*|22 Reconstruction (MOD: Engan, Aase, Husoy’99; K-SVD: Aharon, Elad, Bruckstein’06): min l R*(xl,D) Discrimination: min i,l Ci [R*(xl,D1),…,R*(xl,Dn)] +  R*(xl,Di) (Both MOD and K-SVD version with truncated Newton iterations.) Orthogonal matching pursuit (Mallat & Zhang’93, Tropp’04) D D1,…,Dn

39. Texture classification results

40. Pixel-level classification results Qualitative results, Graz 02 data Quantitative results Comparaison with Pantofaru et al. (2006) and Tuytelaars & Schmid (2007).

41. L1 local sparse image representations (Mairal, Leordeanu, Bach, Hebert, Ponce, ECCV’08) *(x,D) = Argmin | x - D |22s.t. ||1 ≤ L R*(x,D) = | x – D*|22 Reconstruction (Lee, Battle, Rajat, Ng’07): min l R*(xl,D) Discrimination: min i,lCi [R*(xl,D1),…,R*(xl,Dn)] +  R*(xl,Di) (Partial dictionary update with Newtown iterations on the dual problem; partial fast sparse coding with projected gradient descent.) Lasso: Convex optimization (LARS: Efron et al.’04) D D1,…,Dn

42. Edge detection results Quantitative results on the Berkeley segmentation dataset and benchmark (Martin et al., ICCV’01)

43. Pascal 07 data L’07 Us + L’07 Comparaison with Leordeanu et al. (2007) on Pascal’07 benchmark. Mean error rate reduction: 33%. Input edges Bike edges Bottle edges People edges

45. Online sparse matrix factorization (Mairal, Bach, Ponce, Sapiro, ICML’09) Problem: min DєC,®1,..., ®n1≤i≤n [ 1/2 | x – D®i |22 + ¸ |®i|1 ] min DєC, A1≤i≤n [ 1/2 | X – DA |F2 + ¸ |A|1 ] Algorithm: Iteratively draw one random training sample xt and minimize the quadratic surrogate function: gt ( D ) = 1/t 1≤i≤t[ 1/2 | x – D®i |22 + ¸ |®i|1 ] (Lars/Lasso for sparse coding, block-coordinate descent with warm restarts for dictionary updates, mini-batch extensions, etc.)

47. Non negative sparse coding (Hoyer’02)

48. Sparse principal component analysis (Jolliffe et al.’03; Zou et al.’06; Zass& Shashua’07; d’Aspremont et al.’08; Witten et al.’09)

50. B: 12£16£3 color patches, 512 atoms.

52. Batch version on different subsets of training data.Online vsbatch Online vsstochastic gradient descent

54. E: 2414 192£168 images from extended Yale B.

56. Hoyer’s Matlab implementation of NNSC (Hoyer’02).

57. Our C++/Matlab implementation of SPCA (elastic net on D).SPCA vsNNMF SPCA vsNNSC

58. Faces

59. Inpainting a 12MP image with a dictionary learned from 7x106 patches (Mairal et al., 2009)

60. State of the art in image denoising Non-local means filtering (Buades et al.’05) Dictionary learning for denoising (Elad & Aharon’06; Mairal, Elad & Sapiro’08) min DєC,®1,..., ®n1≤i≤n [ 1/2 | yi – D®i |22 + ¸ |®i|1 ] x = 1/n 1≤i≤n RiD®i

61. State of the art in image denoising BM3D (Dabov et al.’07) Non-local means filtering (Buades et al.’05) Dictionary learning for denoising (Elad & Aharon’06; Mairal, Elad & Sapiro’08) min DєC,®1,..., ®n1≤i≤n [ 1/2 | yi – D®i |22 + ¸ |®i|1 ] x = 1/n 1≤i≤n RiD®i

62. Non-local SparseModels for Image Restoration (Mairal, Bach, Ponce, Sapiro, Zisserman, ICCV’09) Sparsityvs Joint sparsity min  [1/2 | yj – D®ij |F2] + ¸ |Ai|p,q i j2Si D2 C A1,...,An |A|p,q= 1≤i≤k |®i|qp (p,q) = (1,2) or (0,1)

64. PSNR comparison between our method (LSSC) and Portilla et al.’03 [23]; Roth & Black’05 [25]; Elad& Aharon’06 [12]; and Dabov et al.’07 [8].

65. Demosaicking experiments LSSC LSC Bayer pattern ……………………………………………...…………… PSNR comparison between our method (LSSC) and Gunturk et al.’02 [AP]; Zhang & Wu’05 [DL]; and Paliy et al.’07 [LPA] on the Kodak PhotoCD data.

66. Real noise (Canon Powershot G9, 1600 ISO) Raw camera jpeg output Adobe Photoshop DxO Optics Pro LSSC

No Slide Title

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Similar to No Slide Title

Similar to No Slide Title (20)

More from butest

More from butest (20)

No Slide Title