Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

CVPR 2020 報告

cvpaper.challengeにおいてまとめた「CVPR 2020 報告」です。

cvpaper.challengeはコンピュータビジョン分野の今を映し、トレンドを創り出す挑戦です。論文サマリ作成・アイディア考案・議論・実装・論文投稿に取り組み、凡ゆる知識を共有します。2020の目標は「トップ会議に30+本投稿」することです。
http://xpaperchallenge.org/cv/

  • Identifiez-vous pour voir les commentaires

CVPR 2020 報告

  1. 1. 1 1 2 .. / 1 1 1 2 .. / 02
  2. 2. • 2 IRP 201 Ig c – e HI • 201 8 9 6 8 6 6 -9 : / : 4C • 2 8 9 6 8 6 6 -9 : / : 644C • 201 8 9 6 8 6 6 4C 6 48 6 76 4C • . 2 8 9 6 8 6 6 4C 6 48 6 76 944C • 201 8 9 6 8 6 6 4C 6 48 6 76 4C • 201 8 9 6 8 6 6 4C 6 48 6 76 4C – ik V 4C 6 48 6 76 H – H hKa I dEpl o np
  3. 3. x • ms urV x dkleV – ms • . Vdkle • t R • Vv .Vdkle – w V No p • – ah . a g_ • C 0 3 1 0883: 3 1 3D 1 / 0 3 8 088 • y m nV – d c u Vi a g_ P • 8 23 0 3 :3 1 0 3 1 0883: 3 3 3: 0 : • 70 70 0 70 :3 C 0 3 1 0883: 3 8
  4. 4. EUPCPG EIC GNHG/ • i2A o bnu t ◆ j l h nje m T UG GMDG . GRGC EI GMDG . + i j l dBGD hd i? 2A W 82 0W 1 A2 822AB 2A B 422AB 82 W 2A - u t 422A - s yp EUPCPG EIC GNHGi x t rv 2A i o a v wo n o g 422A +B 0VC F A 4B + 3 NCM E 5CRI ON 2T ST GR 8 , # NIDB Near-Miss Incident DB 82 0 - 2A - 3 GR:GSR ?O EI 2A - 6 S7TD ci i O & / HP: http://xpaperchallenge.org/ Twitter: @CVpaperChalleng
  5. 5. . http://xpaperchallenge.org/cv/recruit
  6. 6. o py • -9 6 tu sk – -9 6 u kSjl RcgR i vkT • ALLI AH L A L L L -9 6 LM P • ALLI C H I H N L – -9 6 thdR_S. 3 O 2H LP 8 A H H T • ALLI I D D H HLHD M NI IH L – -9 6 : V S 0 7-/41 6T • ALLI AH L A L L L NI : AH : – -9 6 x kSrjl ibeR a i skT • n ALLI D LH N H I H N L • m ALLI D LH N H I H N L
  7. 7. e , • 1 67 hbcdUja – 1 67 hbcdV1R A /CAI T /4 A D O W • D . R A CAI G C A A D O A P I ( – 1 67 3 CGA igfV A 0 C2 W • D . A C – 1 67 :I A R 0A 6 A /II OI A RI A • D . RI A AP A P OI A R A A II OI A – 1 67 . 9DA 9 5 A 2A A I 6 A • D . DA A B S P DA A A A I A B- A) ((
  8. 8. bad • 1/0 D 25 @ – 1 /@ 5 A R IP – 1/0 F 2 2 C G • 8 A 8C 2 2 C D 25 1/0 – c V H I . 1 F 2:282A8 C G • 8 A 8C 2:282A8 C D 25. 1
  9. 9. - C 2 0 - 0 9 -
  10. 10. 2 / Nin G C • 2 / eIST A – 5 0 3 o r a h Oo r • ,3 5 rC 0 85 5 rC 5 rCNl tP • 2O c N VA 2 Sg u F https://scholar.google.com/citations?view_op=top_venues&hl=en&vq=en 1 O R
  11. 11. 1./ c CeiI • 1./ 19> 2: – dP I fabp V bp • -1 Smo • O fa V >9 bp hl • 1./ . > 2: VD 1./ P0 2 : HR19> 2: 8 > 8
  12. 12. 3/0 U c bh • 3/0 3 C 7= – l V • U XVe • a_ YH i ? 1 8:Ra_P Y T Ye CC ? C 8: 9? 7C9 , +7.2 C8 =
  13. 13. 3856 % %Vc iR ) P ( • 3856 % % 8A H – ho l ) , l % P • Y Y 23 . _ ema( )_Up • 23 _ (% X b HH / H =? C H 1 0 47:9H=
  14. 14. .:5 & & c tVs X &Y • .:5 & & :FLN – } cghf • D FNF X ow U {ruY 4 • 1-80a m lnd e_v • yic/ L F Dghf 8L C L E N S HF T LPF A L F D 0CCF F N NL F F D F C L L N NF L F D -AP L LF L F D 4 L D L NFP H A b lp ENN R N ? H N E,P 2 N? =
  15. 15. / H P • / /8=? – ?? C ? 6 5 : ?5 .10? 2
  16. 16. / 23 Xf ip P # % • / 23 G 9 – Xaw • %m P m d P • nRV m u P Y% V m H = = Yv • rsgbU Yty e 0 ? = _loc e h H,## := C# 9 . 91 76 : 8
  17. 17. 0856 hl o ,#, U % • 0856 8 M F – kwc dv • %m f UR m b U • nV m t , UR a m 7H ? uX • rieY 2 F 1 = s _g 047 f p CMM ##PPP M = G#P M C/ . 37: M=PF ,
  18. 18. 3 9: & &r“ ea % b • 3 9: & & TVW N – y qpk fon]? 2 d l • 3 W WE V a r“ _b. VVRU.%%YYY WVWE F O%F PP N%=3&P F TU7C ?Y Y • 3 8R P 1FF UU A3 9: & & 6 P 3 P T PF B. VVR.%% R P FF UU V F F O%3 9: & & R • 3 8R P 1FF UU A3 9: & & ? TMU RB. VVR.%% R P FF UU V F F O%3 9: & &CY TMU RU%O PW R • Y VV T [ 3 9: & & . VVRU.%%VY VV T F O%U TF 0S/ 3 9: & & UTF/V R CSW T /N – t w r“ shviv gvmcu_
  19. 19. /, CA • /, /6 18 – 9 618 • , 9 9 18 224 06 3 1 – 09 7 9 • , 9 9 18 224 06 3 1 https://www.youtube.com/watch?v=aHUYXtbwl_8
  20. 20. 1 V M bQ O • .= C H=PdaRT – Y ahSW e • /A H 0 • =CA A = 1 = A =CH – 8 I I:= gf c • HHE KKK IHI:= K H ? -=2. http://cvpr2020.thecvf.com/
  21. 21. 2/0 W kLhoR • . C ?8 ?: 0 C – bd mP O yta gz – T pc sa e v nrY O – iu1 C : 8 V w • 4 1 ?= C H 3 I/ 1 https://www.youtube.com/watch?v=aHUYXtbwl_8
  22. 22. 12 cgIe # L • 1 0. /A - 1C H – a r MfPRlj – 12 TohS 12 lj – 33.. SV p ndi • ##7A : # 8 # # # 7# 7 7 C 1 https://www.youtube.com/watch?v=aHUYXtbwl_8
  23. 23. I • 842 1 1 /31 0 – M – C AM P https://www.youtube.com/watch?v=aHUYXtbwl_8
  24. 24. +2 / Q A C & • , @ +2 / – Q w eV lP s – m • 18 C0 4 18 5C n r h i • ZZ Q oRa 8 4 Q r HT S R d t https://www.youtube.com/watch?v=LkSBxpLOBx8
  25. 25. 5 i S V W • L O hcY – t sT • h Uu U af h • v r nl fi fi Xo • hcY y p g k – U ws YT • l – 2L AL h Xb m_Y • 2KAI IC 0ARI A .i O O?Ae – D K QQQ R O O?A K R 1=?-P/ 8 O /? F QR 0 :?6:6/
  26. 26. . % % 0 %5 % • % – RV 2 • 6 27 P V 8 P6 7 6 / 17 – • V / • V 5 P C P
  27. 27. *2/0 lr nwM I • – 3 C3 / 1AB A /IV N da • C3 t f ae – x u p c hM N C3 i s P 8 / A* B 2 D , A1B 3 SRt Smo
  28. 28. . 89 i b • hjl(/ – (/ hycl i – “uxg o]sa efbd • )t v k1 1. [/FF 1FOFRBT VF P FMS[ / GGFRFOT BCMF 9FO FRFRhycl a m • (/ -FODINBRL F IB F FT .- • np ri F 3BPM O 2/2, A 8 PRDI(/ 0BDFCPPL A FSI 9 . 1 1L PWBR FT BM 2.. A Y OSU FRV SF FBRO O PG 8RPCBCM NNFTR D /FGPRNBCMF (/ 7CKFDTS GRPN 2NB FS O TIF M U FT BM . 89 A . 89 -FST 8B FR
  29. 29. + P 8 • f e C ab V R 100/ /3 1 2/ /3 / / – : V R – 100/ /3 1 2/ /3 / / : e • d D Leveraging 2D Data to Learn Textured 3D Mesh Generation https://openaccess.thecvf.com/content_CVPR_2020/html/Henderso n_Leveraging_2D_Data_to_Learn_Textured_3D_Mesh_Generation _CVPR_2020_paper.html From Image Collections to Point Clouds With Self- Supervised Shape and Pose Networks https://openaccess.thecvf.com/content_CVPR_2020/html/Navane et_From_Image_Collections_to_Point_Clouds_With_Self- Supervised_Shape_and_CVPR_2020_paper.html
  30. 30. R P 2/ 0 • 8 8 V3 1 – : C D • C Learning Unsupervised Hierarchical Part Decomposition of 3D Objects From a Single RGB Image hhttps://openaccess.thecvf.com/content_CVPR_2020/html/Pasc halidou_Learning_Unsupervised_Hierarchical_Part_Decomposit ion_of_3D_Objects_From_a_CVPR_2020_paper.html PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes https://openaccess.thecvf.com/content_CVPR_2020/papers/Wu _PQ- NET_A_Generative_Part_Seq2Seq_Network_for_3D_Shapes_ CVPR_2020_paper.pdf
  31. 31. bd cg D • mal P h 3 3: /083 3:23 3 – CW V CW R S f eiI • 3 3: /083 8 3 3:23 : 3/ : : 8 3 3 3: / : . 3 : • L L Colored Voxel http://www.krematas.com/nvr/index.html Neural Rendering Workshop https://www.neuralrender.com/
  32. 32. / W V I • NP R SN M : C 3 :G – 3 - 3 3 3 - 3- – - E F 0G C: 8 G 3 0 2 C: 3 G : C C: 3 D 3 3 3 – 0 F F 0G C: 2 C: D C 3 3 – 0G 0 C F 0G C: 8 3 0 3 Workshop https://nvlabs.github.io/nvs-tutorial-cvpr2020/ Link https://shihmengli.github.io/3D-Photo-Inpainting/
  33. 33. V e • h R D – 3 4/ 302 , 3 M d • 8 8 D i C – h h P P c D Articulation Implicit Function Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion https://openaccess.thecvf.com/content_CVPR_2020/html/Henderson_Leveraging_2D_Data_to_Learn_Textured_3 D_Mesh_Generation_CVPR_2020_paper.html
  34. 34. 3 / • C P R2 – C D C C P 0 D 5 82 3 3D Dynamic Voxel 3DV: 3D Dynamic Voxel for Action Recognition in Depth Video https://openaccess.thecvf.com/content_CVPR_2020/htm l/Wang_3DV_3D_Dynamic_Voxel_for_Action_Recogniti on_in_Depth_Video_CVPR_2020_paper.html
  35. 35. , Pb A IC • , E X – / no • / 0680 / • 06 8 22 8 0 6 6 08 0 82 083 62 – che pi r e ta • V Ps V P e uRTdFl I
  36. 36. 2/0 e R • – c e • w e fbek d P • d i X pon • 2 e f x u dkn i ogb R E A A A Tna lon 3 3G I D8E , C A AE A G A . GD . I D E 3 A7 C D G A 2/0 rvs d n t e “ n ad km Vn N V e n be X d” W n h N
  37. 37. e R Z V • 8 0 – jrul mtwd d • c e xa XSi “d • h e dch xsoviopnk h , AE K D P IG I EE KA F E FKA HHIG KG ,OHD AFAF F AF 2 I D 2 KNGIC 3 B K K KGI ” bekv wkd g i / F K D P0FK IHI K D F I K -AF I AF G FAKAGF A AGF .IG HAF
  38. 38. 2 • 3, – . Zi • V R R L • .c a a R L • I d P a tMr uml Zi a TR I V R R R fI Io t ns p / 9E 99B C FDE9C AC DF /9 C 2 a ” “ Zio r p aTR e agho ” “ A 9E 0 E D C E A 8 C 9DD DD . C9 9 E /9 C 2
  39. 39. 0:78 j V • 2/91 – j2/91 j • 2/9 2/??9 e 4 =CA6ANj • zy j 9EI 4 =CAMj V V h s lbm V wtx R ERAG f h i n je io f V9 =G = 2A C M 2 AA =I j NN M, C M ?M=EG EN A PEME I 9EI 4 =CAM V s b j NN M, AI APEA IAN .E -M A T=/G3 4 4 =CA6ANj u v j V i a u v j c e h[f rb s V V V hg s hdb V4 =CA6AN 7 FA?N]pj NN , E =CA IAN C =NA MA
  40. 40. 0 N n r A • 0 MC tMhe – ui • 3 1 2 32 3 /31 2 /AL E • n N ui Mo o V N lR P 44 2 38 3 MS A • N c a i fR
  41. 41. : 7 jl uXS T • i xh f t x s – 4,8S1 27 T /AAD D L4 LS1 32 T faR • ec ognb S/AAD D L. LT j S .T rp • d V S /AAD D L. L .T 0 D CL C A L F O . /NI D , CDL LM A /AAD D L :D 7 DLD P : 7 3 9 L F O/AAD D L. L 8 F F /AAD D L E L . L LD P : 7
  42. 42. 2/0 P • w R w – ifg Ic • 0 8. A , 4 A. A n Vo” a • . s , 4 AD N I I tr – P – ~v k a S “edcV l n R V pa 035 3 4 A 3 E 8 8 . AC : 8 34 2/0
  43. 43. V4 8 • – cP4i P4 P4 – R3 eR – C R K /0 2/- & 7 nc
  44. 44. ,5 2 WkrMp ( ) O ( • k u M W t – ,CA C G C G ,5V k u W(dfWi • 3G C ,5 C Wm S ,CA C G C G h oS vn Wl VgS C C • /.4W 0 I se yW c c aWu : 8G I G 8 8 G C A G : :8C G C P Wu TR A. Andonian*, C.Fosco*, ... & A.Oliva. We Have So Much in Common: Modeling Relational Set Abstractions in Videos. [arXiv, soon] Aude Oliva, MIT http://ai.stanford.edu/~jingweij/cicv/
  45. 45. 23 vyL P Twitter Strong Accept x3 https://twitter.com/doubledaibo/status/1232832762270236672 • v “ t L spv “ – .G pv “ pv “ P • ai b nS o spv zV P • g ce S l m – 4 D / 6 8D F D C P • R pv u rq R http://openaccess.thecvf.com/content_CVPR_2020/html/Shao_FineGym_A_Hierarchical_Video_Dataset_for_Fi ne-Grained_Action_Understanding_CVPR_2020_paper.html 068A D D 6 6 , 6 : 3 8 : H 4 D / 6 8D F D CJ 23
  46. 46. 30 Xt u PM • t xfghr s r – , xfghrM • oVt R YXca y – - 8 : E 2C7 3 :A /A : DM • da np eiX s lW WR T oVs TP 4F :E 7 /F E C /:E A AC - 8 : E 2C7 3 :A /A : D 30 .: 8 E: A :C :E 7 , - 7 C8 E:8EFC:D AC - 8 : E 3 :A :8A E A 30
  47. 47. -312 P • ʼ – mpdenh xk o • / 7 m iln V ,3, urvP AI c • y _ R w egtl V ba ,7 0 - : xs – -312K .--3K _ m c R N C C7 :AA: 7A 8A A 8 7 AC: 7 : : 4 7
  48. 48. /,- Pf Ci • e g – a o e 384 3 32 3 9 9 RPst • hO L Z – vl F r upP w P ZRP d P o e o FVe no FVe e 03 93 3 S V
  49. 49. 4 b N VP ) • – P • 0 A A 7 0 U I7 A i • 7 2 2l d(tso L agi b gie U . , 7 0 A A 7 5A C H 7 4 C A 7 A / 7 A A K 4 a“ b A 7 H / 7 f o r p 8 c Ln Oa d 0 A U am Ow bU ” RiL 7 H r l d U V Mr l a d U VL f a b vul agiL a OtU l M
  50. 50. - 8& & a Z & • – x Z • u v m ihln d W e y • rvks v 80, / KR PAFK d • FNFKJ 4 J “ caopf gtw 5PN FGK B B U B C PLBM FNBA FBRLKFJ 4B MJFJ /MKI 1I B -K B FKJNV - 8 & & / EP B U FNFKJ 4 J P B F FKJ F E B C PLBM FNBA PSF F MT 8B NKJFJ : NGNV - 8 & & 2 FBM FK JJF B U. K FJ 4KNNBN CKM JNPLBM FNBA FABK 8BLMBNBJ FKJ 4B MJFJ V - 8 & &
  51. 51. 40 Sf Mip O P • e g – n e O o hP • ya cS s • b Z • UV N ldtr R vu yaMXN m - 0C / E C 2 2 D E : -C C E D , A C 40 - . E 2 2 D E : DDEC CE : E E C8 40 5 2 2 D E : 2 / C 40
  52. 52. +5 2% % g L %M • ap sLlkM – rioapL gM – ioThnRV e • 0 apTN rio uPf ioStm • 3 +.2L +/. % M 4 + I 3 A , 8 + C F . 8 5 C 2 A C C +/. % %
  53. 53. .412 P oImt aehgP o fr TR V – u v • u v AC C 78C 5 8 8 6 A5A8 6 5 8 :8 • 3 5 7C 18 9 5 4 78 286 : AC C 78C 5 8 8 6 A5A8 6 5 8 :8 5 7C A8 9 5 78 86 : – sxi • sxi AC C 78C 5 8 8 6 A5A8 6 5 8 :8 C8 9CEA8 C87 85 : – pl • .4Pwn -0P pl AC C 78C 5 8 8 6 A5A8 6 5 8 :8 5 – cd • -7 8 C5 5 / 5 A 8CaeP o AC C 78C 5 8 8 6 A5A8 6 5 8 :8 57 8 C5 5 8 5 A 8C
  54. 54. / B N • G – 45 0 2 B A 6 S RV BP – G C 8
  55. 55. 0 : ey T • s nmo d h – PRIC .8 /FD .8cbe • PRIC .8 S.B 58e • /FD .8 nmoe T mVle – kpjc s dc • 5 251a geeSrteud bi PRIC .8 0 : 9N I /COP : MCN 4L LN IC 7C PFL PPMO DFP AL 8 I O OPRICD /FD .8 506 9N I PPMO DFP AL G NLAH /FD .8 :R LNA
  56. 56. • P e : – 04745/ 2 7 Ce O R • 28 4 a cC V i h • e V MixNMatch https://github.com/Yuheng- Li/MixNMatch Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis http://openaccess.thecvf.com/content_CVPR_ 2020/html/Liao_Towards_Unsupervised_Lear ning_of_Generative_Models_for_3D_Controlla ble_Image_CVPR_2020_paper.html Self-Supervised Scene De- Occlusion https://openaccess.thecvf.com/content_ CVPR_2020/html/Zhan_Self- Supervised_Scene_De- Occlusion_CVPR_2020_paper.html
  57. 57. 8 0 • P – 85 8 V/ – 8 V 2 – 8R 9 – V – C 48 5
  58. 58. -62 a VN • w r mhn pas Z – A9 BD RdaT • ec A9 BDg P at • S d2B F G 8F G a“R • eg Z v a 8D: B A oil e 4 8A F 8 ,D : A F 08C , F A A9 BD ,8 : 8A: A9 BD /D . F 9F BA G 8 :8CF G D8 A A 48 C 4 9F BA -62
  59. 59. - 85 6 • V R Coe – V r C - /25 - 6 • - : 9 • : P N 01 H - C 9 t
  60. 60. , 12 d “O ” R S • snu”Zj – - dsnu • - e a e j i • / C D D F e 3 6 A 8 D - 08 H 5 - D 8 2 D: D 3-. 3 F 1 H , 12 O b3 D : - H D . :R3-.S Wg j O mo v mr cV ktl sp cfh Pc d j
  61. 61. a • po 635 S – 25 28 2 t ec: r Dn • /00/ 2 2 N R iC • 25 83 2 / 2 V ld P
  62. 62. 0:6 s v [U ( , V ) • hie . t ru p n – 5 1/. / 2 0:6 - • hie y R2 HI p ] f g dl R4 IED 2 IED m cS] oR4 2 a hie dr S 3 6 E P9 I D 5 HF D / I DH AH 8 F D 8 F DH C H C 8 EA 8 I D DH 0:6 H E
  63. 63. ,62 gmN P • a r hp ue sd – E A , hp ue sd • cbytlw a rin SRVMhp oUR V2EG :B 8 E a 0 , . A A / A: 4 A BA A - B A 6 : BE H 5A A .G : : 0B: : C BA ,62
  64. 64. /- H P • – V O • R • 60 , 532 3 02 8 60 328 2 8 • CD H IH /
  65. 65. 20 WfnH u • r dHafs – MZ hoidX gVle m d 8 8C 6 E/8C6- 8 0A8 CA6 87 - / 78 A . 6 0 8 , C 6C F 20 d R W P V t . 6 06A : 5 5 6 : 8C 6 E AA8 6C : ,7:8 0 8 C 06A :F 20
  66. 66. ,:67 l Z • 0PI 5 G O O M OF - 0PI 7 – o beaih rd n i l s c x u gW s “ P O H , N A A 0PI 5 G O O M OF 7 D FOF V ,:67 t pW l s me j 8 8 FO O H 6 .P0- 3PHOF 2 H 6FR H HFD A ILHF FO .P OF C M 0FD 7 N HPOF - 0PI -FDFOFT OF V ,:67 leml j - ”yxY s c -:Z - -S IF : R H D O H -: - -S IF : R H C M OF 7 D FOF F - LO :FA V ,:67 l jv s fj pW i i l szY ibg h s P O H FH M O 6M AF OF D ,H O F D F - N .P OF C 0PI 6 N 8 L A / MI O 8OSH V ,:67
  67. 67. 0 Fn • drs WR r PV – e io fh PVSm g – 5 96 /5 9 2 5 9 9 5 5 5D5C9D A: AD5D98 5 -5 9C • r lt PV H a – • dL cbc H 9 C90AC9
  68. 68. 8I • 2 25 / 7 0 – 2 25 0 2/05 L C – 2/05 7 0 9 • Visual-Textual Capsule Routing for Text-Based Video Segmentation Text Video Video Segmentation • Object Relational Graph With Teacher-Recommended Learning for Video Captioning (External language model dataset long-tail )
  69. 69. • 2 2 / 17 10 – 0 0 6086LR S C – 0 0 6086 P • Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection (Graph CN Shape Scene Text ) • SwapText:Image Based Texts Transfer in Scenes (Scene Text / )
  70. 70. , R • 6 / 1 10 – 86 / 1 P L – ,0 6 2 7 ,0 6 1G – 86 / 1 ,0 6 1 C V • REVERIE:Remote Embodied Visual Referring Expression in Real Indoor Environments (Grounding Embodied AI ) ( ) • SQulNTing at VQA Models:Introspecting VQA Models With Sub- Questions (Sub-questions Reasoning )( )
  71. 71. 23 n ou & • 8 1 A 9 – / 6 8 98 0 L 23 P V – 1 3 6 7 1 R I D – d iaI V C • Ecbegts • ca ts • mtsV
  72. 72. P 40 • – / I 67 – RX 0 / AC – P • 82 • P V 3 2
  73. 73. 52 b P R • s y x b gS a – kpn rb w • 4F D8N .FIIc kpnbutnliW a f d V g jom b -D8 I yvR • .8 4:8C : : F 6 C FD -D 8C8 : / C .8 CI F 8:: II :L :FD :F 7 52 7 DC 2 7.8 4:8C 7 : 7 : F 7 7 76 C 7 FD7-D 8C8 : 7/ C .8 CI7 52 7 7 8 DC – b r • _ a h_ e b r • O 8D : D 0 MF F 8 I CO 28: : : F F 8:: II :L :FD :F 7 52 7 DC 28 7 O 8D :7 D 70 MF 7 F 7 78 7 I CO728: 7 : 7 : F 7 52 7 7 8 DC CVPR 2019 SKU-110K dataset https://github.com/eg4000/SKU110K_CVPR19
  74. 74. -6 5 ʼA D E • n p V – 88 L V /:7 8:7 tM V • C de r o N • F- 51 R C L P : 8 /:7 8:7 o /:7 /:7 2 789 0 88 51 -6 5 - 51 l
  75. 75. 20 • p y - r RtP – n u a s c Pb • r jLg emilfh mR dt b 8 / 86E 7 687 6E A 86 D A AC EA A A D 28 6 8D O V d X : EA A A D C : I 8 C : 7 EA 7 2 D EE8 E A EE8 E A x oO v
  76. 76. 80 • 0 / V 0 – R – • 6 • • / W • C 7 – R P 2 7 6 2 2 S 79 6 7 6 6
  77. 77. 8 4 atvUu W X • K AL A R A L M LFMD K – cdkileg Y aflVh • 8 4 br sa D KM P MG L A • np b y_ L B L D I W 4 PALM AX .AIAL G D L • /IP A 6 GFb m w o – D KM P MG L A A L M IP A GFM KDK • AM KAL L – I D AID : 2AIAG M 0 I F M M .A LC G M D DAI , I I AG 8 I . G B L DA K KAL S, I CI M -A LA A LI IC B L / CA I 8 A MA - A I MK B ICT LAMAI A 8 4 A L M LFMD K https://vislab.ucr.edu/Biometrics2020/index.php
  78. 78. /6 d I UM( ) ) • 2 A: 289 A – a d 0k i J – I c i B BCd lrtIo stmd k vd k id e V a d k R d J P • d d 0d k – V dp n h i – d iRe c i – R k – i • /.d1 A: 8 28 Wd c C A: 89 A 9B https://people.eecs.berkeley.edu/~malik/
  79. 79. ,612 a # R • 3:8D0 I .D 3: D D I8D D , 8 D – 3:8D0 I • ,612 sPport II ##LLL :8D D I # h S l w – -# - 3 C8DI : /8 1 #.D I8D: 1 3: D , 8 :8I D – II ##LLL :8D D I #: 8D :8DD I V • pn a P P Pu II ## 8 : D I C # :8DD I7 D: C8 # – V a UV gUe http://www.scan-net.org/cvpr2020workshop/ sPpab_d N cyrm _ V fe -# - okyPm pn N i np n N mP piv _ apn h T http://www.scan-net.org/cvpr2020workshop/static/img/splash.jpg
  80. 80. . 7 g“ W) ( • 3 D B A IG 2G A M / M W32/ – evVs ig • :0- f _k nr vVs ig • e vVs ig e ed – - LM ,P • 1 8 M R D D B . ILL 2G B 8 G MD L AI : E 8 ODL 8 G MD 8 BG M MDI S • g 0.. c a CMM L BDMC IG 1 I D8 .28 PLLL https://lidchallenge.github.io/ h g b gyVw t i W h b m i ”mu p W .I ,MM MDI l e i s ol
  81. 81. . 7 o ”d_ • 4EAR I G FROM LABELE I EOS – mf bso • n gu mf • 2 VITE :PEA ER – ,L OS A /FROS – , REA E AL I – 2VA 4APTEV – 3ITE RA 5ALI – 5I G U 4IU – 7IERRE :ERMA ET • RALjp atn gu mf – ,3 7IERGIOVA I ET AL /VOLVI G 4OSSES FOR SUPERVISE I EO EPRESE TATIO 4EAR I G . 7 https://sites.google.com/view/luv2020 w_ 1- r zy _0UTURE 7RE .ROSS MO AL RA SLATIO .O TRASTIVE /MBE I Gml v “n ho j mf guk cmeiq s v atn gu v TTPS RIVE GOOGLE COM FILE LQ8U G E --LJ A.Y S VIEW
  82. 82. / 8 k a ( ) • 8PCAM GRGM , :CCG RFPMS F RFC SRSPC – k qmgnfw rsvtu • 8PCAM GRGM gl kbg • x oz ih y o “ihj ep qldngcf q • – OSC PR S CP - 2 GT MD MPM RM – /FC :S 2MM C C C PAF – GAI FG CF PR / .CPIC CW • .C R 8 NCP -U PB – 0W KGA CSP C RGM 3 DCPC AC DMP MPCA RG P CARMPGC " /M G 2P CP 3 / " - CV BCP :AFUG 3 / .C R :RSBC R 8 NCP -U PB – 5S RG / KCP P CARMPW MPCA RG , 8CBC RPG P CARMPW 8PCBGARGM G CRUMPI MD / KCP " 7 W :RW C GT MD PUGAI " W 2SF GT MD PUGAI " GARMP : AFC GT MD PUGAI " - CV MR W CAF M M GA GT https://sites.google.com/view/ieeecvf-cvpr2020-precognition/home
  83. 83. 724 U o nsRP Q • CF I G 7AHA D G , H A D GI D HA D – 7t Trk W • b de eM aM eTrk • , H A D/ Pb deTi Wumh QM F, H A D Pb depxQU ecVfg – v • 0GAHI D G C DP6DAK H I HIADQ • KA 2 GAB P G A , /4Q • GA D / 0 K H B P6DAK 2AIIH G Q • . A 8 P/ 4 H G Q https://sites.google.com/view/cvcreative2020
  84. 84. -7 4 VR • -7H G H – -7 lu n • -7e lu n c f ag P – t o 8HA 4 -22 bd – 74e T l x – kmi h v – • , H 8 G 8G – B C 8B 8D C B H BB8 D G G 8B C B8N G : D D 8 8BB B N D DN - 8 G D . B 2 G B 0 8 4 AA /8 8G: 78D .G D G :A 8D 5 C8H , HB D – H 8GM G 8 H rlos p iwyUc kmi https://vap.aau.dk/cvsports/
  85. 85. / Pk Cmz E • – e n RPriCgkCV c u • 7 /7 7 h T s • [] 7 6 M • 0 l [] s – 2A 62 7 2 8E I a – [] s P o ltK T https://eyewear-computing.org/EPIC_CVPR20/
  86. 86. -9 m R T ( () • 0 C . M / C C M 8 N M -IG NM 9C CI – 0 8/ ro S le gm:7 • 8CG CM N u ibt0 8/ m – / F C C A 0 CFN 2 MCA MCI I 7N C / M MCI C - • u a m w u bt ih mB MG l f hn sl dc m u • VB MG kj u ac u WiV w Sm pm mr l l ak u Wm k – , F 0 C MB :CF ,0: . M M • l t xv m mc m BG E BMM M ACMBN CI BMM ACMBN IG C CI DI C P
  87. 87. ex U W • GJ LCMC H FCMR CH GJ M K CLC H – y gv • wuarpT d gv – iVnhmk – o – o U uet z – cb • – CH O . JM W H JM 7J H 8CG – FL -CHHW3 M 2 KHCHA – FCO W MCOCMR H ,O HM A – 0CM H K 3 FCEW M H M CH GJ M K CLC H – HCG L . KAW9HL J KOCL JK L HM MC H – HAD H S P W/ G HL H D ML CH ejVlg a eps http://ai.stanford.edu/~jingweij/cicv/
  88. 88. /1 ToqIptN L • A8 98 , – u lRT gc , w • . / V r S ,nmLT daeb k • A8 98 0 9E A E 9 : 0 LT hPQ c P • 299 7 8 19 EA Sis E 9 A8 98C A : https://embodied-ai.org/
  89. 89. 0 E cH • A8 8AC A – da OSRe • A8 8AC A L PN – / A8 8D A8 V T – A8 8 A 8 08 di b • mj PN – A8 8AC A A8A8 29 A 8D A8 T – / 8 A8 8AC 8A8 V da – 0 V da – A 0 ghln o – da http://activity-net.org/challenges/2020/index.html
  90. 90. 0 km lsMI • . C 9 C C: 1 – t o R TZa Uc • 2 C I.: / C 8 . – duvI nuv y • . 2 9 : I – I cW R NTt o • 3: 9 1 9I C C C C 8 8 C – Y V Tgf Pe rhi https://matsui528.github.io/cvpr2020_tutorial_retrieval/
  91. 91. / 89 b ) ) • R M H J MT M T H FMMC SH TP, – i dmwya • MT M T H FMMC HJJ K 4 2MMFJ • MT M T H FMMC SH T 9HBI :V JHPIH B AMMI 5M CH 8M RP 2MMFJ K 2Y U M HS PH U M C :BGH CJ 0 ?Y HBG HBG J 2M P J B AMMI • MT M T H FMMC AR J, - C T H VFHAAM HB MPME • MT SH TP RP C H G C BHPHM MB PP, H M HM H 2MMFJ 2 F M H M JHP -4 :HKM PH HS PH U https://www.youtube.com/watch?v=W1zPtTt43LI MR RA if hvnspur zatou gbm g k mlb k( maZ ecg
  92. 92. 5 9 - 5
  93. 93. yVe ,623 R D G 0 GD D 2G 9 9B CC IG - GC 9B - 19A I G C /C D I B + I 2 GR – p D B /C RUd - • - I B9 s /BB C D I D wg • hmjo t p Udr - cmi lnO w g R2 I C IG I D D gyS - P3 + Ud g , D D C 6 L DI - I B9 0 IRO 6 L - I 0 I B9 g f u c w g W sv3 +g P, D D C b a 3 D IG I D 0 Rg P
  94. 94. n P RNV- 23 M )( • , 2 9D /9 9B D : - A 7D 09C 9C , B A 79 2 BD D : ,9CD D 9 D 2 A9B , B A 79 2 BD D :ao S vr y lm S , 2 9Dau , 2 9D s y lm BAV tp eh i dcgV alm w : 9 9G 397 CDB 7D A9 D 9 7 :V , 2 9D w
  95. 95. plU VR 2 P )( • - /DCD9 B C D B C9 H C 3 DC H 3H : CH DCD 8 / CH DC – / H B DCS M: C B C D k 9 tW -00 vsN – DC o W u nwW M DD 9 DH i N m cbdeg fh ar
  96. 96. m a 34 ”O P )) • 0 O I7 A P – 0 A A 7 – p Nb L0 A U ml bo sN • b 7 H t n • b c wn . , 7 0 A A 7 A C H 7 4 C A 7 A / 7 A A K 34 c A 7 H / 7 t r l d RL U l ceo sN lcU 9 7 H t n fl U Vl Mt n b fl U VL b c wn l bi L b NvU l n 7 2 2n f(vu L bi “c i g R U
  97. 97. nb g/ ( • / 5 M -P 5 GDH – H DMD H -1 o c D D MD 2 0 D H G am o T – M HM o G DHB S3NG Hi y u i H M DHM o tam f hjlS2 y u h y u nd g w D H G i o h cT 4N MD D HM E M am R hjl4N MD T sU) M D 5 Mf M D 5 Me DG H D H 2 N D H M HM D o mV 0 MC/ MD H5 Me G HMD BND e v o pr V D H 5 Me D H A MN G o V D MD H5 Me0 M GDHD MD f M C MD D H G o
  98. 98. lg 12 t K) (L • C ACC A A A A 3 / 2 A D A 0A – 6A 3 / 2 A D A 63/2 W a C CSC A CWo V v wV -CA D C A CT C C CW fe A sr i wSN VPRI – sign words SOTA 63/2W w u WcnV I -CA D 3 6A CT3 0 CdW A sr i wI v w mgPR63/2 a
  99. 99. i - 3 ) (R • -EC ,7 1 ACE CE /A B 0 7EB B 7E 2CA B – D 1 E 0 7EB B 10 b B 7 7AD b a Vh-EC 7 A ACE ,1 N – ,1aegMtxoA B 7 d c Pnmpr f 7E B 7 D7 E b a VM ,1 b 10 a h N Slow drift phenomena Embedding features drift Hard negative examples ,1v x 37 E 7 yl L7B CEX a vwPs
  100. 100. l Vc 3 S) (T ( • G 0 F CB 8 2CA B – 8 G A 1- / 8BB B e geD C FF cG G F CB m B FG8B B G CBdG G G Fm A G G G C m 8 G A a b G e m gil – 1- / F 8BB B tyRv Rod d a NFD b m il tyRv Roe lP 1 CA G D 8G afrup bnx R m nwxsxwR b e m V G D 8G af l b G e m gN D FD G 8DD B d e- B B m V
  101. 101. wVd T , 34 R (S ) • E 9 (- D 9D D 9 1EC D – /E D D J Ub EEC 9 E (- E A EL E A C e r D E D mjln ac sO – (- ihjl ( ig 2 ue xO /E D D D 9D D E A ED EDeoT ye O D J Ub r 09 E N2 A EL C e s t P sv x
  102. 102. wrRb )0 I • 0 1 ) 1 C (A1 55 – O TaN NIds – ) 1 C O x mcTlO P t – u d O j d o ) 1 C d V a /5 I h i 2 53 I e 2 53 5 13 IdnV) 1 CN Ndy
  103. 103. b R 0: 9 % % O (P % • 2 C 1 – 2 C 6 • 2 C 6 3054 % - E . I D @ -% -) – rxtM o Mpesc a • n c • 2 C 1 1 ( /0707c https://arxiv.org/pdf/1911.09070.pdf h o o MpesL c a O P c V h gi Mf g vc O P n O1%N1 P rxtM a n
  104. 104. TK N/745 ifD E • 8 0 – 1 9 : 3: P p – XCV n e • P XCVe R o c – 2 : 9@ - .6 , .6 http://openaccess.thecvf.com/content_CVPR_2020/papers/Feichtenhofer_X3D_Expanding_Architectures_for _Efficient_Video_Recognition_CVPR_2020_paper.pdf
  105. 105. TgWM 12 • 8 D 1 C D – iro tn a h V – 1 C F D A E D Ec • d sIo – a iro pl NL c30 a I Il 2 / Ea rlzI Oe Il 3 -ADD 2 A 1CA AD 20 1AA Oea h L iro s C E -ADD h M H Il a 20 1AA c V H P Tg tn H L m dM ca R c I h S H
  106. 106. g R - 4 P • /G HF2 , BB – b oOmlnp – h Sf “h a 0 . B /G HF2 , BB 1 G : B , : C GA G / G B 39 : /G HF - 4 L “h aoOmlnp c eL npr cfsni kh t fN oOmlnp V j FH G HF
  107. 107. S - 34 P • , F A 0:A : – p n k • k” – - 7D7 : .7F7 :Fk a • J m o J 2 l s J 2 1 1 :F 7 , F A 0:A : , F A 7 - C F A 5C7F F: C D7 5 :A: 0D7C - 34 a f k a” srtw h R s R Vi c/:G F a R e“ Ri c s dh R RS e i J a S M https://www.actiongenome.org/
  108. 108. f , 5 V W • 1.M0- – hsv ce dx - g • tmwq j rikw b z g e • 2G 0A 5 GDMLAGFa “ou vwpln – ,GD : mt • LLH GD : GG D G A :D 6 G C.IC 3 / 8 L 6 6 ALG L D R 1.M0- 3MDLA 2 D AP D DA F 1 HDA AL .MF LAGF G 0A 5 GDMLAGF - 0M F -A ALA LAGFS , 5 LLH MF MC ALG AL M: AG 1.M0-
  109. 109. r e 3/0 N • C - B G 7 AC D 7 AC C 7 A – pxlaSP g h b c s ib /AD . b Tt d f • /AD . bm V w aSVdg h v A n o f RZ “f 1 C - B G 7 AC D 7 AC ,C 7 A F B F 2C 1 B 1 7 A 3/0
  110. 110. Zj a 3/0 R S • CA A B . 8 F – ” c ec p • ic da k“ • C B B C k ” Ti 1 CI A CA A B . 8 F B A 0 B B C 1 , 1 D / C F 3/0 a1 B F B8 , A R1 ,S Wg ” k ” “ bVT n bfh Pb “ c k v sc 20 Wgc c cuxlo vrs t bah iO
  111. 111. ic • 49 1 3/ 13 1 2 / 4031 143 – e gt r w nl • R Pu TV d C o Ps • aF r w C
  112. 112. D F ea • 2 310 51/ 1 1/ – lf ikjg lf – / C PR cd V O b
  113. 113. S FN 0- D E • 1 / / 6 1 – it sc C m g – e V C fL g R – 2 2 lP noP r / / 3 1 6P d ap
  114. 114. • 0 1 3 7 2 – bT A 0 1 0202/ – dR P dR a dRCe g V
  115. 115. n P S 0 us • / 3 ?8 -38 -? 3 32 3 1 0 ?18 1 – xl Uwr f gt a o iva U iv eag Ti k – fd oUpIeagTi V wr H C Ti RU 3 3 wr H
  116. 116. 9 - D0 2 2 - C 9 1
  117. 117. M CDB • L M – ,0 /0 8 , 0 / 1 28 /3, – VN P 2020 1960 CNN Perceptron Neocognitron 1980 1990 2000 2010 1970 HOG/SIFT BoF/SVM 1st AI 2nd AI 3rd AI Deep Learning F. Rosenblatt et al. “Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms” in 1961. Rumelhart et al. “Learning representations by back-propagating errors” in Nature 1986. K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position”, in 1980 Y. LeCun et al. “Gradient-based learning applied to document recognition” in IEEE 1998. Backprop.
  118. 118. .66 y - i s • AL AKNL I 6A CIENL I - IP6AN – AL AKNL I • tv fkegos zwu ab – GNE G SAL AL AKNL I • AL AKNL I w – 6A CIENL I - IP6AN • “d t mehp u c x AL AKNL I lYhnirs 6A CIENL I 3 0 F MDE U6A CIENL I 9AGB LC IE EIC 6A L G 6AN LF AG B L A D IEM B NNALI 8A CIENE I :I BBA NA S 9DEBN EI MENE I V ,E G -S AIANE M KK DNNKM L NI LC L I K GE K KALM 0 F MDE K B - IP G NE I G 6A L G 6AN A- I AN G U1L EAIN , MA A LIEIC KKGEA N . AIN 8A CIENE I V 2/// DNNK S II GA I AR K GEM K B G A I K B
  119. 119. 1 0 kp l R % • 4 9 80d U ihf x – .E N - g 4 9 80 % • % s r Ywt u r Pn494R hoela B , EC B I G BC D : BCDL E I C A – / EC @ 7IG A CG 8 : 931 1IG GL d1227 U S V
  120. 120. -- . N ! --C D – /31-1 IC V – 8 73 4 !! /31 71 8:3! 4 !!0 : 410 2 08 ! ×
  121. 121. -11 , in I 0 8 1 E V – D 98E C D – gc F + , 23t – R g PVF r http://fungai.org/images/blog/imagenet-logo.png https://www.ted.com/talks/fei_fei_li_how_we_re_teaching_computers_t o_understand_pictures/up-next?language=ja 0 8 1 E oeF 4E8 C F 2C 9 E FRPVr 7 C 2 89 c I P / / 5 - I s F dmelf r W gc aN TS
  122. 122. ot 0 a r / – 62 P 0 N AD – 0 lV T U G PC A DI – s tdI e https://www.nextplatform.com/2015/03/18/nvidia-tweaks-pascal-gpus-for-deep-learning-push/ 3 1 P05 81 s td
  123. 123. 177e 0 epuv , ( + • e z )V + – )w mik l age x] [ – d e 9 KC 7 L H D / P7 L 5 CR NKD 46 90 227 L C HG G 46 90 ) 2HHA6 7 L R A 46 90 ) 0 89 9 K7 L 3 46 90 0 89 + 46 90 CGG S16e + rnpS I sote ILSVRC2014 winner 22 46 90 CGG S y ( h
  124. 124. 3:: lnZ2 ehif - (- , • p tukZ lk – R: Sms adbcg • R: S 3 NR : S 4: S – lkZrok : T BL 0 DI S DST B DI 4GG D NS: S • :0 : S 4GG D NS: S 4: S 4 CL D J. Hu et al. “Squeeze-and-Excitation Networks, N 097 . ISSPR/ B H BCR , . , S BL 0HH HBS E R ETBL BNRG MBS NR G 3 P : T BL : SV R N 2 , ISSPR/ B H BCR , . , R: S 5 6TBNH S BL 3 NR LX 2 NN DS E 2 N LTS NBL : SV R N 2 , ISSPR/ B H BCR - ..( 3 NR : S 1 A PI S BL 8 B N NH BNRG BCL 0 DI S DST R G DBLBCL 7MBH D HN S N N 2 - ISSPR/ B H BCR , , , :0 : S 2 8 T S BL H RR : T BL 0 DI S DST B DI N 2 - ISSPR/ B H BCR , . :0 : S 9 BN S BL 4GG D NS: S/ SI N NH 9 E L DBL NH G 2 N LTS NBL : T BL : SV R N 7298 . ISSPR/ B H PEG . .) PEG 4GG D NS: S
  125. 125. 1 gU ] V-&), (, • s [ Y neVmt rU m r – idha. : 2 OP&2 OP N : 5 1 – l. 2 F P P – O L 5 LFQ F . mtock 9/ OQ 1 F – m r. S OPN LR PO )1 LR ( 1 LR Person Uma G S LD LT O 8: +A : 3 NOG CI 8: A 2 5 LF 8: +A S PN LT L 48 A
  126. 126. 5 o a4Cox yd( /e ()0 Hito Uma 9HH SPRL FCPVSH 4C @ (G 2 H3VVXY 7HXY @ 4 F8P XOP R 44C(,G @ VVSP N SYP YHXR VXX 7HXYL @ 4 F@L A(,G @ @ 4 dju 9H HMYL I L Y LXXe f ” L XOVY 5LYL YV nr nrd” v klle E [( [) [ F@L TV 4C @(- 4C @(. H DP[(/G L XOVY LYL YV M SS V L Y SH L HYLXY 2SNV PYOT ” HXR @ 4 F9L 44C(.G @V 2SPN 5LY ALN IIV] XLNTL YHYPV o i n n tms HXR @ 4 v jh +( /2 1 A4 4 IIV]o i n tms@LYP H LYv gtoi wb + /2 1 A4 4 AA5 F P 644C(-G L XOVY LYL YV 2 OV 3V] 9H HMYL MLHY L o 9 8 F5HSHS 4C @ ,G AC 47 FDollár 3 C4 0G AVMY HX H L 5 FFelzenszwalb B 2 ()G HYL Y AC • o ud) (c) (.e @ 4 F8P XOP R 4C @(+G ALSL YP[L ALH O 4
  127. 127. 3 h 2Bhrvwsa ,b h la , b 1 MRKL I EP 0R 0REP VMV S EPI 7R E MER I MR FNI 3I I MSR MR 2B , L TV. E M S K EFV + , ,- 7 1 MRKL I EP 7 4 . 4 M MIR :XP M EPI A EMRMRK MR IX 7 , L TV. TETI V RMTV TETI , VRMTI I M MIR QXP M V EPI EMRMRK TH 7 4 > DLES I EP : 3I . 0 MRKPI LS FNI 3I I S FEVIH SR :XP M 9I IP 5IE X I EQMH I ZS O MR 0007 - L TV. E M S K EFV , : 3I D CMRK M I EP FNI V EV SMR V MR E CM - L TV. E M S K EFV - +, 2IR I I 9EZ I EP 2S RI I . 3I I MRK FNI V EV EM IH 8I TSMR V MR 422B , L TV. E M S K EFV , , 2S RI I 8 3XER I EP 2IR I I . 8I TSMR A MTPI V S FNI 3I I MSR MR 722B - L TV. E M S K EFV - , ,- 2IR I I :EVO 2 + / 2IR I I + ig d e “fk m x h u t a bm y cj onpm y onpm
  128. 128. / a a T (+U ( • s l k t s Sp • ls g – hl t S b R – g ] ] • nS St 0 ID 78 ) d dPv CD CN 3CI HH L F ) CN 8 IICACLDCL 52 -2 ) Pv S i o r 5 M 8 1C 2 8 2 -HFDI /CN CD k t s Sp at lg [ X g ] /CCK A ( CI 7-52 /FH NC I a V bs l k t s Sp g e
  129. 129. /22 tvO. efdP Q – . 214P y sQ xi o • r P0 8 : .8C A A Q • n up P 8 5 : A A I:D A , 5 Q • 8 / 8 8 u 5 l N p l V g R 5 R0 8 : .8C A A R 8 / 8 a m mw V p h https://visualqa.org/ 3 A 8 : 8 LShow and Tell: A Neural Image Caption Generator A . 4 ) C 8D D C ; ( ())) C ; C 8 8 D
  130. 130. /33 N.8 mv nP )+ )) • sluP3 / C DC .DC – (/, D .DC 3 Pow – (/ slu, .33 1 2P r g R cb – )/, )/ .33Pl iS Tb VbS – ( /, :A .DC P“ l i fp t )/ a aP.)/ )/ 3 0)/ )/oeui a arkp P)/ / C A L- .AD 1DD D D A .DC DAI DC D - DC D C DC M C .84 ( +
  131. 131. 188 n T0 ab )-V ) • nvr 5MGN C S1C NC 0 L GK G E 2QC N G G CKN S G 400 (& , https://arxiv.org/abs/1705.00754 1C NC0 L GK G E G C S G E C K CILKM /A GK 1C CA GK S G /07 77 (& , https://arxiv.org/abs/1710.06236 /1 GP C S C M G E L GK CILKM CLMCNC GK RG KA 3 K 1GDDPNGK S G 0 9 (& . https://arxiv.org/abs/1906.05571 KA 3 K 1GDDPNGK 31 nv ls d nv ui en tpTwpV oTc gh d f m nvr V
  132. 132. V N 1 • 8 6: – / 3D – / GAN https://medium.com/@sunnerli/the- missing-piece-of-gan-d091604a615a / BigGAN https://arxiv.org/pdf/1809.11096.pdf 1 A C 5G8
  133. 133. 288i [1 ix y # ,] • 3/8i g l 3/8 p u i3/8] • A3 GNN X 84: B R .##RCRGS PKR #RCRGS# IGPGSC KWG C WGS CSKCN PG R 213/8 j j i ] • A C S 416 B R .##CS KW SI#CD # :K :K RK GN af e1 P K K PCNg3/8] • A4 NC 1 : B R .##CS KW SI#CD # 1Z NG3/8 RK RK i gb ] • A V 411 B R .##CS KW SI#R # - R /13/8 vs k h bdr wot fbc] • A9 GSC 4176 B R .##CS KW SI#CD # - , 3/8# 83/8 ] • A/SL W MZ 4176 B R.##RS GG KPI ONS RSG #W #CSL W MZ C ON • A7KZC 416 ,B R .##CS KW SI#CD # , - :33/8 ] • A5CSSC 416 ,B R .##CS KW SI#CD # - , GN / GP K P 3/8 nv t m ] • A CPI CS KW , , ,B R .##CS KW SI#CD # , , , - 0KI3/8 3/8] • A0S M 416 -B R .##CS KW SI#CD # , - - , ei
  134. 134. 377c yM2:c opl , ,S • sa x 507 G :04 – ioha xd507 Q - S • 97507 1B 507 9 C 507 – :8 :04 :04 g fV • 507ce 6D 2DCC EG g a x trc y v N:8 :04 N1B 507 0 Razavi et al. “Generating Diverse High- Fidelity Imageswith VQ-VAE-2,” NeurIPS, 2019. A EG. B D E - E
  135. 135. -- 3 lsum • Re a Ve – iOtOm • A 9 9 9 C9D A • 9D A9 9I F 9 DA A • 1D A 9D ,9 DA A • A 8 CF F A • /9 A D 9 9AF ,9 DA A – V S { g UfLdT Zf c } – v 1 dT v pnrg W c g f • hr ko w a N
  136. 136. ))V ,Vmx n C • V – / -/ 2 /30 /8 • /8 C pk u { l i • / /8 C pk u l iS l i • -/ 2 /8 C S V l i – / U vs VW • /30 /8 } C } D} – if V U} NV U if V Vr a o – / e
  137. 137. //Z ,4ZhnoiN N3 8B : .:8 B B Z t – 38 CBCA 6 8A ,401 • ,401 : 08D: 8 • Ze asZuy – ,4Z Z L w T P e a S – p V Zu ] D 8 CBCA 8B C : g edfh[ e a r lkm R 38 2 A 8 3 :: V e asZu
  138. 138. 0 l V/:l ʼ y • [p P – z i [p k g • 3 B D HR 2 , hj] d 2 kn w v l r c • 0 6 EA?N 0ANA?ND e l i 1//: .A N AL • /ON NA 4A L o l mi d 9O ALHARAL 1//: 6L G . z b/2w vi 0 aok pr bd 8AHAS 1//: 6L G /ON NAi ut z r pk c 9CHAG P 1//: 2 l lw v s
  139. 139. 33 /8 kstl ()&) ( S I 2 9 D D – a uVL a • v ifb 9 . L ifb 9 9 . • N r V1C9 3 9 rS 1C9 3 9 Nec Pmgr 9 a,x dp nrV T ( 9 ;9 84/ 1C9 3 O1C9 3 v x
  140. 140. .33 rv efaM +N gk m ns – 0 3 i u • 5 14 5 /C C : C C D 1 C : DF 4 C D • l to https://arxiv.org/abs/2002.05709 5 14 d L uSwR M5F C D 8 p 2 D 5 14 p %%2NSI i V m
  141. 141. loD G • dm – a • /A 4 2 1 1 • 3 4 1 18 41 5 • 1 1 5 4 1 • 1 .1 3 1 – Ncm /A 4 2G sm – eg 3 4 1 18 41 5G • vz Su dm • cm V rL – y 1 1 5 4 1 G • SR i 4 5 t Ch C cG n
  142. 142. 11P A 5P i D C i SP – / 1 R p n G R • B4 / 1 – 523 7 26 7 – /Ve f t I 378 7 D3 N D • B/ 6 / 1 – 0 3 – s Vr N C Ve f t I a C C iVm
  143. 143. 188n V0 n , % • /DS KC :DPF KI 5 ED8DP a sk BN PBF MND PN Kn m n jo5 ED8DP ND PN K a i]sTTT 7 F H K 200 %, 3/o 8 n4 FP Ej c n / 1/ [ M %- , R :D 8D P % % o 8 n https://venturebeat.com/2018/05/02/facebook-is-using-instagram-photos-and-hashtags-to-improve-its-computer-vision/ 5 ED8DPo n wvn m ds • c ] • wv yj a dtp n j – gc % 6 n o • o a h nj Xw m n j ken rttp j s [ 4D N %, wvm c ds r n
  144. 144. /44 y M 8 nrsoP + + • y /- t – y g n e V – y 2C IC 2 K G9C 3 D I C 7CD 3 GI G9C + 0B CK G D 856 , 2C IC 3 D I C 7CD c f y ,c f y ,c f y 1 I G D 3 R v y mN i n a T y mN i n [ dhnnrso X V]
  145. 145. // 2 iorjM# N • // koL Lf K nnLg – 3 3 # 19 3 93 1 , 3 1 9 1 9 3C D/ C 3 8 /1, F/ C 3 38 M y N – 3 T b 2 Vc // x https://chainer.org/images/logo.png http://pytorch.org/docs/master/_static/pytorch-logo-dark.svg https://www.tensorflow.org/_static/image s/tensorflow/logo.png st 93 1 v 1 9 xv 3 # 1 b , 3 a e c 3 4 1 9 lphfi 3 # aP wV
  146. 146. 4 dv 3Adlp m ( (-a ). • 8 3/ no kh icfg dz – 1B> 1 WTI 7RR I 3 RWH rtbe UWFEPI( 1239 6W ENW GRPM URR • :W VM 7 :W VM RHI – 9PE I IV d • . 0 0 ( P 0 P 0 P 0 -P 0 P 0 P u s > 21:5 ( x 1239 https://commons.wikimedia.org/wiki/ File:TSUBAME_3.0_PA075096.jpg : DEPE ENM IV E DIV 1 RV IT 1GGI ITEVIH >74/ IU IV TEM M R 9PE I IV M ,) , UIGR HU ETCMX STI STM V . ( . VVSU/ ETYMX RT SH . ( SH
  147. 147. S N • V – – – D – – – 5 / C 83A0 5 1
  148. 148. f V-:fsyzt & & • , ,8 8A M C - MN , ,8 – f dc ol g e – g21 1 O NAOe bf ] i – g f ujvon zf 21 1 ,POK KIKPN M C A DI MG oryk pmwzr n z h5 O / KR 8OAMAK 3 O D C 5 FA O AOA O K 7K 5 KIAOMS 8AI O 8ACIA O O K 0A CAM -: 7 2 I 1--: f f ec h a e [n zh 8PTPG 2 O KG -: 7
  149. 149. 33 y T,9 irtjW • smhacf – _ E N N ND I – nklorVept I /O D I N W “ – uy i gbtd W “ , : I N G R I /O D I+ E N 5 . ND ND I 1N L ND P I /O D I S DI ,956 CNN + I NC P IN IN ,956 L : I I /O D I E N 5 . ND ND I 1N L NDP I /O D I ,956 L 0D I G N G R2 P L DI C , G ND I L D 8L FDI S DI ,956 http://openaccess.thecvf.com/content_CVPR_2019/papers/Giancola_Leveraging_Shape_Completion_for_3D_Sia mese_Tracking_CVPR_2019_paper.pdf i gbtd Wv
  150. 150. /44 P.8 grshR S • l danst – l dans e ans – / 1 ABGF R S • / 1 ABGF v Y om – 1./- R S • 1./- v k ip VY f c_rs s 9 2 D / 1 ABGF , 8 I BD - F AE IC GI / BGF 5G 0 BE BGF E F BGF F 3 F B B BGF G .DG ABF 3E N BF .85 A G F A GE GF F :.85 : I 2 :/ 1 ABGF :,: 8 I BD :- F AE IC: GI:/ BGF:5G :0 BE BGF: E F BGF: F :.85 : : I H. Kataoka, K. Abe, M. Minoguchi, A. Nakamura, Y. Satoh, "Ten- million-order Human Database for World-wide Fashion Culture Analysis", in CVPR 2019 Workshop on FFSS-USAD. A G F A GE GF F :.85 : I 2 :/ 1 ABGF :,:8 I BD :- F AE IC: GI:/ BGF:5G :0 BE BGF: E F BGF: F :.85 : : I
  151. 151. 166 tuW0 ckmd Z • t w s – t otie – p V rntv V Xhlgb a • t 1/ y Z 1 1 C C H TScaling Egocentric Vision: The EPIC-KITCHENS Dataset 200 E N- C A G AEC N D EP K 5 S C H T:EC 5 C AN 4P .A K BCK 1 NC - , E N- BCC B AK CNC AE K C NKP AC K C NKP AC B NC N G C AN 0 3P C H TAVA: A Video Dataset of Spatio- temporally Localized Atomic Visual Actions 0 78 E N- CNC AE DKKDHC AK BKR H K B E H M. Monfort et al. “Moments in Time Dataset: one million videos for event understanding,” in arXiv pre-print 1801.03150, 2018. http://moments.csail.mit.edu/ 4 E K C H THACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization C , E - E AN AN H CBP 9K C E D 9K C E D B NC E N- AK B NC N NK C E D NK C E D
  152. 152. 055 L/ nu oN & • / nu o, gtis r – ea aS S • l h T e – pmn • y DC 3 C # E FD : 3. /D E DC A 6 D D F E .0. mn – pmn • 0 0 C C A C . D 3 2.81 – pmn V M • f X XP
  153. 153. 22d T-8dw} V W • # # -8 dw} – vUsd {y n Zif • d N pU urtz k S • F N evUs vUs – vUs d U U • 7F C 5 E 5 D R 5L 1 F F • : 3F 0 N R A D F F • E F , F E F F E F 5OF A , D 1 F F ,L E F F 2,5 / F 6 F F h lk hamoS chg g g
  154. 154. 22c U,8cmstnW ,8 5 w d w e – ,8d lVkv Zc u f“ • ,8 5 c L d b hy C I F R 0.N/ S ,8 5 B L LBNHLN L C I AC BN CI 0.N/ N F R7HLN OCL 1 HCHA I I F C I F 3 D L I 0 A L CH B CF S ,8 5 W L P : B IO F R N IF FCHA 3 D L S ,8 5 ba itp Vkojrtd c h gw e
  155. 155. 8 : : - . 5 1 -
  156. 156. 6 5 7 6 5 7 97 P - C – 3 ;7 a P o • /412 E . l. • /412 ,E-+- . l. • 0//4 ,E++ . l. • /412 -E - . - l. • /412 E + . l. – pv i g rnh • ;8 crn TV 7 t P R P R lE e lE +f
  157. 157. + +. 00 – i m e p – h t r al e / 6 b G e ug alH e / 6 b . 10vn e i c https://github.com/cvpaperchallenge/ECCV201 8_Survey/blob/master/ECCV2018_Survey.md http://xpaperchallenge.org/cv/survey /cvpr2019_summaries/
  158. 158. 52 2 . 00 1- 6 – @ • • – n • j e cg – 7 j • v – t v 7 p / – t l 7 ag r h h •h h h h h h
  159. 159. 1 < 61 • x = {cv, nl, robot} – x N RN LV h P eg 1 < 61 + . /:2:> ab c Ni C robotpaper.challenge
  160. 160. T V SK N P R K / 2 0 / 2 2 : 0 2 0 . / 2 20 / , / : / C : / / / /
  161. 161. _g N • . .;7 N . F 7 N – Zl lb wy – . .;7 N • b R Wi _ Y • R Ul ZP R • t r b tm a S – . F 7 p o N • a T h S R _W – a c 0 ;v a 1 – FFC F ; 9 9 7 ; 1 977 6 / :; • R Vl tu b nt ws cWi – nt ws b5 a Rde
  162. 162. S • .2 5 .2 – Pe – .2 u • u r • b T Y v Y – .2 • v P o – v r W • lab l i Y S ; – s R / V .2P V – tW Y b Pe b i CY W ;
  163. 163. O & • /-. .1 3 5 /-. / 3 – .1 3 / 3 • / v C e • lae 9 t iu I V e P RC • V e s t iu D r e OC
  164. 164. I DF * T A 55 3 E3 F 5 5 34 3 X3 /
  165. 165. / 5* * * * *
  166. 166. @ C / . . /

×