SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Knowledge Augmented
     Visual Learning


           Qiang Ji

Rensselaer Polytechnic Institute

       qji@ecse.rpi.edu
                                   1
Motivation
• Machine learning (ML) is playing an
  increasingly important role in computer vision.
• As an enabler for computer vision, it allows
  automatically extracting pattern from the data,
  a significant progress over traditional hand-
  crafted AI-based knowledge acquisition
  models
• Current wisdom: powerful image features +
  large amount of data+ advanced learning
  techniques is the solution to CV ?
                                             2
Motivation (cont’d)
• Current ML methods are mostly data-driven, and
  they are brittle, lack of robustness, and cannot
  generalize well when the training data is
  inadequate in either quality or quantity.
• Current ML learning methods cannot lend
  themselves easily to exploit the readily available
  prior knowledge.
• Prior knowledge is essential to alleviating the
  problems with data and to regularize the ill-
  posed vision problems.
                                                   3
Knowledge-Augmented
            Visual Learning
• Identify the related prior knowledge from
  different sources
• Use the Probabilistic Graphical Models (PGM)
  to capture and encode such knowledge
  systematically and automatically to produce a
  prior model
• Combine the prior model with image
  measurements (features) in a principle manner
  to perform visual understanding
                                              4
Sources of Knowledge
• Permanent theoretical knowledge
  – Various theories or principles or laws that govern the properties and
    behavior of the objects (e.g physics for body tracking)
  – Tend to be generic, applicable to different objects and different
    situations, but hard to capture
• Subjective and experiential knowledge (expert)
  – Knowledge gained from experience based on long time observations
  – Tend to be qualitative, inexact, and approximate
• Circumstantial and contextual knowledge
  – Auxiliary information or context that is available during training or testing
• Temporary-statistical pattern-based
  – Tend to be object, situation or database specific
  – widely used in CV.                                                      5
Methods for Knowledge
  Representation and Encoding
• Convert knowledge into constraints on parameters
  or structure of the PGM
  – Model learning can then be formulated as constrained
    ML/EM (either closed form or iterative )

• Numerically sample the knowledge to generate
  pseudo-data
  – Propose a MCMC sampling approach to efficiently
    explore the parameter space to acquire samples that
    satisfy the knowledge.
  – Encode the knowledge by the distribution of synthetic
    samples
  – Combine the real data with the pseudo-data to train the
                                                          6
    model
Knowledge Representation
          MCMC Sampling
– Determine the valid range for each parameter
– Generate new sample in the valid parameter space,
  using the proposal distribution



– Reject samples inconsistent with the knowledge
– Repeat until enough samples are collected

The proposal distribution allows efficiently exploring the parameter
space by associating high probability for unexplored regions to
                                                                  7
produce representative samples.
Facial Action Recognition
                  (Tong and Ji, CVPR07, PAMI07, and PAMI 10)

  Facial Action Units (AUs) capture the non-rigid muscular activities
  that produce facial appearance changes (defined in Facial Action
  Coding System)
• Each AU is related to the contraction of a set of facial muscles.
  A small set of AUs can describe a large number of facial behaviors




(a) A list of AUs and their interpretations       (b) Muscles underlying facial AUs
                                                                              8
AU Knowledge
– Positive and negative causal influences
   • Mouth stretch increases the chance of lips apart; it decreases the chance
     of cheek raiser and lip presser.
   • Cheek raiser and lid compressor increases the chance of lip corner puller.
   • Outer brow raiser increases the chance of inner brow raiser.
   • Upper lid raiser increases the chance of inner brow raiser and decreases
     the chance of nose wrinkler.
   • Lip tightener increases the chance of lip presser.
   • Lip presser increases the chance of lip corner depressor and chin raiser.
– Group AU constraints
   • Group of AUs happen together or never happen together to produce a
     meaningful or spontaneous expression due to underlying facial anatomy
– Dynamic knowledge
   • Each AU evolves smoothly over time                                    9
   • Dynamic dependencies among AUs
Positive and Negative Influences




  For an AUi with positive influence by its parent node AUjP(AUi =1| AUj
  =1)>P(AUi =1| AUj =0)

  For an AUi with negative influence by its parent node AUj                10
  P(AUi =1| AUj =1)<P(AUi =1| AUj =0)
AU Prior Model Learning
• Use a DBN to encode the knowledge on
  the relationships among AUs

• Convert the knowledge into constraints on
  DBN or into pseudo-data

• Learn the DBN with both pseudo and real
  data under constraints
                                            11
The Learnt DBN for AU
Relationship Modeling

                        • Solid line: spatial
                          relationship among AUs
                        • Self-arrow: temporal
                          evolution of a single AU
                        • Dashed line from time t-
                          1 to time t: temporal
                          relationship between two
                          different AUs



 AU1*.. N = arg max P( AU1.. N | OAU1.. N )
           AU1.. N                            12
AU Recognition Results




                         13
Human Body Tracking
• Goal: Recover the 3D upper-body pose           given the image
  observation .




                                             2
                                     3




                                                       5
                                                   6
                                             1
 O: Image observation               S : 3D upper-body pose
 from multiple views
  The pose state is represented as the joint angles among the six rigid
  body parts:

                                                                      14
Our Approach
• Bayesian Approach
  – Pose estimation is interpreted as the maximization of the
    posterior probability:                    .

  – Based on Bayes rule, the posterior can be factorized as


              Image likelihood   Prior model of the body pose

   A good prior model can handle the uncertainty and
   ambiguity of the image observation


                                                                15
Human Body Pose Prior Model
 We construct a Bayesian Network (BN) to model the
 prior probability of upper body pose.


                        2



                                    5
                        1
           4




                                6




• Node :       represent the joint angle.
• Link :           represent the probabilistic relationship (mixture of Gaussians) :



• Probability of body pose :                                                 16
Human Body Knowledge
• Anatomical Constraints
  – Restrict body structure based on anatomy.
    • Connectivity, kinesiology, symmetric, etc.
• Biomechanics Constraints
  – Restrict the body joint angle ranges.
• Physical Constraints
  – Exclude the physically infeasible pose
    • Non-penetrating constraint
• Dynamics Constraints
  – Restrict the body movement                     17
    • movement speed and movement smoothness
Knowledge-driven Model Learning
– Using the pseudo-data and constraints, learn a
  DBN by maximizing the score of the DBN
  structure (B), given pseudo data (D):
                                         d
   Score( B) = P( B) + p ( D | θ B , B) − log( K )
                                         2




                                                     18
Body Tracking Experiment
       Comparison with Model from Training Data.


         Table 1. Result of baseline system (particle filter) on 5 test sequences.




                       Table 2. Results of different models .
BN_Activity is learned from specific activity. BN_HumanEva is learned from 5 activities.
                                                                                     19
BN_CMU is learned from CMU database. BN_C is learned from Constraints.
Conclusions
• Knowledge is a crucial component of visual
  understanding, and that the long-term success of
  computer vision requires a union of domain knowledge
  and the data.
• We advocate for a hybrid approach for machine learning,
  whereby both knowledge and data can be integrated to
  result in a robust and generalizable learning.
• We propose to systemically identify related knowledge
  from different sources that govern the functions,
  properties, and behaviors of the objects being studied
• We propose to use the probabilistic graphical models to
  automatically and systematically capture the related
  knowledge and to combine with image measurements. 20

Contenu connexe

Similaire à Fcv poster ji

Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Takrim Ul Islam Laskar
 
An analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classificationAn analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classificationSubhashis Hazarika
 
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)Yun Huang
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...AnuragVijayAgrawal
 
NIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imaginationNIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imaginationAkisato Kimura
 
PPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptx
PPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptxPPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptx
PPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptxRaviKiranVarma4
 
photo detection in personal photo collection
photo detection in personal photo collectionphoto detection in personal photo collection
photo detection in personal photo collectionsonalijagtap15
 
Multimodal behavior signal analysis and interpretation for young kids with ASD
Multimodal behavior signal analysis and interpretation for young kids with ASDMultimodal behavior signal analysis and interpretation for young kids with ASD
Multimodal behavior signal analysis and interpretation for young kids with ASDdiannepatricia
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrellzukun
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrellzukun
 
[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimationWei Yang
 
Criminal Detection System
Criminal Detection SystemCriminal Detection System
Criminal Detection SystemIntrader Amit
 
Human Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningHuman Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningWei Yang
 
Deep analytics via learning to reason
Deep analytics via learning to reasonDeep analytics via learning to reason
Deep analytics via learning to reasonDeakin University
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Ian Morgan
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Bayes Nets meetup London
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural networkSumeet Kakani
 
adaptive metric learning for saliency detection base paper
adaptive metric learning for saliency detection base paperadaptive metric learning for saliency detection base paper
adaptive metric learning for saliency detection base paperSuresh Nagalla
 

Similaire à Fcv poster ji (20)

Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
 
An analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classificationAn analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classification
 
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
 
NIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imaginationNIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imagination
 
PPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptx
PPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptxPPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptx
PPT ON INTRODUCTION TO AI- UNIT-1-PART-3.pptx
 
photo detection in personal photo collection
photo detection in personal photo collectionphoto detection in personal photo collection
photo detection in personal photo collection
 
Multimodal behavior signal analysis and interpretation for young kids with ASD
Multimodal behavior signal analysis and interpretation for young kids with ASDMultimodal behavior signal analysis and interpretation for young kids with ASD
Multimodal behavior signal analysis and interpretation for young kids with ASD
 
AIISC Projects via Posters at 2022 Retreat.pptx
AIISC Projects via Posters at 2022 Retreat.pptxAIISC Projects via Posters at 2022 Retreat.pptx
AIISC Projects via Posters at 2022 Retreat.pptx
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrell
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrell
 
[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation
 
Criminal Detection System
Criminal Detection SystemCriminal Detection System
Criminal Detection System
 
Human Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningHuman Pose Estimation by Deep Learning
Human Pose Estimation by Deep Learning
 
Deep analytics via learning to reason
Deep analytics via learning to reasonDeep analytics via learning to reason
Deep analytics via learning to reason
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
 
adaptive metric learning for saliency detection base paper
adaptive metric learning for saliency detection base paperadaptive metric learning for saliency detection base paper
adaptive metric learning for saliency detection base paper
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Dernier

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 

Dernier (20)

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 

Fcv poster ji

  • 1. Knowledge Augmented Visual Learning Qiang Ji Rensselaer Polytechnic Institute qji@ecse.rpi.edu 1
  • 2. Motivation • Machine learning (ML) is playing an increasingly important role in computer vision. • As an enabler for computer vision, it allows automatically extracting pattern from the data, a significant progress over traditional hand- crafted AI-based knowledge acquisition models • Current wisdom: powerful image features + large amount of data+ advanced learning techniques is the solution to CV ? 2
  • 3. Motivation (cont’d) • Current ML methods are mostly data-driven, and they are brittle, lack of robustness, and cannot generalize well when the training data is inadequate in either quality or quantity. • Current ML learning methods cannot lend themselves easily to exploit the readily available prior knowledge. • Prior knowledge is essential to alleviating the problems with data and to regularize the ill- posed vision problems. 3
  • 4. Knowledge-Augmented Visual Learning • Identify the related prior knowledge from different sources • Use the Probabilistic Graphical Models (PGM) to capture and encode such knowledge systematically and automatically to produce a prior model • Combine the prior model with image measurements (features) in a principle manner to perform visual understanding 4
  • 5. Sources of Knowledge • Permanent theoretical knowledge – Various theories or principles or laws that govern the properties and behavior of the objects (e.g physics for body tracking) – Tend to be generic, applicable to different objects and different situations, but hard to capture • Subjective and experiential knowledge (expert) – Knowledge gained from experience based on long time observations – Tend to be qualitative, inexact, and approximate • Circumstantial and contextual knowledge – Auxiliary information or context that is available during training or testing • Temporary-statistical pattern-based – Tend to be object, situation or database specific – widely used in CV. 5
  • 6. Methods for Knowledge Representation and Encoding • Convert knowledge into constraints on parameters or structure of the PGM – Model learning can then be formulated as constrained ML/EM (either closed form or iterative ) • Numerically sample the knowledge to generate pseudo-data – Propose a MCMC sampling approach to efficiently explore the parameter space to acquire samples that satisfy the knowledge. – Encode the knowledge by the distribution of synthetic samples – Combine the real data with the pseudo-data to train the 6 model
  • 7. Knowledge Representation MCMC Sampling – Determine the valid range for each parameter – Generate new sample in the valid parameter space, using the proposal distribution – Reject samples inconsistent with the knowledge – Repeat until enough samples are collected The proposal distribution allows efficiently exploring the parameter space by associating high probability for unexplored regions to 7 produce representative samples.
  • 8. Facial Action Recognition (Tong and Ji, CVPR07, PAMI07, and PAMI 10) Facial Action Units (AUs) capture the non-rigid muscular activities that produce facial appearance changes (defined in Facial Action Coding System) • Each AU is related to the contraction of a set of facial muscles. A small set of AUs can describe a large number of facial behaviors (a) A list of AUs and their interpretations (b) Muscles underlying facial AUs 8
  • 9. AU Knowledge – Positive and negative causal influences • Mouth stretch increases the chance of lips apart; it decreases the chance of cheek raiser and lip presser. • Cheek raiser and lid compressor increases the chance of lip corner puller. • Outer brow raiser increases the chance of inner brow raiser. • Upper lid raiser increases the chance of inner brow raiser and decreases the chance of nose wrinkler. • Lip tightener increases the chance of lip presser. • Lip presser increases the chance of lip corner depressor and chin raiser. – Group AU constraints • Group of AUs happen together or never happen together to produce a meaningful or spontaneous expression due to underlying facial anatomy – Dynamic knowledge • Each AU evolves smoothly over time 9 • Dynamic dependencies among AUs
  • 10. Positive and Negative Influences For an AUi with positive influence by its parent node AUjP(AUi =1| AUj =1)>P(AUi =1| AUj =0) For an AUi with negative influence by its parent node AUj 10 P(AUi =1| AUj =1)<P(AUi =1| AUj =0)
  • 11. AU Prior Model Learning • Use a DBN to encode the knowledge on the relationships among AUs • Convert the knowledge into constraints on DBN or into pseudo-data • Learn the DBN with both pseudo and real data under constraints 11
  • 12. The Learnt DBN for AU Relationship Modeling • Solid line: spatial relationship among AUs • Self-arrow: temporal evolution of a single AU • Dashed line from time t- 1 to time t: temporal relationship between two different AUs AU1*.. N = arg max P( AU1.. N | OAU1.. N ) AU1.. N 12
  • 14. Human Body Tracking • Goal: Recover the 3D upper-body pose given the image observation . 2 3 5 6 1 O: Image observation S : 3D upper-body pose from multiple views The pose state is represented as the joint angles among the six rigid body parts: 14
  • 15. Our Approach • Bayesian Approach – Pose estimation is interpreted as the maximization of the posterior probability: . – Based on Bayes rule, the posterior can be factorized as Image likelihood Prior model of the body pose A good prior model can handle the uncertainty and ambiguity of the image observation 15
  • 16. Human Body Pose Prior Model We construct a Bayesian Network (BN) to model the prior probability of upper body pose. 2 5 1 4 6 • Node : represent the joint angle. • Link : represent the probabilistic relationship (mixture of Gaussians) : • Probability of body pose : 16
  • 17. Human Body Knowledge • Anatomical Constraints – Restrict body structure based on anatomy. • Connectivity, kinesiology, symmetric, etc. • Biomechanics Constraints – Restrict the body joint angle ranges. • Physical Constraints – Exclude the physically infeasible pose • Non-penetrating constraint • Dynamics Constraints – Restrict the body movement 17 • movement speed and movement smoothness
  • 18. Knowledge-driven Model Learning – Using the pseudo-data and constraints, learn a DBN by maximizing the score of the DBN structure (B), given pseudo data (D): d Score( B) = P( B) + p ( D | θ B , B) − log( K ) 2 18
  • 19. Body Tracking Experiment Comparison with Model from Training Data. Table 1. Result of baseline system (particle filter) on 5 test sequences. Table 2. Results of different models . BN_Activity is learned from specific activity. BN_HumanEva is learned from 5 activities. 19 BN_CMU is learned from CMU database. BN_C is learned from Constraints.
  • 20. Conclusions • Knowledge is a crucial component of visual understanding, and that the long-term success of computer vision requires a union of domain knowledge and the data. • We advocate for a hybrid approach for machine learning, whereby both knowledge and data can be integrated to result in a robust and generalizable learning. • We propose to systemically identify related knowledge from different sources that govern the functions, properties, and behaviors of the objects being studied • We propose to use the probabilistic graphical models to automatically and systematically capture the related knowledge and to combine with image measurements. 20