Boost Fertility New Invention Ups Success Rates.pdf
3AMIGAS - Paper5: Feifei Huo
1. Detection Tracking and Recognition of Human Poses
for a Real Time Spatial Game
Feifei Huo, Emile Hendriks, A.H.J. Oomes, Pascal van Beek, Remco Veltkamp
Presenter: Feifei Huo
Information and Communication Theory (ICT) Group
Delft University of Technology
June 16, 2009
2. Outline:
• Introduction to visual analysis system
• People detection, tracking and pose recognition system
– Human body detection and body parts segmentation
– Feature points representation and tracking
– Pose recognition
• Experimental results and conclusion
• Spatial game application and future works
3. Introduction to Visual Analysis System
1. virtual reality
2. smart environment systems
3. sports video indexing
4. advanced users interfaces
Video-based
applications
Pose-Driven
Spatial Game
5. The state of the art:
• combining bottom-up and top-down approaches.
• incorporating appearance, kinematic, temporal constraints, etc.
The proposed system:
• real time system
• a variety of poses
• spatial game control
Fig.1. The flowchart of the proposed system
6. People Detection, Tracking and Pose Recognition System
Video People People Pose Spatial
Sequence Detection Tracking Recognition Game
Whole Human Blob
Initial Frame
Detection
Different Body
Parts Segmentation
7. Methodology
• Background subtraction
– Mixture of Gaussian
• Head and torso detection and tracking
– 2D upper-body model
B F Area( F ) = Area( B)
(a) (b)
Fig.2. (a) Foreground binary image of the initial frame, (b) 2D upper-body model for
human torso detection and tracking.
8. Particle Filtering
{s (n)
, n = 1, 2 , 3 … , N } →
P( B A = s ( ) )
n
{π (n)
, n = 1, 2,3,… , N } →
8
9. People detection and tracking
• A sample set {s , π , n = 1, 2, N } is generated with an initial
(n) (n)
distribution s ( n ) = p ( n ) = ( x ( n ) , y ( n ) , scale( n ) ).
• Then the observation steps take place.
(n ) 1 ⎧ ∑ F ( n ) − ∑ B ( n ) , if ∑ F ( n ) >
⎪ ∑B (n)
⎫
⎪
P(B A = s )=ω (n)
= (n)
×⎨ ⎬
Area ( F ) ⎪ ⎩ 0, otherwise ⎪
⎭
10. People detection and tracking
• This observation is updated by taking
the prior weight into account.
π t(−1)
n
ωt
(n)
=ω (n)
× N
∑π
n =1
(n)
t −1
• The normalized observation forms a
new set of particle weight.
ωt( n )
π (n)
t = N Fig.3. 2D upper-body model for human
∑ω
n =1
t
(n)
torso detection and tracking.
11. Methodology
• Hand detection and tracking
– Foreground pixels are segmented into skin-color and non-skin-
color regions.
B π π G π π B π π
arctan( ) − < , arctan( ) − < , arctan( ) − <
R 4 8 R 6 18 G 5 15
– The face is excluded from the candidate hands regions by using
the size of the connected skin color area.
12. People Detection, Tracking and Pose Recognition System
Video People People Pose Spatial
Sequence Detection Tracking Recognition Game
Feature Points
Multiple Views
Location
Subsequent Feature Points
Video Frames Tracking
13. Torso and Hand Segmentation
Fig.4. Results of torso and hand segmentation
14. 3D Reconstruction
• Three synchronized cameras are used.
– One front view
– Two side views
• The 3D positions of torso and hands can be obtained.
Fig.5. Multiple camera settings
15. People Detection, Tracking and Pose Recognition System
Video People People Pose Spatial
Sequence Detection Tracking Recognition Game
Construction
Predefined Key
Classifier
Poses
Pose Recognition
16. Pose Recognition
• Feature space construction
2D and 3D positions of the torso center and the hands
normalized feature space
relative positions between hands and torso center
17. Predefined Key Poses
Pose Classification
• 9 poses into 9 classes
• 15 persons
• 1515 samples in total
18. Results and Discussion
Cross-validation results of pose classifiers (mean errors with standard deviation)
method LOPO FORO
mean pose err. max pose err. mean pose err. max pose err.
NMC 0.06(0.09) 0.18(0.35) 0.04(0.02) 0.09(0.10)
LDC 0.06(0.07) 0.14(0.35) 0.01(0.01) 0.04(0.05)
QDC 0.10(0.11) 0.23(0.34) 0.01(0.01) 0.04(0.06)
LDA+QDC 0.07(0.09) 0.16(0.35) 0.02(0.01) 0.04(0.06)
Parzen 0.07(0.09) 0.16(0.35) 0.01(0.01) 0.02(0.04)
LDA+Parzen 0.06(0.07) 0.14(0.35) 0.00(0.00) 0.01(0.03)
Conclusion: the simplest method (NMC) provides comparable
performance to more complex classifiers.
19. Results and Discussion
Confusion matrices of nine poses
Estimated Labels
P1 P2 P3 P4 P5 P6 P7 P8 P9
P1 198 0 0 0 0 0 0 0 0
True Labels
P2 0 193 0 0 0 0 0 0 0
P3 2 0 157 0 0 0 0 0 0
P4 0 0 0 159 0 20 0 0 0
P5 1 0 1 0 164 0 2 0 0
P6 2 3 6 0 0 129 0 0 0
P7 0 0 1 0 3 0 164 0 0
P8 0 0 9 0 6 0 1 162 0
P9 0 0 5 3 0 0 0 0 133
Conclusion: most of the poses can be recognized very well.
However, there is quite a large error between pose4 and pose6.
20. People Detection, Tracking and Pose Recognition System
Video People People Pose Spatial
Sequence Detection Tracking Recognition Game
Pose Color Control
Location Position Control
22. Application: Spatial Game
• Real-time application: 20 frames/second PRSD Studio, http://prsysdesign.net/
• Robust to different environments: different indoor settings
• Adapt to different users: various users
23. Future Works
• Improve the robustness of the system
better skin colour detection, more robust feature detection
• Develop multiple-user applications
solve occlusion problem