The document describes a method for measuring pulse rate of a person using video analysis. It discusses using the mean pixel values of red, green, and blue channels in regions of interest of the face to extract pulse signals. Two approaches for selecting regions of interest are described: object tracking and active appearance modeling. The method involves preprocessing signals, applying independent component analysis and filtering, and using Fourier transforms to detect peak pulse rate. Evaluation tests focused on factors like motion, distance from camera, and video duration needed.
Measuring pulse rate from videos using ROI selection and signal processing
1. Measurement of pulse rate of a
person using his video
By Sahil Shah
Date: 30-11-2012
2. •Literature Review: From literature we know that approaches have
been found to extract human pulse information from the video of a
stationary person.
•One of the methods is using the mean values of the R,G,B streams
from a specific region of interest of the face and plotting them over time
from the video.
•Analysis using Matlab.
4. Power
Frequency
The Power spectrum of the mean values signals for the RGB streams.
Peak for the green signal can be seen at 1.2 Hz.
5.
6. ROI Interpolatio Normalize
Video Face
selection n of RGB Intensity
Detection
values
Independen
Processed Hann t Raw RGB
Windowing Bayes Filter Component signal
Signals
Analysis
8. •Two approaches:
1. Object tracking: We use the standard object tracking
implementation in MIRA to detect the face. The ROIs are stated in the
configuration file of the Pulse Detector unit as sub regions of the face.
We select the largest detected object as the face and subsequently
select the closest object to the last detection as the face.
ADV:
• Faster
• Generalized
DIS:
• Breaks when first detection is wrong (generally when face takes
smaller area in the image)
• ‘jumping’ detections.
9. 2. Active Appearance Model (AAM): We use the active
appearance model algorithm to recognize faces based on multiple
features. It returns triangles that define different features on the face.
We configure the AAM face detector to return some pre selected
triangles as ROIs
ADV:
• More robust to small movements
• Exact ROIs
DIS:
• No generalized model for all kinds of faces
10. •Average R,G and B pixel values of the regions of interest from the face
for each timestamp
•Interpolation to get RGB values for the timestamps for which we get
images (since detections come little later)
•Sampling rate can be changed and is not required to be same as that
of images because interpolation can also be used to get intensity
values for any timestamp
•Interpolation also helps to maintain equal intervals between frames
and increase accuracy
11. •Intensity Normalization:
rn = r/(r+g+b)
gn = g/(r+g+b)
bn = b/(r+g+b)
•Independent Component Analysis
•Hann Window: Reduces resolution but works better when S/R is low.
•Bayes Filtering: Kernel with +/-1 bin change (+/- 3 bpm for a window
of 200 frames at 10Hz).
12. •Fast Fourier Transform: Discrete Fourier transforms of the processed
signals to get their power spectrum
•Band-pass filter: Band-pass filter (0.75 to 1.5) to get the frequency
spectrum for the range in which the human pulse can lie.
•Peak Detection: Detects maximum power frequency
•Parabola estimation
•Calculate Pulse
13. •The Pulse Detector can be configured with the help of various
parameters like:
Number of frames
Virtual Sampling Frequency
Regions of Interest
Use AAM
Use ICA
Bayes Filter
Windowing (Hann)
Filter Bands
Parabola Estimation
14. •We evaluated the Pulse Detector Unit on the following factors
Motion vs Stationary
AAM vs Object Tracking
Near vs Far (Resolution)
Jumping detections vs. Non jumping detection
Different ROIs
ICA vs No ICA
16. •The analysis and testing was done in Matlab while the entire
implementation is in C++ using the Middleware for Robotic Applications
(MIRA) framework.
17.
18. Which algorithm is the most promising for usage?
• The Object Tracking algorithm is giving better results currently.
• The AAM tends to lose the detections on increasing movement.
• But a better trained AAM will be more robust because it is more
accurate and gives the exact ROI thus effect of small noise
becomes negligible.
19. What is the maximum distance of people in the image from where
robust pulse extraction is possible?
• For stationary images taken using the Kinect sensor we got good
results even for face size 107x107 pixels from a 640x480 image.
• This was around 80 cm from the camera.
20. To what degree the people can move in the image without losing
pulse observation?
• A well trained AAM would almost nullify the noise effects, currently
face tracking however is not so robust to higher noise (>10 pixels)
specially when the person is farther from the camera.
21. What is the minimum duration of a video sequence to allow pulse
rate extraction?
• 20 second blocks of video are sufficient for pulse rate extraction.
We take 20 second sliding window continuously for as long as the
video is captured.
22. [1] Remote plethysmographic imaging using ambient light. Verkruysse,
W. and Svaasand, L.O. and Nelson, J.S., Optics express, nr. 26, vol.
16, pp. 21434-21445, Optical Society of America, 2008
[2] Eulerian video magnification for revealing subtle changes in the
world. Wu, H.Y. and Rubinstein, M. and Shih, E. and Guttag, J. and
Durand, F. and Freeman, W., ACM Transactions on Graphics (TOG), nr.
4, vol. 31, pp. 65, ACM, 2012
[3] Non-contact, automated cardiac pulse measurements using video
imaging and blind source separation. Poh, M.Z. and McDuff, D.J. and
Picard, R.W., Optics Express, nr. 10, vol. 18, pp. 10762-10774, Optical
Society of America, 2010