6. Intel Confidential
Creative* Interactive Gesture Camera
For use with the Intel® Perceptual Computing SDK
Small, light-weight, low-power
Tuned for close-range interactivity
Designed with ease of setup and
portability
Includes:
HD web camera
Depth sensor
Dual-array microphone
Sign up to purchase a camera at intel.com/software/perceptual
*Other brands and trademarks may be claimed as the property of their respective owners
8. Intel Confidential
What is Perceptual Computing?
Interactivity Beyond Touch, Mouse and Keyboard …
Facial
Tracking
Speech
Recognition
Close-range
Finger
Tracking
Augmented
Reality
Close-range
Gesture
Tracking
Facilitates Application Developers Implementation of:
Games
Entertainment
Productivity
Accessibility
Immersive Teleconferencing
Education
Medical / Health
Enterprises
Retail
Industrial
9. Intel Confidential
SDK Usage H/W Requirements
SDK Usage Mode Speech Certified
Dual-Array
Microphones
RGB Webcam Creative*
Camera
Close-range
Depth tracking
X
Speech
Recognition
X X
Face Recognition X X
Augmented
Reality
X X
Close-range depth tracking requires Creative camera
Speech requires dual-array microphones OR Creative* camera
2H’13 4th Gen Ultrabook devices are required to have speech
certified microphones
Dell XPS 13* has speech-certified microphones
Facial tracking requires RGB Webcam OR Creative* Camera
Augmented Reality requires RGB Webcam OR Creative* Camera
*Other brands and trademarks may be claimed as the property of their respective owners
10. Intel Confidential
Programming Language and Framework
Support
• C++, C#, Java
• Supported Frameworks
– processing
– openFrameworks
– Unity
– Havok
– Total Immersion AR
12. Intel Confidential
PXCSession, PXCImage, PXCAudio, PXCCapture, PXCGesture,
PXCFaceAnalysis, PXCVoice
UtilCapture, UtilPipeline
C#
PXCMSession
PXCMImage
PXCMAudio
PXCMCapture
PXCMGesture
PXCMFaceAn
alysis
PXCMVoice
UtilMCapture
UtilMPipeline
pxcupipeline
Unity*
Pro
Processing
openFrame
works*
Applications
Core Functionalities
Module Interaction
Additional Language and
Framework Support
SDK API Hierarchy
*Other brands and trademarks may be claimed as the property of their respective owners
13. Intel Confidential
SDK files
• Samples are located in:
– $(PCSDK_DIR)bin (executables)
– $(PCSDK_DIR)samples (source code)
– $(PCSDK_DIR)framework (framework and game engines)
– $(PCSDK_DIR)demo
15. Intel Confidential
User Experience considerations
• Reality inspired, not cloning
– Wrapping fingers around objects
• Literal, not abstract
– Universal visual cues – switches and nobs
• Consistency!!!
• Extensible – prepare for future improvements
• Manage persistence
– Sometimes the hand will go out of the view
16. Intel Confidential
Process Overview – prop 1
• Subclass the SDK Pipeline Object
• Enable features in the constructor
• Reimplement OnAlert|Gesture|NewFrame…() methods
• Instantiate the object and call LoopFrames() on it.
17. Intel Confidential
Hello World – C++
class GesturePipeline: public UtilPipeline {
public:
GesturePipeline(void):UtilPipeline(), m_render(L"Gesture Viewer") {
EnableGesture();
}
virtual void PXCAPI OnGesture(PXCGesture::Gesture *data) {
if (data->active) m_gdata = (*data);
switch (data->label) {
case PXCGesture::Gesture::LABEL_NAV_SWIPE_LEFT: break; //do something
case PXCGesture::Gesture::LABEL_NAV_SWIPE_RIGHT: break; //do something
default: break;
}
}
virtual bool OnNewFrame(void) {
return m_render.RenderFrame(QueryImage(PXCImage::IMAGE_TYPE_DEPTH),
QueryGesture(), &m_gdata);
}
protected:
GestureRender m_render;
PXCGesture::Gesture m_gdata;
};
18. Intel Confidential
Process Overview – prop 2
• Declare The SDK Pipeline Object
• Select Features and Initialize
• Acquire A Frame
• Query The Data
• Cleanup
You can also subclass the pipeline object and reimplement OnXXX methods
19. Intel Confidential
Declare The SDK Object – C++/C#
• C++
UtilPipeline pipeline;
• C#
UtilMPipeline pipeline;
pipeline = new UtilMPipeline();
20. Intel Confidential
Select Features and Initialize – C++/C#
• Select Features Using .Enable*() Methods
• Use Init() To Set Features and Enable SDK Access
pipeline.EnableGesture();
pipeline.EnableImage(PXCImage::COLOR_FORMAT_RGB24);
//C# PXCImage.ColorFormat.COLOR_FORMAT_RGB24
pipeline.Init();
21. Intel Confidential
Declare The SDK Object - Frameworks
• Unity
private PXCUPipeline pipeline;
pipeline = new PXCUPipeline();
• Processing
private PXCUPipeline pipeline;
pipeline = new PXCUPipeline(this);
22. Intel Confidential
Select Features and Initialize - Frameworks
• Select Features using PXCUPipeline.Mode enum
• Use Bitwise OR (|) for Multiple Features
• Use Init() To Set Features and Enable SDK Access
pipeline.Init(PXCUPipeline.Mode.COLOR_VGA|
PXCUPipeline.Mode.DEPTH_QVGA|
PXCUPipeline.Mode.GESTURE);
23. Intel Confidential
Capture A Frame
• Poll For A Frame Using AcquireFrame(bool);
– Can be blocking or non-blocking
– AcquireFrame(true) is blocking, AcquireFrame(false) is non-
blocking
• Returns true If A Frame Is Available
if(pipeline.AcquireFrame(false))
{
}
24. Intel Confidential
Retrieve The Data
• Data Is Retrieved via Query*(<T>)
– QueryRGB(), QueryLabelMapAsImage(), etc…
• Unity
Texture2D rgbTexture = new
Texture2D(640,480,TextureFormat.ARGB32, false);
Pipeline.QueryRGB(rgbTexture);
• processing
PImage rgbTexture = createImage(640,480,RGB);
pipeline.QueryRGB(rgbTexture);
26. Intel Confidential
SDK Features
• User Tracking
– Hand Detection
– Finger Detection
– Static Pose Detection
– Dynamic Gesture
Detection
27. Intel Confidential
User experience on Gestures
• Innate vs. learned gestures
– Aim for lower cognitive load
• Actionable gestures
– What if the user turns the head, drinks coffee…
• Prevent Occlusion!!!
• Left-right gestures easier than up-down
• Grabbed objects
– Make it obvious where objects can be dropped
35. Intel Confidential
Select Features and Initialize – C++
• Only Need To Enable Gestures, Images Optional for
Feedback or Visualization
pipeline.EnableGesture();
pipeline.Init();
36. Intel Confidential
Select Features and Initialize - Frameworks
• Only Need ‘GESTURE’, Images Optional for Feedback or
Visualization
pipeline.Init(PXCUPipeline.Mode.GESTURE);
47. Intel Confidential
Retrieve The Data (C++)
• Data Is Retrieved via QueryImage() And Accessing The
Data Buffers
• Image Is Retrieved as PXCImage, Data Is Accessed Via
PXCImage::ImageData.planes
PXCImage rgb = pipeline.QueryImage(PXCImage::IMAGE_TYPE_COLOR);
PXCImage::ImageData rgbData;
rgb->AcquireAccess(PXCImage::ACCESS_READ, &rgbData);
//Data can be loaded from rgbData.planes[0]
rgb->ReleaseAccess(&rgbData);
48. Intel Confidential
Retrieve The Data (C#)
• Image Data Can Be Retrieved Via QueryBitmap()
System.Drawing.Bitmap rgb;
PXCMImage rgbImage =
pipeline.QueryImage(PXCMImage.ImageType.IMAGE_TYPE_COLOR);
rgbImage.QueryBitmap(pipeline.QuerySession(), out rgb);
49. Intel Confidential
Retrieve The Data - Frameworks
• Image Data Can Be Retrieved Via QueryBitmap()
System.Drawing.Bitmap rgb;
PXCMImage rgbImage =
pipeline.QueryImage(PXCMImage.ImageType.IMAGE_TYPE_COLOR);
rgbImage.QueryBitmap(pipeline.QuerySession(), out rgb);
50. Intel Confidential
• Track any 2D planar surfaces
– Position, orientation and other
parameters
• Track limited 3D objects
– Based on 3D models
• Track face orientation
SDK Features
53. Intel Confidential
class PXCDVTracker: public PXCBase
{
enum TargetType
{
TARGET_UNDEFINED,
TARGET_PLANE,
TARGET_OBJECT3D,
TARGET_FACE,
TARGET_PLANEBLACKBOX,
TARGET_MARKER
};
typedef struct
{
TrackingStatus status; // (-1) not initialized, 0 not tracking (recognition in process), 1 tracking
pxcF64 position[3]; // Resulting pose (X,Y, Z)
pxcF64 orientation[4]; // Quaternion to express the orientation
int index; // Recognized keyFrame index (-1 none)
} TargetData;
QueryProfile(…); // Retrieve configuration(s)
SetProfile(…); // Set active configuration
GetTargetCount(…); //
ActivateTarget(…); // Retrieve object tracking data
GetTargetData(…); //
ProcessImageAsync(…); // Data processing
};
Algorithm Modules: PXCDVTracker
Module Interface
54. Intel Confidential
SDK Features
• User Tracking
– Face Detection
– Face Location Detection
– Face Feature Detection
– Face Recognition
55. Intel Confidential
Face Detection/Tracking
•Locate and track
multiple faces
•Unique identifier
for each face
Algorithm Modules: PXCFaceAnalysis
Face tracking and analysis
Landmark Detection
•6/7-point detection
including eyes,
nose, and mouth
Facial Attribute Detection
•Age-group including
baby/youth/adult/senior
•Gender detection
•Smile/blink detection
Face Recognition
•Similarity among a set of
faces
56. Intel Confidential
class PXCFaceAnalysis: public PXCBase
{
class Detection {
QueryProfile(…);
SetProfile(…);
QueryData(…);
};
class Landmark {
QueryProfile(…);
SetProfile(…);
QueryLandmarkData(…);
QueryPoseData(…);
};
class Recognition {
QueryProfile(…);
SetProfile(…);
CreateModel(…);
};
class Attribute {
QueryProfile(…);
SetProfile(…);
QueryData(…);
};
QueryProfile(…);
SetProfile(…);
ProcessImageAsync(…);
}
Algorithm Modules: PXCFaceAnalysis
Module interface
Face location detection/tracking
configuration and retrieve data
Face landmark detection configuration
and data retrieval
Face attribute detection configuration and
data retrieval
Face recognition confirmation and data
retrieval
Face analysis overall configuration and
data processing
57. Intel Confidential
• Nuance* Voice Command and Control
– Recognize from a list of predefined commands
• Nuance Voice Dictation
– Recognize short sentences (<30 seconds)
• Nuance Voice Synthesis
– Text to speech for short sentences
SDK Features
58. Intel Confidential
class PXCVoiceRecognition: public PXCBase
{
struct Recognition {} // Recognized data structure
struct Alert {} // Event data structure
QueryProfile(…); // Retrieve configuration(s)
SetProfile(…); // Set active configuration
SubscribeRecognition(…);// Recognition event setup
SubscribeAlert(…); // Alert event setup
CreateGrammar(…); //
AddGrammar(…); // Command list construction
SetGrammar(…); //
DeleteGrammar(…); //
ProcessAudioAsync(…); // Data processing
};
Algorithm Modules: PXCVoiceRecognition
Module Interface
59. Intel Confidential
class MyHandler: public PXCVoiceRecognition::Recognition::Handler, public PXCVoiceRecognition::Alert::Handler
{
public:
MyHandler(std::vector<pxcCHAR*> &commands) { this->commands = commands; }
virtual void PXCAPI OnRecognized(PXCVoiceRecognition::Recognition *cmd)
{
wprintf_s(L"nRecognized: <%s>n", (cmd->label>=0)?commands[cmd->label]:cmd->dictation);
}
virtual void PXCAPI OnAlert(PXCVoiceRecognition::Alert *alert)
{
switch (alert->label)
{
case PXCVoiceRecognition::Alert::LABEL_SNR_LOW:
wprintf_s(L"nAlert: <Low SNR>n");
break;
case PXCVoiceRecognition::Alert::LABEL_VOLUME_LOW:
wprintf_s(L"nAlert: <Low Volume>n");
break;
case PXCVoiceRecognition::Alert::LABEL_VOLUME_HIGH:
wprintf_s(L"nAlert: <High Volume>n");
break;
default:
wprintf_s(L"nAlert: <0x%x>n",alert->label);
break;
}
Algorithm Modules: PXCVoiceSynthesis
Voice recognition example – callback handlers
61. Intel Confidential
// Queue the sentence to the speech synthesis module
pxcUID tuid = 0;
sts = vtts->QueueSentence(cmdl.m_ttstext, wcslen(cmdl.m_ttstext), &tuid);
…
while (1)
{
PXCSmartPtr<PXCAudio> audio;
PXCSmartSP sp;
// Read audio frame
sts = vtts->ProcessAudioAsync(tuid, &audio, &sp);
if (sts < PXC_STATUS_NO_ERROR)
break;
sts = sp->Synchronize();
if (sts < PXC_STATUS_NO_ERROR)
{
if ((sts == PXC_STATUS_PARAM_UNSUPPORTED) || (sts == PXC_STATUS_EXEC_TIMEOUT))
wprintf_s(L"Error in ProcessAudion");
if (sts == PXC_STATUS_ITEM_UNAVAILABLE)
wprintf_s(L"Voice synthesis completed successfullyn");
break;
}
return 0;
Algorithm Modules: PXCVoiceSynthesis
Speech synthesis example - generation
62. Intel Confidential
User Experience on Face Recognition
• More expressions will be available in the future SDK
• Give feedback on distance
• Give feedback on lightening
– So user avoid shadows
• Notify the user if moving too fast to be tracked
63. Intel Confidential
Overall Visual feedback
• Give instant feedback acknowledging command
– Gesture, voice, or anything else
• Show what’s actionable
• Show the current state
• Consider physics
66. Intel Confidential
class aPXCInterface: public PXCBase {
public:
PXC_CUID_OVERWRITE(PXC_UID(‘M’,’Y’,’I’,’F’));
// configurations & inquiries
struct ProfileInfo {
…
};
virtual pxcStatus PXCAPI QueryProfile(pxcU32 idx, ProfileInfo *pinfo)=0;
virtual pxcStatus PXCAPI SetProfile(ProfileInfo *pinfo)=0;
// data processing
virtual pxcStatus PXCAPI ProcessDataAsync(…, PXCScheduler::SyncPoint **sp)=0;
};
Each interface has a unique ID used by
PXCBase::QueryInterface
Consistent way of querying and
setting configurations
Asynchronous execution returns SP for
later synchronization
Core: PXCSession
Module interface conventions
PXC interfaces derive from the PXCBase
class
SDK interfaces contain only pure virtual
functions
No exception handling or dynamic_cast
(replaced with PXCBase::DynamicCast)
67. Intel Confidential
• Users are notified when SDK
accesses Personally Identifiable
Information (PII)
• Can also launch a viewer from
the taskbar icon that shows any
apps currently accessing the
sensor and what, in particular,
they are accessing
Core: Privacy Notification
Keeping users informed
68. Intel Confidential
• Image Capture:
– 8-bit RGB in RGBA/RGB24/NV12/YUY2
– Creative* camera supports up to 1280x720@30p.
– 16-bit depthmap, confidence map and vertices.
– Creative camera supports up to QVGA@60p
– Depthmap smoothing by default
• Audio capture:
– 1-2 channel PCM/IEEE-Float audio streams
– Creative camera supports 44.1kHz and 48KHz
• Device properties:
– Standard camera properties such as brightness and exposure.
– Depth-related properties such as confidence threshold, depthmap value range etc.
I/O Modules
Audio and video capture
69. Intel Confidential
1. Enumerate and create capture device
QueryDevice Query capture device names
CreateDevice Create a capture device instance
2. Enumerate and select streams
QueryStream Query stream type
CreateVideoStream Select a video stream
CreateAudioStream Select an audio stream
3. Perform stream operations
QueryProfile Query stream configurations
SetProfile Set a stream configuration
ReadStreamAsync Read samples from the stream
I/O Modules: PXCCapture
PXCCapture interface hierarchy
71. Intel Confidential
• Alert and callback interface used for low-
frequency events and notifications
• Subscribe to events
PXCGesture::SubscribeAlert
PXCGesture::SubscribeGesture
PXCVoiceCommand::SubscribeAlert
PXCVoiceCommand::SubscribeCommand
• Implement the callback handler
Algorithm Modules: PXCGesture
Alerts and callback notifications
class Handler: public PXCBaseImpl<PXCGesture::Gesture::Handler>
{
public:
virtual pxcStatus PXCAPI OnGesture(Gesture *gesture) {
…
}
};
72. Intel Confidential
class MyPipeline: public UtilPipeline {
public:
MyPipeline(void):UtilPipeline() {
EnableGesture();
}
virtual void PXCAPI OnGesture
(PXCGesture::Gesture *data) {
printf_s(“%dn”,data->label);
}
};
int wmain(int argc, WCHAR* argv[]) {
MyPipeline pipeline;
pipeline.LoopFrames();
return 0;
}
class MyPipeline: UtilMPipeline {
public MyPipeline():base() {
EnableGesture();
}
public override void OnGesture
(ref PXCMGesture.Gesture data) {
Console.WriteLn(data.label);
}
};
class Program {
static void Main(string[] args) {
MyPipeline pipeline=new MyPipeline();
pipeline.LoopFrames();
pipeline.Dispose();
}
}
C++ C#
Enable Finger Tracking
Gesture Callback
Data Flow Loops
UtilPipeline Class
Gesture Recognition “Hello World”
73. Intel Confidential
• Multiple processing modules on single
input device
– Live streaming or file-based
recording/playback
– Synchronized image (or audio) processing
UtilPipeline pp;
pp.EnableImage(PXCImage::COLOR_FORMAT_RGB32);
pp.EnableImage(PXCImage::COLOR_FORMAT_DEPTH);
for (;;) {
if (!pp.AcquireFrame(true)) break;
PXCImage *color, *depth;
color=pp.QueryImage(PXCImage::IMAGE_TYPE_COLOR);
depth=pp.QueryImage(PXCImage::IMAGE_TYPE_DEPTH);
pp.ReleaseFrame();
}
pp.Close();
UtilPipeline Class
UtilPipeline-based application
Color and depth are synchronized
74. Intel Confidential
Speech Recognition:
Voice command and control, short sentence dictation, and
text to speech synthesis
SDK Usage Modes Today
1
1 New usage modes may be added in the future
Close-range Depth Tracking (6 in. to 3 ft.):
Recognize the positions of each of the user’s hands, fingers,
static hand poses and moving hand gestures.
Facial Analysis:
Face detection and recognition (six and seven point landmark and
attribution detection, including smiles, blinks, and age groups)
Augmented Reality:
Combine real-time images from the camera and close-range tracking
from the depth sensor with 2D or 3D graphical images.
75. Intel Confidential
Your SDK ‘One-Stop-Shop”
intel.com/software/perceptual
@PerceptualSDK (Twitter)
CHALLENGE INFO
DOWNLOAD SDK
ORDER CAMERA
DOCUMENTS
DEMO APPS
SUPPORT
76. Intel Confidential
Key Upcoming Items
Creative* Senz3D – Q3 2013
Integration in Intel devices – H2 2014
*Other brands and trademarks may be claimed as the property of their respective owners
77. Intel Confidential
Intel® Perceptual Computing Challenge
The $1Million 2013 Application Development Contest*
Enter Phase 2: perceptualchallenge.intel.com/
Focus: Games, Productivity, Creative UI
& Multi-modal
Process: Developers submit working
prototypes, panel judged
Two Phases:
Phase 1 (CLOSED): See Winner
Showcase at http://goo.gl/EnNHv
Phase 2: March (GDC) to September -
$800,000+ in prizes
Categories: Perceptual Gaming,
Productivity, Creative User Interface
and Open Innovation
Available in 16 countries
*Terms and Conditions Apply
78. Intel Confidential
Speech Recognition & Dragon Assistant*
Perceptual
Computing SDK
Runtime
Speech Recognition
Application
Drivers & Hardware
Dragon Assistant*
Dragon Assistant*
Engine and
Language Pack
• Perceptual Computing speech recognition applications require Dragon Assistant* Engine and
Language Packs to be installed on target platform
• For app developers, Engine and Language Packs are available on SDK download site (THESE
ARE FOR DEVELOPER INTERNAL USE ONLY AND NOT TO BE DISTRIBUTED).
• For consumers, Dragon Assistant* (with Engine) is expected to available as follows:
• Expected to be bundled with Creative* Camera (when available)
• Expected to be pre-installed on speech-certified 4th Gen Core Ultrabook devices in late
2013
SDK Speech APIs
use the Dragon
Assistant* Engine
and Language
Packs
83. Intel Confidential
Image Conversion
Visual Computing
Products
RGB24 RGB32 NV12 YUY2 GRAY
RGB24 Y Y Y
RGB32 Y Y Y
NV12 Y Y Y
YUY2 Y Y Y Y Y
GRAY Y Y Y Y
DEPTH Y
VERTICES Y
For instance, raw DepthSense color image format is RGB24, with
AcquireAccess(PXCImage::ACCESS_READ, PXCImage::COLOR_FORMAT_RGB32, &data)
SDK framework will convert color image data from RGB24 to RGB32