The document provides an overview and instructions for creating Kinect hacks. It introduces the author and their background with Kinect projects. Chapter 1 discusses the necessary hardware, software, and programming techniques used to create hacks like transforming the player into a superhero and shooting lasers with poses. Chapter 2 notes that while the author did not intend commercial success, factors like differentiating hacks, quick completion, and making the experience entertaining led to unexpectedly large public response.
2. Table of Contents
Introduction
◦ Who is the author?
◦ Overview
◦ Kinect basics
Chapter 1: Tech Side
◦ Hardware/software preparation
◦ Hacks and tips
Chapter 2: Biz Side
◦ Original intention and actual feedback
◦ Video view analyses
◦ What else happened
3. Who is the Author?
CPU able to Programming language Where/what I was
speak with able to speak
1973 Born in Japan
First programming in
1979 assembly language
MN1610
BASIC Elementary school
1985 Z80 Posting original games
to computer magazine
and getting money High school Golden age!
SC62015
1991
LISP C Part-time programmer 30%
x86, ARM, VB C++ Guitar player and composer 65%
Other RISCs Perl Java University student 5%
Tcl/Tk
1997
Software product R&D
Hitachi • Large-scale OO design
• Middleware
2003 • HCI
California • System management
Javascript at Stanford! • User-centric design
C# Ruby Haskell •…
2009
ActionScript California again
Python
at Hitachi Data Systems!
4. Overview (1)
What is this presentation all about?
◦ My Kinect Hacks as holiday project:
http://code.google.com/p/kinect-ultra/
http://code.google.com/p/kinect-kamehameha/
◦ How much fun Kinect hacking could be
What is Kinect Hacks?
◦ Creating your own cool stuff using Kinect, motion sensing gaming
system for Xbox360
When and how I started Kinect Hacks?
◦ On Dec 2011, a month after Kinect’s release
◦ From my friend’s tweet about Kinect and Kinect Hacks
◦ Me: “Wow, I’ve gotta do this! I don’t mind spending all my Winter
Holidays!”
5. Overview (2)
What is interesting with my Kinect Hacks?
◦ Intensive crash project
Major part(*) done in a week (before my wife runs out patience)
No special knowledge of motion detection or 3D CG at the beginning
◦ Challenge for “the silliest thing ever”
Me: “I’ll take my hats off to the smart hacks created by the brilliant
people all around the world. Then, I’ll create something silly nobody ever
thought of, dedicating the best of my intelligence, energy, and CPU &
GPU power. It must be a fun!”
◦ Got unexpectedly huge response from public
Huge views in YouTube & Nicovideo (300k in 1st week)
Appeared on news blogs, newspapers, TV, and other media
Contest-awarded
Contacted by investor for commercialization
…
(*) kinect-ultra V1 that earned largest public response
6. Overview (3)
What you may learn today
◦ How to start cool Kinect Hacks by yourself
Chapter 1: Tech Side
◦ Some hints for a geek to make a “hit” (Well, I hope so)
Chapter 2: Biz Side
Disclaimers
◦ I am totally amateur for image recognition, motion detection, and
3D CG
◦ I know only things interesting and/or necessary for me
◦ I do not care much for academic accuracy (Be careful I may be lying)
◦ I am a geek but not a business person
7. Kinect Basics (1)
What is Kinect actually?
◦ Gaming system for Xbox360 that enables intuitive and natural
game play without controllers
◦ Released at Nov 2011
What is Kinect Sensor?
◦ Input device with RGB camera, IR depth sensor, and some other
auxiliary sensors
640x480@30fps, 1280x1024@10fps(*)
Internals developed by PrimeSense
◦ Connectable to PC via USB
Drivers and libraries available for free
◦ In this presentation, “Kinect” refers to Kinect Sensor
(*) With Avin’s Windows driver
8. Kinect Basics (2)
What can you do with Kinect? Generally speaking…
Very Far
Near
Far
RGB camera + Kinect provides color of
Depth sensor and distance to the
object for each pixel
Don’t you see you
can build any cool
stuff on this?
Let’s hack!
Skeleton recognition by PC 3D object recognition
(So you will get 3D by PC
positions for each joint)
9. Chapter I: Tech Side
This chapter explains the nuts-and-bolts behind this crash
project
◦ Like the tricks behind a magic, it’s nothing surprising once you get
to know
◦ General mathematics (especially geometry) required
How much time did I spend? Got huge public
response for this
◦ Study: 3 days
◦ kinect-ultra: 7 days (for V1) + 2 days (for V2)
◦ kinect-kamehameha: 1 day (for V1) + 1 day (for V2)
I think I should count “night” rather than “day” actually
10. Hardware Preparation
Kinect, of course!
◦ Caution: buy standalone, but not Xbox-bundle
Xbox-bundle does not have the adapter for USB connector
Windows PC
◦ With fairly fast CPU and GPU
The more powerful your hardware is, the more energy you can use for
cool essential stuff rather than performance optimization
Mine: Core i7 2600 + GeForce GTX 285
◦ How about Mac and Linux?
I am not so familiar, but probably Windows is safer because of good
driver support(*) and Microsoft’s SDK in the future
You don’t need Xbox
(*) Avin’s Windows driver can automatically calibrate RGB camera and IR depth sensor, but I was
not able to find the same feature in Linux drivers when I tried. It could be better now.
11. Software Preparation (1)
OpenNI + NITE + Avin’s SensorKinect
◦ Basic software component set for sensor information access and
recognition algorithms
OpenNI: Framework
NITE: OpenNI-compatible implementation
Avin’s SensorKinect: OpenNI-compabitle Kinect driver
◦ Advantages to other options (such as OpenKinect)
Released by PrimeSense
Player recognition and skeleton tracking available out-of-the-box!
Actually, this was the key success factor for me to get this project done so
quickly without any special knowledge about motion recognition
Auto calibration between RGB camera and IR sensor
Thanks to Avin for nice driver implementation
◦ In this presentation, “OpenNI” refers to all of these software
components as a set
12. Software Preparation (2)
OpenGL support libraries
◦ Chose OpenGL for my first 3D API to learn
◦ Just followed “OpenGL SuperBible 5th Edition”
Standard support libraries (e.g. freeglut)
Original library in this book (GLTools)
Others
◦ OpenCV
Only used for reading image files and Gaussian random number
13. Hack 0: Study with Sample Programs
Study for 3 days before starting kinect-ultra
◦ Surveyed both OpenKinect and OpenNI, and chose latter
◦ Learned basic pixel information access and OpenGL usage from
OpenNI’s sample programs
First practice piece: depth-aware delayed-overlay
See “Algorithm March by Kinect”
http://www.youtube.com/watch?v=j4ABDmFhkgA
14. Hack 1: Transformation
Use “calibration complete” event to trigger transformation
◦ Calibration by “psi pose” is common for Kinect apps to start skeleton tracking
◦ “Something happens on calibration complete” is Kinect-ish entertainment
Modulate color of player area to represent the superhero suit
◦ OpenNI reports “hey, this pixel seems a part of player #1” so the app easily knows
which pixels should be modulated
◦ Switch color (red or gray) for each pixel based on its distance from head
App can calculate Euclid distance between any pixels/joints in real world coordinates
It is slow, however; some optimization is required
◦ You: “Isn’t it too rough?” Me: “Well, that’s OK, this is meant to be funny after all!”
Skinning should be ideal, but too serious and challenging
Ψ
psi pose
15. TIP: A Bit about Coordinate Systems
Kinect coordinates OpenGL coordinates
• Raw pixel & depth data • Raw vertex & pixel data
from Kinect for OpenGL
10000~
1.0
Depth Z-buffer
(seems linear) (Non-linear)
0.0
0
Z
Each XY plane
(0, 0)~(640, 480) Each XY plane
(-1.0, -1.0)~(1.0, 1.0)
XY plane
Projected by
Transformed by OpenNI API
OpenGL API
(a little slow)
Real world coordinates
• Skeleton positions from OpenNI
• Virtual 3D polygon objects
16. Hack 2: Detect Pose Shoot Laser
No motion detection, only pose detection!
◦ Calculation is tremendously easy without time derivative
◦ Once the positions of skeleton parts are given, elementary vector
operations (distance, dot product, cross product) work very well
◦ Try and error to decide good parameters (e.g. thresholds)
Spawn laser while pose is detected
◦ Laser is flat rectangle object in 3D space with alpha texture, and laid over
image from RGB camera
◦ Position/direction/initial velocity calculated from the pose
Same approach for shooting Eye Slugger
◦ With an additional stability check
17. Hack 3: Hidden Surface Processing
Place each pixel from Kinect as point object in 3D space
◦ Not texture mapping
◦ So pixels and other 3D objects hide each other
Handle pixels in projective coords for good performance
◦ 3D objects basically reside in real world coords, but mapping all pixels into real
world is too slow
◦ Instead, directly map pixels from Kinect coords to OpenGL raw coords by
transforming depth value to OpenGL Z-buffer value
◦ See next page, it was a hack
18. TIP: Fast Depth Transformation
Direct transformation from Kinect’s
Kinect coordinates depth value to OpenGL Z-buffer value OpenGL coordinates
• Raw pixel & depth data is much faster! Some hacking was • Raw vertex & pixel data
from Kinect needed to figure out the formula. for OpenGL
10000~
1.0
Depth Z-buffer
(seems linear) (Non-linear)
Uniform everything into real 0.0
world makes the logic easier,
but slow.
0
Z
Each XY plane
(0, 0)~(640, 480) Each XY plane
(-1.0, -1.0)~(1.0, 1.0)
XY plane
Projected by
Transformed by OpenNI API
OpenGL API
(a little slow)
Real world coordinates
• Skeleton positions from OpenNI
• Virtual 3D polygon objects
19. Hack 4: Hit testing
Hit-test between lasers (= rectangles in 3D space) and image pixels (=
points in 3D space), and convert lasers into sparks
◦ Impractical to check the distance between all the objects
◦ Instead, divide the real world space into coarse 1-bit voxels, and mark
voxels that contain points
No distance calculation, just voxel look up is enough for hit testing
Mark voxels with down-sampled pixels
Marking voxels needs to be done in the real world coordinates thus slow
◦ Maybe inaccurate, but fun!
20. TIP: How Kinect Works in Darkness?
IR laser depth sensing works even in dark room
◦ http://www.youtube.com/watch?v=nvvQJxgykcU
◦ Cast random dot pattern and analyze parallax
(capture from above URL)
21. Hack 5: Light Ball
Drawing white circle does not look light ball at all…
Instead, brighten surroundings as per distance from light ball center
You feel dazzling light and heat! (Thanks to human illusion)
Use approximation because real Euclid distance calculation for all pixels
is slow
Calculate “pseudo” distance in projective coordinates (with tweaking Z value a bit)
Try and error to decide how to modulate brightness by pseudo distance
Not 100% scientific and realistic, but good enough and, most importantly, fun!
22. Hack 6: Energy Wave (1)
Represented by long-stretched polygon sphere
Decide transparency by dot product between normal of polygonal
surface and sight vector (for nebular effect)
◦ Solid around center, transparent around edge
◦ Implemented by GLSL (shading language)
Although it was first time for me to work on this language, it’s done in about 30 minutes by
tweaking a sample code in a book
Add random fluctuation to normal (for misty/swirly effect)
◦ Accidentally discovered from bug
23. Hack 6: Energy Wave (2)
Act as brightness
Simple Reflection
rgb = rgb·(n·v / |v|)k
After a quick tweak…
v
(sight vector)
n
(normal)
Nebular Effect
Add random fluctuation to the a = (n·v / |v|)k
normal to make the transparency
roughly modulated by position
and time. This makes the energy Act as transparency
wave look misty or swirly
24. Hack 7: Hair!
Secret formula to model the hair
◦ O = center of head, P = each pixel on player’s border near and above O
◦ Render narrow triangle from P to the direction of OP with length of n|OP|
where n is a simple linear saw-wave function of r
where r is the angle of OP against the horizon
Add some repulsion against energy ball
Randomly blend graded yellow (for “goldish shine” effect)
Everything is calculated/rendered in 2D on projective plane
◦ Easy and unrealistic, but cartoonish and funny
n|OP|
n = simple linear saw-wave function of r
P
n
O
r
Player’s border
r
0 π/2
25. Chapter 2: Biz Side
Got unexpected huge response to uploaded video
Maybe able to read some hint for a geek to make a “hit”…
26. What Did I Intend Actually?
Absolutely no intention to be “successful”, but had other clear
intentions which might be eventual success factors
◦ Desire to be in the same line as other Kinect Hackers
◦ Must be differentiated -- useless, nonsense, and never-seen
◦ Must be quickly done
Before real game studios publish their serious work
Before someone else (as crazy as myself) shoot lasers
◦ Completeness of entertainment
First created laser shooting only (in 2 days), then added other features one
by one till satisfied with “completeness”
Motivated by “hey, this idea is too good! I couldn’t finish without it!”
Transformation, hidden surface, hit testing, Eye Slugger, timeout, flying out, …
◦ Targeted at worldwide
Created videos in both Japanese and English, and uploaded them to both
YouTube and Nicovideo (Japanese video site)
Creating only for one community would mean not to welcome the other
27. Examples of unexpected feedback
It’s for kids!
◦ “My kid keeps PC and never leaves.”
◦ “When my kids and I play heroes and bad guys, they identify
themselves with the heroes in their mind. If they can actually
become the heroes out of their imagination, it will be wonderful.”
It makes my dream come true!
◦ “I wanted to do this since I was a kid.”
◦ “The kid’s part of me says ‘Look! He transforms! I wanna do it!’ and
drown out my adult’s words.”
Me: “I did not mean it at all. I just tried to be silly and funny.
But, it is definitely a pleasure to see people get excited
about the future of the technology demonstrated by this.”
28. Video View Analysis of kinect-ultra
Exploded within 24 hours and reached to 300k in a week
◦ More discussion in next page
Japan heats up and cools down very quickly while worldwide seems a
little slower
Forgotten while nothing happens, and remembered by occasional
events
Total Views Views/day in first two weeks
600,000 140000
500,000 120000
Explosion
100000
400,000
80000
300,000
60000 Nico (ja)
200,000
40000 YT (ja)
100,000 20000 YT (en)
0 0
Nicovideo-
award nominee
29. Hypothesis of explosion mechanism
Interesting to think how access could grew up so largely and
rapidly
Hypothesis: multistage explosive chain reaction among video,
tweets, and news sites(*)
Stage 1 • Maniac communities first notice the video, and start tweeting
(~10h) • Views and tweets increase slowly
Stage 2 • Number of tweets penetrates some threshold
(~20h) • News sites notice it and post articles (independent blog sites
first and then major news sites such as Yahoo! News)
• Views and tweets rapidly increase by positive feedback effect
Stage 3 • Number of views penetrates some threshold and ranks in most
popular videos
• Feedback effect even more accelerated
Cool down • Tweet cools down and feedback effect stops gradually
(48h~)
Is it possible to make it happen intentionally?
Not sure, probably very difficult
(*) My colleague tracked the public activity and came up with this hypothesis. Great job of him.
30. Video View Analysis of kinect-kamehameha
No explosion
◦ Got many views at first in Nicovideo (more than ultra in fact), but did not fuse
explosion
◦ Probably insufficient impact to make them tweet and penetrate the threshold
Sustainable popularity from worldwide more than Japan
◦ From DBZ fans in the world? Most views come from Brazil
◦ Sporadic jump up – don’t know what is happening
Total Views Views/day in first two weeks
200,000 20000
180,000 18000
160,000 16000
140,000 14000
120,000 12000
100,000 10000
80,000 8000 Nico (ja)
60,000 6000
YT (ja)
40,000 4000
20,000 2000 YT (en)
0 0
31. What else happened (1)
Appear on media
◦ Blog, news, and tech review sites
◦ Papers and magazines (e.g. Japan Times)
◦ TV shows (e.g. NHK BS1/2 in Japan)
◦ Net casting (in Japan and France)
◦ For more information:
http://code.google.com/p/kinect-ultra/wiki/Articles
http://code.google.com/p/kinect-kamehameha/wiki/Articles
Public demos and presentations
◦ 3D Vision & Kinect Hacking Meetup
◦ JTPA Geek Saloon
◦ Maker Faire (Thanks to Matt Bell for involving me)
◦ Campus Party (Did not make it, though)
32. What else happened (2)
Win and nominated for awards
◦ Matt Cutt’s Kinect Contest Winner
◦ Maker Faire 2011 Bay Area Editor’s Choice Winner
◦ Nicovideo Award 2011 Spring Nominee
Other interesting contacts from
◦ Other hackers, of course!
◦ Investors
◦ Artists (who wanted to use the video in his art work)
◦ 3D modelers (who kindly contributed Eye Slugger model)