SlideShare une entreprise Scribd logo
1  sur  41
BiDi Screen Depth and Lighting Aware Interactive Display Matthew Hirsch MIT Media Lab Douglas Lanman Brown University Ramesh Raskar MIT Media Lab Henry Holtzman MIT Media Lab
BiDi Screen
Inspiration Light Sensitive Displays Depth Cameras Multitouch
Beyond Multi-touch: Hover Interaction ,[object Object],[object Object]
Beyond Multi-touch: Mobile Laptops Mobile
Results: Analysis
BiDi Screen ,[object Object],[object Object],[object Object],[object Object]
Design Overview Display with embedded optical sensors Sharp Microelectronics Optical Multi-touch Prototype
Design Overview Display with embedded optical sensors LCD   ,  displaying   mask Optical sensor array ~2.5 cm ~50 cm
Design Vision Object Collocated Capture and Display Bare Sensor Spatial Light Modulator
Design Overview Mask Array of Virtual Cameras Sensor
Manipulating an object with 3D gesture
Alternatives to capture depth from an LCD ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Adapting Traditional Touch? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],?
Camera Arrays ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],?
Depth Cameras and Sensors ,[object Object],[object Object],[object Object],[object Object],[object Object]
Theory: Depth from light-field capture ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Theory: Spatial Heterodyning with MURA u d ref d m u  s u =0 s =0 = u +( d m / d ref ) s  L received ( f u , f s ) f u f s f u 0 f s 0 f s R s   
Theory: Lightfield θ x y θ x . (Sensor) Sensor integrates these rays
Theory: Lightfield Frequency Domain f θ f x θ x Fourier Slice Theorm FT Lightfield FT
Theory: LF Skew in Free-Space Propogation θ x x y mask θ mask
Theory: Convolution with Delta Functions
Theory: Transforms f θ f x θ x f mask mask Tiled Broadband Mask Lightfield FT FT
Theory: Spatial Heterodyning x y mask Multiplication in prime domain is convolution in frequency domain
Theory: Spatial Heterodyning f θ f x f mask f θ f x
Theory: Spatial Heterodyning f θ f x Reconstructed Lightfield Spectrum Band Limited
Desired Prototype Actual Prototype
Prototype ,[object Object]
Prototype ,[object Object]
Prototype Backlights Cameras
Pipeline Max Contrast Operator
Pipeline
Theory: Depth and Spatial Resolution
Theory: Depth and Spatial Resolution
Results: Analysis
Manipulating an object with 3D gesture
Navigating a 3D world with 3D gesture
Lighting a vritual scene with a real light
Limitations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object]

Contenu connexe

En vedette

Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016
Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016
Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016StanfordComputationalImaging
 
Light Field Technology
Light Field TechnologyLight Field Technology
Light Field TechnologyJeffrey Funk
 
Optical rotatory dispersion
Optical rotatory dispersionOptical rotatory dispersion
Optical rotatory dispersionSujit Patel
 
Mobile Application Design & Development
Mobile Application Design & DevelopmentMobile Application Design & Development
Mobile Application Design & DevelopmentRonnie Liew
 
Evolution of Optical Fibers
Evolution of Optical FibersEvolution of Optical Fibers
Evolution of Optical FibersSourabh Roy
 
Rechercher dans HAL et exploiter les résultats
Rechercher dans HAL et exploiter les résultatsRechercher dans HAL et exploiter les résultats
Rechercher dans HAL et exploiter les résultatsOAccsd
 
Déposer une thèse dans TEL ou HAL
Déposer une thèse dans TEL ou HALDéposer une thèse dans TEL ou HAL
Déposer une thèse dans TEL ou HALOAccsd
 

En vedette (12)

Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016
Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016
Light Field, Focus-tunable, and Monovision Near-eye Displays | SID 2016
 
The Light Field Stereoscope | SIGGRAPH 2015
The Light Field Stereoscope | SIGGRAPH 2015The Light Field Stereoscope | SIGGRAPH 2015
The Light Field Stereoscope | SIGGRAPH 2015
 
Introduction to Light Fields
Introduction to Light FieldsIntroduction to Light Fields
Introduction to Light Fields
 
Light Field Technology
Light Field TechnologyLight Field Technology
Light Field Technology
 
Optical rotatory dispersion
Optical rotatory dispersionOptical rotatory dispersion
Optical rotatory dispersion
 
Light Field Photography Introduction
Light Field Photography IntroductionLight Field Photography Introduction
Light Field Photography Introduction
 
Optical Computing
Optical ComputingOptical Computing
Optical Computing
 
Mobile Application Design & Development
Mobile Application Design & DevelopmentMobile Application Design & Development
Mobile Application Design & Development
 
Quantum dots ppt
Quantum dots pptQuantum dots ppt
Quantum dots ppt
 
Evolution of Optical Fibers
Evolution of Optical FibersEvolution of Optical Fibers
Evolution of Optical Fibers
 
Rechercher dans HAL et exploiter les résultats
Rechercher dans HAL et exploiter les résultatsRechercher dans HAL et exploiter les résultats
Rechercher dans HAL et exploiter les résultats
 
Déposer une thèse dans TEL ou HAL
Déposer une thèse dans TEL ou HALDéposer une thèse dans TEL ou HAL
Déposer une thèse dans TEL ou HAL
 

Plus de Camera Culture Group, MIT Media Lab

God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar
God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar
God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar Camera Culture Group, MIT Media Lab
 
Dont follow the rainbow: How to avoid career traps that can lead you to fail,...
Dont follow the rainbow: How to avoid career traps that can lead you to fail,...Dont follow the rainbow: How to avoid career traps that can lead you to fail,...
Dont follow the rainbow: How to avoid career traps that can lead you to fail,...Camera Culture Group, MIT Media Lab
 
Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019
Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019
Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019Camera Culture Group, MIT Media Lab
 
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...Camera Culture Group, MIT Media Lab
 

Plus de Camera Culture Group, MIT Media Lab (20)

Raskar Sig2017 Siggraph Achievement Award Talk
Raskar Sig2017 Siggraph Achievement Award TalkRaskar Sig2017 Siggraph Achievement Award Talk
Raskar Sig2017 Siggraph Achievement Award Talk
 
Lost Decade of Computational Photography
Lost Decade of Computational PhotographyLost Decade of Computational Photography
Lost Decade of Computational Photography
 
Covid Safe Paths
Covid Safe PathsCovid Safe Paths
Covid Safe Paths
 
God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar
God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar
God’s Eye View: Will global AI empower us or destroy us? | Ramesh Raskar
 
Dont follow the rainbow: How to avoid career traps that can lead you to fail,...
Dont follow the rainbow: How to avoid career traps that can lead you to fail,...Dont follow the rainbow: How to avoid career traps that can lead you to fail,...
Dont follow the rainbow: How to avoid career traps that can lead you to fail,...
 
Raskar PhD and MS Thesis Guidance
Raskar PhD and MS Thesis GuidanceRaskar PhD and MS Thesis Guidance
Raskar PhD and MS Thesis Guidance
 
Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019
Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019
Making Invisible Visible, Ramesh Raskar Keynote at Embedded Vision 2019
 
Augmented Surgeons: AI AR for Anatome, Raskar Aria 2019
Augmented Surgeons: AI AR for Anatome, Raskar Aria 2019Augmented Surgeons: AI AR for Anatome, Raskar Aria 2019
Augmented Surgeons: AI AR for Anatome, Raskar Aria 2019
 
Geo-spatial Research: Transition from Analysis to Synthesis
Geo-spatial Research: Transition from Analysis to SynthesisGeo-spatial Research: Transition from Analysis to Synthesis
Geo-spatial Research: Transition from Analysis to Synthesis
 
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
Split Learning versus Federated Learning for Data Transparent ML, Camera Cult...
 
Unspoken Challenges in AR and XR
Unspoken Challenges in AR and XRUnspoken Challenges in AR and XR
Unspoken Challenges in AR and XR
 
Raskar stanfordextremecompuimagingapr2016
Raskar stanfordextremecompuimagingapr2016Raskar stanfordextremecompuimagingapr2016
Raskar stanfordextremecompuimagingapr2016
 
What is SIGGRAPH NEXT? Intro by Ramesh Raskar
What is SIGGRAPH NEXT? Intro by Ramesh RaskarWhat is SIGGRAPH NEXT? Intro by Ramesh Raskar
What is SIGGRAPH NEXT? Intro by Ramesh Raskar
 
What is Media in MIT Media Lab, Why 'Camera Culture'
What is Media in MIT Media Lab, Why 'Camera Culture'What is Media in MIT Media Lab, Why 'Camera Culture'
What is Media in MIT Media Lab, Why 'Camera Culture'
 
Raskar UIST Keynote 2015 November
Raskar UIST Keynote 2015 NovemberRaskar UIST Keynote 2015 November
Raskar UIST Keynote 2015 November
 
Multiview Imaging HW Overview
Multiview Imaging HW OverviewMultiview Imaging HW Overview
Multiview Imaging HW Overview
 
Time of Flight Cameras - Refael Whyte
Time of Flight Cameras - Refael WhyteTime of Flight Cameras - Refael Whyte
Time of Flight Cameras - Refael Whyte
 
Leap Motion Development (Rohan Puri)
Leap Motion Development (Rohan Puri)Leap Motion Development (Rohan Puri)
Leap Motion Development (Rohan Puri)
 
Compressed Sensing - Achuta Kadambi
Compressed Sensing - Achuta KadambiCompressed Sensing - Achuta Kadambi
Compressed Sensing - Achuta Kadambi
 
Coded Photography - Ramesh Raskar
Coded Photography - Ramesh RaskarCoded Photography - Ramesh Raskar
Coded Photography - Ramesh Raskar
 

Dernier

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Dernier (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

BiDi Screen A Thin, Depth-Sensing LCD for 3D Interaction using Light Fields Matthew Hirsch Douglas Lanman Henry Holtzman Ramesh Raskar

Notes de l'éditeur

  1. We have all heard a lot about touch interaction recently. In this talk I’m going to describe a new way to interact with thin-screen devices. We have developed a device we are calling the BiDi Screen, short for bi-directional, which supports seamless transition from on-screen multi-touch to hover-based gestural interaction, among other features, in an LCD-thin package.
  2. Here’s a quick teaser to illustrate the capabilities I’m describing. <wait for multi- to hover part to pass> Here you see a user pulling her hands away to rotate and zoom a 3-D model. We also show a use of 3D gesture to navigate a 3D world. We support these modes by creating an array of virtual cameras on an LCD using a technique known as Spatial Heterodyning. Because we’re using an optical technique, we also enable dynamic relighting applications, where real-world lighting is transfered to a rendered scene.
  3. We are inspired by the next generation of multi-touch devices that rely on optical sensors embedded in an LCD matrix typically used for display. We also take inspiration from developments in commercializing depth sensitive cameras and the explosion of multi-touch interaction devices in consumer electronics and media and their ability to provide a smooth and intuitive user experience. What if we could combine all of these features into a single device?
  4. This device would of course support multi-touch on-screen interaction, but because it can measure the distance to objects in the scene a user’s hands can be tracked in a volume in front of the screen, without gloves or other fiducials.
  5. Since we are adapting LCD technology we can fit a BiDi screen into laptops and mobile devices.
  6. So here is a preview of our quantitative results. I’ll explain this in more detail later on, but you can see we’re able to accurately distinguish the depth of a set of resolution targets. We show above a portion of portion of views form our virtual cameras, a synthetically refocused image, and the depth map derived from it.
  7. You may be wondering at this point how a you can build a thin device to enable touch and 3D gesture interaction with bare hands, and still display images without interference.
  8. Recall that one of our inspirations was this new class of optical multi-touch device. At the top you can see a prototype that Sharp Microelectronics has published. These devices are basically arrays of naked phototransistors. Like a document scanner, they are able to capture a sharp image of objects in contact with the surface of the screen. But as objects move away from the screen, without any focusing optics, the images captured this device are blurred.
  9. Our observation is that by moving the sensor plane a small distance from the LCD in an optical multitouch device, we enable mask-based light-field capture. We use the LCD screen to display the desired masks, multiplexing between images displayed for the user and masks displayed to create a virtual camera array. I’ll explain more about the virtual camera array in a moment, but suffice to say that once we have measurements from the array we can extract depth.
  10. Thus the ideal BiDi screen consists of a normal LCD panel separated by a small distance from a bare sensor array. This format creates a single device that spatially collocates a display and capture surface.
  11. In order to see what’s going on here, its useful to consider the case where we display a pattern of pinhole masks on the LCD. This essentially creates an array of virtual cameras, each of which has a slightly different view of the objects in front of the screen. Pinholes allow very little light through to the sensor layer, however. Some of us (Lanman and Raskar) have shown in previous work that with a technique called Spatial Heterodyning, a tiled-broadband mask, such as the MURA code shown here, has equivalent resolution properties to a pinhole array but allows 50 times more light through to the sensor.
  12. With this virtual camera array we’re able to capture depth information, which can be used for a wide range of purposes. We show this simple luke-skywalker like interaction to demonstrate the abilites of the BiDi screen, but much more is possible.
  13. This sounds somewhat complicated, I know. In our thinking about this topic we considered a number of alternative solutions. Touch and gesture interaction is a popular field, with many techniques and variety of design goals. I’ll show you why I think the BiDi screen has much to offer over these alternatives. /* From top left, clockwise: Jeff Han’s multitouch screen used by King on CNN, the Apple iPhone, gSpeak by Oblong Industries, the Stanford Multi-Camera array, the Canesta Depth Cam, the ThinSight, by Microsoft Research, and the Microsoft Surface */
  14. To start, let’s consider the the simplest cases: traditional touch technology. Resistive and capacitive touch screens have been the staple of touch interaction for years and the techniques are well understood and cheap. As they stand commercially neither can sense off-screen gesture. It may be possible using arrays of capacitive sensing RF antennas to detect objects with sufficient resolution for gesture interaction. In this mode, calibration becomes a significant issue. It is important to note that in contrast to optical multi-touch neither resistive nor capacitive touch techniques are currently on a technology trend that will lead to gesture interaction.
  15. More sophisticated methods can be imagined to support our goals. One such approach would be to put a real camera array behind the screen. This would duplicate the measurements that the BiDi screen makes but will create dark spots by blocking the backlight. The Microsoft SecondLight uses a pair of cameras far behind the screen with a switchable diffuser. This adds significant depth to the device. Other near-field optical techniques exist as well, such as the Microsoft Research ThinSight and the work shown by Jeff Han, but do not addres off-screen interaction. Another option would be to place the cameras to the side of the screen. In this scenario there will be a transition region between touch and gesture that will be uncovered by the cameras. A multi-modal approach could help to fill this gap. But will still leave some region of gestural interaction uncovered. An extreem fish eye lens, which might cover the entire screen might fill the space, but will add distortion that will make object correspondance for depth from steredo difficult. Additional cameras might be turned in to face the screen but will add to the thickness of the device. Importantly, none of these solutions is poised to ride an existing technology trend. Manufacturers are already building LCDs with embedded light sensors.
  16. Employing sensors specifically designed to measure depth is another option. Time-of-flight depth cameras can be placed behind the screen and coupled with projectors or other types of display, such as the Microsoft TouchLight. This approach generally yields a deep device not suitable for applications that require portability. These approaches cannot replicate the relighting demo that I showed earlier.
  17. To achieve collocated display with a capture with a virtual array of cameras in a thin form factor, the BiDi screen uses an LCD as a spatial light modulator for mask-based light field capture, and also to display images to the user. A depth map for the scene in front of the screen is extracted from the light field using a depth from focus technique, enabling gesture recognition. In order to more fully explain this process I’ll briefly present some background information. Some of this may seem a little unrelated, but bear with me and I’ll tie it together at the end.
  18. Not shown
  19. It is sufficient to consider a light field as the set of light rays taveling from objects in a scene to our sensor. We’ll consider the flatland case to make visualizing it simple. A ray striking the sensor on the x-axis at an angle theta is represent by a point in the x-theta parameterization of the light field on the right. A standard sensor integrates light rays striking it from all angles, measuring a line in x-theta space. A sensor array consisting of many sensors will measure a set of lines in x-theta space.
  20. Here we consider what happens when we take the fourier transform in x-theta space. The set of rays emitted from a real-world scene will produce some function in x-theta space. This light field will have some spectrum when transformed. The type of measurement shown here is actually a projection in x-theta space on to the x axis. By the fourier slice theorm, this projection will have a fourier transform pair in a slice along the f-x axis in the frequency domain. The important point here is that because of the type of measurement made by a standard light sensor, we are only able to measure a slice in the frequency domain along the f-x axis. This limits us to measuring only this region of the spectrum of the light field.
  21. Another interesting property of the light field is the skew transform that is applied to a set of rays that propogate through free space. Here we see a single ray passing through a mask and hitting a sensor at x. The ray is plotted in two light field spaces, one for the mask and one for the sensor. -- -- As we add rays we can see why the skew occurs.
  22. Another important property in Spatial Heterodyning is the fact that convolving an arbitrary function witha delta function creates a copy of that function at the location of the delta function.
  23. Recall that the light field in the x-theta plane has some spectrum when transformed. A critical property of the mask we choose for light field decoding is that its transform contain a series of delta functions.
  24. The name Spatial Heterodyning is inspired by AM radio broadcasts, in which a voice signal is multiplied (or modulated) with a high frequnecy carrier to shift it in the frequency domain. We accomplish a similar multiplication each time a ray passes through our mask. Recall from signal processing that multiplication in the prime domain is convolution in the frequeny domain.
  25. So to tie everything together, when the rays of the scene pass through our mask, their frequency spectrum is convolved with the series of deltas in the mask’s spectrum, creating copies of the scene spectrum. When we take into account the skew incurred by propogation through free space between the mask and sensor we can see something interesting.
  26. Now, different regions of each spectral copy lie on the measurable f-x axis. By measuring these regions... ...and rearranging the measured sections, we can reconstruct the original light field spectrum, from which we may obtain the light field itself. Note that we must have a band limited light field on order for this technique to work, as the spectral copies cannot overlap. In practice this means we must limit the angle of incident light into the sensor.
  27. So armed with this technique we went about building a prototype. One immediate stumbling block is the relative unavailablity of large area, high res light sensor arrays. Though these will be a commodity in the near future they are rare today. To compensate we replace the sensor array in our device with a diffuser/camera pair. The camera images light cast on the diffuser, simulating a sensor array. This is not ideal, as it creates a thick device, but it allows us to validate our design. Here you can see the cameras and LCD and diffuser frame for our prototype device.
  28. The camera/diffuser pair has one convenient advantage: it provides an angle limiting property almost automatically. Light entering the system. If a ray enters the diffuser at a shallow angle it will be diffused and reach the camera. A ray entering at a steep angle will never reach the camera. With a sensor array, various angle limiting materials can be used to fulfill this function.
  29. -- The diffuser also provides a nice surface to light with an array of backlight LEDs. Here you see the cameras we used (two point grey flea2s). (Two were used to cover the area of the screen with sufficient resolution). And here you see the frame we built to hold an LCD screen and diffuser.
  30. I’ll setp through the elements of our processing pipeline in order to demonstrate how each works of the course of a frame capture. With the backlights off, the mask pattern is displayed to the screen. Raw data is captured from the sensor. Note the high frequnecy modulation in the hand photo on the left. Spatial heterodyne decoding is applied, giving us an array of virtual cameras, each with a slightly different view of the scene. Synthetic aperture refocusing is used to obtain a stack of images focused at different depths. We run through them in a video here. A maximum contrast operator is applied to each pixel in this focal stack to find the image with the sharpest focus. A depth map is obtained form the maximum contrast operator, and hands can be tracked in the depth map.
  31. Finally, we render the scene based on the input we just received, and turn on the backlight.
  32. We show some further analysis to understand what is possible with the BiDi screen. In the case of pinholes, we can see that the resolution of our array of virtual cameras decreases with distance (size of pixel projected into space).
  33. The measureable resolution for an object given its distance from the screen is shown in the top plot. This accounts for diffraction and geometric blur as well. The gold bars represent the measured performance of our prototype. The measured resolution also depends on the separation between the sensor and the mask. For a given sensor and display pixel size (that of our prototype) we see that the optimal separation of mask and sensor is about 2.2 cm.
  34. Returning to this analysis, we see in practice how the resolution of distant objects drops off (red arrow on targets). The plot in the bottom left shows the output of the maximum contrast operator (depth from focus) at the corresponding locations in the depth map on the right.
  35. Here I show the user manipulating more objects. Notice how the manipulation is started with a touch, and that her hand is tracked in free space in front of the screen to move and zoom the models. Remember that these simple interactions are meant to demonstrate just the tip of what is possible with this technology.
  36. Here the user uses 3d gesture to navigate a 3d world. Hand movement in to, and out of the screen move her view forward and back. The speed of the view turning up down left and right is determined by the hand position over the screen. You may notice the logo shown here is for the wrong conference. We’re pleased to announce that a technical paper based on this work has been accepted to SIGGRAPH Asia 2009.
  37. The approach we lay out here has some clear limitations. The device requres a separation between LCD and sensor, which necessarily increases the thickness of the screen somewhat over a normal LCD. The sensitivity to room lighting of the device, while a plus for enabling lighting aware interaction, it becomes difficult or impossible to use the screen in low lighting conditions, unless there is an internal lighting source. This is a problem common to many optical capture techniques. Time multiplexing of our display and capture frames causes an undesirable flicker. High refresh rate LCDs coming into the market will make it simple to overcome this problem.
  38. In the future, we will consider using masks that respond to the scene content to optimize the imaging properties of the display. With higher resolution devices we can facilitate novel video chat experiences with mixed reality rendering and live background subtraction. Look out for the technical paper we wrote on this topic, accepted to SIGGRAPH Asia 2009. Hopefully we’ll see you there.
  39. Questions?