SlideShare a Scribd company logo
1 of 28
Download to read offline
Detecting silent data corruptions
and memory leaks using
DMA Debug API

Shuah Khan
Senior Linux Kernel Developer – Open Source Group
Samsung Research America (Silicon Valley)
shuah.kh@samsung.com

Open Source Group – Silicon Valley

1

© 2013 SAMSUNG Electronics Co.
Abstract
Linux kernel drivers map and unmap Dynamic DMA buffers using DMA API.
DMA map operations can fail. Failure to check for errors can result in a variety
of problems ranging from panics to silent data corruptions. Kernel panics can
be fixed easily, however data corruptions are hard to debug.
DMA mapping error analysis performed by the presenter found that more than
50% of map interface return values go unchecked in the kernel. Further more,
several drivers fail to unmap buffers when an error occurs in the middle of a
multi-page dma mapping attempt.
Presenter added a new DMA Debug interface in Linux 3.9 to check for missing
mapping error checks.
This talk will share the results of the analysis and discuss how to find and fix
missing mapping errors checks using the new interface. This talk will discuss
possible enhancements to DMA Debug API to detect and flag unmap errors.
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Agenda
DMA API and its usage rules
DMA-debug API
DMA-debug API – what is missing?
Why check dma mapping errors?
Analysis results
After debug_dma_mapping_error()
Checking mapping errors (examples: incorrect and correct)
Use or not use unlikely()
dma_mapping_error()
Why unmap after use?
Next steps – possible enhancements to DMA-debug API
Questions
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
DMA API
Linux kernel supports Dynamic DMA mapping
– dma_map_single() and dma_unmap_single()
– dma_map_page() and dma_unmap_page()
– dma_map_sg() and dma_unmap_sg()
References:
– https://www.kernel.org/doc/Documentation/DMA-API.txt
– https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt
– https://www.kernel.org/doc/Documentation/DMA-attributes.txt

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
DMA API usage rules
Drivers
– can map and unmap DMA buffers at run-time
– should map buffers, use, and unmap when buffers are
no longer needed – don't hoard buffers
– ensure all mapped buffers are unmapped
– should use generic DMA API as opposed bus specific
DMA API e.g: pci_dma_*()
– should call dma_mapping_error() to check for mapping
errors before using the returned dma handle

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
DMA-debug API
designed for debugging driver DMA API
usage errors
keeps track of DMA mappings per device
debug_dma_map_page() - adds newly
mapped entry to keep track. Sets flag to track
missing mapping error checks
detects missing mapping error checks in
driver code after DMA mapping.
debug_dma_mapping_error() - checks and
clears flag set by debug_dma_map_page()
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
DMA-debug API
detects unmap attempts on invalid dma
addresses
generates warning message for missing
dma_mapping_error() calls with call trace
leading up to dma_unmap()
debug_dma_unmap_page() - checks if buffer
is valid and checks dma mapping error flag
CONFIG_HAVE_DMA_API_DEBUG and
CONFIG_DMA_API_DEBUG enabled
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
DMA-debug API
CONFIG_DMA_API_DEBUG disabled

dma_mapping_error()
debug_dma_mapping_error() is a stub
CONFIG_DMA_API_DEBUG enabled

dma_mapping_error()
debug_dma_mapping_error()
code is executed to clear mapping error
flag set by debug_dma_map()

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
DMA-debug API – what is missing?
Missing: Detecting missing unmap cases that
would result in dangling DMA buffers
Weakness: Detecting missing mapping errors
is done in debug_dma_unmap_page().
– These go undetected when driver fails to
unmap.

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Why check dma mapping errors?
Failure to check mapping error prior to using
the address
– could result in panics, silent data
corruptions
– panics could be fixed easily once they
occur
– data corruptions are very hard to debug, not
to mention the damage they do.
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Why check dma mapping errors?
Preventing errors is better than the alternative
Detection allows
– Taking corrective action that is right for the
condition
– Prevents uncertain failure modes

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
DMA API error handling

dma_map_single() or
dma_map_page()

dma_mapping_error()

no error checking

bad

use buffer

error handling

return good

use buffer

Open Source Group – Silicon Valley

unpredictable failure mode

© 2013 SAMSUNG Electronics Co.
Analysis results
First analysis (linux-next August 6 2012):
– large % (> 50%) of addresses returned by
dma_map_single() and dma_map_page() go
unchecked
Current status (linux 3.12-rc5) October 2013:
– large % (> 50%) of addresses returned by
dma_map_single() and dma_map_page() go
unchecked
No change in the % of missing mapping error checks.

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Problem classification - broken
Broken
– no dma mapping error checks done on the returned address.
Partially Broken
– not all dma_map_single() and dma_map_page() calls are
followed by mapping error checks.
Unmap broken
– checks dma mapping errors
– doesn't unmap already mapped pages when mapping error
occurs in the middle of a multiple page mapping attempt.

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Problem classification - good
Good
– checks dma mapping errors correctly
– checks dma mapping errors with unlikely()
– unmaps already mapped pages when
mapping error occurs in the middle of a
multiple page mapping attempt.

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
dma_map_single() results
Broken - 46%
Partially broken - 11%
Unmap broken - 6%
Good - 35%

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
dma_map_page() results
Broken - 59%
Partially broken - 11%
Unmap broken - 15%
Good - 19%

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Things considered and action taken
When do mapping errors get detected?
How often do these errors occur?
Why don't we see failures related to missing
dma mapping error checks?
Are they silent failures?
What is done - a new DMA-debug interface is
added after the first analysis

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
After debug_dma_mapping_error()
debug_dma_mapping_error() went into
Linux-3.9
Several drivers have been fixed as a result of
the warnings.
Intel drivers deserve a special mention.
– drivers flagged in the first analysis have
been fixed.
New code and drivers are added that fail to
check errors since the last analysis
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Checking mapping errors
Incorrect example 1:
Non-generic and not portable - depends on architecture
specific dma implementations
dma_addr_t dma_handle;
dma_handle = dma_map_single(dev, addr, size, direction);
if ((dma_handle & 0xffff != 0) || (dma_handle >= 0x1000000)) {
goto map_error;
}

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Checking mapping errors
Incorrect example 2:
Non-generic and not portable - depends on architecture
specific DMA_ERROR_CODE definitions
dma_addr_t dma_handle;
dma_handle = dma_map_single(dev, addr, size, direction);
if (dma_handle == DMA_ERROR_CODE) {
goto map_error;
}

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Checking mapping errors
Generic and portable:
dma_addr_t dma_handle;
dma_handle = dma_map_page(dev, page, offset, size, direction);
if (dma_mapping_error(dev, dma_handle)) {
/*
* reduce current DMA mapping usage,
* delay and try again later or
* reset driver.
*/
goto map_error_handling;
}
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Use or not use unlikely()
likely() and unlikely() aren't efficient in all cases.
Don't use it especially in driver code.
if (unlikely(dma_mapping_error(dev, dma_handle))) {
--}
More on this topic:
https://lkml.org/lkml/2012/10/18/150
http://blog.man7.org/2012/10/how-much-do-builtinexpect
-likely-and.html
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
dma_mapping_error()
It is implemented by all DMA implementations: the ones that don't
implement, simply return 0
– e.g: arch/openrisc/include/asm/dma-mapping.h
Some architectures return DMA_ERROR_CODE
– e.g: arch/sparc/include/asm/dma-mapping.h
Some implement it invoking underlying dma_ops
– e.g: arch/x86/include/asm/dma-mapping.h
Good practice to use dma_mapping_error() to check errors and
let the underlying DMA layer handle the architecture specifics.

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Why unmap after use?
Timely unmap of DMA buffers ensures buffer
availability in need
Failure to unmap when mapping error occurs in the
middle of a multi-page DMA map attempt is a problem
– equivalent to a memory leak condition
– leaves dangling DMA buffers that will never get
unmapped and reclaimed.
Note: failure to unmap is not a problem on some
architectures
– however from drivers calling dma_unmap() is a
good practice.
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Next steps
possible enhancements to DMA-debug API
– enhance to check for unmap errors.
– I am working on the following ideas – stay
tuned
• when should the unmap error check get
triggered?
–one possible option is when device object
is released.
• dynamic DMA-debug API?
http://linuxdriverproject.org/mediawiki/index.php/User_talk:Shuahkhan
Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Questions?

Open Source Group – Silicon Valley

© 2013 SAMSUNG Electronics Co.
Thank you.

Shuah Khan
Senior Linux Kernel Developer – Open Source Group
Samsung Research America (Silicon Valley)
shuah.kh@samsung.com

Open Source Group – Silicon Valley

28

© 2013 SAMSUNG Electronics Co.

More Related Content

What's hot

Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device driversHoucheng Lin
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdfAdrian Huang
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux KernelAdrian Huang
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in LinuxAdrian Huang
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)shimosawa
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBshimosawa
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
 
Reverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelAdrian Huang
 
Linux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver OverviewLinux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver OverviewRajKumar Rampelli
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debuggingHao-Ran Liu
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Adrian Huang
 
Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)Adrian Huang
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedAdrian Huang
 
Part 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingPart 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingTushar B Kute
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
 
Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelKernel TLV
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionGene Chang
 
How To Install and Configure Log Rotation on RHEL 7 or CentOS 7
How To Install and Configure Log Rotation on RHEL 7 or CentOS 7How To Install and Configure Log Rotation on RHEL 7 or CentOS 7
How To Install and Configure Log Rotation on RHEL 7 or CentOS 7VCP Muthukrishna
 
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedAnne Nicolas
 

What's hot (20)

Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device drivers
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
 
spinlock.pdf
spinlock.pdfspinlock.pdf
spinlock.pdf
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Reverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux Kernel
 
Linux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver OverviewLinux Kernel MMC Storage driver Overview
Linux Kernel MMC Storage driver Overview
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
 
Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
 
Part 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingPart 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module Programming
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 
Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux Kernel
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
How To Install and Configure Log Rotation on RHEL 7 or CentOS 7
How To Install and Configure Log Rotation on RHEL 7 or CentOS 7How To Install and Configure Log Rotation on RHEL 7 or CentOS 7
How To Install and Configure Log Rotation on RHEL 7 or CentOS 7
 
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
 

Viewers also liked

XPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
XPDS13: Xen on ARM Update - Stefano Stabellini, CitrixXPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
XPDS13: Xen on ARM Update - Stefano Stabellini, CitrixThe Linux Foundation
 
Redesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) MechanismRedesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) MechanismThe Linux Foundation
 
Zynq mp勉強会資料
Zynq mp勉強会資料Zynq mp勉強会資料
Zynq mp勉強会資料一路 川染
 
5. IO virtualization
5. IO virtualization5. IO virtualization
5. IO virtualizationHwanju Kim
 
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...The Linux Foundation
 
仮想化環境におけるパケットフォワーディング
仮想化環境におけるパケットフォワーディング仮想化環境におけるパケットフォワーディング
仮想化環境におけるパケットフォワーディングTakuya ASADA
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化Takuya ASADA
 

Viewers also liked (7)

XPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
XPDS13: Xen on ARM Update - Stefano Stabellini, CitrixXPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
XPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
 
Redesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) MechanismRedesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) Mechanism
 
Zynq mp勉強会資料
Zynq mp勉強会資料Zynq mp勉強会資料
Zynq mp勉強会資料
 
5. IO virtualization
5. IO virtualization5. IO virtualization
5. IO virtualization
 
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
 
仮想化環境におけるパケットフォワーディング
仮想化環境におけるパケットフォワーディング仮想化環境におけるパケットフォワーディング
仮想化環境におけるパケットフォワーディング
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化
 

Similar to Detecting Silent Data Corruptions using Linux DMA Debug API

A Detailed Look at Cairo's OpenGL Spans Compositor Performance
A Detailed Look at Cairo's OpenGL Spans Compositor PerformanceA Detailed Look at Cairo's OpenGL Spans Compositor Performance
A Detailed Look at Cairo's OpenGL Spans Compositor PerformanceSamsung Open Source Group
 
Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013MattKilner
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxSamsung Open Source Group
 
Cloud-native Java EE-volution
Cloud-native Java EE-volutionCloud-native Java EE-volution
Cloud-native Java EE-volutionQAware GmbH
 
Android tools for testers
Android tools for testersAndroid tools for testers
Android tools for testersMaksim Kovalev
 
Memory Management in Android
Memory Management in AndroidMemory Management in Android
Memory Management in AndroidOpersys inc.
 
SQL tips and techniques April 2014
SQL tips and techniques April 2014SQL tips and techniques April 2014
SQL tips and techniques April 2014Nick Ivanov
 
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the mythsDB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the mythsFlorence Dubois
 
Sap Solman Instguide Install Windows MaxDB
Sap Solman Instguide Install Windows MaxDBSap Solman Instguide Install Windows MaxDB
Sap Solman Instguide Install Windows MaxDBwlacaze
 
Ds white papers_caa_radebyexample
Ds white papers_caa_radebyexampleDs white papers_caa_radebyexample
Ds white papers_caa_radebyexampleTrần Đức
 
Overview
OverviewOverview
Overviewhendriv
 

Similar to Detecting Silent Data Corruptions using Linux DMA Debug API (20)

A Detailed Look at Cairo's OpenGL Spans Compositor Performance
A Detailed Look at Cairo's OpenGL Spans Compositor PerformanceA Detailed Look at Cairo's OpenGL Spans Compositor Performance
A Detailed Look at Cairo's OpenGL Spans Compositor Performance
 
X Means Y
X Means YX Means Y
X Means Y
 
Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013
 
Linux DMA Engine
Linux DMA EngineLinux DMA Engine
Linux DMA Engine
 
Resume_Manjunath Jayaram
Resume_Manjunath JayaramResume_Manjunath Jayaram
Resume_Manjunath Jayaram
 
Resume_Manjunath Jayaram
Resume_Manjunath JayaramResume_Manjunath Jayaram
Resume_Manjunath Jayaram
 
Resume_Manjunath Jayaram
Resume_Manjunath JayaramResume_Manjunath Jayaram
Resume_Manjunath Jayaram
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on Linux
 
What's new in designer
What's new in designerWhat's new in designer
What's new in designer
 
Cloud-native Java EE-volution
Cloud-native Java EE-volutionCloud-native Java EE-volution
Cloud-native Java EE-volution
 
Android tools for testers
Android tools for testersAndroid tools for testers
Android tools for testers
 
Memory Management in Android
Memory Management in AndroidMemory Management in Android
Memory Management in Android
 
mdd.docx
mdd.docxmdd.docx
mdd.docx
 
SQL tips and techniques April 2014
SQL tips and techniques April 2014SQL tips and techniques April 2014
SQL tips and techniques April 2014
 
pmux
pmuxpmux
pmux
 
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the mythsDB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
 
Sap Solman Instguide Install Windows MaxDB
Sap Solman Instguide Install Windows MaxDBSap Solman Instguide Install Windows MaxDB
Sap Solman Instguide Install Windows MaxDB
 
Ds white papers_caa_radebyexample
Ds white papers_caa_radebyexampleDs white papers_caa_radebyexample
Ds white papers_caa_radebyexample
 
Overview
OverviewOverview
Overview
 
Ch1
Ch1Ch1
Ch1
 

More from Samsung Open Source Group

The Complex IoT Equation (and FLOSS solutions)
The Complex IoT Equation (and FLOSS solutions)The Complex IoT Equation (and FLOSS solutions)
The Complex IoT Equation (and FLOSS solutions)Samsung Open Source Group
 
Rapid SPi Device Driver Development over USB
Rapid SPi Device Driver Development over USBRapid SPi Device Driver Development over USB
Rapid SPi Device Driver Development over USBSamsung Open Source Group
 
Tizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT DevicesTizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT DevicesSamsung Open Source Group
 
IoTivity: Smart Home to Automotive and Beyond
IoTivity: Smart Home to Automotive and BeyondIoTivity: Smart Home to Automotive and Beyond
IoTivity: Smart Home to Automotive and BeyondSamsung Open Source Group
 
IoTivity for Automotive: meta-ocf-automotive tutorial
IoTivity for Automotive: meta-ocf-automotive tutorialIoTivity for Automotive: meta-ocf-automotive tutorial
IoTivity for Automotive: meta-ocf-automotive tutorialSamsung Open Source Group
 
Open Source Metrics to Inform Corporate Strategy
Open Source Metrics to Inform Corporate StrategyOpen Source Metrics to Inform Corporate Strategy
Open Source Metrics to Inform Corporate StrategySamsung Open Source Group
 
IoTivity for Automotive IoT Interoperability
IoTivity for Automotive IoT InteroperabilityIoTivity for Automotive IoT Interoperability
IoTivity for Automotive IoT InteroperabilitySamsung Open Source Group
 
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...Samsung Open Source Group
 
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceAdding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceSamsung Open Source Group
 
IoT: From Arduino Microcontrollers to Tizen Products using IoTivity
IoT: From Arduino Microcontrollers to Tizen Products using IoTivityIoT: From Arduino Microcontrollers to Tizen Products using IoTivity
IoT: From Arduino Microcontrollers to Tizen Products using IoTivitySamsung Open Source Group
 
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxPractical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxSamsung Open Source Group
 
IoTivity Tutorial: Prototyping IoT Devices on GNU/Linux
IoTivity Tutorial: Prototyping IoT Devices on GNU/LinuxIoTivity Tutorial: Prototyping IoT Devices on GNU/Linux
IoTivity Tutorial: Prototyping IoT Devices on GNU/LinuxSamsung Open Source Group
 
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of ThingsJerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of ThingsSamsung Open Source Group
 

More from Samsung Open Source Group (20)

The Complex IoT Equation (and FLOSS solutions)
The Complex IoT Equation (and FLOSS solutions)The Complex IoT Equation (and FLOSS solutions)
The Complex IoT Equation (and FLOSS solutions)
 
Easy IoT with JavaScript
Easy IoT with JavaScriptEasy IoT with JavaScript
Easy IoT with JavaScript
 
Spawny: A New Approach to Logins
Spawny: A New Approach to LoginsSpawny: A New Approach to Logins
Spawny: A New Approach to Logins
 
Rapid SPi Device Driver Development over USB
Rapid SPi Device Driver Development over USBRapid SPi Device Driver Development over USB
Rapid SPi Device Driver Development over USB
 
Tizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT DevicesTizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
Tizen RT: A Lightweight RTOS Platform for Low-End IoT Devices
 
IoTivity: Smart Home to Automotive and Beyond
IoTivity: Smart Home to Automotive and BeyondIoTivity: Smart Home to Automotive and Beyond
IoTivity: Smart Home to Automotive and Beyond
 
IoTivity for Automotive: meta-ocf-automotive tutorial
IoTivity for Automotive: meta-ocf-automotive tutorialIoTivity for Automotive: meta-ocf-automotive tutorial
IoTivity for Automotive: meta-ocf-automotive tutorial
 
GENIVI + OCF Cooperation
GENIVI + OCF CooperationGENIVI + OCF Cooperation
GENIVI + OCF Cooperation
 
Framework for IoT Interoperability
Framework for IoT InteroperabilityFramework for IoT Interoperability
Framework for IoT Interoperability
 
Open Source Metrics to Inform Corporate Strategy
Open Source Metrics to Inform Corporate StrategyOpen Source Metrics to Inform Corporate Strategy
Open Source Metrics to Inform Corporate Strategy
 
IoTivity for Automotive IoT Interoperability
IoTivity for Automotive IoT InteroperabilityIoTivity for Automotive IoT Interoperability
IoTivity for Automotive IoT Interoperability
 
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Thin...
 
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceAdding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
 
IoTivity: From Devices to the Cloud
IoTivity: From Devices to the CloudIoTivity: From Devices to the Cloud
IoTivity: From Devices to the Cloud
 
SOSCON 2016 JerryScript
SOSCON 2016 JerryScriptSOSCON 2016 JerryScript
SOSCON 2016 JerryScript
 
IoT: From Arduino Microcontrollers to Tizen Products using IoTivity
IoT: From Arduino Microcontrollers to Tizen Products using IoTivityIoT: From Arduino Microcontrollers to Tizen Products using IoTivity
IoT: From Arduino Microcontrollers to Tizen Products using IoTivity
 
Run Your Own 6LoWPAN Based IoT Network
Run Your Own 6LoWPAN Based IoT NetworkRun Your Own 6LoWPAN Based IoT Network
Run Your Own 6LoWPAN Based IoT Network
 
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under LinuxPractical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
Practical Guide to Run an IEEE 802.15.4 Network with 6LoWPAN Under Linux
 
IoTivity Tutorial: Prototyping IoT Devices on GNU/Linux
IoTivity Tutorial: Prototyping IoT Devices on GNU/LinuxIoTivity Tutorial: Prototyping IoT Devices on GNU/Linux
IoTivity Tutorial: Prototyping IoT Devices on GNU/Linux
 
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of ThingsJerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Detecting Silent Data Corruptions using Linux DMA Debug API

  • 1. Detecting silent data corruptions and memory leaks using DMA Debug API Shuah Khan Senior Linux Kernel Developer – Open Source Group Samsung Research America (Silicon Valley) shuah.kh@samsung.com Open Source Group – Silicon Valley 1 © 2013 SAMSUNG Electronics Co.
  • 2. Abstract Linux kernel drivers map and unmap Dynamic DMA buffers using DMA API. DMA map operations can fail. Failure to check for errors can result in a variety of problems ranging from panics to silent data corruptions. Kernel panics can be fixed easily, however data corruptions are hard to debug. DMA mapping error analysis performed by the presenter found that more than 50% of map interface return values go unchecked in the kernel. Further more, several drivers fail to unmap buffers when an error occurs in the middle of a multi-page dma mapping attempt. Presenter added a new DMA Debug interface in Linux 3.9 to check for missing mapping error checks. This talk will share the results of the analysis and discuss how to find and fix missing mapping errors checks using the new interface. This talk will discuss possible enhancements to DMA Debug API to detect and flag unmap errors. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 3. Agenda DMA API and its usage rules DMA-debug API DMA-debug API – what is missing? Why check dma mapping errors? Analysis results After debug_dma_mapping_error() Checking mapping errors (examples: incorrect and correct) Use or not use unlikely() dma_mapping_error() Why unmap after use? Next steps – possible enhancements to DMA-debug API Questions Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 4. DMA API Linux kernel supports Dynamic DMA mapping – dma_map_single() and dma_unmap_single() – dma_map_page() and dma_unmap_page() – dma_map_sg() and dma_unmap_sg() References: – https://www.kernel.org/doc/Documentation/DMA-API.txt – https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt – https://www.kernel.org/doc/Documentation/DMA-attributes.txt Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 5. DMA API usage rules Drivers – can map and unmap DMA buffers at run-time – should map buffers, use, and unmap when buffers are no longer needed – don't hoard buffers – ensure all mapped buffers are unmapped – should use generic DMA API as opposed bus specific DMA API e.g: pci_dma_*() – should call dma_mapping_error() to check for mapping errors before using the returned dma handle Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 6. DMA-debug API designed for debugging driver DMA API usage errors keeps track of DMA mappings per device debug_dma_map_page() - adds newly mapped entry to keep track. Sets flag to track missing mapping error checks detects missing mapping error checks in driver code after DMA mapping. debug_dma_mapping_error() - checks and clears flag set by debug_dma_map_page() Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 7. DMA-debug API detects unmap attempts on invalid dma addresses generates warning message for missing dma_mapping_error() calls with call trace leading up to dma_unmap() debug_dma_unmap_page() - checks if buffer is valid and checks dma mapping error flag CONFIG_HAVE_DMA_API_DEBUG and CONFIG_DMA_API_DEBUG enabled Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 8. DMA-debug API CONFIG_DMA_API_DEBUG disabled dma_mapping_error() debug_dma_mapping_error() is a stub CONFIG_DMA_API_DEBUG enabled dma_mapping_error() debug_dma_mapping_error() code is executed to clear mapping error flag set by debug_dma_map() Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 9. DMA-debug API – what is missing? Missing: Detecting missing unmap cases that would result in dangling DMA buffers Weakness: Detecting missing mapping errors is done in debug_dma_unmap_page(). – These go undetected when driver fails to unmap. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 10. Why check dma mapping errors? Failure to check mapping error prior to using the address – could result in panics, silent data corruptions – panics could be fixed easily once they occur – data corruptions are very hard to debug, not to mention the damage they do. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 11. Why check dma mapping errors? Preventing errors is better than the alternative Detection allows – Taking corrective action that is right for the condition – Prevents uncertain failure modes Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 12. DMA API error handling dma_map_single() or dma_map_page() dma_mapping_error() no error checking bad use buffer error handling return good use buffer Open Source Group – Silicon Valley unpredictable failure mode © 2013 SAMSUNG Electronics Co.
  • 13. Analysis results First analysis (linux-next August 6 2012): – large % (> 50%) of addresses returned by dma_map_single() and dma_map_page() go unchecked Current status (linux 3.12-rc5) October 2013: – large % (> 50%) of addresses returned by dma_map_single() and dma_map_page() go unchecked No change in the % of missing mapping error checks. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 14. Problem classification - broken Broken – no dma mapping error checks done on the returned address. Partially Broken – not all dma_map_single() and dma_map_page() calls are followed by mapping error checks. Unmap broken – checks dma mapping errors – doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple page mapping attempt. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 15. Problem classification - good Good – checks dma mapping errors correctly – checks dma mapping errors with unlikely() – unmaps already mapped pages when mapping error occurs in the middle of a multiple page mapping attempt. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 16. dma_map_single() results Broken - 46% Partially broken - 11% Unmap broken - 6% Good - 35% Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 17. dma_map_page() results Broken - 59% Partially broken - 11% Unmap broken - 15% Good - 19% Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 18. Things considered and action taken When do mapping errors get detected? How often do these errors occur? Why don't we see failures related to missing dma mapping error checks? Are they silent failures? What is done - a new DMA-debug interface is added after the first analysis Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 19. After debug_dma_mapping_error() debug_dma_mapping_error() went into Linux-3.9 Several drivers have been fixed as a result of the warnings. Intel drivers deserve a special mention. – drivers flagged in the first analysis have been fixed. New code and drivers are added that fail to check errors since the last analysis Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 20. Checking mapping errors Incorrect example 1: Non-generic and not portable - depends on architecture specific dma implementations dma_addr_t dma_handle; dma_handle = dma_map_single(dev, addr, size, direction); if ((dma_handle & 0xffff != 0) || (dma_handle >= 0x1000000)) { goto map_error; } Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 21. Checking mapping errors Incorrect example 2: Non-generic and not portable - depends on architecture specific DMA_ERROR_CODE definitions dma_addr_t dma_handle; dma_handle = dma_map_single(dev, addr, size, direction); if (dma_handle == DMA_ERROR_CODE) { goto map_error; } Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 22. Checking mapping errors Generic and portable: dma_addr_t dma_handle; dma_handle = dma_map_page(dev, page, offset, size, direction); if (dma_mapping_error(dev, dma_handle)) { /* * reduce current DMA mapping usage, * delay and try again later or * reset driver. */ goto map_error_handling; } Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 23. Use or not use unlikely() likely() and unlikely() aren't efficient in all cases. Don't use it especially in driver code. if (unlikely(dma_mapping_error(dev, dma_handle))) { --} More on this topic: https://lkml.org/lkml/2012/10/18/150 http://blog.man7.org/2012/10/how-much-do-builtinexpect -likely-and.html Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 24. dma_mapping_error() It is implemented by all DMA implementations: the ones that don't implement, simply return 0 – e.g: arch/openrisc/include/asm/dma-mapping.h Some architectures return DMA_ERROR_CODE – e.g: arch/sparc/include/asm/dma-mapping.h Some implement it invoking underlying dma_ops – e.g: arch/x86/include/asm/dma-mapping.h Good practice to use dma_mapping_error() to check errors and let the underlying DMA layer handle the architecture specifics. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 25. Why unmap after use? Timely unmap of DMA buffers ensures buffer availability in need Failure to unmap when mapping error occurs in the middle of a multi-page DMA map attempt is a problem – equivalent to a memory leak condition – leaves dangling DMA buffers that will never get unmapped and reclaimed. Note: failure to unmap is not a problem on some architectures – however from drivers calling dma_unmap() is a good practice. Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 26. Next steps possible enhancements to DMA-debug API – enhance to check for unmap errors. – I am working on the following ideas – stay tuned • when should the unmap error check get triggered? –one possible option is when device object is released. • dynamic DMA-debug API? http://linuxdriverproject.org/mediawiki/index.php/User_talk:Shuahkhan Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 27. Questions? Open Source Group – Silicon Valley © 2013 SAMSUNG Electronics Co.
  • 28. Thank you. Shuah Khan Senior Linux Kernel Developer – Open Source Group Samsung Research America (Silicon Valley) shuah.kh@samsung.com Open Source Group – Silicon Valley 28 © 2013 SAMSUNG Electronics Co.