1. The presenter compared the graphics rendering performance of Metal to UIImageView to learn about GPU usage.
2. Metal was initially 10-20x faster than UIImageView for rendering images but was found to be slower after further analysis and optimization of the measurement code.
3. Two key problems were identified with the Metal implementation: processing on the CPU was blocking the GPU, and texture loading was a bottleneck.
4. Optimizations including combining operations, caching textures, and ensuring resources were in GPU memory improved the Metal performance.
3. Today’s Goal
• Learn “how to use Metal”
• Be conscious the GPU layer through Metal
4. Agenda
• Compare the graphics rendering performance of
Metal to UIImageView
→ Learn a lot around GPU
1. UIKit is optimized well with GPU.
2. Consider also the GPU, when measuring the performance.
3. Pay attention to the processing flow between CPU and GPU.
4. Be careful where the resource is.
9. Difference Between CPU and GPU
CPU is a Sports Car
• Very fast
• Can’t process many tasks in
parallel
GPU is a Bus
• Not as fast as CPU
• Can process many “same” tasks
in parallel
10. • CPU is very fast, good for any tasks (general-purpose
processor)
- However, if used to process everything, it will easily reach to
100% load.
→ Utilize GPU as much as possible,
if the task is good for GPU
(= can be computed in parallel)
29. Sample App
for the comparison
• Render large images in table cells.
- 5120 x 3200 (elcapitan.jpg)
- 1245 x 1245 (sierra.png)
30. Measuring Code
let time1 = CACurrentMediaTime()
if isMetal {
let metalCell = cell as! MetalTableViewCell
metalCell.metalImageView.textureName = name
} else {
let uikitCell = cell as! TableViewCell
uikitCell.uiImageView.image = UIImage(named: name)
}
let time2 = CACurrentMediaTime()
print("time:(time2-time1)")
Time
Interval
Render with UIImageView
Render with Metal
31. Results
• Metal is 10x - 20x faster!
Time to render an image
UIImageView 0.4 - 0.6 msec
Metal 0.02 - 0.05 msec
iPhone 6s
35. 2. CPU creates GPU commands
as a command buffer
1. Load image data to memory
for GPU (& CPU)
4. GPU processes
the commands
3. Push it to GPU
36. let time1 = CACurrentMediaTime()
if isMetal {
let metalCell = cell as! MetalTableViewCell
metalCell.metalImageView.textureName = name
} else {
let uikitCell = cell as! TableViewCell
uikitCell.uiImageView.image = UIImage(named: name)
}
let time2 = CACurrentMediaTime()
print("time:(time2-time1)")
37. 2. CPU creates GPU commands
as a command buffer
1. Load image data to memory
for GPU (& CPU)
3. Push it to GPU
4. GPU processes
the commands
NOT Considered!
38. • Measure the time until the GPU processing is completed
func draw(in view: MTKView) {
// Prepare the command buffer
...
// Push the command buffer
commandBuffer.commit()
// Wait
commandBuffer.waitUntilCompleted()
// Measure
let endTime = CACurrentMediaTime()
print(“Time: (endTime - startTime)”)
}
Fixed measuring code
Submit commands to GPU
Wait until the GPU processing
is completed
Calculate the total time
39. Results
• Metal is SLOWER !?
- Less than 30fps even the best case
→ My implementation should have problems
• UIImageView is fast enough anyways.
Time to render an image
UIImageView 0.4 - 0.6 msec
Metal 40 - 200 msec
42. • UIKit has been updated, and optimized well.
• Should use UIKit rather than making a custom UI
component with low level APIs (e.g. Metal) unless
there is particular reasons it can be better.
52. Current processing flow
1. Resize with MPSImageLanczosScale
2. After 1 is completed, call setNeedsDisplay()
3. draw(in:) of MTKViewDelegate is called
4. Render to screen in the draw(in:)
Problem
63. Measure the time to load textures
let startTime = CACurrentMediaTime()
textureLoader.newTexture(name: name, scaleFactor: scaleFactor, bundle: nil) { (texture,
error) in
let endTime = CACurrentMediaTime()
print("Time to load (name): (endTime - startTime)")
• Results: 20 - 500 msec
→ It’s the bottleneck!
64. Fix: Cache the loaded textures
• UIImage(named:) caches internally, too
• “Caching loaded image data” is NOT a Metal/GPU
specific idea.
65. Metal/GPU specific point:
“Where is the resource?”
Memory for GPU
(& CPU)
private var cachedTextures: [String: MTLTexture] = [:]OK
private var cachedImages: [String: UIImage] = [:]NG
70. Today’s Goal
• Learn “how to use Metal”
• Be conscious the GPU layer through Metal
71. • Compared the graphics rendering performance of
Metal to UIImageView
→ Learned a lot around GPU
1. UIKit is optimized well with GPU.
2. Consider also the GPU, when measuring the performance.
3. Pay attention to the processing flow between CPU and GPU.
4. Be careful where the resource is.