Debug, Analyze and Optimize Games with Intel Tools
1. Debug, Analyze and Optimize
Games with Intel Tools
Surviving the apocalypse on mainstream graphics
Matteo Valoriani, FifthIngenium CEO
Intel Software Innovator
2. Nice to Meet You
mvaloriani at gmail dot com
CEO of FifthIngenium
PhD at Politecnico of Milano
Speaker and Consultant
System Analyzer / HUD
• Get metrics for CPU, GPU, graphics drivers, DirectX*,
OpenGL*, or OpenGL* ES
• Experiment with override modes that quickly isolate
common performance bottlenecks
• Capture frames and traces for further analysis
• Display up to 16 performance metrics
• Monitor the current, minimum, and maximum frame
• Use without code modifications or special libraries
Graphics Frame Analyzer
• Use the API log to identify visual errors by function and call
errors and warnings to graphics APIs
• Select a draw call and verify its contribution to the frame,
alpha channel, color, format, and depth buffers
• Quantify performance optimization opportunities with
render experiments per draw call
• Solve issues with shadowing, lighting, or color schemes by
locating misplaced objects
Script Frustum Culling and Co-routines
Use the following Monobehavior callbacks to cull scripts
outside of the camera frustum that do not need to
update when not in focus.
Monobehavior callbacks which trigger when object with
script leaves / enters the camera frustum
Script Frustum Culling and Co-routines (2)
Co-routines are essentially functions with the ability to
pause and resume execution.
The power of co-routines can be leveraged by
removing the original Update() function in your script
and replacing it with a co-routine.
You can then set how often you would like your co-
routine to execute using the yield command.
Memory Management Optimization
A great way to get an overview of how you are managing memory is to check the ‘GC
Alloc’ section of the Overview window in Unity Profiler and step through your frames until
you see a significant allocation.
• To avoid frequent allocations, it is advantageous to use structs instead of classes to
have allocations be done on the stack, instead of in the heap.
• Multiple allocations to the heap can lead to significant memory fragmentation and
frequent garbage collections.
1. Go through your entire scene to multi-select
any objects that should be included in
occlusion culling calculations and mark them
as “Occluder Static” and “Occludee Static”.
2. When setting up your occlusion culling system,
set your occlusion areas carefully.By default,
Unity uses the entire scene as the occlusion
area, which can lead to frivolous computation.
3. To make sure that the entire scene isn’t used,
create an occlusion area manually and
surround only the area to be included in the
Level of Detail (LOD) allows multiple meshes to attach to a game object and provides the
ability to switch between meshes the object uses based on camera distance. The LOD can
automatically simplify the mesh to compensate.
LOD L0 L1 L2
fps 160 180 220
• Sampler limited
• No dynamic branching
• Optimized for Legacy HW where sampling was faster than computing LODs
• Implementation of dynamic branching increased perf by 2x ( 3ms -> 1.5ms)
• Using samplegrad for dynamic LOD selection