SlideShare a Scribd company logo
1 of 58
Download to read offline
Stable SSAO in Battlefield 3 with
Selective Temporal Filtering

Louis Bavoil                    Johan Andersson
Developer Technology Engineer   Rendering Architect
NVIDIA                          DICE / EA
SSAO
●   Screen Space Ambient Occlusion (SSAO)
    ●   Has become de-facto approach for rendering AO in
        games with no precomputation
    ●   Key Idea: use depth buffer as approximation of the
        opaque geometry of the scene
    ●   Large variety of SSAO algorithms, all taking as input
        the scene depth buffer

                                                           [Kajalin 09]
                                                   [Loos and Sloan 10]
                                                    [McGuire et al. 11]
HBAO
●   Horizon-Based Ambient Occlusion (HBAO)
    ●   Considers the depth buffer as a heightfield, and
        approximates ray-tracing this heightfield
    ●   Improved implementation available in NVIDIA’s SDK11
        (SSAO11.zip / HBAO_PS.hlsl)
    ●   Used for rendering SSAO in Battlefield 3 / PC for its
        “High” and “Ultra” Graphics Quality presets

                                                 [Bavoil and Sainz 09a]
                                                         [Andersson 10]
                                         [White and Barré-Brisebois 11]
[Andersson 10]
Original HBAO
Implementation
in Frostbite 2




Frame time (GPU):
25.2 ms

Total HBAO time:
2.5 ms (10% of frame)

[1920x1200 “High” DX11
GeForce GTX 560 Ti]
Screenshot: With HBAO
Screenshot: HBAO Only




            The HBAO looked good-enough on screenshots…
Video: Flickering HBAO




        …but produced objectionable flickering in motion on thin objects
          such as alpha-tested foliage (grass and trees in particular)
Our Constraints
●   PC-only
    ●   In Frostbite 2, HBAO implemented only for DX10 & DX11


●   Low Perf Hit, High Quality
        ●   Whole HBAO was already 2.5 ms (1920x1200 / GTX 560 Ti)
        ●   HBAO used in High and Ultra presets, had to look great
Considered Workarounds
●   Full-resolution SSAO or dual-resolution SSAO (*)
    …but that more-than-doubled the cost of the SSAO, and
    some flickering could remain

●   Brighten SSAO on the problematic objects
    …but we wanted a way to keep full-strength SSAO on
    everything (in particular on foliage)


                                         (*) [Bavoil and Sainz 09b]
Temporal Filtering Approach
●   By definition, AO depends only on the scene
    geometry, not on the camera
    ●   For static (or nearly-static geometry), can re-project AO
        from previous frame
    ●   Reduce AO variance between frames by using a temporal
        filter: newAO = lerp(newAO,previousAO,x)
●   Known approach used in Gears of War 2

                                                       [Nehab et al. 07]
                                               [Smedberg and Wright 09]
Previous
Camera                    Reverse
(Frame i-1)
                          Reprojection
                          [Nehab et al. 07]




Current       (uvi, zi)
Camera
(Frame i)
Previous
Camera                                        Reverse
(Frame i-1)
                                              Reprojection
                                              [Nehab et al. 07]




                          1. Current
              (uvi, zi)   ViewProjection-1
Current
Camera
(Frame i)                                Pi
Previous
Camera                                            Reverse
(Frame i-1)   uvi-1
                                                  Reprojection
                                                  [Nehab et al. 07]




                          2. Previous
                          ViewProjection

              (uvi, zi)
Current
Camera                    1. Current
                          ViewProjection-1   Pi
(Frame i)
Previous
Camera                                                       Reverse
(Frame i-1)   (uvi-1, zi-1)
                                                             Reprojection
                                                             [Nehab et al. 07]
                              3. Fetch
                              Previous
                              View Depth


                                2. Previous           Pi-1
                                ViewProjection
              (uvi, zi)
Current
Camera                    1. Current
                          ViewProjection-1       Pi
(Frame i)
Previous
Camera                                                          Reverse
(Frame i-1)   (uvi-1, zi-1)
                                                                Reprojection
                                                                [Nehab et al. 07]
                              3. Fetch Previous
                              View Depth




                                                       Pi-1
                                 2. Previous
                                 ViewProjection
              (uvi, zi)
                                                              4. If Pi ~= Pi-1
Current                                                       re-use AO(Pi-1)
Camera                    1. Current
(Frame i)                 ViewProjection-1        Pi
Temporal Refinement                              [Mattausch et al. 11]


     (uvi-1, zi-1)
                                If Pi-1 ~= Pi
                                     AOi = (Ni-1 AOi-1 + AOi) / (Ni-1 + 1)
                                     Ni = min(Ni-1 + 1, Nmax)
                                Else
                                     AOi = AOi
                        Pi-1
                                     Ni = 1
     (uvi, zi)

                      Pi

Ni = num. frames that have been accumulated in current solution at Pi
Nmax = max num. frames (~8), to keep AOi-1 contributing to AOi
Disocclusion Test          [Mattausch et al. 11]



   Pi ~= Pi-1 
   wi     = ViewDepth(Viewi-1, Pi)
   wi-1   = ViewDepth(Viewi-1, Pi-1)
Relaxed Disocclusion Test

      Pi ~= Pi-1 

To support nearly-static objects
  ●   Such as foliage waving in the wind (grass, trees, …)
  ●   We simply relaxed the threshold (used ε = 10%)
Video: Temporal Filtering with ε=+Inf




      Flickering is fixed, but there are ghosting artifacts due to disocclusions
Video: Temporal Filtering with ε=0.1%




        Flickering on the grass (nearly static), but no ghosting artifacts
Video: Temporal Filtering with ε=10%




          No flickering, no ghosting, 1% perf hit (25.2 -> 25.5 ms)
Video: Trailing Artifacts




    New issue: trailing artifacts on static objects receiving AO from dynamic objects
Observations
1. With temporal filtering OFF
  The flickering pixels are mostly on
  foliage. The other pixels do not have any
  objectionable flickering.

2. With temporal filtering ON
  The trailing artifacts (near the
  character's feet) are not an issue on
  foliage pixels.
Selective Temporal Filtering
●   Assumption
    The set of flickering pixels and the set of trailing
    pixels are mutually exclusive

●   Our Approach:
    1. Classify the pixels as stable (potential trailing) or
    unstable (potential flickering)
    2. Disable the temporal filter for the stable pixels
Pixel Classification Approach
●   Normal reconstruction in SSAO shader
    ●   Px = ||P - Pleft|| < ||P - Pright|| ? Pleft : Pright        Pt
    ●   Py = ||P - Ptop|| < ||P - Pbottom|| ? Ptop : Pbottom   Pl   P    Pr
    ●   N = ± normalize(cross(P - Px, P - Py))
                                                                    Pb

●   Idea: If reconstructed normal is noisy,
    the SSAO will be noisy
Piecewise Continuity Test
                          1.   Select nearest
                               neighbor Px between
                               Pleft and Pright
Px = Pleft   P




                 Pright
Piecewise Continuity Test
                          1.   Select nearest
                               neighbor Px between
                               Pleft and Pright
Px = Pleft   P
                          2.   Continuous pixels 
                               || Px – P || < L
                               where L = distance threshold
                               (in view-space units)

                 Pright
Pixel Classification Examples
Two Half-Res Passes
 Pass 1: Output SSAO and continuity mask
  continuityMask = (|| Px – P || < L && || Py – P || < L)


 Pass 2: Dilate the discontinuities
  dilatedMask = all4x4(continuityMask)
Example Scene
Pass 1: Classification




                         Note: Distant pixels are all classified as “unstable” because ||P – Px/y||
                         increases with depth (perspective camera). Luckily for us, trailing
                         artifacts were not an issue in the distance so this was not an issue.
Pass 2: 4x4 Dilation

Pixel classified as unstable if it
has at least one discontinuity in
its neighborhood




                                     4 Gather4 instructions on DX11
Selective Temporal Filtering

Enable temporal filter only:
- for unstable pixels
- with (|1 – wi /wi-1| < ε)
Video: HBAO + Selective Temporal Filtering
Video: Final Result
Final Pipeline with
Selective Temporal
Filtering (STF)

STF Performance Hit
[1920x1200 “High”, GTX 560 Ti]
•   HBAO total: 2.5 ms -> 2.9 ms
•   Frame time (GPU): 25.2 -> 25.6 ms
    (1.6% performance hit)

STF Parameters
•   Reprojection Threshold: For detecting
    disocclusions (ε=10%)
•   Distance Threshold: For detecting
    discontinuities (L=0.1 meter)
•   Dilation Kernel Size (4x4 texels)
History Buffers
●   Additional GPU memory required for the history buffers
     ●   For (AOi, Zi, Ni) and (AOi-1, Zi-1, Ni)
     ●   Full-res, 1xMSAA


●   For Multi-GPU Alternate Frame Rendering (AFR) rendering
     ●   Create one set of buffers per GPU and alternate between them
     ●   Each AFR GPU has its own buffers & reprojection state


●   The history buffers are cleared on first use
     ●   Clear values: (AO,Z)=(1.f, 0.f) and N=0
SelectiveTemporalFilter(uvi, AOi)                Unoptimized
   zi = Fetch(ZBufferi, uvi)
   Pi = UnprojectToWorld(uvi, zi)
                                                 pseudo-code
   uvi-1 = ProjectToUV(Viewi-1, Pi)
   zi-1 = Fetch(ZBufferi-1, uvi-1)
                                                 For fetching zi-1, use
   Pi-1 = UnprojectToWorld(uvi-1, zi-1)
   wi = ViewDepth(Viewi-1 , Pi)
                                                 clamp-to-border to
   wi-1 = ViewDepth(Viewi-1 , Pi-1)              discard out-of-frame
   isStablePixel = Fetch(StabilityMask, uvi)     data, with borderZ=0.f
   if ( |1 – wi/wi-1 | < ε && !isStablePixel )
      AOi-1 = Fetch(AOTexturei-1, uvi-1)
      Ni-1 = Fetch(NTexturei-1, uvi-1)           For fetching AOi-1, use
      AOi = (Ni-1 AOi-1 + AOi) / (Ni-1 + 1)      bilinear filtering like in
      Ni = min(Ni-1 + 1, Nmax)                   [Nehab et al. 07]
   else
      Ni = 1
   return(AOi, wi, Ni)
Blur Optimization
Raw HBAO (Blur OFF)
Final HBAO (Blur ON)
Blur Overview
●   Full-screen pixel-shader passes
    ●   BlurX (horizontal)
    ●   BlurY (vertical)


●   BlurX takes as input
    ●   Half-res AO
    ●   Full-res linear depth (non-MSAA)
Blur Kernel
●   We use 1D Cross-Bilateral Filters (CBF)
●   Gaussian blur with depth-dependent weights


                 Sum[ AOi w(i,Zi,Z0), i=-R..R ]
      Output =
                  Sum[ w(i,Zi,Z0), i=-R..R ]

                                                       [Petschnigg et al.    04]
                                                  [Eisemann and Durand       04]
                                                             [Kopf et al.    07]
                                                            [Bavoil et al.   08]
                                                          [McGuire et al.    11]
Blur Opt: Adaptive Sampling

                      2/3 of total weights


                       1/3 of total weights

                       Can these samples be
                       approximated?
Blur Opt: Adaptive Sampling


                             Keep original samples


                             Replace pairs of samples
                             with in-between sample
     (AO,Z) fetches with
     hw bilinear filtering
Blur Opt: Adaptive Sampling
Brute-Force Sampling   BlurX cost: 0.8 ms
Adaptive Sampling   BlurX cost: 0.6 ms
Blur Opt: Speedup
GPU Time        Before           After            Speedup
Pack (AO,Z)     0.18 ms          0.18 ms          0%
BlurX           0.75 ms          0.58 ms          29%
BlurY+STF       1.00 ms          0.95 ms          5% (*)
Blur Total      1.93 ms          1.71 ms          13%

(*) Lower speedup due to the math overhead of
the Selective Temporal Filter (STF)
                                                Blur Radius: 8
                                                Resolution: 1920x1200
                                                GeForce GTX 560 Ti
Summary
Two techniques used in Battlefield 3 / PC
  1. A generic solution to fix SSAO flickering with
  a low perf hit (*) on DX10/11 GPUs
  2. An approximate cross-bilateral filter, using a
  mix of point and bilinear taps
(*) 0.4 ms in 1920x1200 on GeForce GTX 560 Ti
Questions?

Louis Bavoil
lbavoil@nvidia.com

Johan Andersson
johan.andersson@dice.se
References
[McGuire et al. 11] McGuire, Osman, Bukowski, Hennessy. The Alchemy
Screen-Space Ambient Obscurance Algorithm. Proceedings of ACM
SIGGRAPH / Eurographics High-Performance Graphics 2011 (HPG '11).
[White and Barré-Brisebois 11] White, Barré-Brisebois. More Performance!
Five Rendering Ideas from Battlefield 3 and Need For Speed: The Run.
Advances in Real-Time Rendering in Games. SIGGRAPH 2011.
[Mattausch et al. 11] Mattausch, Scherzer, Wimmer. Temporal Screen-
Space Ambient Occlusion. In GPU Pro 2. 2011.
[Loos and Sloan 10] Loos, Sloan. Volumetric Obscurance. ACM Symposium
on Interactive 3D Graphics and Games 2010.
References
[Andersson 10] Andersson. Bending the Graphics Pipeline. Beyond
Programmable Shading course, SIGGRAPH 2010.
[Bavoil and Sainz 09a] Bavoil, Sainz. Image-Space Horizon-Based Ambient
Occlusion. In ShaderX7. 2009.
[Bavoil and Sainz 09b] Bavoil, Sainz. Multi-Layer Dual-Resolution Screen-
Space Ambient Occlusion. SIGGRAPH Talk. 2009.
[Smedberg and Wright 09] Smedberg, Wright. Rendering Techniques in
Gears of War 2. GDC 2009.
[Kajalin 09] Kajalin. Screen Space Ambient Occlusion. In ShaderX7. 2009.
[Bavoil et al. 08] Bavoil, Sainz, Dimitrov. Image-Space Horizon-Based
Ambient Occlusion. SIGGRAPH Talk. 2008.
References
[Nehab et al. 07] Nehab, Sander, Lawrence, Tatarchuk, Isidoro.
Accelerating Real-Time Shading with Reverse Reprojection Caching. In ACM
SIGGRAPH/Eurographics Symposium on Graphics Hardware 2007.
[Kopf et al. 07] Kopf, Cohen, Lischinski, Uyttendaele. 2007. Joint Bilateral
Upsampling. In Proceedings of SIGGRAPH 2007.
[Petschnigg et al. 04] Petschnigg, Szeliski, Agrawala, Cohen, Hoppe.
Toyama: Digital photography with flash and no-flash image pairs. In
Proceedings of SIGGRAPH 2004.
[Eisemann and Durand 04] Eisemann, Durand. “Flash Photography
Enhancement via Intrinsic Relighting”. In Proceedings of SIGGRAPH 2004.
Bonus Slides
HLSL: Adaptive Sampling
float r = 1;
// Inner half of the kernel: step size = 1 and POINT filtering
[unroll] for (; r <= KERNEL_RADIUS/2; r += 1)
{
            float2 uv = r * deltaUV + uv0;
            float2 AOZ = mainTexture.Sample(pointClampSampler, uv).xy;
            processSample(AOZ, r, centerDepth, totalAO, totalW);
}
// Outer half of the kernel: step size = 2 and LINEAR filtering
[unroll] for (; r <= KERNEL_RADIUS; r += 2)
{
            float2 uv = (r + 0.5) * deltaUV + uv0;
            float2 AOZ = mainTexture.Sample(linearClampSampler, uv).xy;
            processSample(AOZ, r, centerDepth, totalAO, totalW);
}
HLSL: Cross-Bilateral Weights
// d and d0 = linear depths
float crossBilateralWeight(float r, float d, float d0)
{
   // precompiled by fxc
   const float BlurSigma = ((float)KERNEL_RADIUS+1.0f) * 0.5f;
   const float BlurFalloff = 1.f / (2.0f*BlurSigma*BlurSigma);

    // assuming that d and d0 are pre-scaled linear depths
    float dz = d0 - d;
    return exp2(-r*r*BlurFalloff - dz*dz);
}

More Related Content

What's hot

Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUsPerformance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUsFisnik Kraja
 
An evaluation of gnss code and phase solutions
An evaluation of gnss code and phase solutionsAn evaluation of gnss code and phase solutions
An evaluation of gnss code and phase solutionsAlexander Decker
 
Frequency domain methods
Frequency domain methods Frequency domain methods
Frequency domain methods thanhhoang2012
 
Cycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation LearningCycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation LearningQuan Kong
 
Tennis video shot classification based on support vector
Tennis video shot classification based on support vectorTennis video shot classification based on support vector
Tennis video shot classification based on support vectores712
 
04 1 - frequency domain filtering fundamentals
04 1 - frequency domain filtering fundamentals04 1 - frequency domain filtering fundamentals
04 1 - frequency domain filtering fundamentalscpshah01
 
Low level feature extraction - chapter 4
Low level feature extraction - chapter 4Low level feature extraction - chapter 4
Low level feature extraction - chapter 4Aalaa Khattab
 

What's hot (8)

Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUsPerformance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
 
An evaluation of gnss code and phase solutions
An evaluation of gnss code and phase solutionsAn evaluation of gnss code and phase solutions
An evaluation of gnss code and phase solutions
 
Frequency domain methods
Frequency domain methods Frequency domain methods
Frequency domain methods
 
Cycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation LearningCycle-Contrast for Self-Supervised Video Represenation Learning
Cycle-Contrast for Self-Supervised Video Represenation Learning
 
BMC 2012 - Invited Talk
BMC 2012 - Invited TalkBMC 2012 - Invited Talk
BMC 2012 - Invited Talk
 
Tennis video shot classification based on support vector
Tennis video shot classification based on support vectorTennis video shot classification based on support vector
Tennis video shot classification based on support vector
 
04 1 - frequency domain filtering fundamentals
04 1 - frequency domain filtering fundamentals04 1 - frequency domain filtering fundamentals
04 1 - frequency domain filtering fundamentals
 
Low level feature extraction - chapter 4
Low level feature extraction - chapter 4Low level feature extraction - chapter 4
Low level feature extraction - chapter 4
 

More from MinGeun Park

[CSStudy] 코딩인터뷰 완전 분석 #7.pdf
[CSStudy] 코딩인터뷰 완전 분석 #7.pdf[CSStudy] 코딩인터뷰 완전 분석 #7.pdf
[CSStudy] 코딩인터뷰 완전 분석 #7.pdfMinGeun Park
 
[Cs study] 코딩인터뷰 완전 분석 #6
[Cs study] 코딩인터뷰 완전 분석 #6[Cs study] 코딩인터뷰 완전 분석 #6
[Cs study] 코딩인터뷰 완전 분석 #6MinGeun Park
 
[Cs study] 코딩인터뷰 완전 분석 #5
[Cs study] 코딩인터뷰 완전 분석 #5[Cs study] 코딩인터뷰 완전 분석 #5
[Cs study] 코딩인터뷰 완전 분석 #5MinGeun Park
 
[Cs study] 코딩인터뷰 완전 분석 #3
[Cs study] 코딩인터뷰 완전 분석 #3[Cs study] 코딩인터뷰 완전 분석 #3
[Cs study] 코딩인터뷰 완전 분석 #3MinGeun Park
 
[Cs study] 코딩인터뷰 완전 분석 #2
[Cs study] 코딩인터뷰 완전 분석 #2[Cs study] 코딩인터뷰 완전 분석 #2
[Cs study] 코딩인터뷰 완전 분석 #2MinGeun Park
 
[Cs study] 코딩인터뷰 완전 분석
[Cs study] 코딩인터뷰 완전 분석[Cs study] 코딩인터뷰 완전 분석
[Cs study] 코딩인터뷰 완전 분석MinGeun Park
 
[데브루키_언리얼스터디_0525] 애니메이션 노티파이
[데브루키_언리얼스터디_0525] 애니메이션 노티파이[데브루키_언리얼스터디_0525] 애니메이션 노티파이
[데브루키_언리얼스터디_0525] 애니메이션 노티파이MinGeun Park
 
[데브루키] 이벤트 드리븐 아키텍쳐
[데브루키] 이벤트 드리븐 아키텍쳐[데브루키] 이벤트 드리븐 아키텍쳐
[데브루키] 이벤트 드리븐 아키텍쳐MinGeun Park
 
[데브루키 언리얼 스터디] PBR
[데브루키 언리얼 스터디] PBR[데브루키 언리얼 스터디] PBR
[데브루키 언리얼 스터디] PBRMinGeun Park
 
[데브루키 언리얼 스터디] 스터디 안내 OT
[데브루키 언리얼 스터디] 스터디 안내 OT[데브루키 언리얼 스터디] 스터디 안내 OT
[데브루키 언리얼 스터디] 스터디 안내 OTMinGeun Park
 
[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.
[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.
[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.MinGeun Park
 
[데브루키] Color space gamma correction
[데브루키] Color space gamma correction[데브루키] Color space gamma correction
[데브루키] Color space gamma correctionMinGeun Park
 
유니티 팁&트릭 Unity Tip & Trick
유니티 팁&트릭 Unity Tip & Trick유니티 팁&트릭 Unity Tip & Trick
유니티 팁&트릭 Unity Tip & TrickMinGeun Park
 
Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)
Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)
Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)MinGeun Park
 
[RAPA/C++] 1. 수업 내용 및 진행 방법
[RAPA/C++] 1. 수업 내용 및 진행 방법[RAPA/C++] 1. 수업 내용 및 진행 방법
[RAPA/C++] 1. 수업 내용 및 진행 방법MinGeun Park
 
[Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용
[Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용 [Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용
[Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용 MinGeun Park
 
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현MinGeun Park
 
[데브루키160409 박민근] UniRx 시작하기
[데브루키160409 박민근] UniRx 시작하기[데브루키160409 박민근] UniRx 시작하기
[데브루키160409 박민근] UniRx 시작하기MinGeun Park
 
[160404] 유니티 apk 용량 줄이기
[160404] 유니티 apk 용량 줄이기[160404] 유니티 apk 용량 줄이기
[160404] 유니티 apk 용량 줄이기MinGeun Park
 
[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개MinGeun Park
 

More from MinGeun Park (20)

[CSStudy] 코딩인터뷰 완전 분석 #7.pdf
[CSStudy] 코딩인터뷰 완전 분석 #7.pdf[CSStudy] 코딩인터뷰 완전 분석 #7.pdf
[CSStudy] 코딩인터뷰 완전 분석 #7.pdf
 
[Cs study] 코딩인터뷰 완전 분석 #6
[Cs study] 코딩인터뷰 완전 분석 #6[Cs study] 코딩인터뷰 완전 분석 #6
[Cs study] 코딩인터뷰 완전 분석 #6
 
[Cs study] 코딩인터뷰 완전 분석 #5
[Cs study] 코딩인터뷰 완전 분석 #5[Cs study] 코딩인터뷰 완전 분석 #5
[Cs study] 코딩인터뷰 완전 분석 #5
 
[Cs study] 코딩인터뷰 완전 분석 #3
[Cs study] 코딩인터뷰 완전 분석 #3[Cs study] 코딩인터뷰 완전 분석 #3
[Cs study] 코딩인터뷰 완전 분석 #3
 
[Cs study] 코딩인터뷰 완전 분석 #2
[Cs study] 코딩인터뷰 완전 분석 #2[Cs study] 코딩인터뷰 완전 분석 #2
[Cs study] 코딩인터뷰 완전 분석 #2
 
[Cs study] 코딩인터뷰 완전 분석
[Cs study] 코딩인터뷰 완전 분석[Cs study] 코딩인터뷰 완전 분석
[Cs study] 코딩인터뷰 완전 분석
 
[데브루키_언리얼스터디_0525] 애니메이션 노티파이
[데브루키_언리얼스터디_0525] 애니메이션 노티파이[데브루키_언리얼스터디_0525] 애니메이션 노티파이
[데브루키_언리얼스터디_0525] 애니메이션 노티파이
 
[데브루키] 이벤트 드리븐 아키텍쳐
[데브루키] 이벤트 드리븐 아키텍쳐[데브루키] 이벤트 드리븐 아키텍쳐
[데브루키] 이벤트 드리븐 아키텍쳐
 
[데브루키 언리얼 스터디] PBR
[데브루키 언리얼 스터디] PBR[데브루키 언리얼 스터디] PBR
[데브루키 언리얼 스터디] PBR
 
[데브루키 언리얼 스터디] 스터디 안내 OT
[데브루키 언리얼 스터디] 스터디 안내 OT[데브루키 언리얼 스터디] 스터디 안내 OT
[데브루키 언리얼 스터디] 스터디 안내 OT
 
[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.
[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.
[데브루키/페차쿠차] 유니티 프로파일링에 대해서 알아보자.
 
[데브루키] Color space gamma correction
[데브루키] Color space gamma correction[데브루키] Color space gamma correction
[데브루키] Color space gamma correction
 
유니티 팁&트릭 Unity Tip & Trick
유니티 팁&트릭 Unity Tip & Trick유니티 팁&트릭 Unity Tip & Trick
유니티 팁&트릭 Unity Tip & Trick
 
Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)
Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)
Live2D with Unity - 그녀들을 움직이게 하는 기술 (알콜코더 박민근)
 
[RAPA/C++] 1. 수업 내용 및 진행 방법
[RAPA/C++] 1. 수업 내용 및 진행 방법[RAPA/C++] 1. 수업 내용 및 진행 방법
[RAPA/C++] 1. 수업 내용 및 진행 방법
 
[Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용
[Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용 [Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용
[Unite17] 유니티에서차세대프로그래밍을 UniRx 소개 및 활용
 
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
 
[데브루키160409 박민근] UniRx 시작하기
[데브루키160409 박민근] UniRx 시작하기[데브루키160409 박민근] UniRx 시작하기
[데브루키160409 박민근] UniRx 시작하기
 
[160404] 유니티 apk 용량 줄이기
[160404] 유니티 apk 용량 줄이기[160404] 유니티 apk 용량 줄이기
[160404] 유니티 apk 용량 줄이기
 
[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개
 

Recently uploaded

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Battle field3 ssao

  • 1. Stable SSAO in Battlefield 3 with Selective Temporal Filtering Louis Bavoil Johan Andersson Developer Technology Engineer Rendering Architect NVIDIA DICE / EA
  • 2. SSAO ● Screen Space Ambient Occlusion (SSAO) ● Has become de-facto approach for rendering AO in games with no precomputation ● Key Idea: use depth buffer as approximation of the opaque geometry of the scene ● Large variety of SSAO algorithms, all taking as input the scene depth buffer [Kajalin 09] [Loos and Sloan 10] [McGuire et al. 11]
  • 3. HBAO ● Horizon-Based Ambient Occlusion (HBAO) ● Considers the depth buffer as a heightfield, and approximates ray-tracing this heightfield ● Improved implementation available in NVIDIA’s SDK11 (SSAO11.zip / HBAO_PS.hlsl) ● Used for rendering SSAO in Battlefield 3 / PC for its “High” and “Ultra” Graphics Quality presets [Bavoil and Sainz 09a] [Andersson 10] [White and Barré-Brisebois 11]
  • 5. Original HBAO Implementation in Frostbite 2 Frame time (GPU): 25.2 ms Total HBAO time: 2.5 ms (10% of frame) [1920x1200 “High” DX11 GeForce GTX 560 Ti]
  • 7. Screenshot: HBAO Only The HBAO looked good-enough on screenshots…
  • 8. Video: Flickering HBAO …but produced objectionable flickering in motion on thin objects such as alpha-tested foliage (grass and trees in particular)
  • 9. Our Constraints ● PC-only ● In Frostbite 2, HBAO implemented only for DX10 & DX11 ● Low Perf Hit, High Quality ● Whole HBAO was already 2.5 ms (1920x1200 / GTX 560 Ti) ● HBAO used in High and Ultra presets, had to look great
  • 10. Considered Workarounds ● Full-resolution SSAO or dual-resolution SSAO (*) …but that more-than-doubled the cost of the SSAO, and some flickering could remain ● Brighten SSAO on the problematic objects …but we wanted a way to keep full-strength SSAO on everything (in particular on foliage) (*) [Bavoil and Sainz 09b]
  • 11. Temporal Filtering Approach ● By definition, AO depends only on the scene geometry, not on the camera ● For static (or nearly-static geometry), can re-project AO from previous frame ● Reduce AO variance between frames by using a temporal filter: newAO = lerp(newAO,previousAO,x) ● Known approach used in Gears of War 2 [Nehab et al. 07] [Smedberg and Wright 09]
  • 12. Previous Camera Reverse (Frame i-1) Reprojection [Nehab et al. 07] Current (uvi, zi) Camera (Frame i)
  • 13. Previous Camera Reverse (Frame i-1) Reprojection [Nehab et al. 07] 1. Current (uvi, zi) ViewProjection-1 Current Camera (Frame i) Pi
  • 14. Previous Camera Reverse (Frame i-1) uvi-1 Reprojection [Nehab et al. 07] 2. Previous ViewProjection (uvi, zi) Current Camera 1. Current ViewProjection-1 Pi (Frame i)
  • 15. Previous Camera Reverse (Frame i-1) (uvi-1, zi-1) Reprojection [Nehab et al. 07] 3. Fetch Previous View Depth 2. Previous Pi-1 ViewProjection (uvi, zi) Current Camera 1. Current ViewProjection-1 Pi (Frame i)
  • 16. Previous Camera Reverse (Frame i-1) (uvi-1, zi-1) Reprojection [Nehab et al. 07] 3. Fetch Previous View Depth Pi-1 2. Previous ViewProjection (uvi, zi) 4. If Pi ~= Pi-1 Current re-use AO(Pi-1) Camera 1. Current (Frame i) ViewProjection-1 Pi
  • 17. Temporal Refinement [Mattausch et al. 11] (uvi-1, zi-1) If Pi-1 ~= Pi AOi = (Ni-1 AOi-1 + AOi) / (Ni-1 + 1) Ni = min(Ni-1 + 1, Nmax) Else AOi = AOi Pi-1 Ni = 1 (uvi, zi) Pi Ni = num. frames that have been accumulated in current solution at Pi Nmax = max num. frames (~8), to keep AOi-1 contributing to AOi
  • 18. Disocclusion Test [Mattausch et al. 11] Pi ~= Pi-1  wi = ViewDepth(Viewi-1, Pi) wi-1 = ViewDepth(Viewi-1, Pi-1)
  • 19. Relaxed Disocclusion Test Pi ~= Pi-1  To support nearly-static objects ● Such as foliage waving in the wind (grass, trees, …) ● We simply relaxed the threshold (used ε = 10%)
  • 20. Video: Temporal Filtering with ε=+Inf Flickering is fixed, but there are ghosting artifacts due to disocclusions
  • 21. Video: Temporal Filtering with ε=0.1% Flickering on the grass (nearly static), but no ghosting artifacts
  • 22. Video: Temporal Filtering with ε=10% No flickering, no ghosting, 1% perf hit (25.2 -> 25.5 ms)
  • 23. Video: Trailing Artifacts New issue: trailing artifacts on static objects receiving AO from dynamic objects
  • 24. Observations 1. With temporal filtering OFF The flickering pixels are mostly on foliage. The other pixels do not have any objectionable flickering. 2. With temporal filtering ON The trailing artifacts (near the character's feet) are not an issue on foliage pixels.
  • 25. Selective Temporal Filtering ● Assumption The set of flickering pixels and the set of trailing pixels are mutually exclusive ● Our Approach: 1. Classify the pixels as stable (potential trailing) or unstable (potential flickering) 2. Disable the temporal filter for the stable pixels
  • 26. Pixel Classification Approach ● Normal reconstruction in SSAO shader ● Px = ||P - Pleft|| < ||P - Pright|| ? Pleft : Pright Pt ● Py = ||P - Ptop|| < ||P - Pbottom|| ? Ptop : Pbottom Pl P Pr ● N = ± normalize(cross(P - Px, P - Py)) Pb ● Idea: If reconstructed normal is noisy, the SSAO will be noisy
  • 27. Piecewise Continuity Test 1. Select nearest neighbor Px between Pleft and Pright Px = Pleft P Pright
  • 28. Piecewise Continuity Test 1. Select nearest neighbor Px between Pleft and Pright Px = Pleft P 2. Continuous pixels  || Px – P || < L where L = distance threshold (in view-space units) Pright
  • 30. Two Half-Res Passes Pass 1: Output SSAO and continuity mask continuityMask = (|| Px – P || < L && || Py – P || < L) Pass 2: Dilate the discontinuities dilatedMask = all4x4(continuityMask)
  • 32. Pass 1: Classification Note: Distant pixels are all classified as “unstable” because ||P – Px/y|| increases with depth (perspective camera). Luckily for us, trailing artifacts were not an issue in the distance so this was not an issue.
  • 33. Pass 2: 4x4 Dilation Pixel classified as unstable if it has at least one discontinuity in its neighborhood 4 Gather4 instructions on DX11
  • 34. Selective Temporal Filtering Enable temporal filter only: - for unstable pixels - with (|1 – wi /wi-1| < ε)
  • 35. Video: HBAO + Selective Temporal Filtering
  • 37. Final Pipeline with Selective Temporal Filtering (STF) STF Performance Hit [1920x1200 “High”, GTX 560 Ti] • HBAO total: 2.5 ms -> 2.9 ms • Frame time (GPU): 25.2 -> 25.6 ms (1.6% performance hit) STF Parameters • Reprojection Threshold: For detecting disocclusions (ε=10%) • Distance Threshold: For detecting discontinuities (L=0.1 meter) • Dilation Kernel Size (4x4 texels)
  • 38. History Buffers ● Additional GPU memory required for the history buffers ● For (AOi, Zi, Ni) and (AOi-1, Zi-1, Ni) ● Full-res, 1xMSAA ● For Multi-GPU Alternate Frame Rendering (AFR) rendering ● Create one set of buffers per GPU and alternate between them ● Each AFR GPU has its own buffers & reprojection state ● The history buffers are cleared on first use ● Clear values: (AO,Z)=(1.f, 0.f) and N=0
  • 39. SelectiveTemporalFilter(uvi, AOi) Unoptimized zi = Fetch(ZBufferi, uvi) Pi = UnprojectToWorld(uvi, zi) pseudo-code uvi-1 = ProjectToUV(Viewi-1, Pi) zi-1 = Fetch(ZBufferi-1, uvi-1) For fetching zi-1, use Pi-1 = UnprojectToWorld(uvi-1, zi-1) wi = ViewDepth(Viewi-1 , Pi) clamp-to-border to wi-1 = ViewDepth(Viewi-1 , Pi-1) discard out-of-frame isStablePixel = Fetch(StabilityMask, uvi) data, with borderZ=0.f if ( |1 – wi/wi-1 | < ε && !isStablePixel ) AOi-1 = Fetch(AOTexturei-1, uvi-1) Ni-1 = Fetch(NTexturei-1, uvi-1) For fetching AOi-1, use AOi = (Ni-1 AOi-1 + AOi) / (Ni-1 + 1) bilinear filtering like in Ni = min(Ni-1 + 1, Nmax) [Nehab et al. 07] else Ni = 1 return(AOi, wi, Ni)
  • 43. Blur Overview ● Full-screen pixel-shader passes ● BlurX (horizontal) ● BlurY (vertical) ● BlurX takes as input ● Half-res AO ● Full-res linear depth (non-MSAA)
  • 44. Blur Kernel ● We use 1D Cross-Bilateral Filters (CBF) ● Gaussian blur with depth-dependent weights Sum[ AOi w(i,Zi,Z0), i=-R..R ] Output = Sum[ w(i,Zi,Z0), i=-R..R ] [Petschnigg et al. 04] [Eisemann and Durand 04] [Kopf et al. 07] [Bavoil et al. 08] [McGuire et al. 11]
  • 45. Blur Opt: Adaptive Sampling 2/3 of total weights 1/3 of total weights Can these samples be approximated?
  • 46. Blur Opt: Adaptive Sampling Keep original samples Replace pairs of samples with in-between sample (AO,Z) fetches with hw bilinear filtering
  • 47. Blur Opt: Adaptive Sampling
  • 48. Brute-Force Sampling BlurX cost: 0.8 ms
  • 49. Adaptive Sampling BlurX cost: 0.6 ms
  • 50. Blur Opt: Speedup GPU Time Before After Speedup Pack (AO,Z) 0.18 ms 0.18 ms 0% BlurX 0.75 ms 0.58 ms 29% BlurY+STF 1.00 ms 0.95 ms 5% (*) Blur Total 1.93 ms 1.71 ms 13% (*) Lower speedup due to the math overhead of the Selective Temporal Filter (STF) Blur Radius: 8 Resolution: 1920x1200 GeForce GTX 560 Ti
  • 51. Summary Two techniques used in Battlefield 3 / PC 1. A generic solution to fix SSAO flickering with a low perf hit (*) on DX10/11 GPUs 2. An approximate cross-bilateral filter, using a mix of point and bilinear taps (*) 0.4 ms in 1920x1200 on GeForce GTX 560 Ti
  • 53. References [McGuire et al. 11] McGuire, Osman, Bukowski, Hennessy. The Alchemy Screen-Space Ambient Obscurance Algorithm. Proceedings of ACM SIGGRAPH / Eurographics High-Performance Graphics 2011 (HPG '11). [White and Barré-Brisebois 11] White, Barré-Brisebois. More Performance! Five Rendering Ideas from Battlefield 3 and Need For Speed: The Run. Advances in Real-Time Rendering in Games. SIGGRAPH 2011. [Mattausch et al. 11] Mattausch, Scherzer, Wimmer. Temporal Screen- Space Ambient Occlusion. In GPU Pro 2. 2011. [Loos and Sloan 10] Loos, Sloan. Volumetric Obscurance. ACM Symposium on Interactive 3D Graphics and Games 2010.
  • 54. References [Andersson 10] Andersson. Bending the Graphics Pipeline. Beyond Programmable Shading course, SIGGRAPH 2010. [Bavoil and Sainz 09a] Bavoil, Sainz. Image-Space Horizon-Based Ambient Occlusion. In ShaderX7. 2009. [Bavoil and Sainz 09b] Bavoil, Sainz. Multi-Layer Dual-Resolution Screen- Space Ambient Occlusion. SIGGRAPH Talk. 2009. [Smedberg and Wright 09] Smedberg, Wright. Rendering Techniques in Gears of War 2. GDC 2009. [Kajalin 09] Kajalin. Screen Space Ambient Occlusion. In ShaderX7. 2009. [Bavoil et al. 08] Bavoil, Sainz, Dimitrov. Image-Space Horizon-Based Ambient Occlusion. SIGGRAPH Talk. 2008.
  • 55. References [Nehab et al. 07] Nehab, Sander, Lawrence, Tatarchuk, Isidoro. Accelerating Real-Time Shading with Reverse Reprojection Caching. In ACM SIGGRAPH/Eurographics Symposium on Graphics Hardware 2007. [Kopf et al. 07] Kopf, Cohen, Lischinski, Uyttendaele. 2007. Joint Bilateral Upsampling. In Proceedings of SIGGRAPH 2007. [Petschnigg et al. 04] Petschnigg, Szeliski, Agrawala, Cohen, Hoppe. Toyama: Digital photography with flash and no-flash image pairs. In Proceedings of SIGGRAPH 2004. [Eisemann and Durand 04] Eisemann, Durand. “Flash Photography Enhancement via Intrinsic Relighting”. In Proceedings of SIGGRAPH 2004.
  • 57. HLSL: Adaptive Sampling float r = 1; // Inner half of the kernel: step size = 1 and POINT filtering [unroll] for (; r <= KERNEL_RADIUS/2; r += 1) { float2 uv = r * deltaUV + uv0; float2 AOZ = mainTexture.Sample(pointClampSampler, uv).xy; processSample(AOZ, r, centerDepth, totalAO, totalW); } // Outer half of the kernel: step size = 2 and LINEAR filtering [unroll] for (; r <= KERNEL_RADIUS; r += 2) { float2 uv = (r + 0.5) * deltaUV + uv0; float2 AOZ = mainTexture.Sample(linearClampSampler, uv).xy; processSample(AOZ, r, centerDepth, totalAO, totalW); }
  • 58. HLSL: Cross-Bilateral Weights // d and d0 = linear depths float crossBilateralWeight(float r, float d, float d0) { // precompiled by fxc const float BlurSigma = ((float)KERNEL_RADIUS+1.0f) * 0.5f; const float BlurFalloff = 1.f / (2.0f*BlurSigma*BlurSigma); // assuming that d and d0 are pre-scaled linear depths float dz = d0 - d; return exp2(-r*r*BlurFalloff - dz*dz); }