SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
GeForce 8800 OpenGL Extensions

           Evan Hart
Roadmap


         What’s different
         The programmable core
         Feeding it
         New pathways
         The backend




© NVIDIA Corporation 2007
GeForce 8800 Differences


         Pipeline modifications
                   Additional geometry shader stage
                   Feedback available midstream
         Unified shading hardware
                   Same instructions and characteristics across shaders




© NVIDIA Corporation 2007
GeForce 8800 OpenGL Pipeline
                                                                    Input Assembler


         More flexible memory                                          Vertex Processor
         access model
                   Multiple ways to read and                      Primitive Assembly
                   write
         Additional pipeline stages                                  Geometry Processor




                                                                                                   Video Memory
                   Fundamentally new
                                                                                       Transform
                   capabilities                                                        Feedback


                                                              Rasterization & Interpolation



                                                                     Fragment Processor

               Fixed / Built in stream      Fixed stage
               Programmable stream
                                         Programmable stage        Raster Operations
               Video memory IO
                                              Memory
                                                                      Framebuffer
© NVIDIA Corporation 2007
Unified Shaders
                                        Pixel
                                                Geometry
                            Vertex               (new)
                                       Floating
                                        Point
                                      Processor
                                                  Future
                            Physics                Future
                                                    Future




                                        ROP


                                       Memory
© NVIDIA Corporation 2007
Programmability


         #1 design concern of all extensions
                   Efficient for input
                   Efficient for computation
                   Efficient for output
         No concessions for fixed-function
                   Only the right choices for programmability
                   Most will not work with the fixed function pipeline




© NVIDIA Corporation 2007
New Capabilities


         Unified instruction set for programs
         Integer instructions and data types
         Uniform set of structured branching constructs
         Indexable constants and temporaries
         New texture fetching instructions
         Attribute interpolation control
                   Flat shaded, perspective-incorrect or centroid sampled




© NVIDIA Corporation 2007
New OpenGL Program Extensions

         OpenGL Shading Language
                   EXT_gpu_shader4
                            GLSL extension for fourth generation shaders
         ARB_vertex_program style asm-like programs
                   NV_gpu_program4
                            NV_vertex_program4
                            NV_fragment_program4
                            NV_geometry_program4


         Cg 2.0
                   Not a GL extension
                   Will support the capabilities



© NVIDIA Corporation 2007
GL_EXT_gpu_shader4


         New integer support
                   unsigned int, uvec2, uvec3, uvec4
                   Integer varying and attributes
                   Integer texture samplers: isampler*, usampler*
                   Real integer ops
                            %, &, |, ^, >>, <<, ~




© NVIDIA Corporation 2007
GL_EXT_gpu_shader4 cnt’d


         Extends varying type qualifiers
                   Centroid – keeps the value inside the covered region
                   Flat – no interpolation, like flat shading
                   Noperspective – interpolate in screen-space




© NVIDIA Corporation 2007
Flat interpolation




© NVIDIA Corporation 2007
GL_EXT_gpu_shader4 cnt’d


         Instancing and procedural generation
                   gl_VertexID
                            Integer index derived from glDrawElements, etc
                            Only with vertex arrays (no display lists)
                            Only when using VBOs
                   gl_InstanceID
                            Integer index of the current primitive
                            Only available when using DrawElementsInstancedEXT
                   gl_PrimitiveID
                            Integer describing the primitive number




© NVIDIA Corporation 2007
GL_EXT_gpu_shader4 cnt’d


         User-defined output variables
                   Declared as “varying out”
                   Allow more flexible outputs from the fragment shader
                            Integers
                            Integers and floats simultaneously
                   Used instead of gl_FragColor or gl_FragData




© NVIDIA Corporation 2007
GL_EXT_gpu_shader cnt’d


         Exact texel fetches
                   texelFetch*( sampler, icoord, lod)
                   Treats a texture as a directly addressable array of texels
         Texture size query
                   textureSize*( sampler, lod)
                   Returns the dimensions of the texture level
         Shadow cubemaps
                   Depth is compared against the 4th component
         Texture Gradient fetches
                   texture*Grad( sampler, coord, dx, dy)
                   Allows custom lod/anisotropy control


© NVIDIA Corporation 2007
GL_EXT_gpu_shader4 cnt’d


         Offset texture fetches
                   texture*Offset( sampler, coord, offset)
                   Offset must be compile-time constant expression
                   Allow convenient shortcut for building filter kernels
                   Offset size has an implementation dependent limit




© NVIDIA Corporation 2007
GL_NV_gpu_program4


         Composed of three sub-extensions
                   GL_NV_vertex_program4
                   GL_NV_fragment_program4
                   GL_NV_geometry_program4
         Still based on 4-wide registers
                   No longer really matches HW
                   Enhances backward compatibility
         Provides same capabilities as EXT_gpu_shader4




© NVIDIA Corporation 2007
Feeding the Shader


         Instancing
                   Optimized rendering of multiple copies of an object
         New texture types
                   Texture arrays
         New texture formats
                   Integer textures
                   Additional HDR formats
         Data buffers
                   Fast flexible ways to swap blocks of constants
         Texture buffers
                   Massive data store for shaders

© NVIDIA Corporation 2007
EXT_draw_instanced


         Efficient rendering for large numbers of objects
         Vertex array only
                   glDrawArraysInstancedEXT
                   glDrawElementsInstancedEXT
         Draw calls take one additional parameter
                   # of instances to draw
         Each instance has a separate instance ID
                   Used by the shader to change behavior
                            Select transform matrix
                            Select material
         Clever shaders can even draw different objects

© NVIDIA Corporation 2007
EXT_texture_array

          Array Textures
          Generalizes 1D and 2D textures to consist of an
          array of 1D or 2D textures
                    1D array loaded using glTexImage2D
                    2D array loaded using glTexImage3D
          Array indexable from shader program
                    Access layer using r texture coordinate
          Layers must be same size and format
          No filtering between layers
          Arrays of cubemaps not currently supported
          Removes need for texture atlases
                    Useful for instancing, terrain texturing

© NVIDIA Corporation 2007
Array Textures Cont’d




             T [0-1]



                                      R [0–3]
                            S [0-1]
© NVIDIA Corporation 2007
Texture Format Extensions

         EXT_texture_integer
                   Adds integer texture formats
         EXT_packed_float
                   Space-efficient float format
                   Relatively low precision
                            More than good enough most times
         EXT_texture_shared_exponent
                   Space-efficient float format
                   Variable accuracy
                            Can be more or less accurate than packed float
                            Also more than good enough




© NVIDIA Corporation 2007
Integer texture formats

         EXT_texture_integer
                   Adds integer texture formats
                   8, 16, and 32 bit per component
                   Signed and unsigned
                   RGB, RGBA, Luminance, Alpha, Intensity, and LA
                   Lack filtering support
                   Only available with EXT_gpu_shader4 / NV_gpu_program4
         Uses
                   Bitfields
                   Color index emulation
                   Lookup tables



© NVIDIA Corporation 2007
Packed Float Textures

         EXT_packed_float
                   11/11/10 floating point format
                   5 bit exponent per component
                            Bias of -15
                   Only supports positive values (no sign bit)
                   Can be used as framebuffer format
                   Max values
                            R/G – 65024
                            B – 64512
                   Size advantage can make it much faster than float16
                   Supports filtering, blending, and MSAA




© NVIDIA Corporation 2007
Packed Float Texture Usage




© NVIDIA Corporation 2007
RGBE / Shared Exponent textures

         EXT_texture_shared_exponent
                   9/9/9/5 RGBE format
                   Similar to Radiance 8/8/8/8 RGBE format
                   Shared 5 bit exponent (bias of -15)
                   Source texture format only (not renderable to)
                   Only supports positive values (no sign bit)




© NVIDIA Corporation 2007
New compressed texture formats


         EXT_texture_compression_latc
                   Good format for greyscale w/ alpha compression
                   8-bits per texel
                   Stored in 4x4 blocks (like DXT formats)
                   Components are compressed independently


         EXT_texture_compression_rgtc
                   Same properties as latc
                   Returns (r,g,0,1) instead of (l,l,l,a)
                   Useful for normal map compression
                            (x,y) = 2 * (r,g) – 1
                            Z = sqrt( 1 – (x2 + y2) )

© NVIDIA Corporation 2007
OpenGL Data Buffer Extensions

          EXT_bindable_uniform
                    Used with EXT_gpu_shader4
                    Allows a uniform to be bound to a buffer object
                    Quickly switch all values in a large structure or array


          NV_parameter_buffer_object
                    Used with NV_gpu_program4
                    Defines new object containing banks of uniform
                    parameters
                    Enables rendering to parameter buffers
                    Useful for instancing
                    Provides a very large parameter store


© NVIDIA Corporation 2007
EXT_texture_buffer_object


         Bind buffer object as a texture
         Jumbo 1D texture
                   Larger size limit (128 Mtexels)
         Does not support filtering
         Addressed by element
                   Not normalized [0-1]
         Limited format support
                   No RGB support (RGBA is OK)
                   Minimum of 1 byte per component
                   No compressed formats



© NVIDIA Corporation 2007
Texture buffer usage


         Large constant store
                   Significantly larger than EXT_bindable_uniform
                   Jumbo bone list for skinning
         Instancing data
                   Transform matrices
                   Materials
         Custom indexing
                   Separate ‘index’ for position, normal, texture coordinate
                   Less efficient than normal vertex fetching
                            Not as coherent
                   Can be useful interactive editing due to reduced sw cost


© NVIDIA Corporation 2007
New Pathways


         Geometry Shaders
                   New pipeline stage
         Render target arrays
                   Geometry shader indexing of texture arrays
         Transform feedback
                   Export vertices mid-stream




© NVIDIA Corporation 2007
Geometry Shader Basics
         Input                                                Primitive Assembly
                   Standard primitives
                            point, line, triangle…
                   New primitive types include
                   neighboring vertices
                            Line with adjacency




                                                                                       Video Memory
                            1                           4      Geometry Shader
                                        2       3


                            Triangle with adjacency
                                            3
                                                    4
                                2


                                    1           5
                                                            Clipping & Rasterization
                                            6
         Output
                   Unique output type (independent from input type)
                   Points, line strips or triangle strips
                   Can output zero or more primitives
                   Generated primitive stream is in the same order as inputted
© NVIDIA Corporation 2007
Geometry Shader Applications

         Better point sprites
                   Rotation, non-square, motion blur
         Simple subdivision
         Single pass cube map creation
         Automatic stencil shadow polygon generation
         Fur rendering
                   Fin generation
         Curve rendering
                   2D rendering, hair/fur
         Particle systems
         GPGPU
                   Data amplification – variable number of outputs


© NVIDIA Corporation 2007
Geometry Shader Extensions


         EXT_geometry_shader4
                   GLSL geometry shaders
         NV_geometry_shader4
                   Adds to EXT_geometry_shader4
         NV_geometry_program4
                   ARB_vertex_program style
                   Part of NV_gpu_program4




© NVIDIA Corporation 2007
EXT_geometry_shader


         New link-time parameters
                   Primitive input and output type
                   Max vertices output
         New shader variables
                   gl_VerticesIn – number of input vertices
                   gl_Layer – texture array layer target
                   gl_PrimitiveID – Set primitive ID seen by the fragments
                   gl_PrimitiveIDIn – Primitive ID based on input prims




© NVIDIA Corporation 2007
EXT_geometry_shader code

varying in vec3 eyeNormal[gl_VerticesIn];

varying out vec3 oEyeNormal;

for ( int i = 0; i < gl_VerticesIn; i++) {
    oEyeNormal = eyeNormal[i];
    //causes all output varying to submit
    EmitVertex();
}

//Start a new strip
RestartPrimitive();

© NVIDIA Corporation 2007
NV_geometry_shader4


         Relax restrictions of EXT_geometry_shader4
                   Changing input/output primitive without a relink
                   Changing max output vertices without a relink


         Defines additional behavior
                   Quads and polygons at turned into triangles
                            Shader accepting triangle accepts quads




© NVIDIA Corporation 2007
NV_geometry_program4


         Same capabilities as geometry_shader4
         ARB_vertex_program style syntax
         Inputs are ‘ATTRIB’
         Outputs are ‘RESULT’
         Input primitive, output primitive, and max vertices
                   Declared in the shader




© NVIDIA Corporation 2007
Rendering to Texture Arrays

          Can select destination layer for each primitive in
          geometry program
                    Write to “gl_Layer” or “result.layer”
          Can be used for single-pass render-to-cubemap
                    Read input triangle
                    Output to 6 cube map faces, transformed by correct
                    face matrix
                    No real savings for GPU
                            Same transform and rasterization load
                            Additional geometry shader work
                    Simple culling may help




© NVIDIA Corporation 2007
EXT_transform_feedback

          Allows storing output from a vertex program or
          geometry program to buffer object
          Enables multi-pass operations on geometry, e.g.
                    Store results of animation (skinning) to buffer, reuse for
                    multiple lights
                    Recursive subdivision
          Provides queries for number of primitives generated
          by geometry program




© NVIDIA Corporation 2007
Example: Terrain Subdivision


         Takes quad as input (actually line_adj primitive)
         Geometry program subdivides into 4 new quads
         using diamond-square subdivision
         Uses two VBOs, reads from one, writes to the other
         using transform feedback
         Then swap




© NVIDIA Corporation 2007
Terrain Subdivision




© NVIDIA Corporation 2007
The backend


         Floating point depth buffers
         EXT_draw_buffers2
         sRGB Framebuffers
         Coverage sample anti-aliasing




© NVIDIA Corporation 2007
NV_depth_float


         Provides floating point depth buffers and textures
         Range extended to [–MAX_FLOAT, MAX_FLOAT]
         New functions for non-normalized values
                   glDepthRangedNV
                   glClearDepthdNV
                   glDepthBoundsdNV
         Multiple formats
                   DEPTH_COMPONENT32F_NV
                   DEPTH32F_STENCIL8_NV
         FBO only
         Care must be taken in using the extra precision

© NVIDIA Corporation 2007
Float depth precision


         Normal [0-1] mapping is ineffective
                   Precision bunches up at 0
                            Logarithmic distribution of floats
                            24 bits of precision [0.5 (2^-1) – 1.0 (2^0)]
                   Perspective projection pushes scene toward 1
                            2x near maps to roughly 0.5
                            90%+ of scene [0.5 – 1.0] range
                   Effectively a 25-bit depth buffer
         Changing to [0-256] is no better
                   [128 – 256] has the same 24-bits




© NVIDIA Corporation 2007
Depth Precision Distribution


          Integer distribution




0                                                   1




            Floating point distribution




                                          2x near


© NVIDIA Corporation 2007
The Solution


         Reverse the depth mapping
                   Far = 0.0
                   Near = 1.0 (or higher)
         Precision bunching now reversed
                   Compression and expansion line up
                   Results in essentially linear depth buffering




© NVIDIA Corporation 2007
A Caveat




         Small precision cliff near 0
                   FP numbers often use denormals to fill it
                   Graphics hardware typically avoid it
                            Often CPU’s too, denorms are slow
         Problem is minimal due to exponent range
                   Infinite far plane completely fixes it




© NVIDIA Corporation 2007
EXT_draw_buffers2


         Provides per draw buffer control of
                   Blend Enable
                   Color mask
         Does not provide independent control of
                   Blend Function
                   Blend Equation
                   Blend Color




© NVIDIA Corporation 2007
EXT_framebuffer_sRGB


         Allows framebuffer to be stored in sRGB space
                   Provides perceptual color compression
                            8 bits sRGB is similar to 10 bits linear RGB
                   Converts to/from sRGB on access to framebuffer
                   Implements a gamma of 2.2
                            This matches most monitors relatively well
         Controlled via a simple enable
                   Not available on all formats (>8 bit per component)




© NVIDIA Corporation 2007
CSAA


         Coverage Sample Anti-Aliasing
                   GL_NV_framebuffer_multisample_coverage
         Decouples primitive coverage from depth/color
                   Increases edge quality with little cost
                   Memory and bandwidth overhead scale with # depth
                   samples
                   Worst case is as good as # of depth samples




© NVIDIA Corporation 2007
CSAA usage


         Extremely simple

         MSAA
           glRenderbufferStorageMultisampleEXT(
             GL_RENDERBUFFER_EXT, 4, GL_RGBA8, width,
              height)


         CSAA
           glRenderbufferStorageMultisampleCoverageNV(
             GL_RENDERBUFFER_EXT, 16, 4, GL_RGBA8, width,
              height)


© NVIDIA Corporation 2007
CSAA Modes

Name                        Coverage Samples   Color/Depth
                                               Samples
2x                                  2                   2

4x                                  4                   4

8x                                  8                   4

8xQ                                 8                   8

16x                                 16                  4

16xQ                                16                  8


© NVIDIA Corporation 2007
Thanks


         Simon Green
         Samuel Gateau
         Henry Moreton
         NVIDIA DevTech team




© NVIDIA Corporation 2007
New Developer Tools at GDC 02007




                                           FX Composer 2
                            PerfKit 5

            SDK 10




                            ShaderPerf 2
GPU-Accelerated
 Texture Tools
© NVIDIA Corporation 2007
                                            Shader Library

Contenu connexe

Tendances

Migrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMigrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMark Kilgard
 
Borderless Per Face Texture Mapping
Borderless Per Face Texture MappingBorderless Per Face Texture Mapping
Borderless Per Face Texture Mappingbasisspace
 
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineAccelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineMark Kilgard
 
OpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUsOpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUsMark Kilgard
 
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadOpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadTristan Lorach
 
Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...
Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...
Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...chiportal
 
NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017Mark Kilgard
 
NV_path_rendering Frequently Asked Questions
NV_path_rendering Frequently Asked QuestionsNV_path_rendering Frequently Asked Questions
NV_path_rendering Frequently Asked QuestionsMark Kilgard
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardMark Kilgard
 
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingMark Kilgard
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable ShadingMark Kilgard
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauAMD Developer Central
 
OSGi Applications Clustering using Distributed Shared Memory
OSGi Applications Clustering using Distributed Shared MemoryOSGi Applications Clustering using Distributed Shared Memory
OSGi Applications Clustering using Distributed Shared MemoryAnthony Gelibert
 
GPU-accelerated Path Rendering
GPU-accelerated Path RenderingGPU-accelerated Path Rendering
GPU-accelerated Path RenderingMark Kilgard
 
ICE Presentation: Integrated Component Characterization Environment
ICE Presentation: Integrated Component Characterization EnvironmentICE Presentation: Integrated Component Characterization Environment
ICE Presentation: Integrated Component Characterization EnvironmentNMDG NV
 
NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017Mark Kilgard
 
Avoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL PitfallsAvoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL PitfallsMark Kilgard
 
X Server Multi-rendering for OpenGL and PEX
X Server Multi-rendering for OpenGL and PEXX Server Multi-rendering for OpenGL and PEX
X Server Multi-rendering for OpenGL and PEXMark Kilgard
 
Shader model 5 0 and compute shader
Shader model 5 0 and compute shaderShader model 5 0 and compute shader
Shader model 5 0 and compute shaderzaywalker
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
 

Tendances (20)

Migrating from OpenGL to Vulkan
Migrating from OpenGL to VulkanMigrating from OpenGL to Vulkan
Migrating from OpenGL to Vulkan
 
Borderless Per Face Texture Mapping
Borderless Per Face Texture MappingBorderless Per Face Texture Mapping
Borderless Per Face Texture Mapping
 
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware PipelineAccelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
Accelerating Vector Graphics Rendering using the Graphics Hardware Pipeline
 
OpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUsOpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUs
 
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadOpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
 
Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...
Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...
Software Parallelisation & Platform Generation for Heterogeneous Multicore Ar...
 
NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017
 
NV_path_rendering Frequently Asked Questions
NV_path_rendering Frequently Asked QuestionsNV_path_rendering Frequently Asked Questions
NV_path_rendering Frequently Asked Questions
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforward
 
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable Shading
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
 
OSGi Applications Clustering using Distributed Shared Memory
OSGi Applications Clustering using Distributed Shared MemoryOSGi Applications Clustering using Distributed Shared Memory
OSGi Applications Clustering using Distributed Shared Memory
 
GPU-accelerated Path Rendering
GPU-accelerated Path RenderingGPU-accelerated Path Rendering
GPU-accelerated Path Rendering
 
ICE Presentation: Integrated Component Characterization Environment
ICE Presentation: Integrated Component Characterization EnvironmentICE Presentation: Integrated Component Characterization Environment
ICE Presentation: Integrated Component Characterization Environment
 
NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017
 
Avoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL PitfallsAvoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL Pitfalls
 
X Server Multi-rendering for OpenGL and PEX
X Server Multi-rendering for OpenGL and PEXX Server Multi-rendering for OpenGL and PEX
X Server Multi-rendering for OpenGL and PEX
 
Shader model 5 0 and compute shader
Shader model 5 0 and compute shaderShader model 5 0 and compute shader
Shader model 5 0 and compute shader
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentials
 

En vedette

En vedette (15)

爱是什么(自动播放)
爱是什么(自动播放)爱是什么(自动播放)
爱是什么(自动播放)
 
The unwanted Jesus & Bethany
The unwanted Jesus & BethanyThe unwanted Jesus & Bethany
The unwanted Jesus & Bethany
 
Religion in Blue Jeans
Religion in Blue JeansReligion in Blue Jeans
Religion in Blue Jeans
 
Pecha Kucha Slideshow
Pecha Kucha SlideshowPecha Kucha Slideshow
Pecha Kucha Slideshow
 
Vidadedecasado 1 2 3 4
Vidadedecasado 1 2 3 4Vidadedecasado 1 2 3 4
Vidadedecasado 1 2 3 4
 
Personal Inquiry
Personal  InquiryPersonal  Inquiry
Personal Inquiry
 
Ria2010 keynote développeurs
Ria2010 keynote développeursRia2010 keynote développeurs
Ria2010 keynote développeurs
 
Criza
CrizaCriza
Criza
 
6. open innov conclusions
6. open innov conclusions6. open innov conclusions
6. open innov conclusions
 
1
11
1
 
PLC-Word Choice
PLC-Word ChoicePLC-Word Choice
PLC-Word Choice
 
Milieu
MilieuMilieu
Milieu
 
Polovinka Lm Prezentaciya Vchitelya
Polovinka Lm Prezentaciya VchitelyaPolovinka Lm Prezentaciya Vchitelya
Polovinka Lm Prezentaciya Vchitelya
 
дерево
дереводерево
дерево
 
De Ale Ingerilor
De Ale IngerilorDe Ale Ingerilor
De Ale Ingerilor
 

Similaire à GeForce 8800 OpenGL Extensions

SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012Mark Kilgard
 
Hardware Shaders
Hardware ShadersHardware Shaders
Hardware Shadersgueste52f1b
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsPrabindh Sundareson
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityMark Kilgard
 
Gentek Introduce(en)
Gentek Introduce(en)Gentek Introduce(en)
Gentek Introduce(en)cloudmmog
 
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axelaparuma
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrKohei KaiGai
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Stefano Di Carlo
 
Adobe AIR - Mobile Performance – Tips & Tricks
Adobe AIR - Mobile Performance – Tips & TricksAdobe AIR - Mobile Performance – Tips & Tricks
Adobe AIR - Mobile Performance – Tips & TricksMihai Corlan
 
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for..."The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...Edge AI and Vision Alliance
 
NVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and TransparencyNVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and TransparencyMark Kilgard
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unitDayakar Siddula
 
Sig13 ce future_gfx
Sig13 ce future_gfxSig13 ce future_gfx
Sig13 ce future_gfxCass Everitt
 

Similaire à GeForce 8800 OpenGL Extensions (20)

SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012
 
Hardware Shaders
Hardware ShadersHardware Shaders
Hardware Shaders
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
 
Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL Functionality
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Gentek Introduce(en)
Gentek Introduce(en)Gentek Introduce(en)
Gentek Introduce(en)
 
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
 
OpenGL 4 for 2010
OpenGL 4 for 2010OpenGL 4 for 2010
OpenGL 4 for 2010
 
Tutorial MPEG 3D Graphics
Tutorial MPEG 3D GraphicsTutorial MPEG 3D Graphics
Tutorial MPEG 3D Graphics
 
FIR filter on GPU
FIR filter on GPUFIR filter on GPU
FIR filter on GPU
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
 
Adobe AIR - Mobile Performance – Tips & Tricks
Adobe AIR - Mobile Performance – Tips & TricksAdobe AIR - Mobile Performance – Tips & Tricks
Adobe AIR - Mobile Performance – Tips & Tricks
 
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for..."The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...
 
NVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and TransparencyNVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and Transparency
 
Real Time Video Processing in FPGA
Real Time Video Processing in FPGA Real Time Video Processing in FPGA
Real Time Video Processing in FPGA
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unit
 
Sig13 ce future_gfx
Sig13 ce future_gfxSig13 ce future_gfx
Sig13 ce future_gfx
 
3 d to _hpc
3 d to _hpc3 d to _hpc
3 d to _hpc
 

Dernier

Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdfKhaled Al Awadi
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxMarkAnthonyAurellano
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Anamaria Contreras
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Seta Wicaksana
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionMintel Group
 

Dernier (20)

Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted Version
 

GeForce 8800 OpenGL Extensions

  • 1. GeForce 8800 OpenGL Extensions Evan Hart
  • 2. Roadmap What’s different The programmable core Feeding it New pathways The backend © NVIDIA Corporation 2007
  • 3. GeForce 8800 Differences Pipeline modifications Additional geometry shader stage Feedback available midstream Unified shading hardware Same instructions and characteristics across shaders © NVIDIA Corporation 2007
  • 4. GeForce 8800 OpenGL Pipeline Input Assembler More flexible memory Vertex Processor access model Multiple ways to read and Primitive Assembly write Additional pipeline stages Geometry Processor Video Memory Fundamentally new Transform capabilities Feedback Rasterization & Interpolation Fragment Processor Fixed / Built in stream Fixed stage Programmable stream Programmable stage Raster Operations Video memory IO Memory Framebuffer © NVIDIA Corporation 2007
  • 5. Unified Shaders Pixel Geometry Vertex (new) Floating Point Processor Future Physics Future Future ROP Memory © NVIDIA Corporation 2007
  • 6. Programmability #1 design concern of all extensions Efficient for input Efficient for computation Efficient for output No concessions for fixed-function Only the right choices for programmability Most will not work with the fixed function pipeline © NVIDIA Corporation 2007
  • 7. New Capabilities Unified instruction set for programs Integer instructions and data types Uniform set of structured branching constructs Indexable constants and temporaries New texture fetching instructions Attribute interpolation control Flat shaded, perspective-incorrect or centroid sampled © NVIDIA Corporation 2007
  • 8. New OpenGL Program Extensions OpenGL Shading Language EXT_gpu_shader4 GLSL extension for fourth generation shaders ARB_vertex_program style asm-like programs NV_gpu_program4 NV_vertex_program4 NV_fragment_program4 NV_geometry_program4 Cg 2.0 Not a GL extension Will support the capabilities © NVIDIA Corporation 2007
  • 9. GL_EXT_gpu_shader4 New integer support unsigned int, uvec2, uvec3, uvec4 Integer varying and attributes Integer texture samplers: isampler*, usampler* Real integer ops %, &, |, ^, >>, <<, ~ © NVIDIA Corporation 2007
  • 10. GL_EXT_gpu_shader4 cnt’d Extends varying type qualifiers Centroid – keeps the value inside the covered region Flat – no interpolation, like flat shading Noperspective – interpolate in screen-space © NVIDIA Corporation 2007
  • 11. Flat interpolation © NVIDIA Corporation 2007
  • 12. GL_EXT_gpu_shader4 cnt’d Instancing and procedural generation gl_VertexID Integer index derived from glDrawElements, etc Only with vertex arrays (no display lists) Only when using VBOs gl_InstanceID Integer index of the current primitive Only available when using DrawElementsInstancedEXT gl_PrimitiveID Integer describing the primitive number © NVIDIA Corporation 2007
  • 13. GL_EXT_gpu_shader4 cnt’d User-defined output variables Declared as “varying out” Allow more flexible outputs from the fragment shader Integers Integers and floats simultaneously Used instead of gl_FragColor or gl_FragData © NVIDIA Corporation 2007
  • 14. GL_EXT_gpu_shader cnt’d Exact texel fetches texelFetch*( sampler, icoord, lod) Treats a texture as a directly addressable array of texels Texture size query textureSize*( sampler, lod) Returns the dimensions of the texture level Shadow cubemaps Depth is compared against the 4th component Texture Gradient fetches texture*Grad( sampler, coord, dx, dy) Allows custom lod/anisotropy control © NVIDIA Corporation 2007
  • 15. GL_EXT_gpu_shader4 cnt’d Offset texture fetches texture*Offset( sampler, coord, offset) Offset must be compile-time constant expression Allow convenient shortcut for building filter kernels Offset size has an implementation dependent limit © NVIDIA Corporation 2007
  • 16. GL_NV_gpu_program4 Composed of three sub-extensions GL_NV_vertex_program4 GL_NV_fragment_program4 GL_NV_geometry_program4 Still based on 4-wide registers No longer really matches HW Enhances backward compatibility Provides same capabilities as EXT_gpu_shader4 © NVIDIA Corporation 2007
  • 17. Feeding the Shader Instancing Optimized rendering of multiple copies of an object New texture types Texture arrays New texture formats Integer textures Additional HDR formats Data buffers Fast flexible ways to swap blocks of constants Texture buffers Massive data store for shaders © NVIDIA Corporation 2007
  • 18. EXT_draw_instanced Efficient rendering for large numbers of objects Vertex array only glDrawArraysInstancedEXT glDrawElementsInstancedEXT Draw calls take one additional parameter # of instances to draw Each instance has a separate instance ID Used by the shader to change behavior Select transform matrix Select material Clever shaders can even draw different objects © NVIDIA Corporation 2007
  • 19. EXT_texture_array Array Textures Generalizes 1D and 2D textures to consist of an array of 1D or 2D textures 1D array loaded using glTexImage2D 2D array loaded using glTexImage3D Array indexable from shader program Access layer using r texture coordinate Layers must be same size and format No filtering between layers Arrays of cubemaps not currently supported Removes need for texture atlases Useful for instancing, terrain texturing © NVIDIA Corporation 2007
  • 20. Array Textures Cont’d T [0-1] R [0–3] S [0-1] © NVIDIA Corporation 2007
  • 21. Texture Format Extensions EXT_texture_integer Adds integer texture formats EXT_packed_float Space-efficient float format Relatively low precision More than good enough most times EXT_texture_shared_exponent Space-efficient float format Variable accuracy Can be more or less accurate than packed float Also more than good enough © NVIDIA Corporation 2007
  • 22. Integer texture formats EXT_texture_integer Adds integer texture formats 8, 16, and 32 bit per component Signed and unsigned RGB, RGBA, Luminance, Alpha, Intensity, and LA Lack filtering support Only available with EXT_gpu_shader4 / NV_gpu_program4 Uses Bitfields Color index emulation Lookup tables © NVIDIA Corporation 2007
  • 23. Packed Float Textures EXT_packed_float 11/11/10 floating point format 5 bit exponent per component Bias of -15 Only supports positive values (no sign bit) Can be used as framebuffer format Max values R/G – 65024 B – 64512 Size advantage can make it much faster than float16 Supports filtering, blending, and MSAA © NVIDIA Corporation 2007
  • 24. Packed Float Texture Usage © NVIDIA Corporation 2007
  • 25. RGBE / Shared Exponent textures EXT_texture_shared_exponent 9/9/9/5 RGBE format Similar to Radiance 8/8/8/8 RGBE format Shared 5 bit exponent (bias of -15) Source texture format only (not renderable to) Only supports positive values (no sign bit) © NVIDIA Corporation 2007
  • 26. New compressed texture formats EXT_texture_compression_latc Good format for greyscale w/ alpha compression 8-bits per texel Stored in 4x4 blocks (like DXT formats) Components are compressed independently EXT_texture_compression_rgtc Same properties as latc Returns (r,g,0,1) instead of (l,l,l,a) Useful for normal map compression (x,y) = 2 * (r,g) – 1 Z = sqrt( 1 – (x2 + y2) ) © NVIDIA Corporation 2007
  • 27. OpenGL Data Buffer Extensions EXT_bindable_uniform Used with EXT_gpu_shader4 Allows a uniform to be bound to a buffer object Quickly switch all values in a large structure or array NV_parameter_buffer_object Used with NV_gpu_program4 Defines new object containing banks of uniform parameters Enables rendering to parameter buffers Useful for instancing Provides a very large parameter store © NVIDIA Corporation 2007
  • 28. EXT_texture_buffer_object Bind buffer object as a texture Jumbo 1D texture Larger size limit (128 Mtexels) Does not support filtering Addressed by element Not normalized [0-1] Limited format support No RGB support (RGBA is OK) Minimum of 1 byte per component No compressed formats © NVIDIA Corporation 2007
  • 29. Texture buffer usage Large constant store Significantly larger than EXT_bindable_uniform Jumbo bone list for skinning Instancing data Transform matrices Materials Custom indexing Separate ‘index’ for position, normal, texture coordinate Less efficient than normal vertex fetching Not as coherent Can be useful interactive editing due to reduced sw cost © NVIDIA Corporation 2007
  • 30. New Pathways Geometry Shaders New pipeline stage Render target arrays Geometry shader indexing of texture arrays Transform feedback Export vertices mid-stream © NVIDIA Corporation 2007
  • 31. Geometry Shader Basics Input Primitive Assembly Standard primitives point, line, triangle… New primitive types include neighboring vertices Line with adjacency Video Memory 1 4 Geometry Shader 2 3 Triangle with adjacency 3 4 2 1 5 Clipping & Rasterization 6 Output Unique output type (independent from input type) Points, line strips or triangle strips Can output zero or more primitives Generated primitive stream is in the same order as inputted © NVIDIA Corporation 2007
  • 32. Geometry Shader Applications Better point sprites Rotation, non-square, motion blur Simple subdivision Single pass cube map creation Automatic stencil shadow polygon generation Fur rendering Fin generation Curve rendering 2D rendering, hair/fur Particle systems GPGPU Data amplification – variable number of outputs © NVIDIA Corporation 2007
  • 33. Geometry Shader Extensions EXT_geometry_shader4 GLSL geometry shaders NV_geometry_shader4 Adds to EXT_geometry_shader4 NV_geometry_program4 ARB_vertex_program style Part of NV_gpu_program4 © NVIDIA Corporation 2007
  • 34. EXT_geometry_shader New link-time parameters Primitive input and output type Max vertices output New shader variables gl_VerticesIn – number of input vertices gl_Layer – texture array layer target gl_PrimitiveID – Set primitive ID seen by the fragments gl_PrimitiveIDIn – Primitive ID based on input prims © NVIDIA Corporation 2007
  • 35. EXT_geometry_shader code varying in vec3 eyeNormal[gl_VerticesIn]; varying out vec3 oEyeNormal; for ( int i = 0; i < gl_VerticesIn; i++) { oEyeNormal = eyeNormal[i]; //causes all output varying to submit EmitVertex(); } //Start a new strip RestartPrimitive(); © NVIDIA Corporation 2007
  • 36. NV_geometry_shader4 Relax restrictions of EXT_geometry_shader4 Changing input/output primitive without a relink Changing max output vertices without a relink Defines additional behavior Quads and polygons at turned into triangles Shader accepting triangle accepts quads © NVIDIA Corporation 2007
  • 37. NV_geometry_program4 Same capabilities as geometry_shader4 ARB_vertex_program style syntax Inputs are ‘ATTRIB’ Outputs are ‘RESULT’ Input primitive, output primitive, and max vertices Declared in the shader © NVIDIA Corporation 2007
  • 38. Rendering to Texture Arrays Can select destination layer for each primitive in geometry program Write to “gl_Layer” or “result.layer” Can be used for single-pass render-to-cubemap Read input triangle Output to 6 cube map faces, transformed by correct face matrix No real savings for GPU Same transform and rasterization load Additional geometry shader work Simple culling may help © NVIDIA Corporation 2007
  • 39. EXT_transform_feedback Allows storing output from a vertex program or geometry program to buffer object Enables multi-pass operations on geometry, e.g. Store results of animation (skinning) to buffer, reuse for multiple lights Recursive subdivision Provides queries for number of primitives generated by geometry program © NVIDIA Corporation 2007
  • 40. Example: Terrain Subdivision Takes quad as input (actually line_adj primitive) Geometry program subdivides into 4 new quads using diamond-square subdivision Uses two VBOs, reads from one, writes to the other using transform feedback Then swap © NVIDIA Corporation 2007
  • 41. Terrain Subdivision © NVIDIA Corporation 2007
  • 42. The backend Floating point depth buffers EXT_draw_buffers2 sRGB Framebuffers Coverage sample anti-aliasing © NVIDIA Corporation 2007
  • 43. NV_depth_float Provides floating point depth buffers and textures Range extended to [–MAX_FLOAT, MAX_FLOAT] New functions for non-normalized values glDepthRangedNV glClearDepthdNV glDepthBoundsdNV Multiple formats DEPTH_COMPONENT32F_NV DEPTH32F_STENCIL8_NV FBO only Care must be taken in using the extra precision © NVIDIA Corporation 2007
  • 44. Float depth precision Normal [0-1] mapping is ineffective Precision bunches up at 0 Logarithmic distribution of floats 24 bits of precision [0.5 (2^-1) – 1.0 (2^0)] Perspective projection pushes scene toward 1 2x near maps to roughly 0.5 90%+ of scene [0.5 – 1.0] range Effectively a 25-bit depth buffer Changing to [0-256] is no better [128 – 256] has the same 24-bits © NVIDIA Corporation 2007
  • 45. Depth Precision Distribution Integer distribution 0 1 Floating point distribution 2x near © NVIDIA Corporation 2007
  • 46. The Solution Reverse the depth mapping Far = 0.0 Near = 1.0 (or higher) Precision bunching now reversed Compression and expansion line up Results in essentially linear depth buffering © NVIDIA Corporation 2007
  • 47. A Caveat Small precision cliff near 0 FP numbers often use denormals to fill it Graphics hardware typically avoid it Often CPU’s too, denorms are slow Problem is minimal due to exponent range Infinite far plane completely fixes it © NVIDIA Corporation 2007
  • 48. EXT_draw_buffers2 Provides per draw buffer control of Blend Enable Color mask Does not provide independent control of Blend Function Blend Equation Blend Color © NVIDIA Corporation 2007
  • 49. EXT_framebuffer_sRGB Allows framebuffer to be stored in sRGB space Provides perceptual color compression 8 bits sRGB is similar to 10 bits linear RGB Converts to/from sRGB on access to framebuffer Implements a gamma of 2.2 This matches most monitors relatively well Controlled via a simple enable Not available on all formats (>8 bit per component) © NVIDIA Corporation 2007
  • 50. CSAA Coverage Sample Anti-Aliasing GL_NV_framebuffer_multisample_coverage Decouples primitive coverage from depth/color Increases edge quality with little cost Memory and bandwidth overhead scale with # depth samples Worst case is as good as # of depth samples © NVIDIA Corporation 2007
  • 51. CSAA usage Extremely simple MSAA glRenderbufferStorageMultisampleEXT( GL_RENDERBUFFER_EXT, 4, GL_RGBA8, width, height) CSAA glRenderbufferStorageMultisampleCoverageNV( GL_RENDERBUFFER_EXT, 16, 4, GL_RGBA8, width, height) © NVIDIA Corporation 2007
  • 52. CSAA Modes Name Coverage Samples Color/Depth Samples 2x 2 2 4x 4 4 8x 8 4 8xQ 8 8 16x 16 4 16xQ 16 8 © NVIDIA Corporation 2007
  • 53. Thanks Simon Green Samuel Gateau Henry Moreton NVIDIA DevTech team © NVIDIA Corporation 2007
  • 54. New Developer Tools at GDC 02007 FX Composer 2 PerfKit 5 SDK 10 ShaderPerf 2 GPU-Accelerated Texture Tools © NVIDIA Corporation 2007 Shader Library