SlideShare a Scribd company logo
1 of 44
Performance Tuning of Spatial
Queries in SQL Server
Deep Dive into Spatial Indexing

Michael Rys (@SQLServerMike)
Principal Program Manager
Microsoft Corp.




                               October 11-14, Seattle, WA
DEMO
A spatial query……




                    October 11-14, Seattle, WA
Q: Why is my Query so Slow?
A: Usually because the index isn’t being used.
Q: How do I tell?
A: SELECT * FROM T WHERE g.STIntersects(@x) = 1




                               AD404-M| Spatial Performance   3
Hinting the Index
Spatial indexes can be forced if needed.

SELECT *
FROM T WITH(INDEX(T_g_idx))
WHERE g.STIntersects(@x) = 1

Use SQL Server 2008 SP1 or 2008 R2!



                              AD404-M| Spatial Performance   4
But Why Isn't My Index Used?
Plan choice is cost-based
• QO uses various information, including cardinality
     EXEC sp_executesql
     SELECT *@x geometry = 'POINT (0 0)'
     DECLARE
     FROM T
     SELECT *N'SELECT *
     WHERE FROM T
     FROM TT.g.STIntersects('POINT (0 0)') = 1
     WHERE WHERE T.g.STIntersects(@x) = 1',
              T.g.STIntersects(@x) = 1
             N'@x geometry', N'POINT (0 0)'
When can we estimate cardinality?
• Variables: never
• Literals: not for spatial since they are not literals
  under the covers
• Parameters: yes, but cached, so first call matters
                                      AD404-M| Spatial Performance   5
Spatial Indexing Basics
              C
                                   D A    B                              B
    D A   B                                                        A
                  Primary Filter              Secondary Filter
          E       (Index lookup)              (Original predicate)

In general, split predicates in two
•   Primary filter finds all candidates, possibly
    with false positives (but never false negatives)
•   Secondary filter removes false positives
The index provides our primary filter
Original predicate is our secondary filter
Some tweaks to this scheme
•   Sometimes possible to skip secondary filter


                                                   AD404-M| Spatial Performance   6
Using B+-Trees for Spatial Index
SQL Server has B+-Trees
Spatial indexing is usually done through other
structures
• Quad tree, R-Tree
Challenge: How do we repurpose the B+-Tree
to handle spatial queries?
• Add a level of indirection!




                                AD404-M| Spatial Performance   7
Mapping to the B+-Tree
B+-Trees handle linearly ordered sets well
We need to somehow linearly order 2D space
• Either the plane or the globe
We want a locality-preserving mapping from
the original space to the line
• i.e., close objects should be close in the index
• Can’t be done, but we can approximate it




                                     AD404-M| Spatial Performance   8
SQL Server Spatial Indexing Story
Planar Index                             Geographic Index
• Requires bounding box                  • No bounding box
• Only one grid                          • Two top-level projection grids

          Secondary Filter
           Indexing Filter
             Primary Phase
      1       2      15       16            1.

      4       3      14       13


      5       8       9       12                               3.

      6       7      10       11            2.

   5.
   4. Apply actual CLR method
   3. Intersecting for spatial
   2. Identify a grid on the
   1. Overlay gridsgrids identifies on
                        query
   candidates to
   object(s)
   object to store in index
   spatial object find matches
                                                  AD404-M| Spatial Performance   9
SQL Server Spatial Indexing Story
Multi-Level Grid
• Much more flexible than a simple grid
• Hilbert numbering
• Modified adaptable QuadTree
Grid index features
• 4 levels
• Customizable grid subdivisions
• Customizable maximum number of cells per object (default
  16)
• NEW IN SQL Server Codename “DENALI”: New Default
  tessellation with 8 levels of cell nesting



                                          AD404-M| Spatial Performance   10
Multi-Level Grid



                                                                           /4/2/3/1

    /
(“cell 0”)




Deepest-cell Optimization: Only keep the lowest level cell in index
Covering Optimization: Only record higher level cells when all lower
cells are completely covered by the object
Cell-per-object Optimization: User restricts max number of cells per object Performance
                                                           AD404-M| Spatial               11
Implementation of the Index
 Persist a table-valued function
 • Internally rewrite queries Spatialencoding IDcovers cellor 2)
                               Varbinary(5) Reference table
                                     to use the
                      0 – cell at least touches the object (but not 1
                      1 – guarantee that object partially
                15 columns and 2 – object limitation be the same to
                                       of gridHaveid
                                               cell to
                               895 byte covers cell
                                              produce match

     Prim_key        geography                     Prim_key    cell_id     srid        cell_attr
                                                   1           0x00007     42          0
     1               g1
                                                   3           0x00007     42          1
     2               g2                            3           0x0000A     42          2
     3               g3                            3           0x0000B     42          0
                                                   3           0x0000C     42          1
         Base Table T                              1           0x0000D     42          0
                                                   2           0x00014     42          1

CREATE SPATIAL INDEX sixd
                                                       Internal Table for sixd
ON T(geography)
                                                              AD404-M| Spatial Performance         12
New AUTO GRID Index
• NEW IN SQL Server Codename “DENALI”
• Has 8 levels of cell nesting
• No manual grid density selection:
  •   Fixed at HLLLLLLL
• default number of cells per object:
  •   8 for geometry
  •   12 for geography

• More stable performance
      • for windows of different size
      • for data with different spatial density
• For default values:
  •   Up to 2x faster for longer queries > 500 ms
      •   More efficient primary filter
      •   Fewer rows returned
  •   10ms slower for very fast queries < 50 ms
      •   Increased tessellation time which is constant

                                                          AD404-M| Spatial Performance   13
Spatial Index Performance




New grid gives much stable performance for query windows of different size
Better grid coverage gives fewer high peaks
                                                            AD404-M| Spatial Performance   14
Index Creation and Maintenance
     Create index example GEOMETRY:
        CREATE SPATIAL INDEX sixd ON spatial_table(geom_column)
        WITH (
             BOUNDING_BOX = (0, 0, 500, 500),
             GRIDS = (LOW, LOW, MEDIUM, HIGH),
             CELLS_PER_OBJECT = 20)

     Create index example GEOGRAPHY:
        CREATE SPATIAL INDEX sixd ON spatial_table(geogr_column)
        USING GEOGRAPHY_GRID
        WITH (
             GRIDS = (LOW, LOW, MEDIUM, HIGH),
             CELLS_PER_OBJECT = 20)

     NEW IN SQL Server “DENALI” (equivalent to default creation):
        CREATE SPATIAL INDEX sixd ON spatial_table(geom_column)
        USING GEOGRAPHY_AUTO_GRID
        WITH (CELLS_PER_OBJECT = 20)

15   Use ALTER and DROP INDEX for maintenance.
DEMO
Indexing and Performance




                           October 11-14, Seattle, WA
Spatial Methods supported by Index

Geometry:               Geography:
• STIntersects() = 1      •   STIntersects() = 1
• STOverlaps() = 1        •   STOverlaps() = 1
• STEquals()= 1           •   STEquals()= 1
                          •   STWithin() = 1
• STTouches() = 1
                          •   STContains() = 1
• STWithin() = 1          •   STDistance() < val
• STContains() = 1        •   STDistance() <= val
• STDistance() < val      •   Nearest Neighbor
• STDistance() <= val     •   Filter() = 1
• Nearest Neighbor
                          New in Denali
• Filter() = 1
                                 AD404-M| Spatial Performance   17
How Costing is Done
• The stats on the index contain a trie constructed on
  the string form of the packed binary(5) typed Cell ID.
• When a window query is compiled with a sniffable
  window object, the tessellation function on the
  window object is run at compile time. The results are
  used to construct a trie for use during compilation.
  •   May lead to wrong compilation for later objects
• No costing on:
  • Local variables, constants, results of expressions
• Use different indices and different stored procs to
  account for different query characteristics

                                              AD404-M| Spatial Performance   18
Understanding the Index Query Plan




                      AD404-M| Spatial Performance   19
Seeking into a Spatial Index
Minimize I/O and random I/O
Intuition: small windows should touch small portions of the index
A cell 7.2.4 matches
•   Itself
•   Ancestors
•   Descendants
                             7       7.2   7.2.4



     Spatial Index S




                                            AD404-M| Spatial Performance   20
Understanding the Index Query Plan
                    Remove dup            T(@g)
 Optional Sort
                      ranges



                                                    Ranges
            Spatial Index Seek




                                 AD404-M| Spatial Performance   21
Other Query Processing Support
• Index intersection
  • Enables efficient mixing of spatial and non-spatial
    predicates
• Matching
  •   New in SQL Server “Denali”: Nearest Neighbor query
  •   Distance queries: convert to STIntersects
  •   Commutativity: a.STIntersects(b) = b.STIntersects(a)
  •   Dual: a.STContains(b) = b.STWithin(a)
  •   Multiple spatial indexes on the same column
      • Various bounding boxes, granularities
  • Outer references as window objects
      • Enables spatial join to use one index

                                                AD404-M| Spatial Performance   22
Other Spatial Performance Improvements
in SQL Server Codename “Denali”
• Spatial index build time for point data can be as
  much as four to five times faster
• Optimized spatial query plan for STDistance and
  STIntersects like queries
• Faster point data queries
• Optimized STBuffer, lower memory footprint




                                    AD404-M| Spatial Performance   23
Spatial Nearest Neighbor (Denali)
Main scenario
  • Give me the closest 5 Italian restaurants
Execution plan
  • SQL Server 2008/2008 R2: table scan
  • SQL Server Codename “Denali”: uses spatial index

Specific query pattern required
• SELECT TOP(5) *
  FROM Restaurants r
  WHERE r.type = ‘Italian’
    AND r.pos.STDistance(@me) IS NOT NULL
  ORDER BY r.pos.STDistance(@me)

                                        AD404-M| Spatial Performance   24
DEMO
Nearest Neighbor performance




                         October 11-14, Seattle, WA
Nearest Neighbor Performance
Find the closest 50 business points (22 million in total)




NN query vs best current workaround (sort all points in 10km radius)
*Average time for NN query is ~236ms           AD404-M| Spatial Performance   26
Limitations of Spatial Plan Selection
• Off whenever window object is not a
  parameter:
 • Spatial join (window is an outer reference)
 • Local variable, string constant, or complex expression
• Has the classic SQL Server parameter-
  sensitivity problem
 • SQL compiles once for one parameter value and reuses the
   plan for all parameter values
 • Different plans for different sizes of window require
   application logic to bucketize the windows


                                         AD404-M| Spatial Performance   27
Index Support
•   Can be built in parallel
•   Can be hinted
•   File groups/Partitioning
    •   Aligned to base table or Separate file group
    •   Full rebuild only
•   New catalog views, DDL Events
•   DBCC Checks
•   Supportability stored procedures
•   New in SQL Server “Denali”: Index Page and Row Compression
    •   Ca. 50% smaller indices, 0-15% slower queries
•   Not supported
    •   Online rebuild
    •   Database Tuning advisor                         AD404-M| Spatial Performance   28
SET Options
Spatial indexes requires:
•   ANSI_NULLS: ON
•   ANSI_PADDING: ON
•   ANSI_WARNINGS: ON
•   CONCAT_NULL_YIELDS_NULL: ON
•   NUMERIC_ROUNDABORT: OFF
•   QUOTED_IDENTIFIER: ON




                                  AD404-M| Spatial Performance   29
Index Hinting
FROM T WITH (INDEX (<Spatial_idxname>))
• Spatial index is treated the same way a
  non-clustered index is
 • the order of the hint is reflected in the order of the indexes
   in the plan
 • multiple index hints are concatenated
 • no duplicates are allowed
• The following restrictions exist:
 • The spatial index must be either first in the first index hint or
   last in the last index hint for a given table.
 • Only one spatial index can be specified in any index hint for
   a given table.
                                            AD404-M| Spatial Performance   30
Query Window Hinting (Denali)
SELECT * FROM table t
with(SPATIAL_WINDOW_MAX_CELLS=1024)
WHERE t.geom.STIntersects(@window)=1
•       Used if an index is chosen (does not force an index)
•       Overwrites the default (512 for geometry, 768 for
        geography)
•       Rule of thumb:
    •     Higher value makes primary filter phase longer but reduces
          work in secondary filter phase
    •     Set higher for dense spatial data
    •     Set lower for sparse spatial data
                                              AD404-M| Spatial Performance   31
DEMO
Query hinting




                October 11-14, Seattle, WA
Spatial Catalog Views
• sys.spatial_indexes catalog view
• sys.spatial_index_tessellations catalog view
• Entries in sys.indexes for a spatial index:
 • A clustered index on the internal table of the spatial index
 • A spatial index (type = 4) for spatial index
• An entry in sys.internal_tables
• An entry to sys.index_columns



                                          AD404-M| Spatial Performance   35
New Spatial Histogram Helpers (Denali)
  sp_spatial_help_geometry_histogram
  sp_spatial_help_geography_histogram
  Used for spatial data and index analysis




Histogram of 22 million business points over US
Left: SSMS view of a histogram
Right: Custom drawing on top of Bing Maps
                                                  AD404-M| Spatial Performance   38
Indexing Support Procedures
sys.sp_help_spatial_geometry_index
sys.sp_help_spatial_geometry_index_xml
sys.sp_help_spatial_geography_index
sys.sp_help_spatial_geography_index_xml

Provide information about index:
64 properties
10 of which are considered core

                             AD404-M| Spatial Performance   39
sys.sp_help_spatial_geometry_index
 Arguments
   Parameter        Type            Description
   @tabname         nvarchar(776)   the name of the table for which the index
                                    has been specified
   @indexname       sysname         the index name to be investigated
   @verboseoutput   tinyint         0 core set of properties is reported
                                    1 all properties are being reported
   @query_sample    geometry        A representative query sample that will be
                                    used to test the usefulness of the index. It
                                    may be a representative object or a query
                                    window.

 Results in property name/value pair table of the format:

   PropName: nvarchar(256)           PropValue: sql_variant
                                                  AD404-M| Spatial Performance   40
Some of the returned Properties
Property                        Type           Description
Number_Of_Rows_Selected_By_    bigint   Core   P = Number of rows selected by the
Primary_Filter                                 primary filter.
Number_Of_Rows_Selected_By_    bigint   Core   S = Number of rows selected by the
Internal_Filter                                internal filter. For these rows, the secondary
                                               filter is not called.
Number_Of_Times_Secondary_Fi bigint     Core   Number of times the secondary filter is
lter_Is_Called                                 called.
Percentage_Of_Rows_NotSelecte float     Core   Suppose there are N rows in the base table,
d_By_Primary_Filter                            suppose P are selected by the primary filter.
                                               This is (N-P)/N as percentage.
Percentage_Of_Primary_Filter_R float    Core   This is S/P as a percentage. The higher the
ows_Selected_By_Internal_Filter                percentage, the better is the index in
                                               avoiding the more expensive secondary
                                               filter.
Number_Of_Rows_Output          bigint   Core   O=Number of rows output by the query.

Internal_Filter_Efficiency     float    Core   This is S/O as a percentage.

Primary_Filter_Efficiency      float    Core   This is O/P as a percentage. The higher the
                                               efficiency is, the less false positives have to
                                               be processed by the secondary filter.
                                                        AD404-M| Spatial Performance    43
DEMO
Indexing Supportability




                          October 11-14, Seattle, WA
Spatial Tips on index settings
Some best practice recommendations (YMMV):
• Start out with new default tesselation
• Point data: always use HIGH for all 4 level.
  CELL_PER_OBJECT are not relevant in the case.
• Simple, relatively consistent polygons: set all levels to
  LOW or MEDIUM, MEDIUM, LOW, LOW
• Very complex LineString or Polygon instances:
    •    High number of CELL_PER_OBJECT (often 8192 is best)
    •    Setting all 4 levels to HIGH may be beneficial
•       Polygons or line strings which have highly variable
        sizes: experimentation is needed.
•       Rule of thumb for GEOGRAPHY: if MMMM is not
        working, try HHMM                 AD404-M| Spatial Performance   45
What to do if my Spatial Query is slow?
• Make sure you are running SQL Server 2008 SP1, 2008 R2 or
  “Denali”
• Check query plan for use of index
• Make sure it is a supported operation
• Hint the index (and/or a different join type)
• Do not use a spatial index when there is a highly selective non-
  spatial predicate
• Run above index support procedure:
  •   Assess effectiveness of primary filter (Primary_Filter_Efficiency)
  •   Assess effectiveness of internal filter (Internal_Filter_Efficiency)
  •   Redefine or define a new index with better characteristics
      • More appropriate bounding box for GEOMETRY
      • Better grid densities

                                                           AD404-M| Spatial Performance   46
Related Content
Weblog
•   http://blogs.msdn.com/isaac
•   http://blogs.msdn.com/edkatibah
•   http://johanneskebeck.spaces.live.com/
•   http://sqlblog.com/blogs/michael_rys/

Forum: http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1629&SiteID=1
Whitepapers, Websites & Code
•   Denali CTP3: http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/08/08/new-spatial-
    features-in-sql-server-code-named-denali-community-technology-preview-3.aspx
•   Spatial Wiki: http://social.technet.microsoft.com/wiki/contents/articles/4136.aspx
•   SQL Server 2008 Spatial Site: http://www.microsoft.com/sqlserver/2008/en/us/spatial-
    data.aspx
•   SQL Spatial Codeplex: http://www.codeplex.com/sqlspatialtools
•   http://www.sharpgis.net/page/SQL-Server-2008-Spatial-Tools.aspx
•   http://www.codeplex.com/ProjNET
•   http://www.geoquery2008.com/
•   SIGMOD 2008 Paper: Spatial Indexing in Microsoft SQL Server 2008
•   And of course Books Online!

                                                         AD404-M| Spatial Performance   47
Complete the Evaluation Form
to Win!
Win a Dell Mini Netbook – every day – just for
submitting your completed form. Each session
evaluation form represents a chance to win.

Pick up your evaluation form:
• In each presentation room                          Sponsored by Dell
• Online on the PASS Summit website
Drop off your completed form:
• Near the exit of each presentation room
• At the Registration desk
• Online on the PASS Summit website


                                            AD404-M| Spatial Performance   48
Thank you
for attending this session and the
2011 PASS Summit in Seattle




                                     October 11-14, Seattle, WA
Microsoft SQL                Microsoft                Expert Pods              Hands-on Labs
  Server Clinic             Product Pavilion            Meet Microsoft SQL
                                                        Server Engineering
   Work through your         Talk with Microsoft SQL                            Get experienced through
                                                         team members &
technical issues with SQL     Server & BI experts to                             self-paced & instructor-
                                                            SQL MVPs
    Server CSS & get          learn about the next                                 led labs on our cloud
 architectural guidance       version of SQL Server                                 based lab platform -
      from SQLCAT           and check out the new                               bring your laptop or use
                            Database Consolidation                                HP provided hardware
                                   Appliance


     Room 611                    Expo Hall             6th Floor Lobby            Room 618-620

                                                                 AD404-M| Spatial Performance     50

More Related Content

Similar to Performance Tuning of Spatial Queries in SQL Server

SQLBits X SQL Server 2012 Spatial Indexing
SQLBits X SQL Server 2012 Spatial IndexingSQLBits X SQL Server 2012 Spatial Indexing
SQLBits X SQL Server 2012 Spatial IndexingMichael Rys
 
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...Qbeast
 
Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...Texas Natural Resources Information System
 
Chapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdfChapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdfRick Hwang
 
HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010Cloudera, Inc.
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsDatabricks
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkDatabricks
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengDatabricks
 
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...Databricks
 
Structuring Spark: DataFrames, Datasets, and Streaming by Michael Armbrust
Structuring Spark: DataFrames, Datasets, and Streaming by Michael ArmbrustStructuring Spark: DataFrames, Datasets, and Streaming by Michael Armbrust
Structuring Spark: DataFrames, Datasets, and Streaming by Michael ArmbrustSpark Summit
 
Structuring Spark: DataFrames, Datasets, and Streaming
Structuring Spark: DataFrames, Datasets, and StreamingStructuring Spark: DataFrames, Datasets, and Streaming
Structuring Spark: DataFrames, Datasets, and StreamingDatabricks
 
Sql query performance analysis
Sql query performance analysisSql query performance analysis
Sql query performance analysisRiteshkiit
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Databricks
 
Solving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationSolving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationdmcfarlane
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: RenormalizeAriel Weil
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: RenormalizeAriel Weil
 
Big Analytics Without Big Hassles 04/10/14 Webinar
Big Analytics Without Big Hassles 04/10/14 WebinarBig Analytics Without Big Hassles 04/10/14 Webinar
Big Analytics Without Big Hassles 04/10/14 WebinarParadigm4Inc
 
Sql query performance analysis
Sql query performance analysisSql query performance analysis
Sql query performance analysisRiteshkiit
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRPivotalOpenSourceHub
 

Similar to Performance Tuning of Spatial Queries in SQL Server (20)

SQLBits X SQL Server 2012 Spatial Indexing
SQLBits X SQL Server 2012 Spatial IndexingSQLBits X SQL Server 2012 Spatial Indexing
SQLBits X SQL Server 2012 Spatial Indexing
 
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
 
Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...
 
Chapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdfChapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdf
 
HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010HP - Jerome Rolia - Hadoop World 2010
HP - Jerome Rolia - Hadoop World 2010
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDs
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache Spark
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
 
Structuring Spark: DataFrames, Datasets, and Streaming by Michael Armbrust
Structuring Spark: DataFrames, Datasets, and Streaming by Michael ArmbrustStructuring Spark: DataFrames, Datasets, and Streaming by Michael Armbrust
Structuring Spark: DataFrames, Datasets, and Streaming by Michael Armbrust
 
Structuring Spark: DataFrames, Datasets, and Streaming
Structuring Spark: DataFrames, Datasets, and StreamingStructuring Spark: DataFrames, Datasets, and Streaming
Structuring Spark: DataFrames, Datasets, and Streaming
 
Sql query performance analysis
Sql query performance analysisSql query performance analysis
Sql query performance analysis
 
Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™Web-Scale Graph Analytics with Apache® Spark™
Web-Scale Graph Analytics with Apache® Spark™
 
Solving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationSolving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalization
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: Renormalize
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: Renormalize
 
Big Analytics Without Big Hassles 04/10/14 Webinar
Big Analytics Without Big Hassles 04/10/14 WebinarBig Analytics Without Big Hassles 04/10/14 Webinar
Big Analytics Without Big Hassles 04/10/14 Webinar
 
Sql query performance analysis
Sql query performance analysisSql query performance analysis
Sql query performance analysis
 
Tunning overview
Tunning overviewTunning overview
Tunning overview
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
 

More from Michael Rys

Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Michael Rys
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Michael Rys
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Michael Rys
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Michael Rys
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Michael Rys
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Michael Rys
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Michael Rys
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Michael Rys
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Michael Rys
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Michael Rys
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...Michael Rys
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Michael Rys
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...Michael Rys
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)Michael Rys
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Michael Rys
 
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Michael Rys
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLMichael Rys
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLMichael Rys
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)Michael Rys
 

More from Michael Rys (20)

Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
 
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQL
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
 

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 

Performance Tuning of Spatial Queries in SQL Server

  • 1. Performance Tuning of Spatial Queries in SQL Server Deep Dive into Spatial Indexing Michael Rys (@SQLServerMike) Principal Program Manager Microsoft Corp. October 11-14, Seattle, WA
  • 2. DEMO A spatial query…… October 11-14, Seattle, WA
  • 3. Q: Why is my Query so Slow? A: Usually because the index isn’t being used. Q: How do I tell? A: SELECT * FROM T WHERE g.STIntersects(@x) = 1 AD404-M| Spatial Performance 3
  • 4. Hinting the Index Spatial indexes can be forced if needed. SELECT * FROM T WITH(INDEX(T_g_idx)) WHERE g.STIntersects(@x) = 1 Use SQL Server 2008 SP1 or 2008 R2! AD404-M| Spatial Performance 4
  • 5. But Why Isn't My Index Used? Plan choice is cost-based • QO uses various information, including cardinality EXEC sp_executesql SELECT *@x geometry = 'POINT (0 0)' DECLARE FROM T SELECT *N'SELECT * WHERE FROM T FROM TT.g.STIntersects('POINT (0 0)') = 1 WHERE WHERE T.g.STIntersects(@x) = 1', T.g.STIntersects(@x) = 1 N'@x geometry', N'POINT (0 0)' When can we estimate cardinality? • Variables: never • Literals: not for spatial since they are not literals under the covers • Parameters: yes, but cached, so first call matters AD404-M| Spatial Performance 5
  • 6. Spatial Indexing Basics C D A B B D A B A Primary Filter Secondary Filter E (Index lookup) (Original predicate) In general, split predicates in two • Primary filter finds all candidates, possibly with false positives (but never false negatives) • Secondary filter removes false positives The index provides our primary filter Original predicate is our secondary filter Some tweaks to this scheme • Sometimes possible to skip secondary filter AD404-M| Spatial Performance 6
  • 7. Using B+-Trees for Spatial Index SQL Server has B+-Trees Spatial indexing is usually done through other structures • Quad tree, R-Tree Challenge: How do we repurpose the B+-Tree to handle spatial queries? • Add a level of indirection! AD404-M| Spatial Performance 7
  • 8. Mapping to the B+-Tree B+-Trees handle linearly ordered sets well We need to somehow linearly order 2D space • Either the plane or the globe We want a locality-preserving mapping from the original space to the line • i.e., close objects should be close in the index • Can’t be done, but we can approximate it AD404-M| Spatial Performance 8
  • 9. SQL Server Spatial Indexing Story Planar Index Geographic Index • Requires bounding box • No bounding box • Only one grid • Two top-level projection grids Secondary Filter Indexing Filter Primary Phase 1 2 15 16 1. 4 3 14 13 5 8 9 12 3. 6 7 10 11 2. 5. 4. Apply actual CLR method 3. Intersecting for spatial 2. Identify a grid on the 1. Overlay gridsgrids identifies on query candidates to object(s) object to store in index spatial object find matches AD404-M| Spatial Performance 9
  • 10. SQL Server Spatial Indexing Story Multi-Level Grid • Much more flexible than a simple grid • Hilbert numbering • Modified adaptable QuadTree Grid index features • 4 levels • Customizable grid subdivisions • Customizable maximum number of cells per object (default 16) • NEW IN SQL Server Codename “DENALI”: New Default tessellation with 8 levels of cell nesting AD404-M| Spatial Performance 10
  • 11. Multi-Level Grid /4/2/3/1 / (“cell 0”) Deepest-cell Optimization: Only keep the lowest level cell in index Covering Optimization: Only record higher level cells when all lower cells are completely covered by the object Cell-per-object Optimization: User restricts max number of cells per object Performance AD404-M| Spatial 11
  • 12. Implementation of the Index Persist a table-valued function • Internally rewrite queries Spatialencoding IDcovers cellor 2) Varbinary(5) Reference table to use the 0 – cell at least touches the object (but not 1 1 – guarantee that object partially 15 columns and 2 – object limitation be the same to of gridHaveid cell to 895 byte covers cell produce match Prim_key geography Prim_key cell_id srid cell_attr 1 0x00007 42 0 1 g1 3 0x00007 42 1 2 g2 3 0x0000A 42 2 3 g3 3 0x0000B 42 0 3 0x0000C 42 1 Base Table T 1 0x0000D 42 0 2 0x00014 42 1 CREATE SPATIAL INDEX sixd Internal Table for sixd ON T(geography) AD404-M| Spatial Performance 12
  • 13. New AUTO GRID Index • NEW IN SQL Server Codename “DENALI” • Has 8 levels of cell nesting • No manual grid density selection: • Fixed at HLLLLLLL • default number of cells per object: • 8 for geometry • 12 for geography • More stable performance • for windows of different size • for data with different spatial density • For default values: • Up to 2x faster for longer queries > 500 ms • More efficient primary filter • Fewer rows returned • 10ms slower for very fast queries < 50 ms • Increased tessellation time which is constant AD404-M| Spatial Performance 13
  • 14. Spatial Index Performance New grid gives much stable performance for query windows of different size Better grid coverage gives fewer high peaks AD404-M| Spatial Performance 14
  • 15. Index Creation and Maintenance Create index example GEOMETRY: CREATE SPATIAL INDEX sixd ON spatial_table(geom_column) WITH ( BOUNDING_BOX = (0, 0, 500, 500), GRIDS = (LOW, LOW, MEDIUM, HIGH), CELLS_PER_OBJECT = 20) Create index example GEOGRAPHY: CREATE SPATIAL INDEX sixd ON spatial_table(geogr_column) USING GEOGRAPHY_GRID WITH ( GRIDS = (LOW, LOW, MEDIUM, HIGH), CELLS_PER_OBJECT = 20) NEW IN SQL Server “DENALI” (equivalent to default creation): CREATE SPATIAL INDEX sixd ON spatial_table(geom_column) USING GEOGRAPHY_AUTO_GRID WITH (CELLS_PER_OBJECT = 20) 15 Use ALTER and DROP INDEX for maintenance.
  • 16. DEMO Indexing and Performance October 11-14, Seattle, WA
  • 17. Spatial Methods supported by Index Geometry: Geography: • STIntersects() = 1 • STIntersects() = 1 • STOverlaps() = 1 • STOverlaps() = 1 • STEquals()= 1 • STEquals()= 1 • STWithin() = 1 • STTouches() = 1 • STContains() = 1 • STWithin() = 1 • STDistance() < val • STContains() = 1 • STDistance() <= val • STDistance() < val • Nearest Neighbor • STDistance() <= val • Filter() = 1 • Nearest Neighbor New in Denali • Filter() = 1 AD404-M| Spatial Performance 17
  • 18. How Costing is Done • The stats on the index contain a trie constructed on the string form of the packed binary(5) typed Cell ID. • When a window query is compiled with a sniffable window object, the tessellation function on the window object is run at compile time. The results are used to construct a trie for use during compilation. • May lead to wrong compilation for later objects • No costing on: • Local variables, constants, results of expressions • Use different indices and different stored procs to account for different query characteristics AD404-M| Spatial Performance 18
  • 19. Understanding the Index Query Plan AD404-M| Spatial Performance 19
  • 20. Seeking into a Spatial Index Minimize I/O and random I/O Intuition: small windows should touch small portions of the index A cell 7.2.4 matches • Itself • Ancestors • Descendants 7 7.2 7.2.4 Spatial Index S AD404-M| Spatial Performance 20
  • 21. Understanding the Index Query Plan Remove dup T(@g) Optional Sort ranges Ranges Spatial Index Seek AD404-M| Spatial Performance 21
  • 22. Other Query Processing Support • Index intersection • Enables efficient mixing of spatial and non-spatial predicates • Matching • New in SQL Server “Denali”: Nearest Neighbor query • Distance queries: convert to STIntersects • Commutativity: a.STIntersects(b) = b.STIntersects(a) • Dual: a.STContains(b) = b.STWithin(a) • Multiple spatial indexes on the same column • Various bounding boxes, granularities • Outer references as window objects • Enables spatial join to use one index AD404-M| Spatial Performance 22
  • 23. Other Spatial Performance Improvements in SQL Server Codename “Denali” • Spatial index build time for point data can be as much as four to five times faster • Optimized spatial query plan for STDistance and STIntersects like queries • Faster point data queries • Optimized STBuffer, lower memory footprint AD404-M| Spatial Performance 23
  • 24. Spatial Nearest Neighbor (Denali) Main scenario • Give me the closest 5 Italian restaurants Execution plan • SQL Server 2008/2008 R2: table scan • SQL Server Codename “Denali”: uses spatial index Specific query pattern required • SELECT TOP(5) * FROM Restaurants r WHERE r.type = ‘Italian’ AND r.pos.STDistance(@me) IS NOT NULL ORDER BY r.pos.STDistance(@me) AD404-M| Spatial Performance 24
  • 25. DEMO Nearest Neighbor performance October 11-14, Seattle, WA
  • 26. Nearest Neighbor Performance Find the closest 50 business points (22 million in total) NN query vs best current workaround (sort all points in 10km radius) *Average time for NN query is ~236ms AD404-M| Spatial Performance 26
  • 27. Limitations of Spatial Plan Selection • Off whenever window object is not a parameter: • Spatial join (window is an outer reference) • Local variable, string constant, or complex expression • Has the classic SQL Server parameter- sensitivity problem • SQL compiles once for one parameter value and reuses the plan for all parameter values • Different plans for different sizes of window require application logic to bucketize the windows AD404-M| Spatial Performance 27
  • 28. Index Support • Can be built in parallel • Can be hinted • File groups/Partitioning • Aligned to base table or Separate file group • Full rebuild only • New catalog views, DDL Events • DBCC Checks • Supportability stored procedures • New in SQL Server “Denali”: Index Page and Row Compression • Ca. 50% smaller indices, 0-15% slower queries • Not supported • Online rebuild • Database Tuning advisor AD404-M| Spatial Performance 28
  • 29. SET Options Spatial indexes requires: • ANSI_NULLS: ON • ANSI_PADDING: ON • ANSI_WARNINGS: ON • CONCAT_NULL_YIELDS_NULL: ON • NUMERIC_ROUNDABORT: OFF • QUOTED_IDENTIFIER: ON AD404-M| Spatial Performance 29
  • 30. Index Hinting FROM T WITH (INDEX (<Spatial_idxname>)) • Spatial index is treated the same way a non-clustered index is • the order of the hint is reflected in the order of the indexes in the plan • multiple index hints are concatenated • no duplicates are allowed • The following restrictions exist: • The spatial index must be either first in the first index hint or last in the last index hint for a given table. • Only one spatial index can be specified in any index hint for a given table. AD404-M| Spatial Performance 30
  • 31. Query Window Hinting (Denali) SELECT * FROM table t with(SPATIAL_WINDOW_MAX_CELLS=1024) WHERE t.geom.STIntersects(@window)=1 • Used if an index is chosen (does not force an index) • Overwrites the default (512 for geometry, 768 for geography) • Rule of thumb: • Higher value makes primary filter phase longer but reduces work in secondary filter phase • Set higher for dense spatial data • Set lower for sparse spatial data AD404-M| Spatial Performance 31
  • 32. DEMO Query hinting October 11-14, Seattle, WA
  • 33. Spatial Catalog Views • sys.spatial_indexes catalog view • sys.spatial_index_tessellations catalog view • Entries in sys.indexes for a spatial index: • A clustered index on the internal table of the spatial index • A spatial index (type = 4) for spatial index • An entry in sys.internal_tables • An entry to sys.index_columns AD404-M| Spatial Performance 35
  • 34. New Spatial Histogram Helpers (Denali) sp_spatial_help_geometry_histogram sp_spatial_help_geography_histogram Used for spatial data and index analysis Histogram of 22 million business points over US Left: SSMS view of a histogram Right: Custom drawing on top of Bing Maps AD404-M| Spatial Performance 38
  • 36. sys.sp_help_spatial_geometry_index Arguments Parameter Type Description @tabname nvarchar(776) the name of the table for which the index has been specified @indexname sysname the index name to be investigated @verboseoutput tinyint 0 core set of properties is reported 1 all properties are being reported @query_sample geometry A representative query sample that will be used to test the usefulness of the index. It may be a representative object or a query window. Results in property name/value pair table of the format: PropName: nvarchar(256) PropValue: sql_variant AD404-M| Spatial Performance 40
  • 37. Some of the returned Properties Property Type Description Number_Of_Rows_Selected_By_ bigint Core P = Number of rows selected by the Primary_Filter primary filter. Number_Of_Rows_Selected_By_ bigint Core S = Number of rows selected by the Internal_Filter internal filter. For these rows, the secondary filter is not called. Number_Of_Times_Secondary_Fi bigint Core Number of times the secondary filter is lter_Is_Called called. Percentage_Of_Rows_NotSelecte float Core Suppose there are N rows in the base table, d_By_Primary_Filter suppose P are selected by the primary filter. This is (N-P)/N as percentage. Percentage_Of_Primary_Filter_R float Core This is S/P as a percentage. The higher the ows_Selected_By_Internal_Filter percentage, the better is the index in avoiding the more expensive secondary filter. Number_Of_Rows_Output bigint Core O=Number of rows output by the query. Internal_Filter_Efficiency float Core This is S/O as a percentage. Primary_Filter_Efficiency float Core This is O/P as a percentage. The higher the efficiency is, the less false positives have to be processed by the secondary filter. AD404-M| Spatial Performance 43
  • 38. DEMO Indexing Supportability October 11-14, Seattle, WA
  • 39. Spatial Tips on index settings Some best practice recommendations (YMMV): • Start out with new default tesselation • Point data: always use HIGH for all 4 level. CELL_PER_OBJECT are not relevant in the case. • Simple, relatively consistent polygons: set all levels to LOW or MEDIUM, MEDIUM, LOW, LOW • Very complex LineString or Polygon instances: • High number of CELL_PER_OBJECT (often 8192 is best) • Setting all 4 levels to HIGH may be beneficial • Polygons or line strings which have highly variable sizes: experimentation is needed. • Rule of thumb for GEOGRAPHY: if MMMM is not working, try HHMM AD404-M| Spatial Performance 45
  • 40. What to do if my Spatial Query is slow? • Make sure you are running SQL Server 2008 SP1, 2008 R2 or “Denali” • Check query plan for use of index • Make sure it is a supported operation • Hint the index (and/or a different join type) • Do not use a spatial index when there is a highly selective non- spatial predicate • Run above index support procedure: • Assess effectiveness of primary filter (Primary_Filter_Efficiency) • Assess effectiveness of internal filter (Internal_Filter_Efficiency) • Redefine or define a new index with better characteristics • More appropriate bounding box for GEOMETRY • Better grid densities AD404-M| Spatial Performance 46
  • 41. Related Content Weblog • http://blogs.msdn.com/isaac • http://blogs.msdn.com/edkatibah • http://johanneskebeck.spaces.live.com/ • http://sqlblog.com/blogs/michael_rys/ Forum: http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1629&SiteID=1 Whitepapers, Websites & Code • Denali CTP3: http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/08/08/new-spatial- features-in-sql-server-code-named-denali-community-technology-preview-3.aspx • Spatial Wiki: http://social.technet.microsoft.com/wiki/contents/articles/4136.aspx • SQL Server 2008 Spatial Site: http://www.microsoft.com/sqlserver/2008/en/us/spatial- data.aspx • SQL Spatial Codeplex: http://www.codeplex.com/sqlspatialtools • http://www.sharpgis.net/page/SQL-Server-2008-Spatial-Tools.aspx • http://www.codeplex.com/ProjNET • http://www.geoquery2008.com/ • SIGMOD 2008 Paper: Spatial Indexing in Microsoft SQL Server 2008 • And of course Books Online! AD404-M| Spatial Performance 47
  • 42. Complete the Evaluation Form to Win! Win a Dell Mini Netbook – every day – just for submitting your completed form. Each session evaluation form represents a chance to win. Pick up your evaluation form: • In each presentation room Sponsored by Dell • Online on the PASS Summit website Drop off your completed form: • Near the exit of each presentation room • At the Registration desk • Online on the PASS Summit website AD404-M| Spatial Performance 48
  • 43. Thank you for attending this session and the 2011 PASS Summit in Seattle October 11-14, Seattle, WA
  • 44. Microsoft SQL Microsoft Expert Pods Hands-on Labs Server Clinic Product Pavilion Meet Microsoft SQL Server Engineering Work through your Talk with Microsoft SQL Get experienced through team members & technical issues with SQL Server & BI experts to self-paced & instructor- SQL MVPs Server CSS & get learn about the next led labs on our cloud architectural guidance version of SQL Server based lab platform - from SQLCAT and check out the new bring your laptop or use Database Consolidation HP provided hardware Appliance Room 611 Expo Hall 6th Floor Lobby Room 618-620 AD404-M| Spatial Performance 50

Editor's Notes

  1. ADD USING Syntax to show new tesselation scheme
  2. Procedure:Construct 4 points/ranges for each cell in TRemove duplicatesSort (optionally)Seek
  3. Clustering imposes ordering on index
  4. Procedure:Construct 4 points/ranges for each cell in TRemove duplicatesSort (optionally)Seek
  5. TBD
  6. ADD Tesselation
  7. Experimentation: For instance, consider this dataset: US Highways.  In this dataset some of the LineStrings are quite long (over 2000 miles) and others are quite short (400 meters or less). For optimal performance, the following two indexes were roughly equivalent:Geography Index: MEDIUM, MEDIUM, MEDIUM, MEDIUM 1024Geometry Index: LOW, LOW, LOW, LOW 1024