Hekaton is large piece of kit, this session will focus on the internals of how in-memory tables and native stored procedures work and interact – Database structure: use of File Stream, backup/restore considerations in HA and DR as well as Database Durability, in-memory table make up: hash and range indexes, row chains, Multi-Version Concurrency Control (MVCC). Design considerations and gottcha’s to watch out for.
The session will be demo led.
Note: the session will assume the basics of Hekaton are known, so it is recommended you attend the Basics session.
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
SQL Server 2014 Memory Optimised Tables - Advanced
1.
2. SQL Server 2014 In-Memory Tables
(Extreme Transaction Processing)
Advanced
Tony Rogerson, SQL Server MVP
@tonyrogerson
tonyrogerson@torver.net
http://www.sql-server.co.uk
3.
4.
5. Who am I?
• Freelance SQL Server professional and Data Specialist
• Fellow BCS, MSc in BI, PGCert in Data Science
• Started out in 1986 – VSAM, System W, Application System, DB2,
Oracle, SQL Server since 4.21a
• Awarded SQL Server MVP yearly since 97
• Founded UK SQL Server User Group back in ’99, founder member
of DDD, SQL Bits, SQL Relay, SQL Santa and Project Mildred
• Interested in commodity based distributed processing of Data.
• I have an allotment where I grow vegetables - I do not do any
analysis on how they grow, where they grow etc. My allotment is
where I escape technology.
6. Agenda
• Define concept “In-Memory”
• Test Harness
• Storage
• Row Chains
• Indexing
– Hash
– Bw Tree and B+ Tree
– Multi-Version Concurrency Control (MVCC)
• Monitoring
8. What is IMDB / DBIM?
• Entire database resides in Main memory
• Hybrid - selective tables reside entirely in memory
• Row / Column store
– Oracle keeps two copies of the data – in row AND column store –
optimiser chooses
– SQL Server 2014 you choose –
• Columnstore (Column Store Indexing) – traditionally for OLAP
• Rowstore (memory optimised table with hash or range index) – traditionally
for OLTP
• Non-Volatile Memory changes everything – it’s here now!
– SQL Server 2014 Buffer Pool Extensions
9. SQL Server “in-memory”
• Column Store (designed for OLAP)
– Column compression to reduce memory required
– SQL 2012:
• One index per table, nonclustered only, table read-only
– SQL 2014:
• Only index on table – clustered, table is now updateable
• Memory Optimised Tables (designed for OLTP)
– Standard CREATE TABLE with restrictions on FK’s and other
constraints
10. Traditional Tables: Buffer Pool usage (Read)
Data on Stable
Storage
(Disk/SSD)
Buffer Pool
(Server Memory
60GB)
Table (500GB)
• 500GB does not fit into 60GB!
• Buffer pool (the cache) allows us to read tables larger
than available memory
11. Traditional Tables: Buffer Pool usage (Write)
Data on Stable
Storage
(Disk/SSD)
Buffer Pool
(Server Memory
60GB)
Table (500GB)
• 500GB does not fit into 60GB!
• Buffer pool (the cache) + Transaction Log allow us
to write tables larger than available memory
• Penalty for that ability!
13. Test Harness
In-mem P01
In-mem P02
On-store P01
Data in last
5 minutes
Older than
5 minutes
Logical Table
(VIEW)
insert /
update /
delete
Routing table controls
which in-memory table
the insert should go to.
(INSTEAD OF trigger)
0
1
14. Test harness operations
• 10 connections each inserting 500 rows in a single
transaction, running every 15 seconds with a random
delay of 0 – 10 seconds.
• Manual Runs:
– Move rows older than 5 minutes from in-memory into on-
storage tables
– Update rows
18. Transaction Logging
• All data is “in-memory” – WAL not required
• On COMMIT TRAN
– Checks made depending on Isolation
– Data made Durable (to Checkpoint File Pair (CFP))
• Versions of rows kept in memory (see MVCC)
• COMMIT is dramatically quicker because
– Less data logged
• no UNDO
• no INDEX records
• multiple log records combined grouped together to reduce log header overhead
– Bulked IO
20. How MOT use FILESTREAM
• Multiple Files
– Data >= 16MiB or 128MiB files, stores row data
– Delta >= 1MiB or 8MiB stores tombstone information
– Data/Delta may be larger if long running transactions
• All reads and writes are sequential
• Offline Checkpoint Worker
– Reads log
– Appends to Data / Delta files
– Random IO is not required!
21. Life of a Row Version
Memory
CFP
(Data / Delta)
CFP
(Data / Delta )
No active rows
3. MERGE
4. GARBAGE
COLLECT
2. CHECKPOINT
1. Write to storage LDF – offline
checkpoint worker writes to
CFP
2. Close CFP and mark ACTIVE
(Durable Checkpoint
established)
3. ACTIVE CFP’s with >= 50%
free space can be merged.
4. Files with no active rows can
be deleted
22. CHECKPOINT
• After 512MiB data written to log or Manual CHECKPOINT
• CFP state: UNDER CONSTRUCTION (on recovery, data
taken from transaction log)
• On CHECKPOINT
– UNDER CONSTRUCTION CTP closed, becomes ACTIVE
– Now have a durable checkpoint
24. Life of a Row Version
Memory
CFP
(Data / Delta)
CFP
(Data / Delta )
No active rows
3. MERGE
4. GARBAGE
COLLECT
2. CHECKPOINT
1. Write to storage LDF – offline
checkpoint worker writes to
CFP
2. Close CFP and mark ACTIVE
(Durable Checkpoint
established)
3. ACTIVE CFP’s with >= 50%
free space can be merged.
4. Files with no active rows can
be deleted
25. Row Updates
• Actually DELETE and INSERT
• Delete marker (not the row) is stored in DELTA file
• INSERT is stored in DATA file
• Versions hang around until no other sessions require
them
26. Garbage Collection
• Before GC can be performed:
– Rows get Durabilised to storage on COMMIT
– Checkpoint File Pairs grow, new ones created
– Various CFP will contain the “current” rows we want in case of a
failure [remember – log gets backed up and truncated]
– Merge CFP into a new one – pull out current rows into a new CFP
• CFP end up containing just old row versions no longer
required
• GC removes old CFP’s (FILESTREAM GC)
27. SQL Server Memory Pools
Stable
Storage
(MDF/NDF
Files)
Buffer Pool
Table
Data Data in/out as required
Memory Internal Structures (proc/log
cache etc.)
SQL Server
Memory
Space
Memory Optimised Tables
FINITE
ELASTIC
30. The hash is the position in the memory map
table
Bucket
position
31. Hash Collisions
• Example:1 million unique values hashed into 5000 hash
values?
– Multiple unique values mush hash to the same hash value
• Multiple key (real data) values hang off same hash bucket
• Termed: Row Chain
– Chain of connected rows
• SQL Server 2014 Hash function is Balanced and follows a
Poisson distribution – 1/3 empty, 1/3 at least one index
key and 1/3 containing two index keys
33. Memory Optimised Table Row Header
Up-to 8 Index Memory
Pointers
• Above is a 3 row – row chain, all three rows hash to bucket #3
• Most recent row is first row in chain
• Rows in stable storage
• accessed using files, pages and extents
• Rows in memory optimised
• accessed using memory pointers
Memory
Location
Hash Index
35. Row Header – Multiple Index Pointers
Data Rows (Index #‘x’ Pointer is next memory pointer in ‘row’ chain)
Memory Pointers
per Hash Bucket
Row chain is
defined
through the
Row Header
36. Hash Index Scan
• Scan follows Hash Index (8 byte per
bucket)
• Jump to each fist row in row chain
• Read row chain (lots of bytes per
row – header + data)
38. What makes a Row Chain grow?
• Hash collisions
• Row versions from updates
– An update is a {delete, insert}
• Late garbage collection because of long running
transactions
• Garbage collection can’t keep up because of box load and
amount to do {active threads do garbage collection on
completion of work}
39. Range Index (Bw Tree ≈ B+Tree)
8KiB
1 row tracks a page
in level below
Root
Leaf
Leaf row per table row
{index-col}{pointer}
B+ Tree
40. Range Index (Bw Tree ≈ B+Tree)
8KiB
1 row tracks a page
in level below
Root
Leaf
Leaf row per table row
{index-col}{pointer}
B+ Tree
41. Range Index (Bw Tree ≈ B+Tree)
Bw Tree
8KiB
Root
Leaf
Leaf row per unique value
{index-col}{memory-pointer}
Row Chain
42. Range Index (Bw Tree ≈ B+Tree)
8KiB
1 row tracks a page
in level below
Root
Leaf
Leaf row per table row
{index-col}{pointer}
Bw TreeB+Tree
8KiB
Root
Leaf
Leaf row per unique value
{index-col}{memory-pointer}
Row Chain
≈
43. Multi-Version Concurrency Control (MVCC)
• Compare And Swap replaces Latching
• Update is actually INSERT and Tombstone
• Row Versions [kept in memory]
• Garbage collection task cleans up versions no longer required
(stale data rows)
• Version not required determined by active transactions and
snapshot the connection has of the data
• Your in-memory table can double, triple, x 100 etc. in size
depending on access patterns (data mods + query durations)
44. Versions in the Row Chain
Newest
Oldest
TS_START TS_END Connection
2 NULL 55
7 NULL 60
50 NULL 80
46. Versions in the Row Chain
Current
TS_START TS_END Connection
2 NULL 55
7 NULL 60
50 NULL 80
Tombstone
47. The problem with Versions
• Available memory is finite
• All data must fit in memory
• You may well run out of memory!!
• Removal of versions from memory is not immediate
49. Summary
• Memory is finite
• Access patterns
– Watch your long running transactions
– COMMIT is now your friend (batch up)
– Version creation is your enemy
• Consider Index strategy very careful and monitor
Notes de l'éditeur
Far from Hekaton being an extension of DBCC PINTABLE, it’s a huge new piece of functionality that can significantly improve the scalability of various data based scenarios – not just OLTP but also ETL and real-time BI.
This session will introduce Hekaton features, how and when to use it; it will be demo led giving Hekaton end-to-end: enabling it, create tables, index design, query considerations, native stored procedures, durability [or not], introduce methods of identifying what to put in memory or not.