Full Table Scan: Friend or Foe

Full Table Scan: friend or foe?
Mauro Pagano

Mauro Pagano
• Consultant, working with both DBA and Devs
• Oracle  Enkitec  Accenture Enkitec Group
• Database Performance and SQL Tuning
• Training, Workshops, OUG
• Free Tools (SQLd360, TUNAs360, Pathfinder)
• Newbie old fart (thanks Bryn! :-)
2

• ”FTS is bad” biggest myth in SQL Tuning
• FTS distracts people from real root cause
• FTS often not performed at its best
3
Why this session?

Friend or foe?
Want to guess?
Come on try to guess…
IT DEPENDS

4

What is a Full Table Scan?
• SQL Execution Access Method
– Read data from table, while applying filters
• Same mechanics apply to:
– Full (sub)Partition Scan
– Index Fast Full Scan
• Always available regardless of SQL construct
5

How does it work?
• Whole segment is read
– From segment header up to (L)HWM
– Blocks read regardless if empty or not
• Several blocks read at once
– This is key to understand full potential of FTS
• Data goes into SGA => db file scattered read
• Data goes into session PGA => direct path read
6

Why FTS rocks?
• Can crunch A LOT of data efficiently (*)
– Couple of O more data than index scans per disk
• Full Scan (~200MB/s per disk)
– Wait IO seek + latency per (large) chunk
– Parallelizes well, increasing bandwidth (GB/s)
• Index Scan (~1.5MB/s per disk)
– Wait IO seek + latency per block
7https://vimeo.com/160371916

Why FTS doesn’t rock?
• FTS read ~100x faster than index
• Needs to read the whole segment
– GB to read just few rows
• Concurrent users share bandwidth
– More users less resource for each
• Index faster if filters less than ~1% of data
– Assuming data comes all from disk
8

What’s the challenge then?
• If < 1% index otherwise FTS? Easy right?
• NO!!!!
• Buffer Cache complicates things a lot
– It saves large % of disk reads
• Especially for index blocks, touched often
– Buffer Cache is transitory in nature
• No guarantee block X will be there when needed
– Complex for CBO to consider caching
• Algorithms assume each read is a physical one (kind of)
9

Demo table
create table t_fts as
select * from dba_objects;
insert into t_fts select * from t_fts;
/
<<a few times>>
exec dbms_stats.gather_stats(user,’T_FTS’);
TABLE_NAME NUM_ROWS BLOCKS PAR
----------- ---------- ---------- ---
T_FTS 827878 15327 NO
10

FTS CBO costing
• Amount of work is xxx_TABLE.BLOCKS
• How much can be read at once?
– Adjusted by how much longer mread takes vs sread
– Considering block size and IOTFRSPEED
• db_file_multiblock_read_count
– If set, then obeyed, if not set then use 8
– If MBRC from System Stats, then obeyed
• Total Cost is dominated by IO cost
11

When is it a good idea to use FTS?
• No hardcoded threshold on % of data selected
• Main parameters to consider
– Table size & scan “capacity”
– Indexes available and their “quality”
– Caching not considered (*)
• CBO costs it, selects when cheaper
• Costing formula simple and solid
– Enhanced to support In-Memory (not Exadata)
12

FTS CBO costing – demo table
No system stats (IOSEEKTIME 10ms and IOTRFSPEED 4096)
No db_file_multiblock_read_count set
Using 8 as MBRC
Table Stats::
Table: T_FTS #Rows: 827878 #Blks: 15327
Access Path: TableScan
Cost: 4198.59
Cost_io: 4153.00 Cost_cpu: 465137851
Costing Formula = 1 + (#Blocks/MBRC * mread/sread)
Doing the math: 1(1) + (15327/8 * (10 + 8 * 8/4)/12) ~= 4152
13

FTS execution
• Oracle tries to maximize IO Size
• Usually translates in 1MB read size
• db_file_multiblock_read_count
– If set, then obeyed
– Usually a bad idea
• Direct path read vs db file scattered read
– Decision NOT made by the CBO
– Can increase pressure to IO
14

FTS execution – demo table
No db_file_multiblock_read_count set
PARSING IN CURSOR #140422231271976 ... sqlid='3g5guxdgz4drx’
select * from t_fts
...
WAIT #1..:'direct path read' fnum=6 fdba=328203 bcnt=13
15

How did the myth start?
• Many problems lead to FTS when not a good idea
– FTS is the symptom / consequence of the issue, not cause
– FTS gets blamed and steal attention from root cause
• Some examples that affect CBO decision:
– Lack of necessary index
• 1-2% data, index usually faster thx to Buffer Cache
– Poor quality stats, for example lack of histogram
• CBO unable to determine specific values very selective
16

And if the CBO was right?
• Legit FTS can still underperform
• Often not due to Oracle database itself
• Storage underperforming
– Unable to provide data fast enough, small pipe
– High latency due to high load
• File system / ASM misunderstanding
– Optimizations causing unstable (and puzzling) results
– Caching making TEST vs PROD comparison incorrect
– File System vs ASM comparison assuming they work the same
17

Use FTS the right way
• DBAs have generally little control outside Oracle
– Make sure FTS works at its best “from the DB”
• Provide proper stats to CBO
– In order to make FTS selected when needed
– No need to actively discourage FTS
– Usually no need to gather System Stats (*)
• Make sure IO size is maximized
• Let’s play Trivia! 
18

Trivia 1 – What’s going on?
Good execution
WAIT #140245916217600: nam='db file scattered read' ela= 4834 file#=26 block#=16002 blocks=128
Bad bad execution
WAIT #140245916165224: nam='db file sequential read' ela= 124 file#=26 block#=16002 blocks=1
….<<another 38 waits here>>
19

Good execution
Bad bad execution
20

Bad bad execution
WAIT #140029131327704: nam='db file scattered read' file#=26 block#=15618 blocks=128 obj#=74828
WAIT #140029131327704: nam='db file sequential read'… bytes=8192 obj#=0
WAIT #140029131327704: nam='db file sequential read' … bytes=8192 obj#=0
WAIT #140029131327704: nam='db file sequential read' … bytes=8192 obj#=0
…
…
…
21

Trivia Summary
• Make sure FTS can go full speed on paper
– Proper Extent Size
– No db_file_multiblock_read_count
– Helping buffered vs unbuffered decision, if needed
• Be familiar with common cause for slowdown
– Chained / Migrated rows
– Heavy access to UNDO when doing CR
22

Mixed workloads
• FTS works well with (relatively) large scans
• Common in analytics, less in OLTP
– Not necessarily a bad idea, just more uncommon
• ”Classic OLTP” behaviors negatively affect FTS
– Heavy concurrency, many reads from UNDO
– Warm buffer cache, “fill the gaps” reads
• FTS is more popular for batch-like SQLs
• Direct path reads help a bit, at expenses of storage
23

Real-life case when FTS was desired
24

25
Index scan is chosen by cost, looking into stats
Formula = blevel + (ix_sel * leaf_blocks) + (ix_sel_with_filters * cluf)
select table_name, num_rows, blocks, avg_row_len from dba_tables
TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN
-------------------------- ---------- ---------- -----------
... 1547082 202847 149
select index_name, clustering_factor, leaf_blocks from dba_indexes
INDEX_NAME CLUSTERING_FACTOR LEAF_BLOCKS
-------------- ----------------- -----------
..._IDX1 1508791 19508
..._IDX2 774974 5579
..._IDX3 13345 3241 <--
..._PK 16462 3097
..._UK1 1662405 7707
..._IDX4 26671 2814

26
• Shrinking table size dropped to 20k blocks
• FTS cost dropped from 60k to 6k
– Index scan cost stayed ~the same, 13k
• Plan switched from IRS to FTS
• Elapsed time dropped to 7s (from 60s)
– SQL performed 20x less CR (mostly thx to shrink)
• Lesson learned? FTS can be VERY efficient

FTS improvements
• FTS made more efficient by engineering
• Goal is avoid reading part of the segment
– Reduce IO and improve response time
– Can be row-based or column-based (columnar)
• Skip region with no data of interest
– Partition pruning, based on object definition
– Zonemaps, based on data location
– HCC, based on columns of interest
27

Where is Oracle going?
• New products built on power of FTS
• Big shift compared to old mentality
• Exadata
– Needs FTS with direct path read to offload processing
– Skip regions thank to Storage Indexes (enh in 12.2)
• In-memory
– No index defined on IMCU, only FTS
– Skip regions thank to IM Storage Index
28

Where is industry going?
• Data is growing exponentially
• Lots of interest in Hadoop / BigData
– There is no index in Hadoop
– Every scan is a FTS with pruning
• Storage is faster now, more powerful scans
– NVMe provides GB/s per card
– Powerful scans moving processing to data
29

Summary
• FTS is a very efficient way of scanning data
• CBO determines when to use it
– Suboptimal uses have other root causes
• Few things negatively affect FTS
– Need to know them to alleviate effect
• Large scans are getting more popular
– As software improves to handle more data
– As hardware improves to scan more and faster
30

Contact Information
• http://mauro-pagano.com
– Email
• mauro.pagano@gmail.com
– Free tools to download
• SQLd360
• TUNAs360
• Pathfinder
32

Full Table Scan: Friend or Foe

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Full Table Scan: Friend or Foe

Similaire à Full Table Scan: Friend or Foe (20)

Dernier

Dernier (20)

Full Table Scan: Friend or Foe

Notes de l'éditeur