SlideShare une entreprise Scribd logo
1  sur  61
Record Manipulation &
Indexing
•records/fields
•index placement; index management
•manipulating fixed-length record files
•re-using space in fixed-length files
•varying length records:[VLR] adds; dels; mods;
•free lists for VLR - placement strategies (first, best, worst)
•varying length record maintenance

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 1
Records in General
A record is:
• An identifiable, describable data set
• Often contains a sub-structure
• Typically part of a larger structure
This definition also works for: files; fields;
…
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 2
Records and Fields
FILE SYSTEM containing files

FILE containing records

RECORD

FIELD containing elements

containing
fields

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 3
Record Manipulation
• Operations on Records:
–
–
–
–

© Katrin Becker
All Rights Reserved

Searches
Additions
Deletions
Modifications

Records and Indexing

14-Sep-03 4
Record Manipulation - Search
Sequential Search
• While NOT done:
– Position file pointer
– Read record
– Examine record to see if it’s the one
• Yes DONE
• No CONTINUE
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 5
Other Searches
• What changes?
– Binary search:
• We position the file pointer in a different
fashion (the rest is the same)

– Search with an index
• We apply the search to the index and retrieve
the record only when located in the index

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 6
Record Manipulation –
Addition
New record gets added to the end.

• Insertion into middle of file is impractical.
• If there is an index, then we also perform
an addition to the index (addition to the
end of this list is infeasible – WHY? ).
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 7
Addition with an Index - 1
INDEX

1. New record gets added to the end.

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 8
Addition with an Index - 2
INDEX

2. Locate place where index entry needs to go

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 9
Addition with an Index - 3
INDEX

3. Insert New Index entry (it’s a record too)

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 10
Records vs. Index:
Assertions & Questions
• Moving file records is more expensive
than moving index records.
• Should index be IN record file or its
own file? (How do we maintain it? )
• If IN file: should it be at the beginning,
end, middle, distributed?
• What if we are able to hold the index in
memory?
• What if we can’t?
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 11
Record Manipulation - Deletion
• Locate record (Search)
• Mark space as deleted
• Remove index entry? (why or why not)

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 12
Deletion with an index - 1
INDEX

1.

Locate index entry

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 13
Deletion with an index - 2
INDEX

1.

Locate index entry

2. Locate record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 14
Deletion with an index - 3
INDEX

3. Delete (mark) record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 15
Deletion with an index - 4
INDEX

4. Delete (mark?) index entry

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 16
Record Manipulation - Modification
•
•
•
•

Locate record
Read record
Modify record
Re-write record (assuming fixed-size
records – what if the record is now a
different size? [see later])

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 17
File Behaviour – 1 start

Record count = 9

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 18
File Behaviour – 2 add record

Record count = 10

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 19
File Behaviour – 3 add record

Record count = 11

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 20
File Behaviour – 4 delete

Record count = 10

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 21
File Behaviour – 5 delete

Record count = 9

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 22
File Behaviour – 6 add

Record count = 10

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 23
File Behaviour – 7 add

Record count = 11

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 24
File Behaviour – 8 add

Record count = 12

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 25
File Behaviour – 9 delete

Record count = 11

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 26
File Behaviour – 10 delete

Record count = 10

And so on…….
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 27
What’s happening to the file?
• File grows – does not shrink (we get
fragmentation)
• We end up covering more ground to do the
same job
•

Q: If we are doing random access, why does it matter?

• The file system has less space to use (the
fragmentation is internal from the
perspective of the file system).
• Worst case = EVERY record access ends
up costing us a seek.
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 28
Re-Using Space in the File [FLR]
• When there is a deletion, locate the
last record in the file, end move to the
free slot
– Costs:
• Additional file access to locate (where will we
remember where the last records is?) and
retrieve last record.
• Records will loose locality faster than if we
simply mark the slot. (Why do we care?)

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 30
Re-Using Space – Way 2
• Make a list of places where records
have been deleted.
• When doing addition, check for empty
‘slot’ before placing new record at end.
Q: What about the index?

• When doing deletion, add location of
deleted record to ‘free-list’

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 31
What does the Free-List look like?
INDEX

All we need is
the location.
Order is
unimportant.

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 32
How to decide which ‘slot’
to re-use?
• In FLR every slot will fit a new record.
• We can just take the first one – FreeList can then be maintained as a stack
(which is easy).
• Do we keep Free-List information in
the file?

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 33
Indexing – What is it?
• Table-of-contents for a file (directory)
• Uses keys
• Byte Offset (BO) vs Relative Record
Number (RRN)

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 34
Primary Key Properties:
•
•
•
•

Unique
Canonical
Data-less
Unchanging

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 35
Indexing – How does it Look?
• Must have:
INDEX

– Key
– Way to locate record

• It is itself a structure containing
‘records’ (each index entry is a
record)
• It may be separate from the main
data or in the same file.
• It may be copied into memory for
manipulation and only updated
infrequently; or the file copy may be
maintained as well.
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 36
Indexing – File Ops?
• Tied to records:
– If records added – new/update index entry
– If record deleted – ‘delete’ index entry
– If record modified – maybe no change to
index; maybe update BO [byte offset]

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 37
Fixed-length vs Varying Length
•
•
•
•
•

VLR provides greater flexibility.
VLR increases maintenance overhead.
VLR decreases wasted space. *
VLR makes index virtually essential.
VLR complicates Free-List
maintenance.

*may simply waste space in a different place or a different way.
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 38
VLR Index
INDEX

• Requires:

– Key
– Byte offset
– Record size? [optional]

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 39
VLR Search Operation
INDEX

•
•

Same as for FLR:

1. Locate key in index
2. Locate record in file

Binary search still possible
on index, but NOT on
records alone.

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 40
VLR Deletion Operation - 1
INDEX

Locate key

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 41
VLR Deletion Operation - 2
INDEX

Locate record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 42
VLR Deletion Operation - 3
INDEX

Delete record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 43
VLR Deletion Operation - 4
INDEX

Free-List

•
•

Remember location of ‘slot’
Remember size of slot.

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 44
VLR Deletion Operation - 5
INDEX

Free-List

5. Mark index entry

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 45
VLR Addition Operation – 1a
INDEX

Free-List
New
Record

1. Search Free-List

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 46
VLR Addition Operation – 1b
INDEX

Free-List

Too Big for first
place

© Katrin Becker
All Rights Reserved

New
New
Record
Record

RECORDS
Records and Indexing

14-Sep-03 47
VLR Addition Operation – 1c
INDEX

Free-List

Too Big for
second place

New

New
RECORDS
Record
Record

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 48
VLR Addition Operation – 1d
INDEX

Free-List

Too Big for third
place

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

New
New
Record
Record
14-Sep-03 49
VLR Addition Operation – 1e
INDEX

Free-List

Place at end of file

New
Record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 50
VLR Addition Operation – 2a
INDEX

Free-List

New
New
Record
Record

Search Free-List

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 51
VLR Addition Operation – 2b
INDEX

Free-List

Fits in first place….
BUT…..

New
New
Record
Record
© Katrin Becker
All Rights Reserved

RECORDS
Records and Indexing

14-Sep-03 52
VLR Addition Operation – 2c
INDEX

We will end up with left-over
unused (and probably
unusable space).
We call this “First-Fit”
(because we are using
the first slot that we find
that fits).

Free-List

New
Record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 53
VLR Addition Operation – 2d
INDEX

If instead we keep
looking…
We find the second
entry is a better
fit…..

Free-List

New
Record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 54
VLR Addition Operation – 2e
INDEX

Free-List

The third slot does
not fit, so….

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

New
Record

14-Sep-03 55
VLR Addition Operation – 2f
INDEX

We decide to use the
second slot.
It is the Best-Fit

Free-List

New
Record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 56
VLR Addition Operation – 2g
INDEX

Free-List

1. Insert record.
3. Update Index
Notice the index entry is
sorted differently.
What’s the advantage to
leaving ‘spaces’ in the
index?

2. Delete
FreeList
entry.

New
Record

RECORDS
© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 57
VLR Modification Operation - 1
• 2 kinds:
– 1. Mod results in record remaining same
size
– 2. Mod results in record growing or
shrinking.

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 58
VLR Modification Operation - 2
• Mod results in record remaining same
size
– Same as for FLR

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 59
VLR Modification Operation - 3
• Mod results in record growing or
shrinking.
– Treat Mod as a deletion followed by an
addition.

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 60
Free-Lists
• May want to keep Free-List sorted.
• If the List is short it may not matter.
• Placement Strategies:
– First Fit
– Best Fit
– Worst Fit

• It could be its own list or we could make the
regular index serve double-duty.

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 61
Summary
• Managing space inside the file is our
business.
• We must choose:
– FLR / VLR?
– Index? (what kind?)
– Secondary indices?
– Re-claim free space? How?

© Katrin Becker
All Rights Reserved

Records and Indexing

14-Sep-03 62

Contenu connexe

Similaire à CS: Introduction to Record Manipulation & Indexing

Fundamental file structure concepts & managing files of records
Fundamental file structure concepts & managing files of recordsFundamental file structure concepts & managing files of records
Fundamental file structure concepts & managing files of recordsDevyani Vaidya
 
DFS-Lecture-6 (3).ppt
DFS-Lecture-6 (3).pptDFS-Lecture-6 (3).ppt
DFS-Lecture-6 (3).pptSatvik93
 
16119 - Get to Know Your Data Sets (1).pdf
16119 - Get to Know Your Data Sets (1).pdf16119 - Get to Know Your Data Sets (1).pdf
16119 - Get to Know Your Data Sets (1).pdf3operatordcslipiPeng
 
File organization
File organizationFile organization
File organizationGokul017
 
Trouble-shooting Tips for Primo (2013)
Trouble-shooting Tips for Primo (2013)Trouble-shooting Tips for Primo (2013)
Trouble-shooting Tips for Primo (2013)Alison Hitchens
 
Document and Records Control - Records Management
Document and Records Control - Records ManagementDocument and Records Control - Records Management
Document and Records Control - Records ManagementMelvin Limon
 
Inb343 week2 sql server intro
Inb343 week2 sql server introInb343 week2 sql server intro
Inb343 week2 sql server introFredlive503
 
Apache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysisApache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysisliang chen
 
5 data storage_and_indexing
5 data storage_and_indexing5 data storage_and_indexing
5 data storage_and_indexingUtkarsh De
 

Similaire à CS: Introduction to Record Manipulation & Indexing (11)

Fundamental file structure concepts & managing files of records
Fundamental file structure concepts & managing files of recordsFundamental file structure concepts & managing files of records
Fundamental file structure concepts & managing files of records
 
DFS-Lecture-6 (3).ppt
DFS-Lecture-6 (3).pptDFS-Lecture-6 (3).ppt
DFS-Lecture-6 (3).ppt
 
16119 - Get to Know Your Data Sets (1).pdf
16119 - Get to Know Your Data Sets (1).pdf16119 - Get to Know Your Data Sets (1).pdf
16119 - Get to Know Your Data Sets (1).pdf
 
File organization
File organizationFile organization
File organization
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Trouble-shooting Tips for Primo (2013)
Trouble-shooting Tips for Primo (2013)Trouble-shooting Tips for Primo (2013)
Trouble-shooting Tips for Primo (2013)
 
Document and Records Control - Records Management
Document and Records Control - Records ManagementDocument and Records Control - Records Management
Document and Records Control - Records Management
 
Database File operation
Database File operationDatabase File operation
Database File operation
 
Inb343 week2 sql server intro
Inb343 week2 sql server introInb343 week2 sql server intro
Inb343 week2 sql server intro
 
Apache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysisApache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysis
 
5 data storage_and_indexing
5 data storage_and_indexing5 data storage_and_indexing
5 data storage_and_indexing
 

Plus de Katrin Becker

Cross breeding animation
Cross breeding animationCross breeding animation
Cross breeding animationKatrin Becker
 
Assignments that Meet the Needs of Exceptional Students without Disadvantagin...
Assignments that Meet the Needs of Exceptional Students without Disadvantagin...Assignments that Meet the Needs of Exceptional Students without Disadvantagin...
Assignments that Meet the Needs of Exceptional Students without Disadvantagin...Katrin Becker
 
T.A.P. : The Teach Aloud Protocol
T.A.P. : The Teach Aloud ProtocolT.A.P. : The Teach Aloud Protocol
T.A.P. : The Teach Aloud ProtocolKatrin Becker
 
Misguided illusions of understanding
Misguided illusions of understandingMisguided illusions of understanding
Misguided illusions of understandingKatrin Becker
 
4 Pillars of DGBL: A Structured Rating System for Games for Learning
4 Pillars of DGBL: A Structured Rating System for Games for Learning4 Pillars of DGBL: A Structured Rating System for Games for Learning
4 Pillars of DGBL: A Structured Rating System for Games for LearningKatrin Becker
 
Gamification paradigm
Gamification paradigmGamification paradigm
Gamification paradigmKatrin Becker
 
The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...
The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...
The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...Katrin Becker
 
Gamification how to gamify learning and instruction Part 1 (of 3)
Gamification how to gamify learning and instruction Part 1 (of 3)Gamification how to gamify learning and instruction Part 1 (of 3)
Gamification how to gamify learning and instruction Part 1 (of 3)Katrin Becker
 
Gamification how to gamify learning and instruction, part 2 (of 3)
Gamification how to gamify learning and instruction, part 2 (of 3)Gamification how to gamify learning and instruction, part 2 (of 3)
Gamification how to gamify learning and instruction, part 2 (of 3)Katrin Becker
 
Is gamification a game changer
Is gamification a game changerIs gamification a game changer
Is gamification a game changerKatrin Becker
 
CS Example: Parsing a Sentence
CS Example: Parsing a Sentence CS Example: Parsing a Sentence
CS Example: Parsing a Sentence Katrin Becker
 
CS Lesson: Introduction to the Java virtual Machine
CS Lesson: Introduction to the Java virtual MachineCS Lesson: Introduction to the Java virtual Machine
CS Lesson: Introduction to the Java virtual MachineKatrin Becker
 
CS Lesson: Creating Your First Class in Java
CS Lesson: Creating Your First Class in JavaCS Lesson: Creating Your First Class in Java
CS Lesson: Creating Your First Class in JavaKatrin Becker
 
Informing pedagogy through collaborative inquiry
Informing pedagogy through collaborative inquiryInforming pedagogy through collaborative inquiry
Informing pedagogy through collaborative inquiryKatrin Becker
 
Informing SoTL using playtesting techniques
Informing SoTL using playtesting techniquesInforming SoTL using playtesting techniques
Informing SoTL using playtesting techniquesKatrin Becker
 
Using cards games as learning objects to teach genetics
Using cards games as learning objects to teach geneticsUsing cards games as learning objects to teach genetics
Using cards games as learning objects to teach geneticsKatrin Becker
 
Gamification how to gamify learning and instruction, Part 3 (of 3)
Gamification how to gamify learning and instruction, Part 3 (of 3)Gamification how to gamify learning and instruction, Part 3 (of 3)
Gamification how to gamify learning and instruction, Part 3 (of 3)Katrin Becker
 
The decorative media trap
The decorative media trapThe decorative media trap
The decorative media trapKatrin Becker
 

Plus de Katrin Becker (20)

Cross breeding animation
Cross breeding animationCross breeding animation
Cross breeding animation
 
Assignments that Meet the Needs of Exceptional Students without Disadvantagin...
Assignments that Meet the Needs of Exceptional Students without Disadvantagin...Assignments that Meet the Needs of Exceptional Students without Disadvantagin...
Assignments that Meet the Needs of Exceptional Students without Disadvantagin...
 
T.A.P. : The Teach Aloud Protocol
T.A.P. : The Teach Aloud ProtocolT.A.P. : The Teach Aloud Protocol
T.A.P. : The Teach Aloud Protocol
 
Misguided illusions of understanding
Misguided illusions of understandingMisguided illusions of understanding
Misguided illusions of understanding
 
Signature pedagogy
Signature pedagogySignature pedagogy
Signature pedagogy
 
Virtue of Failure
Virtue of FailureVirtue of Failure
Virtue of Failure
 
4 Pillars of DGBL: A Structured Rating System for Games for Learning
4 Pillars of DGBL: A Structured Rating System for Games for Learning4 Pillars of DGBL: A Structured Rating System for Games for Learning
4 Pillars of DGBL: A Structured Rating System for Games for Learning
 
Gamification paradigm
Gamification paradigmGamification paradigm
Gamification paradigm
 
The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...
The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...
The Calm and The Storm: Simulation and Games - Why All Games are Simulations ...
 
Gamification how to gamify learning and instruction Part 1 (of 3)
Gamification how to gamify learning and instruction Part 1 (of 3)Gamification how to gamify learning and instruction Part 1 (of 3)
Gamification how to gamify learning and instruction Part 1 (of 3)
 
Gamification how to gamify learning and instruction, part 2 (of 3)
Gamification how to gamify learning and instruction, part 2 (of 3)Gamification how to gamify learning and instruction, part 2 (of 3)
Gamification how to gamify learning and instruction, part 2 (of 3)
 
Is gamification a game changer
Is gamification a game changerIs gamification a game changer
Is gamification a game changer
 
CS Example: Parsing a Sentence
CS Example: Parsing a Sentence CS Example: Parsing a Sentence
CS Example: Parsing a Sentence
 
CS Lesson: Introduction to the Java virtual Machine
CS Lesson: Introduction to the Java virtual MachineCS Lesson: Introduction to the Java virtual Machine
CS Lesson: Introduction to the Java virtual Machine
 
CS Lesson: Creating Your First Class in Java
CS Lesson: Creating Your First Class in JavaCS Lesson: Creating Your First Class in Java
CS Lesson: Creating Your First Class in Java
 
Informing pedagogy through collaborative inquiry
Informing pedagogy through collaborative inquiryInforming pedagogy through collaborative inquiry
Informing pedagogy through collaborative inquiry
 
Informing SoTL using playtesting techniques
Informing SoTL using playtesting techniquesInforming SoTL using playtesting techniques
Informing SoTL using playtesting techniques
 
Using cards games as learning objects to teach genetics
Using cards games as learning objects to teach geneticsUsing cards games as learning objects to teach genetics
Using cards games as learning objects to teach genetics
 
Gamification how to gamify learning and instruction, Part 3 (of 3)
Gamification how to gamify learning and instruction, Part 3 (of 3)Gamification how to gamify learning and instruction, Part 3 (of 3)
Gamification how to gamify learning and instruction, Part 3 (of 3)
 
The decorative media trap
The decorative media trapThe decorative media trap
The decorative media trap
 

Dernier

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Dernier (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

CS: Introduction to Record Manipulation & Indexing

  • 1. Record Manipulation & Indexing •records/fields •index placement; index management •manipulating fixed-length record files •re-using space in fixed-length files •varying length records:[VLR] adds; dels; mods; •free lists for VLR - placement strategies (first, best, worst) •varying length record maintenance © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 1
  • 2. Records in General A record is: • An identifiable, describable data set • Often contains a sub-structure • Typically part of a larger structure This definition also works for: files; fields; … © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 2
  • 3. Records and Fields FILE SYSTEM containing files FILE containing records RECORD FIELD containing elements containing fields © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 3
  • 4. Record Manipulation • Operations on Records: – – – – © Katrin Becker All Rights Reserved Searches Additions Deletions Modifications Records and Indexing 14-Sep-03 4
  • 5. Record Manipulation - Search Sequential Search • While NOT done: – Position file pointer – Read record – Examine record to see if it’s the one • Yes DONE • No CONTINUE © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 5
  • 6. Other Searches • What changes? – Binary search: • We position the file pointer in a different fashion (the rest is the same) – Search with an index • We apply the search to the index and retrieve the record only when located in the index © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 6
  • 7. Record Manipulation – Addition New record gets added to the end. • Insertion into middle of file is impractical. • If there is an index, then we also perform an addition to the index (addition to the end of this list is infeasible – WHY? ). © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 7
  • 8. Addition with an Index - 1 INDEX 1. New record gets added to the end. RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 8
  • 9. Addition with an Index - 2 INDEX 2. Locate place where index entry needs to go RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 9
  • 10. Addition with an Index - 3 INDEX 3. Insert New Index entry (it’s a record too) RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 10
  • 11. Records vs. Index: Assertions & Questions • Moving file records is more expensive than moving index records. • Should index be IN record file or its own file? (How do we maintain it? ) • If IN file: should it be at the beginning, end, middle, distributed? • What if we are able to hold the index in memory? • What if we can’t? © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 11
  • 12. Record Manipulation - Deletion • Locate record (Search) • Mark space as deleted • Remove index entry? (why or why not) © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 12
  • 13. Deletion with an index - 1 INDEX 1. Locate index entry RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 13
  • 14. Deletion with an index - 2 INDEX 1. Locate index entry 2. Locate record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 14
  • 15. Deletion with an index - 3 INDEX 3. Delete (mark) record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 15
  • 16. Deletion with an index - 4 INDEX 4. Delete (mark?) index entry RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 16
  • 17. Record Manipulation - Modification • • • • Locate record Read record Modify record Re-write record (assuming fixed-size records – what if the record is now a different size? [see later]) © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 17
  • 18. File Behaviour – 1 start Record count = 9 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 18
  • 19. File Behaviour – 2 add record Record count = 10 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 19
  • 20. File Behaviour – 3 add record Record count = 11 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 20
  • 21. File Behaviour – 4 delete Record count = 10 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 21
  • 22. File Behaviour – 5 delete Record count = 9 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 22
  • 23. File Behaviour – 6 add Record count = 10 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 23
  • 24. File Behaviour – 7 add Record count = 11 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 24
  • 25. File Behaviour – 8 add Record count = 12 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 25
  • 26. File Behaviour – 9 delete Record count = 11 © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 26
  • 27. File Behaviour – 10 delete Record count = 10 And so on……. © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 27
  • 28. What’s happening to the file? • File grows – does not shrink (we get fragmentation) • We end up covering more ground to do the same job • Q: If we are doing random access, why does it matter? • The file system has less space to use (the fragmentation is internal from the perspective of the file system). • Worst case = EVERY record access ends up costing us a seek. © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 28
  • 29. Re-Using Space in the File [FLR] • When there is a deletion, locate the last record in the file, end move to the free slot – Costs: • Additional file access to locate (where will we remember where the last records is?) and retrieve last record. • Records will loose locality faster than if we simply mark the slot. (Why do we care?) © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 30
  • 30. Re-Using Space – Way 2 • Make a list of places where records have been deleted. • When doing addition, check for empty ‘slot’ before placing new record at end. Q: What about the index? • When doing deletion, add location of deleted record to ‘free-list’ © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 31
  • 31. What does the Free-List look like? INDEX All we need is the location. Order is unimportant. RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 32
  • 32. How to decide which ‘slot’ to re-use? • In FLR every slot will fit a new record. • We can just take the first one – FreeList can then be maintained as a stack (which is easy). • Do we keep Free-List information in the file? © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 33
  • 33. Indexing – What is it? • Table-of-contents for a file (directory) • Uses keys • Byte Offset (BO) vs Relative Record Number (RRN) © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 34
  • 34. Primary Key Properties: • • • • Unique Canonical Data-less Unchanging © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 35
  • 35. Indexing – How does it Look? • Must have: INDEX – Key – Way to locate record • It is itself a structure containing ‘records’ (each index entry is a record) • It may be separate from the main data or in the same file. • It may be copied into memory for manipulation and only updated infrequently; or the file copy may be maintained as well. © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 36
  • 36. Indexing – File Ops? • Tied to records: – If records added – new/update index entry – If record deleted – ‘delete’ index entry – If record modified – maybe no change to index; maybe update BO [byte offset] © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 37
  • 37. Fixed-length vs Varying Length • • • • • VLR provides greater flexibility. VLR increases maintenance overhead. VLR decreases wasted space. * VLR makes index virtually essential. VLR complicates Free-List maintenance. *may simply waste space in a different place or a different way. © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 38
  • 38. VLR Index INDEX • Requires: – Key – Byte offset – Record size? [optional] RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 39
  • 39. VLR Search Operation INDEX • • Same as for FLR: 1. Locate key in index 2. Locate record in file Binary search still possible on index, but NOT on records alone. RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 40
  • 40. VLR Deletion Operation - 1 INDEX Locate key RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 41
  • 41. VLR Deletion Operation - 2 INDEX Locate record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 42
  • 42. VLR Deletion Operation - 3 INDEX Delete record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 43
  • 43. VLR Deletion Operation - 4 INDEX Free-List • • Remember location of ‘slot’ Remember size of slot. RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 44
  • 44. VLR Deletion Operation - 5 INDEX Free-List 5. Mark index entry RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 45
  • 45. VLR Addition Operation – 1a INDEX Free-List New Record 1. Search Free-List RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 46
  • 46. VLR Addition Operation – 1b INDEX Free-List Too Big for first place © Katrin Becker All Rights Reserved New New Record Record RECORDS Records and Indexing 14-Sep-03 47
  • 47. VLR Addition Operation – 1c INDEX Free-List Too Big for second place New New RECORDS Record Record © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 48
  • 48. VLR Addition Operation – 1d INDEX Free-List Too Big for third place RECORDS © Katrin Becker All Rights Reserved Records and Indexing New New Record Record 14-Sep-03 49
  • 49. VLR Addition Operation – 1e INDEX Free-List Place at end of file New Record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 50
  • 50. VLR Addition Operation – 2a INDEX Free-List New New Record Record Search Free-List RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 51
  • 51. VLR Addition Operation – 2b INDEX Free-List Fits in first place…. BUT….. New New Record Record © Katrin Becker All Rights Reserved RECORDS Records and Indexing 14-Sep-03 52
  • 52. VLR Addition Operation – 2c INDEX We will end up with left-over unused (and probably unusable space). We call this “First-Fit” (because we are using the first slot that we find that fits). Free-List New Record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 53
  • 53. VLR Addition Operation – 2d INDEX If instead we keep looking… We find the second entry is a better fit….. Free-List New Record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 54
  • 54. VLR Addition Operation – 2e INDEX Free-List The third slot does not fit, so…. RECORDS © Katrin Becker All Rights Reserved Records and Indexing New Record 14-Sep-03 55
  • 55. VLR Addition Operation – 2f INDEX We decide to use the second slot. It is the Best-Fit Free-List New Record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 56
  • 56. VLR Addition Operation – 2g INDEX Free-List 1. Insert record. 3. Update Index Notice the index entry is sorted differently. What’s the advantage to leaving ‘spaces’ in the index? 2. Delete FreeList entry. New Record RECORDS © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 57
  • 57. VLR Modification Operation - 1 • 2 kinds: – 1. Mod results in record remaining same size – 2. Mod results in record growing or shrinking. © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 58
  • 58. VLR Modification Operation - 2 • Mod results in record remaining same size – Same as for FLR © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 59
  • 59. VLR Modification Operation - 3 • Mod results in record growing or shrinking. – Treat Mod as a deletion followed by an addition. © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 60
  • 60. Free-Lists • May want to keep Free-List sorted. • If the List is short it may not matter. • Placement Strategies: – First Fit – Best Fit – Worst Fit • It could be its own list or we could make the regular index serve double-duty. © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 61
  • 61. Summary • Managing space inside the file is our business. • We must choose: – FLR / VLR? – Index? (what kind?) – Secondary indices? – Re-claim free space? How? © Katrin Becker All Rights Reserved Records and Indexing 14-Sep-03 62