SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
Linked Lists In PerlWhat?Why bother?Steven Lembark  Workhorse Computinglembark@wrkhors.com©2009,2013 Steven Lembark
What About Arrays?● Perl arrays are fast and flexible.● Autovivication makes them generally easy to use.● Built-in Perl functions for managing them.● But there are a few limitations:● Memory manglement.● Iterating multiple lists.● Maintaining state within the lists.● Difficult to manage links within the list.
Memory Issues● Pushing a single value onto an array candouble its size.● Copying array contents is particularly expensivefor long-lived processes which have problemswith heap fragmentation.● Heavily forked processes also run into problemswith copy-on-write due to whole-array copies.● There is no way to reduce the array structuresize once it grows -- also painful for long-livedprocesses.
Comparing Multiple Lists● for( @array ) will iterate one list at a time.● Indexes work but have their own problems:● Single index requires identical offsets in all of thearrays.● Varying offsets require indexes for each array,with separate bookeeping for each.● Passing all of the arrays and objects becomes a lotof work, even if they are wrapped on objects –which have their own performance overhead.● Indexes used for external references have to beupdated with every shift, push, pop...
Linked Lists: The Other Guys● Perly code for linked lists is simple, simplifiesmemory management, and list iteration.● Trade indexed access for skip-chains andexternal references.● List operators like push, pop, and splice aresimple.● You also get more granular locking in threadedapplications.
Examples● Insertion sorts are stable, but splicing an iteminto the middle of an array is expensive;adding a new node to a linked list is cheap.● Threading can require locking an entire arrayto update it; linked lists can safely be lockedby node and frequently dont even requirelocking.● Comparing lists of nodes can be simpler thandealing with multiple arrays – especially if theoffsets change.
Implementing Perly Linked Lists● Welcome back to arrayrefs.● Arrays can hold any type of data.● Use a “next” ref and arbitrary data.● The list itself requires a static “head” and anode variable to walk down the list.● Doubly-linked lists are also manageable withthe addition of weak links.● For the sake of time Ill only discuss singly-linked lists here.
List Structure ● Singly-linked listsare usually drawnas some datafollowed by apointer to the nextlink.● In Perl it helps todraw the pointerfirst, because thatis where it isstored.
Adding a Node● New nodes do notneed to becontiguous inmemory.● Also doesnt requirelocking the entirelist.
Dropping A Node● Dropping a node releasesits memory – at last backto Perl.● Only real effect is on theprior nodes next ref.● This is the only piece thatneeds to be locked forthreading.
Perl Code● A link followed by data looks like:$node = [ $next, @data ]● Walking the list copies the node:( $node, my @data ) = @$node;●Adding a new node recycles the “next” ref:$node->[0] = [ $node->[0], @data ]●Removing recycles the nexts next:($node->[0], my @data) = @{$node->[0]};
A Reverse-Order List● Just update the head nodes next reference.● Fast because it moves the minimum of data.my $list = [ [] ];my $node = $list->[0];for my $val ( @_ ){$node->[0] = [ $node->[0], $val ]}# list is populated w/ empty tail.
In-order List● Just move the node, looks like a push.● Could be a one-liner, Ive shown it here as twooperations.my $list = [ [] ];my $node = $list->[0];for my $val ( @_ ){@$node = ( [], $val );$node = $node->[0];}
Viewing The List● Structure is recursive from Perls point of view.● Uses the one-line version (golf anyone)?DB<1> $list = [ [], head node ];DB<2> $node = $list->[0];DB<3> for( 1 .. 5 ) { ($node) = @$node = ( [], “node-$_” ) }DB<14> x $list0 ARRAY(0x8390608) $list0 ARRAY(0x83ee698) $list->[0]0 ARRAY(0x8411f88) $list->[0][0]0 ARRAY(0x83907c8) $list->[0][0][0]0 ARRAY(0x83f9a10) $list->[0][0][0][0]0 ARRAY(0x83f9a20) $list->[0][0][0][0][0]0 ARRAY(0x83f9a50) $list->[0][0][0][0][0][0]empty array empty tail node1 node-5 $list->[0][0][0][0][0][1]1 node-4 $list->[0][0][0][0][1]1 node-3 $list->[0][0][0][1]1 node-2 $list->[0][0][1]1 node-1 $list->[0][1]1 head node $list->[1]
Destroying A Linked List● Prior to 5.12, Perls memory de-allocator isrecursive.● Without a DESTROY the lists blow up after 100nodes when perl blows is stack.● The fix was an iterative destructor:● This is no longer required.DESTROY{my $list = shift;$list = $list->[0] while $list;}
Simple Linked List Class● Bless an arrayref with the head (placeholder)node and any data for tracking the list.sub new{my $proto = shift;bless [ [], @_ ], blessed $proto || $proto}# iterative < 5.12, else no-op.DESTROY {}
Building the list: unshift● One reason for the head node: it provides aplace to insert the data nodes after.● The new first node has the old first nodes“next” ref and the new data.sub unshift{my $list = shift;$list->[0] = [ $list->[0], @_ ];$list}
Taking one off: shift● This starts directly from the head node also:just replace the head nodes next with the firstnodes next.sub shift{my $list = shift;( $list->[0], my @data )= @{ $list >[0] };‑wantarray ? @data : @data}
Push Is A Little Harder● One approach is an unshift before the tail.● Another is populating the tail node:sub push{my $list = shift;my $node = $list->[0];$node = $node->[0] while $node->[0];# populate the empty tail node@{ $node } = [ [], @_ ];$list}
The Bane Of Single Links: pop● You need the node-before-the-tail to pop thetail.● By the time youve found the tail it is too lateto pop it off.● Storing the node-before-tail takes extrabookeeping.● The trick is to use two nodes, one trailing theother: when the small roast is burned the bigone is just right.
sub node_pop{my $list = shift;my $prior = $list->head;my $node = $prior->[0];while( $node->[0] ){$prior = $node;$node = $node->[0];}( $prior->[0], my @data ) = @$node;wantarray ? @data : @data}● Lexical $prior is more efficient than examining$node->[0][0] at multiple points in the loop.
Mixing OO & Procedural Code● Most of what follows could be done entirelywith method calls.● Catch: They are slower than directly accessingthe nodes.● One advantage to singly-linked lists is that thestructure is simple.● Mixing procedural and OO code without gettingtangled is easy enough.● This is why I use it for genetics code: the code isfast and simple.
Walking A List● By itself the node has enough state to tracklocation – no separate index required.● Putting the link first allows advancing andextraction of data in one access of the node:my $node = $list->[0];while( $node ){( $node, my %info ) = @$node;# process %info...}
Comparing Multiple Lists● Same code, just more assignments:my $n0 = $list0->[0];my $n1 = $list1->[0];while( $n0 && $n1 ){( $n0, my @data0 ) = @$n0;( $n1, my @data1 ) = @$n1;# deal with @data1, @data2...}
Syncopated Lists● Adjusting the offsets requires minimumbookkeeping, doesnt affect the parent list.while( @$n0, @$n1 ){$_ = $_->[0] for $n0, $n1;aligned $n0, $n1or ( $n0, $n1 ) = realign $n0, $n1or last;$score += compare $n0, $n1;}
Using The Head Node● $head->[0] is the first node, there are a fewuseful things to add into @head[1...].● Tracking the length or keeping a ref to the tail tailsimplifys push, pop; requires extra bookkeeping.● The head node can also store user-supplieddata describing the list.● I use this for tracking length and species names inresults of DNA sequences.
Close, but no cigar...● The class shown only works at the head.● Be nice to insert things in the middle withoutresorting to $node variables.● Or call methods on the internal nodes.● A really useful class would use inside-out datato track the head, for example.● Cant assign $list = $list->[0], however.● Looses the inside-out data.● We need a structure that walks the list withoutmodifying its own refaddr.
The fix: ref-to-arrayref● Scalar refs are the ultimate container struct.● They can reference anything, in this case an array.● $list stays in one place, $$list walks up the list.● “head” or “next” modify $$list to repositionthe location.● Saves blessing every node on the list.● Simplifies having a separate class for nodes.● Also helps when resorting to procedural code forspeed.
Basics Dont Change Muchsub new{my $proto = shift;my $head = [ [], @_ ];my $list = $head;$headz{ refaddr $list } = $head;bless $list, blessed $proto || $proto}DESTROY # add iteration for < v5.12.{my $list = shift;delete $headz{ refaddr $list };}● List updates assign to $$list.● DESTROY cleans up inside-out data.
Walking The Listsub head{my $list = shift$$list = $headz{ refaddr $list };$list}sub next{my $list = shift;( $$list, my @data ) = @$$listor returnwantarray ? @data : @data}$list->head;while( my @data = $list->next ) { ... }
Reverse-order revisited:● Unshift isnt much different.● Note that $list is not updated.sub unshift{my $list = shift;my $head = $list->head;$head->[0] = [ $head->[0], @_ ];$list}my $list = List::Class->new( ... );$list->unshift( $_ ) for @data;
Useful shortcut: add_after● Unlike arrays, adding into the middle of a list isefficient and common.● “push” adds to the tail, need something else.● add_after() puts a node after the current one.● unshift() is really “$list->head->add_after”.● Use with “next_node” that ignores the data.● In-order addition (head, middle, or end):$list->add_after( $_ )->next for @data;● Helps if next() avoids walking off the list.
sub next{my $list = shift;my $next = $$list->[0] or return;@$next or return;$$list = $next;$list}sub add_after{my $list = shift;my $node = $$list;$node->[0] = [ $node->[0], @_ ]$list}
Off-by-one gotchas● The head node does not have any user data.● Common mistake: $list->head->data.● This gets you the lists data, not users.● Fix is to pre-increment the list:$list->head->next or last;while( @data = $list->next ) {...}
Overloading a list: bool, offset.● while( $list ), if( $list ) would be nice.● Basically, this leaves the list “true” if it hasdata; false if it is at the tail node.use overloadq{bool} =>sub{my $list = shift;$$list},
Offsets would be nice alsoq{++} => sub{my $list = shift;my $node = $$list;@$node and $$list= $node->[0];$list},q{+} => sub{my ( $list, $offset ) = $_[2] ? ...my $node = $$list;for ( 1 .. $offset ){@$node && $node = $node->[0]or last;}$node},
Updating the list becomes trivial● An offset from the list is a node.● That leaves += simply assigning $list + $off.q{+=} =>sub{my ( $list, $offset ) = …$$list = $list + $offset;$listh};
Backdoor: node operations● Be nice to extract a node without having tocreep around inside of the object.● Handing back the node ref saves derivedclasses from having to unwrap the object.● Also save having to update the list objectslocation to peek into the next or head node.sub curr_node { ${ $_[0] } }sub next_node { ${ $_[0] }->[0] }sub root_node { $headz{ refaddr $_[0] } }sub head_node { $headz{ refaddr $_[0] }->[0] }
Skip chains: bookmarks for lists● Separate list of interesting nodes.● Alphabetical sort would have skip-chain of firstletters or prefixes.● In-list might have ref to next “interesting” node.● Placeholders simplify bookkeeping.● For alphabetic, pre-load A .. Z into the list.● Saves updating the skip chain for inserts prior tothe currently referenced node.
Applying Perly Linked Lists● I use them in the W-curve code for comparingDNA sequences.● The comparison has to deal with different sizes,local gaps between the curves.● Comparison requires fast lookahead for the nextinteresting node on the list.● Nodes and skip chains do the job nicely.● List structure allows efficient node updateswithout disturbing the object.
W-curve is derived from LL::S● Nodes have three spatial values and a skip-chaininitialized after the list is initialized.sub initialize{my ( $wc, $dna ) = @$_;my $pt = [ 0, 0, 0 ];$wc->head->truncate;while( my $a = substr $dna, 0, 1,  ){$pt = $wc->next_point( $a, $pt );$wc->add_after( @$pt,  )->next;}$wc}
Skip-chain looks for “peaks”● The alignment algorithm looks for nearbypoints ignoring their Z value.● Comparing the sparse list of radii > 0.50 speedsup the alignment.● Skip-chains for each node point to the next nodewith a large-enough radius.● Building the skip chain uses an “inch worm”.● The head walks up to the next useful node.● The tail fills ones between with a node refrence.
Skip Chain:“interesting”nodes.sub add_skip{my $list = shift;my $node = $list->head_node;my $skip = $node->[0];for( 1 .. $list->size ){$skip->[1] > $cutoff or next;while( $node != $skip ){$node->[4] = $skip; # replace “”$node = $node->[0]; # next node}}continue{$skip = $skip->[0];}}
Use nodes to compare lists.● DNA sequences can become offset due to gapson either sequence.● This prevents using a single index to comparelists stored as arrays.● A linked list can be re-aligned passing only thenodes.● Comparison can be re-started with only the nodes.● Makes for a useful mix of OO and proceduralcode.
● Re-aligningthe listssimplyrequiresassigning thelocal values.● Updating thenode varsdoes notaffect the listlocations ifcomparefails.sub compare_lists{...while( @$node0 && @$node1 ){( $node0, $node1 )= realign_nodes $node0, $node1or last;( $dist, $node0, $node1 )= compare_aligned $node0, $node1;$score += $dist;}if( defined $score ){$list0->node( $node0 );$list1->node( $node1 );}# caller gets back unused portion.( $score, $list0, $list1 )}
sub compare_aligned{my ( $node0, $node1 ) = @_;my $sum = 0;my $dist = 0;while( @$node0 && @$node1 ){$dist = distance $node0, $node1// last;$sum += $dist;$_ = $_->[0] for $node0, $node1;}( $sum, $node0, $node1 )}● Comparealignedhands backthe unusedportion.● Caller getsback thenodes to re-align ifthere is agap.
Other Uses for Linked Lists● Convenient trees.● Each level of tree is a list.● The data is a list of children.● Balancing trees only updates a couple of next refs.● Arrays work but get expensive.● Need to copy entire sub-lists for modification.
Two-dimensional lists● If the data at each node is a list you get two-dimensional lists.● Four-way linked lists are the guts ofspreadsheets.● Inserting a column or row does not require re-allocating the existing list.● Deleting a row or column returns data to the heap.● A multiply-linked array allows immediate jumpsto neighboring cells.● Three-dimensional lists add “sheets”.
Work Queues● Pushing pairs of nodes onto an array makes forsimple queued analysis of the lists.● Adding to the list doesnt invalidate the queue.● Circular lists last node points back to the head.● Used for queues where one thread inserts newlinks, others remove them for processing.● Minimal locking reduces overhead.● New tasks get inserted after the last one.● Worker tasks just keep walking the list fromwhere they last slept.
Construct a circular linked list:DB<1> $list = [];DB<2> @$list = ( $list, head node );DB<3> $node = $list;DB<4> for( 1 .. 5 ) { $node->[0] = [ $node->[0], "node $_" ] }DB<5> x $list0 ARRAY(0xe87d00)0 ARRAY(0xe8a3a8)0 ARRAY(0xc79758)0 ARRAY(0xe888e0)0 ARRAY(0xea31b0)0 ARRAY(0xea31c8)0 ARRAY(0xe87d00)-> REUSED_ADDRESS1 node 11 node 21 node 31 node 41 node 51 head node● No end, use $node != $list as sentinel value.● weaken( $list->[0] ) if list goes out of scope.
Summary● Linked lists can be quite lazy for a variety ofuses in Perl.● Singly-linked lists are simple to implement,efficient for “walking” the list.● Tradeoff for random access:● Memory allocation for large, varying lists.● Simpler comparison of multiple lists.● Skip-chains.● Reduced locking in threads.

Contenu connexe

Tendances

Logic synthesis with synopsys design compiler
Logic synthesis with synopsys design compilerLogic synthesis with synopsys design compiler
Logic synthesis with synopsys design compiler
naeemtayyab
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
Shivam Gupta
 
crosstalk minimisation using vlsi
crosstalk minimisation using vlsicrosstalk minimisation using vlsi
crosstalk minimisation using vlsi
subhradeep mitra
 
SCSI Protocol
SCSI ProtocolSCSI Protocol
SCSI Protocol
Rakesh T
 

Tendances (20)

Windows Crash Dump Analysis
Windows Crash Dump AnalysisWindows Crash Dump Analysis
Windows Crash Dump Analysis
 
Asic design flow
Asic design flowAsic design flow
Asic design flow
 
Pci express technology 3.0
Pci express technology 3.0Pci express technology 3.0
Pci express technology 3.0
 
Logic synthesis with synopsys design compiler
Logic synthesis with synopsys design compilerLogic synthesis with synopsys design compiler
Logic synthesis with synopsys design compiler
 
SOC design
SOC design SOC design
SOC design
 
Cryptographic Algorithms: DES and RSA
Cryptographic Algorithms: DES and RSACryptographic Algorithms: DES and RSA
Cryptographic Algorithms: DES and RSA
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
 
IPv6 next generation protocol
IPv6 next generation protocolIPv6 next generation protocol
IPv6 next generation protocol
 
Direct Memory Access (DMA)-Working and Implementation
Direct Memory Access (DMA)-Working and ImplementationDirect Memory Access (DMA)-Working and Implementation
Direct Memory Access (DMA)-Working and Implementation
 
crosstalk minimisation using vlsi
crosstalk minimisation using vlsicrosstalk minimisation using vlsi
crosstalk minimisation using vlsi
 
PE File Format
PE File FormatPE File Format
PE File Format
 
system verilog
system verilogsystem verilog
system verilog
 
Assignment on different types of addressing modes
Assignment on different types of addressing modesAssignment on different types of addressing modes
Assignment on different types of addressing modes
 
PCI express
PCI expressPCI express
PCI express
 
Verilog hdl
Verilog hdlVerilog hdl
Verilog hdl
 
Introduction to System verilog
Introduction to System verilog Introduction to System verilog
Introduction to System verilog
 
SCSI Protocol
SCSI ProtocolSCSI Protocol
SCSI Protocol
 
Verilog Lecture2 thhts
Verilog Lecture2 thhtsVerilog Lecture2 thhts
Verilog Lecture2 thhts
 
Single instruction multiple data
Single instruction multiple dataSingle instruction multiple data
Single instruction multiple data
 
Injection on Steroids: Codeless code injection and 0-day techniques
Injection on Steroids: Codeless code injection and 0-day techniquesInjection on Steroids: Codeless code injection and 0-day techniques
Injection on Steroids: Codeless code injection and 0-day techniques
 

Similaire à Linked Lists With Perl: Why bother? (6)

Intro to Table-Grouping™ technology
Intro to Table-Grouping™ technologyIntro to Table-Grouping™ technology
Intro to Table-Grouping™ technology
 
From java to rails
From java to railsFrom java to rails
From java to rails
 
Heap property
Heap propertyHeap property
Heap property
 
Heap property
Heap propertyHeap property
Heap property
 
Stacks fundamentals
Stacks fundamentalsStacks fundamentals
Stacks fundamentals
 
Our Friends the Utils: A highway traveled by wheels we didn't re-invent.
Our Friends the Utils: A highway traveled by wheels we didn't re-invent. Our Friends the Utils: A highway traveled by wheels we didn't re-invent.
Our Friends the Utils: A highway traveled by wheels we didn't re-invent.
 

Plus de Workhorse Computing

Plus de Workhorse Computing (20)

Wheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility ModulesWheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility Modules
 
mro-every.pdf
mro-every.pdfmro-every.pdf
mro-every.pdf
 
Paranormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add UpParanormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add Up
 
The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.
 
Unit Testing Lots of Perl
Unit Testing Lots of PerlUnit Testing Lots of Perl
Unit Testing Lots of Perl
 
Generating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in PosgresqlGenerating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in Posgresql
 
Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!
 
BSDM with BASH: Command Interpolation
BSDM with BASH: Command InterpolationBSDM with BASH: Command Interpolation
BSDM with BASH: Command Interpolation
 
Findbin libs
Findbin libsFindbin libs
Findbin libs
 
Memory Manglement in Raku
Memory Manglement in RakuMemory Manglement in Raku
Memory Manglement in Raku
 
BASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic InterpolationBASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic Interpolation
 
Effective Benchmarks
Effective BenchmarksEffective Benchmarks
Effective Benchmarks
 
Metadata-driven Testing
Metadata-driven TestingMetadata-driven Testing
Metadata-driven Testing
 
The W-curve and its application.
The W-curve and its application.The W-curve and its application.
The W-curve and its application.
 
Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.
 
Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.
 
Smoking docker
Smoking dockerSmoking docker
Smoking docker
 
Getting Testy With Perl6
Getting Testy With Perl6Getting Testy With Perl6
Getting Testy With Perl6
 
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
 
Neatly folding-a-tree
Neatly folding-a-treeNeatly folding-a-tree
Neatly folding-a-tree
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Linked Lists With Perl: Why bother?

  • 1. Linked Lists In PerlWhat?Why bother?Steven Lembark  Workhorse Computinglembark@wrkhors.com©2009,2013 Steven Lembark
  • 2. What About Arrays?● Perl arrays are fast and flexible.● Autovivication makes them generally easy to use.● Built-in Perl functions for managing them.● But there are a few limitations:● Memory manglement.● Iterating multiple lists.● Maintaining state within the lists.● Difficult to manage links within the list.
  • 3. Memory Issues● Pushing a single value onto an array candouble its size.● Copying array contents is particularly expensivefor long-lived processes which have problemswith heap fragmentation.● Heavily forked processes also run into problemswith copy-on-write due to whole-array copies.● There is no way to reduce the array structuresize once it grows -- also painful for long-livedprocesses.
  • 4. Comparing Multiple Lists● for( @array ) will iterate one list at a time.● Indexes work but have their own problems:● Single index requires identical offsets in all of thearrays.● Varying offsets require indexes for each array,with separate bookeeping for each.● Passing all of the arrays and objects becomes a lotof work, even if they are wrapped on objects –which have their own performance overhead.● Indexes used for external references have to beupdated with every shift, push, pop...
  • 5. Linked Lists: The Other Guys● Perly code for linked lists is simple, simplifiesmemory management, and list iteration.● Trade indexed access for skip-chains andexternal references.● List operators like push, pop, and splice aresimple.● You also get more granular locking in threadedapplications.
  • 6. Examples● Insertion sorts are stable, but splicing an iteminto the middle of an array is expensive;adding a new node to a linked list is cheap.● Threading can require locking an entire arrayto update it; linked lists can safely be lockedby node and frequently dont even requirelocking.● Comparing lists of nodes can be simpler thandealing with multiple arrays – especially if theoffsets change.
  • 7. Implementing Perly Linked Lists● Welcome back to arrayrefs.● Arrays can hold any type of data.● Use a “next” ref and arbitrary data.● The list itself requires a static “head” and anode variable to walk down the list.● Doubly-linked lists are also manageable withthe addition of weak links.● For the sake of time Ill only discuss singly-linked lists here.
  • 8. List Structure ● Singly-linked listsare usually drawnas some datafollowed by apointer to the nextlink.● In Perl it helps todraw the pointerfirst, because thatis where it isstored.
  • 9. Adding a Node● New nodes do notneed to becontiguous inmemory.● Also doesnt requirelocking the entirelist.
  • 10. Dropping A Node● Dropping a node releasesits memory – at last backto Perl.● Only real effect is on theprior nodes next ref.● This is the only piece thatneeds to be locked forthreading.
  • 11. Perl Code● A link followed by data looks like:$node = [ $next, @data ]● Walking the list copies the node:( $node, my @data ) = @$node;●Adding a new node recycles the “next” ref:$node->[0] = [ $node->[0], @data ]●Removing recycles the nexts next:($node->[0], my @data) = @{$node->[0]};
  • 12. A Reverse-Order List● Just update the head nodes next reference.● Fast because it moves the minimum of data.my $list = [ [] ];my $node = $list->[0];for my $val ( @_ ){$node->[0] = [ $node->[0], $val ]}# list is populated w/ empty tail.
  • 13. In-order List● Just move the node, looks like a push.● Could be a one-liner, Ive shown it here as twooperations.my $list = [ [] ];my $node = $list->[0];for my $val ( @_ ){@$node = ( [], $val );$node = $node->[0];}
  • 14. Viewing The List● Structure is recursive from Perls point of view.● Uses the one-line version (golf anyone)?DB<1> $list = [ [], head node ];DB<2> $node = $list->[0];DB<3> for( 1 .. 5 ) { ($node) = @$node = ( [], “node-$_” ) }DB<14> x $list0 ARRAY(0x8390608) $list0 ARRAY(0x83ee698) $list->[0]0 ARRAY(0x8411f88) $list->[0][0]0 ARRAY(0x83907c8) $list->[0][0][0]0 ARRAY(0x83f9a10) $list->[0][0][0][0]0 ARRAY(0x83f9a20) $list->[0][0][0][0][0]0 ARRAY(0x83f9a50) $list->[0][0][0][0][0][0]empty array empty tail node1 node-5 $list->[0][0][0][0][0][1]1 node-4 $list->[0][0][0][0][1]1 node-3 $list->[0][0][0][1]1 node-2 $list->[0][0][1]1 node-1 $list->[0][1]1 head node $list->[1]
  • 15. Destroying A Linked List● Prior to 5.12, Perls memory de-allocator isrecursive.● Without a DESTROY the lists blow up after 100nodes when perl blows is stack.● The fix was an iterative destructor:● This is no longer required.DESTROY{my $list = shift;$list = $list->[0] while $list;}
  • 16. Simple Linked List Class● Bless an arrayref with the head (placeholder)node and any data for tracking the list.sub new{my $proto = shift;bless [ [], @_ ], blessed $proto || $proto}# iterative < 5.12, else no-op.DESTROY {}
  • 17. Building the list: unshift● One reason for the head node: it provides aplace to insert the data nodes after.● The new first node has the old first nodes“next” ref and the new data.sub unshift{my $list = shift;$list->[0] = [ $list->[0], @_ ];$list}
  • 18. Taking one off: shift● This starts directly from the head node also:just replace the head nodes next with the firstnodes next.sub shift{my $list = shift;( $list->[0], my @data )= @{ $list >[0] };‑wantarray ? @data : @data}
  • 19. Push Is A Little Harder● One approach is an unshift before the tail.● Another is populating the tail node:sub push{my $list = shift;my $node = $list->[0];$node = $node->[0] while $node->[0];# populate the empty tail node@{ $node } = [ [], @_ ];$list}
  • 20. The Bane Of Single Links: pop● You need the node-before-the-tail to pop thetail.● By the time youve found the tail it is too lateto pop it off.● Storing the node-before-tail takes extrabookeeping.● The trick is to use two nodes, one trailing theother: when the small roast is burned the bigone is just right.
  • 21. sub node_pop{my $list = shift;my $prior = $list->head;my $node = $prior->[0];while( $node->[0] ){$prior = $node;$node = $node->[0];}( $prior->[0], my @data ) = @$node;wantarray ? @data : @data}● Lexical $prior is more efficient than examining$node->[0][0] at multiple points in the loop.
  • 22. Mixing OO & Procedural Code● Most of what follows could be done entirelywith method calls.● Catch: They are slower than directly accessingthe nodes.● One advantage to singly-linked lists is that thestructure is simple.● Mixing procedural and OO code without gettingtangled is easy enough.● This is why I use it for genetics code: the code isfast and simple.
  • 23. Walking A List● By itself the node has enough state to tracklocation – no separate index required.● Putting the link first allows advancing andextraction of data in one access of the node:my $node = $list->[0];while( $node ){( $node, my %info ) = @$node;# process %info...}
  • 24. Comparing Multiple Lists● Same code, just more assignments:my $n0 = $list0->[0];my $n1 = $list1->[0];while( $n0 && $n1 ){( $n0, my @data0 ) = @$n0;( $n1, my @data1 ) = @$n1;# deal with @data1, @data2...}
  • 25. Syncopated Lists● Adjusting the offsets requires minimumbookkeeping, doesnt affect the parent list.while( @$n0, @$n1 ){$_ = $_->[0] for $n0, $n1;aligned $n0, $n1or ( $n0, $n1 ) = realign $n0, $n1or last;$score += compare $n0, $n1;}
  • 26. Using The Head Node● $head->[0] is the first node, there are a fewuseful things to add into @head[1...].● Tracking the length or keeping a ref to the tail tailsimplifys push, pop; requires extra bookkeeping.● The head node can also store user-supplieddata describing the list.● I use this for tracking length and species names inresults of DNA sequences.
  • 27. Close, but no cigar...● The class shown only works at the head.● Be nice to insert things in the middle withoutresorting to $node variables.● Or call methods on the internal nodes.● A really useful class would use inside-out datato track the head, for example.● Cant assign $list = $list->[0], however.● Looses the inside-out data.● We need a structure that walks the list withoutmodifying its own refaddr.
  • 28. The fix: ref-to-arrayref● Scalar refs are the ultimate container struct.● They can reference anything, in this case an array.● $list stays in one place, $$list walks up the list.● “head” or “next” modify $$list to repositionthe location.● Saves blessing every node on the list.● Simplifies having a separate class for nodes.● Also helps when resorting to procedural code forspeed.
  • 29. Basics Dont Change Muchsub new{my $proto = shift;my $head = [ [], @_ ];my $list = $head;$headz{ refaddr $list } = $head;bless $list, blessed $proto || $proto}DESTROY # add iteration for < v5.12.{my $list = shift;delete $headz{ refaddr $list };}● List updates assign to $$list.● DESTROY cleans up inside-out data.
  • 30. Walking The Listsub head{my $list = shift$$list = $headz{ refaddr $list };$list}sub next{my $list = shift;( $$list, my @data ) = @$$listor returnwantarray ? @data : @data}$list->head;while( my @data = $list->next ) { ... }
  • 31. Reverse-order revisited:● Unshift isnt much different.● Note that $list is not updated.sub unshift{my $list = shift;my $head = $list->head;$head->[0] = [ $head->[0], @_ ];$list}my $list = List::Class->new( ... );$list->unshift( $_ ) for @data;
  • 32. Useful shortcut: add_after● Unlike arrays, adding into the middle of a list isefficient and common.● “push” adds to the tail, need something else.● add_after() puts a node after the current one.● unshift() is really “$list->head->add_after”.● Use with “next_node” that ignores the data.● In-order addition (head, middle, or end):$list->add_after( $_ )->next for @data;● Helps if next() avoids walking off the list.
  • 33. sub next{my $list = shift;my $next = $$list->[0] or return;@$next or return;$$list = $next;$list}sub add_after{my $list = shift;my $node = $$list;$node->[0] = [ $node->[0], @_ ]$list}
  • 34. Off-by-one gotchas● The head node does not have any user data.● Common mistake: $list->head->data.● This gets you the lists data, not users.● Fix is to pre-increment the list:$list->head->next or last;while( @data = $list->next ) {...}
  • 35. Overloading a list: bool, offset.● while( $list ), if( $list ) would be nice.● Basically, this leaves the list “true” if it hasdata; false if it is at the tail node.use overloadq{bool} =>sub{my $list = shift;$$list},
  • 36. Offsets would be nice alsoq{++} => sub{my $list = shift;my $node = $$list;@$node and $$list= $node->[0];$list},q{+} => sub{my ( $list, $offset ) = $_[2] ? ...my $node = $$list;for ( 1 .. $offset ){@$node && $node = $node->[0]or last;}$node},
  • 37. Updating the list becomes trivial● An offset from the list is a node.● That leaves += simply assigning $list + $off.q{+=} =>sub{my ( $list, $offset ) = …$$list = $list + $offset;$listh};
  • 38. Backdoor: node operations● Be nice to extract a node without having tocreep around inside of the object.● Handing back the node ref saves derivedclasses from having to unwrap the object.● Also save having to update the list objectslocation to peek into the next or head node.sub curr_node { ${ $_[0] } }sub next_node { ${ $_[0] }->[0] }sub root_node { $headz{ refaddr $_[0] } }sub head_node { $headz{ refaddr $_[0] }->[0] }
  • 39. Skip chains: bookmarks for lists● Separate list of interesting nodes.● Alphabetical sort would have skip-chain of firstletters or prefixes.● In-list might have ref to next “interesting” node.● Placeholders simplify bookkeeping.● For alphabetic, pre-load A .. Z into the list.● Saves updating the skip chain for inserts prior tothe currently referenced node.
  • 40. Applying Perly Linked Lists● I use them in the W-curve code for comparingDNA sequences.● The comparison has to deal with different sizes,local gaps between the curves.● Comparison requires fast lookahead for the nextinteresting node on the list.● Nodes and skip chains do the job nicely.● List structure allows efficient node updateswithout disturbing the object.
  • 41. W-curve is derived from LL::S● Nodes have three spatial values and a skip-chaininitialized after the list is initialized.sub initialize{my ( $wc, $dna ) = @$_;my $pt = [ 0, 0, 0 ];$wc->head->truncate;while( my $a = substr $dna, 0, 1, ){$pt = $wc->next_point( $a, $pt );$wc->add_after( @$pt, )->next;}$wc}
  • 42. Skip-chain looks for “peaks”● The alignment algorithm looks for nearbypoints ignoring their Z value.● Comparing the sparse list of radii > 0.50 speedsup the alignment.● Skip-chains for each node point to the next nodewith a large-enough radius.● Building the skip chain uses an “inch worm”.● The head walks up to the next useful node.● The tail fills ones between with a node refrence.
  • 43. Skip Chain:“interesting”nodes.sub add_skip{my $list = shift;my $node = $list->head_node;my $skip = $node->[0];for( 1 .. $list->size ){$skip->[1] > $cutoff or next;while( $node != $skip ){$node->[4] = $skip; # replace “”$node = $node->[0]; # next node}}continue{$skip = $skip->[0];}}
  • 44. Use nodes to compare lists.● DNA sequences can become offset due to gapson either sequence.● This prevents using a single index to comparelists stored as arrays.● A linked list can be re-aligned passing only thenodes.● Comparison can be re-started with only the nodes.● Makes for a useful mix of OO and proceduralcode.
  • 45. ● Re-aligningthe listssimplyrequiresassigning thelocal values.● Updating thenode varsdoes notaffect the listlocations ifcomparefails.sub compare_lists{...while( @$node0 && @$node1 ){( $node0, $node1 )= realign_nodes $node0, $node1or last;( $dist, $node0, $node1 )= compare_aligned $node0, $node1;$score += $dist;}if( defined $score ){$list0->node( $node0 );$list1->node( $node1 );}# caller gets back unused portion.( $score, $list0, $list1 )}
  • 46. sub compare_aligned{my ( $node0, $node1 ) = @_;my $sum = 0;my $dist = 0;while( @$node0 && @$node1 ){$dist = distance $node0, $node1// last;$sum += $dist;$_ = $_->[0] for $node0, $node1;}( $sum, $node0, $node1 )}● Comparealignedhands backthe unusedportion.● Caller getsback thenodes to re-align ifthere is agap.
  • 47. Other Uses for Linked Lists● Convenient trees.● Each level of tree is a list.● The data is a list of children.● Balancing trees only updates a couple of next refs.● Arrays work but get expensive.● Need to copy entire sub-lists for modification.
  • 48. Two-dimensional lists● If the data at each node is a list you get two-dimensional lists.● Four-way linked lists are the guts ofspreadsheets.● Inserting a column or row does not require re-allocating the existing list.● Deleting a row or column returns data to the heap.● A multiply-linked array allows immediate jumpsto neighboring cells.● Three-dimensional lists add “sheets”.
  • 49. Work Queues● Pushing pairs of nodes onto an array makes forsimple queued analysis of the lists.● Adding to the list doesnt invalidate the queue.● Circular lists last node points back to the head.● Used for queues where one thread inserts newlinks, others remove them for processing.● Minimal locking reduces overhead.● New tasks get inserted after the last one.● Worker tasks just keep walking the list fromwhere they last slept.
  • 50. Construct a circular linked list:DB<1> $list = [];DB<2> @$list = ( $list, head node );DB<3> $node = $list;DB<4> for( 1 .. 5 ) { $node->[0] = [ $node->[0], "node $_" ] }DB<5> x $list0 ARRAY(0xe87d00)0 ARRAY(0xe8a3a8)0 ARRAY(0xc79758)0 ARRAY(0xe888e0)0 ARRAY(0xea31b0)0 ARRAY(0xea31c8)0 ARRAY(0xe87d00)-> REUSED_ADDRESS1 node 11 node 21 node 31 node 41 node 51 head node● No end, use $node != $list as sentinel value.● weaken( $list->[0] ) if list goes out of scope.
  • 51. Summary● Linked lists can be quite lazy for a variety ofuses in Perl.● Singly-linked lists are simple to implement,efficient for “walking” the list.● Tradeoff for random access:● Memory allocation for large, varying lists.● Simpler comparison of multiple lists.● Skip-chains.● Reduced locking in threads.