SlideShare une entreprise Scribd logo
1  sur  45
Télécharger pour lire hors ligne
Steven Lembark
Workhorse Computing
lembark@wrkhors.com
There was Spaghetti Code.
And it was bad.
There was Spaghetti Code.
And it was bad.
So we invented Objects.
There was Spaghetti Code.
And it was bad.
So we invented Objects.
Now we have Spaghetti Objects.
Based on Lambda Calculus.
Few basic ideas:
Transparency.
Consistency.
Constant data.
Transparent transforms.
Functions require input.
Output determined fully by inputs.
Avoid internal state & side effects.
time()
random()
readline()
fetchrow_array()
Result: State matters!
Fix: Apply reality.
Used with AWS “Glacier” service.
$0.01/GiB/Month.
Large, cold data (discounts for EiB, PiB).
Uploads require lots of sha256 values.
Uploads chunked in multiples of 1MB.
Digest for each chunk & entire upload.
Result: tree-hash.
Image from Amazon Developer Guide (API Version 2012-06-01)
http://docs.aws.amazon.com/amazonglacier/latest/dev/checksum-calculations.html
sub calc_tree
{
my ($self) = @_;
my $prev_level = 0;
while (scalar @{ $self->{tree}->[$prev_level] } > 1) {
my $curr_level = $prev_level+1;
$self->{tree}->[$curr_level] = [];
my $prev_tree = $self->{tree}->[$prev_level];
my $curr_tree = $self->{tree}->[$curr_level];
my $len = scalar @$prev_tree;
for (my $i = 0; $i < $len; $i += 2) {
if ($len - $i > 1) {
my $a = $prev_tree->[$i];
my $b = $prev_tree->[$i+1];
push @$curr_tree, { hash => sha256( $a->{hash}.$b->{hash} ),
start => $a->{start}, finish => $b->{finish}, joined => 0 };
} else {
push @$curr_tree, $prev_tree->[$i];
}
}
$prev_level = $curr_level;
}
}
Trees are naturally recursive.
Two-step generation:
Split the buffer.
Reduce the hashes.
Reduce pairs.
Until one
value remains.
sub reduce_hash
{
# undef for empty list
@_ > 1 or return $_[0];
my $count = @_ / 2 + @_ % 2;
reduce_hash
map
{
@_ > 1
? sha256 splice @_, 0, 2
: shift
}
( 1 .. $count )
}
Reduce pairs.
Until one
value remains.
Catch:
Eats Stack
sub reduce_hash
{
# undef for empty list
@_ > 1 or return $_[0];
my $count = @_ / 2 + @_ % 2;
reduce_hash
map
{
@_ > 1
? sha256 splice @_, 0, 2
: shift
}
( 1 .. $count )
}
Tail recursion is common.
“Tail call elimination” recycles stack.
“Fold” is a feature of FP languages.
Reduces the stack to a scalar.
Reset
the stack.
Restart
the sub.
my $foo =
sub
{
@_ > 1 or return $_[0];
@_ = … ;
# new in v5.16
goto __SUB__
};
Voila!
Stack
shrinks.
sub reduce_hash
{
@_ > 1 or return $_[0];
my $count = @_ / 2 + @_ % 2;
@_
= map
{
@_ > 1
? sha256 splice @_, 0, 2
: @_
}
( 1 .. $count );
goto __SUB__
}
Voila!
Stack
shrinks.
@_ =
goto
scare
people.
sub reduce_hash
{
@_ > 1 or return $_[0];
my $count = @_ / 2 + @_ % 2;
@_
= map
{
@_ > 1
? sha256 splice @_, 0, 2
: @_
}
( 1 .. $count );
goto __SUB__
}
See K::D
POD for
{{{…}}}
to avoid
"@_".
use Keyword::Declare;
keyword tree_fold ( Ident $name, Block $new_list )
{
qq # this is source code, not a subref!
{
sub $name
{
@_ > 1 or return $_[0];
@_ = do $new_list;
goto __SUB__
}
}
}
User
supplies
generator
a.k.a
$new_list
tree_fold reduce_hash
{
my $count = @_ / 2 + @_ % 2;
map
{
@_ > 1
? sha256 splice @_, 0, 2
: @_
}
( 1 .. $count )
}
User
supplies
generator.
NQFP:
Hacks the
stack.
tree_fold reduce_hash
{
my $count = @_ / 2 + @_ % 2;
map
{
@_ > 1
? sha256 splice @_, 0, 2
: @_
}
( 1 .. $count )
}
Replace splice
with offsets.
tree_fold reduce_hash
{
my $last = @_ / 2 + @_ % 2 – 1;
map
{
$_[ $_ + 1 ]
? sha256 @_[ $_, $_ + 1 ]
: $_[ $_ ]
}
map
{
2 * $_
}
( 0 .. $last )
}
Replace splice
with offsets.
Still messy:
@_,
stacked map.
tree_fold reduce_hash
{
my $last = @_ / 2 + @_ % 2 – 1;
map
{
$_[ $_ + 1 ]
? sha256 @_[ $_, $_ + 1 ]
: $_[ $_ ]
}
map
{
2 * $_
}
( 0 .. $last )
}
Declare
fold_hash with
parameters.
Caller uses
lexical vars.
keyword tree_fold
(
Ident $name,
List $argz,
Block $stack_op
)
{
...
}
Extract lexical
variables.
See also:
PPI::Token
my @varz # ( '$foo', '$bar' )
= map
{
$_->isa( 'PPI::Token::Symbol' )
? $_->{ content }
: ()
}
map
{
$_->isa( 'PPI::Statement::Expression' )
? @{ $_->{ children } }
: ()
}
@{ $argz->{ children } };
Count & offset
used to extract
stack.
my $lexical = join ',' => @varz;
my $count = @varz;
my $offset = $count -1;
sub $name
{
@_ > 1 or return $_[0];
my $last
= @_ % $count
? int( @_ / $count )
: int( @_ / $count ) - 1
;
...
Interpolate
lexicals,
count, offset,
stack op.
@_
= map
{
my ( $lexical )
= @_[ $_ .. $_ + $offset ];
do $stack_op
}
map
{
$_ * $count
}
( 0 .. $last );
goto __SUB__
Not much
body left:
tree_fold reduce_hash($left, $rite)
{
$rite
? sha2656 $left, $rite
: $left
}
Explicit map,
keyword with
and without
lexicals.
4-32MiB
are good
chunk sizes.
MiB Explicit Implicit Keyword
1 0.02 0.01 0.02
2 0.03 0.03 0.04
4 0.07 0.07 0.07
8 0.14 0.13 0.10
16 0.19 0.18 0.17
32 0.31 0.30 0.26
64 0.50 0.51 0.49
128 1.00 1.02 1.01
256 2.03 2.03 2.03
512 4.05 4.10 4.06
1024 8.10 8.10 8.11
Don’t need Haskell or Scala.
Efficient and elegant functional code.
In Perl 5.
Don’t need Haskell or Scala.
Efficient and elegant functional code.
In Perl 6?
Don’t need Haskell or Scala.
Efficient and elegant functional code.
In Perl 6?
Doubt if even Damian could do it better.
use v6;
sub tree_hash (Str $data, Int :$chunk_size = 1024²) {
reduce_hash
map &sha256,
comb / . ** {1..$chunk_size} /,
$data
}
multi sub reduce_hash ( @nodes) { reduce_hash redigest @nodes }
multi sub reduce_hash ([$node]) { $node }
sub redigest (@list) {
map -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @list;
}
use v6;
sub tree_hash (Str $data, Int :$chunk_size = 1024²) {
reduce_hash
map &sha256,
comb / . ** {1..$chunk_size} /,
$data
}
multi sub reduce_hash ( @nodes) { samewith redigest @nodes }
multi sub reduce_hash ([$node]) { $node }
sub redigest (@list) {
map -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @list;
}
use v6;
sub tree_hash (Str $data, Int :$chunk_size = 1024²) {
reduce_hash
map &sha256,
comb / . ** {1..$chunk_size} /,
$data
}
multi sub reduce_hash (@nodes) {
samewith map -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @nodes
}
multi sub reduce_hash ([$node]) {
$node
}
use v6;
sub tree_hash (Str $data, Int :$chunk_size = 1024²) {
reduce_hash
map &sha256,
comb / . ** {1..$chunk_size} /,
$data
}
sub reduce_hash (@nodes) {
treefold -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @nodes
}
sub treefold (&block, *@data) {
@data > 1 ?? samewith &block, map &block, @data
!! @data[0]
}
use v6;
sub tree_hash (Str $data, Int :$chunk_size = 1024²) {
reduce_hash
map &sha256,
comb / . ** {1..$chunk_size} /,
$data
}
sub reduce_hash (@nodes) {
treefold { sha256 $^a~$^b }, @nodes
}
multi treefold (&block, @data) { |@data }
multi treefold (&block, @data where * >= &block.arity) {
given @data - @data % &block.arity -> $last {
samewith &block, [|map(&block, @data[^$last]), |@data[$last..*]]
}
}
use v6; use Treefold;
sub tree_hash (Str $data, Int :$chunk_size = 1024²) {
reduce_hash
map &sha256,
comb / . ** {1..$chunk_size} /,
$data
}
sub reduce_hash (@nodes) {
treefold { sha256 $^a~$^b }, @nodes
}
use v6; use Treefold;
sub tree_hash (Str $data, Int :$chunk_size = 1024²) {
treefold { sha256 $^a~$^b },
map &sha256,
comb / . ** {1..$chunk_size} /,
$data
}
Don’t need Haskell or Scala.
Efficient and elegant functional code.
In Perl 5 or Perl 6.
Easy to write (once you get the knack).
Easy to optimize (with some syntactic sugar).
Surprisingly efficient.
Give it a try.
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6

Contenu connexe

Tendances

Doctrine MongoDB ODM (PDXPHP)
Doctrine MongoDB ODM (PDXPHP)Doctrine MongoDB ODM (PDXPHP)
Doctrine MongoDB ODM (PDXPHP)Kris Wallsmith
 
Perl.Hacks.On.Vim
Perl.Hacks.On.VimPerl.Hacks.On.Vim
Perl.Hacks.On.VimLin Yo-An
 
Xlab #1: Advantages of functional programming in Java 8
Xlab #1: Advantages of functional programming in Java 8Xlab #1: Advantages of functional programming in Java 8
Xlab #1: Advantages of functional programming in Java 8XSolve
 
Adventures in Optimization
Adventures in OptimizationAdventures in Optimization
Adventures in OptimizationDavid Golden
 
Perl Bag of Tricks - Baltimore Perl mongers
Perl Bag of Tricks  -  Baltimore Perl mongersPerl Bag of Tricks  -  Baltimore Perl mongers
Perl Bag of Tricks - Baltimore Perl mongersbrian d foy
 
SPL: The Missing Link in Development
SPL: The Missing Link in DevelopmentSPL: The Missing Link in Development
SPL: The Missing Link in Developmentjsmith92
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language TriviaNikita Popov
 
Introduction to Perl - Day 1
Introduction to Perl - Day 1Introduction to Perl - Day 1
Introduction to Perl - Day 1Dave Cross
 
Simple Ways To Be A Better Programmer (OSCON 2007)
Simple Ways To Be A Better Programmer (OSCON 2007)Simple Ways To Be A Better Programmer (OSCON 2007)
Simple Ways To Be A Better Programmer (OSCON 2007)Michael Schwern
 
Parsing JSON with a single regex
Parsing JSON with a single regexParsing JSON with a single regex
Parsing JSON with a single regexbrian d foy
 
Creating own language made easy
Creating own language made easyCreating own language made easy
Creating own language made easyIngvar Stepanyan
 
Perforce Object and Record Model
Perforce Object and Record Model  Perforce Object and Record Model
Perforce Object and Record Model Perforce
 
The Joy of Smartmatch
The Joy of SmartmatchThe Joy of Smartmatch
The Joy of SmartmatchAndrew Shitov
 
Advanced symfony Techniques
Advanced symfony TechniquesAdvanced symfony Techniques
Advanced symfony TechniquesKris Wallsmith
 
The Magic Of Tie
The Magic Of TieThe Magic Of Tie
The Magic Of Tiebrian d foy
 

Tendances (20)

Perl6 in-production
Perl6 in-productionPerl6 in-production
Perl6 in-production
 
Doctrine MongoDB ODM (PDXPHP)
Doctrine MongoDB ODM (PDXPHP)Doctrine MongoDB ODM (PDXPHP)
Doctrine MongoDB ODM (PDXPHP)
 
Perl.Hacks.On.Vim
Perl.Hacks.On.VimPerl.Hacks.On.Vim
Perl.Hacks.On.Vim
 
Xlab #1: Advantages of functional programming in Java 8
Xlab #1: Advantages of functional programming in Java 8Xlab #1: Advantages of functional programming in Java 8
Xlab #1: Advantages of functional programming in Java 8
 
Adventures in Optimization
Adventures in OptimizationAdventures in Optimization
Adventures in Optimization
 
Perl Bag of Tricks - Baltimore Perl mongers
Perl Bag of Tricks  -  Baltimore Perl mongersPerl Bag of Tricks  -  Baltimore Perl mongers
Perl Bag of Tricks - Baltimore Perl mongers
 
SPL: The Missing Link in Development
SPL: The Missing Link in DevelopmentSPL: The Missing Link in Development
SPL: The Missing Link in Development
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language Trivia
 
Perl 6 by example
Perl 6 by examplePerl 6 by example
Perl 6 by example
 
Introduction to Perl - Day 1
Introduction to Perl - Day 1Introduction to Perl - Day 1
Introduction to Perl - Day 1
 
Simple Ways To Be A Better Programmer (OSCON 2007)
Simple Ways To Be A Better Programmer (OSCON 2007)Simple Ways To Be A Better Programmer (OSCON 2007)
Simple Ways To Be A Better Programmer (OSCON 2007)
 
Parsing JSON with a single regex
Parsing JSON with a single regexParsing JSON with a single regex
Parsing JSON with a single regex
 
Creating own language made easy
Creating own language made easyCreating own language made easy
Creating own language made easy
 
Perforce Object and Record Model
Perforce Object and Record Model  Perforce Object and Record Model
Perforce Object and Record Model
 
Perl6 grammars
Perl6 grammarsPerl6 grammars
Perl6 grammars
 
The Joy of Smartmatch
The Joy of SmartmatchThe Joy of Smartmatch
The Joy of Smartmatch
 
Nubilus Perl
Nubilus PerlNubilus Perl
Nubilus Perl
 
Advanced symfony Techniques
Advanced symfony TechniquesAdvanced symfony Techniques
Advanced symfony Techniques
 
The Magic Of Tie
The Magic Of TieThe Magic Of Tie
The Magic Of Tie
 
Subroutines
SubroutinesSubroutines
Subroutines
 

Similaire à Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6

The History of PHPersistence
The History of PHPersistenceThe History of PHPersistence
The History of PHPersistenceHugo Hamon
 
Crazy things done on PHP
Crazy things done on PHPCrazy things done on PHP
Crazy things done on PHPTaras Kalapun
 
20 modules i haven't yet talked about
20 modules i haven't yet talked about20 modules i haven't yet talked about
20 modules i haven't yet talked aboutTatsuhiko Miyagawa
 
PHP Functions & Arrays
PHP Functions & ArraysPHP Functions & Arrays
PHP Functions & ArraysHenry Osborne
 
"Coffee Script" in Brief
"Coffee Script" in Brief"Coffee Script" in Brief
"Coffee Script" in BriefNat Weerawan
 
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePerl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePedro Figueiredo
 
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Konrad Malawski
 
Dependency Injection
Dependency InjectionDependency Injection
Dependency InjectionRifat Nabi
 
Introduction à CoffeeScript pour ParisRB
Introduction à CoffeeScript pour ParisRB Introduction à CoffeeScript pour ParisRB
Introduction à CoffeeScript pour ParisRB jhchabran
 
An Elephant of a Different Colour: Hack
An Elephant of a Different Colour: HackAn Elephant of a Different Colour: Hack
An Elephant of a Different Colour: HackVic Metcalfe
 
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011Masahiro Nagano
 
PHPCon 2016: PHP7 by Witek Adamus / XSolve
PHPCon 2016: PHP7 by Witek Adamus / XSolvePHPCon 2016: PHP7 by Witek Adamus / XSolve
PHPCon 2016: PHP7 by Witek Adamus / XSolveXSolve
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
Php my sql - functions - arrays - tutorial - programmerblog.net
Php my sql - functions - arrays - tutorial - programmerblog.netPhp my sql - functions - arrays - tutorial - programmerblog.net
Php my sql - functions - arrays - tutorial - programmerblog.netProgrammer Blog
 
DBIx-DataModel v2.0 in detail
DBIx-DataModel v2.0 in detail DBIx-DataModel v2.0 in detail
DBIx-DataModel v2.0 in detail Laurent Dami
 
The Art of Transduction
The Art of TransductionThe Art of Transduction
The Art of TransductionDavid Stockton
 
Why async and functional programming in PHP7 suck and how to get overr it?
Why async and functional programming in PHP7 suck and how to get overr it?Why async and functional programming in PHP7 suck and how to get overr it?
Why async and functional programming in PHP7 suck and how to get overr it?Lucas Witold Adamus
 
Scalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of codeScalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of codeKonrad Malawski
 

Similaire à Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6 (20)

The History of PHPersistence
The History of PHPersistenceThe History of PHPersistence
The History of PHPersistence
 
PHP API
PHP APIPHP API
PHP API
 
Crazy things done on PHP
Crazy things done on PHPCrazy things done on PHP
Crazy things done on PHP
 
20 modules i haven't yet talked about
20 modules i haven't yet talked about20 modules i haven't yet talked about
20 modules i haven't yet talked about
 
PHP Functions & Arrays
PHP Functions & ArraysPHP Functions & Arrays
PHP Functions & Arrays
 
"Coffee Script" in Brief
"Coffee Script" in Brief"Coffee Script" in Brief
"Coffee Script" in Brief
 
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePerl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
 
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
 
Dependency Injection
Dependency InjectionDependency Injection
Dependency Injection
 
Introduction à CoffeeScript pour ParisRB
Introduction à CoffeeScript pour ParisRB Introduction à CoffeeScript pour ParisRB
Introduction à CoffeeScript pour ParisRB
 
DBI
DBIDBI
DBI
 
An Elephant of a Different Colour: Hack
An Elephant of a Different Colour: HackAn Elephant of a Different Colour: Hack
An Elephant of a Different Colour: Hack
 
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
 
PHPCon 2016: PHP7 by Witek Adamus / XSolve
PHPCon 2016: PHP7 by Witek Adamus / XSolvePHPCon 2016: PHP7 by Witek Adamus / XSolve
PHPCon 2016: PHP7 by Witek Adamus / XSolve
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Php my sql - functions - arrays - tutorial - programmerblog.net
Php my sql - functions - arrays - tutorial - programmerblog.netPhp my sql - functions - arrays - tutorial - programmerblog.net
Php my sql - functions - arrays - tutorial - programmerblog.net
 
DBIx-DataModel v2.0 in detail
DBIx-DataModel v2.0 in detail DBIx-DataModel v2.0 in detail
DBIx-DataModel v2.0 in detail
 
The Art of Transduction
The Art of TransductionThe Art of Transduction
The Art of Transduction
 
Why async and functional programming in PHP7 suck and how to get overr it?
Why async and functional programming in PHP7 suck and how to get overr it?Why async and functional programming in PHP7 suck and how to get overr it?
Why async and functional programming in PHP7 suck and how to get overr it?
 
Scalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of codeScalding - Hadoop Word Count in LESS than 70 lines of code
Scalding - Hadoop Word Count in LESS than 70 lines of code
 

Plus de Workhorse Computing

Wheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility ModulesWheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility ModulesWorkhorse Computing
 
Paranormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add UpParanormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add UpWorkhorse Computing
 
The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.Workhorse Computing
 
Generating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in PosgresqlGenerating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in PosgresqlWorkhorse Computing
 
Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!Workhorse Computing
 
BSDM with BASH: Command Interpolation
BSDM with BASH: Command InterpolationBSDM with BASH: Command Interpolation
BSDM with BASH: Command InterpolationWorkhorse Computing
 
BASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic InterpolationBASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic InterpolationWorkhorse Computing
 
The W-curve and its application.
The W-curve and its application.The W-curve and its application.
The W-curve and its application.Workhorse Computing
 
Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.Workhorse Computing
 
Shared Object images in Docker: What you need is what you want.
Shared Object images in Docker: What you need is what you want.Shared Object images in Docker: What you need is what you want.
Shared Object images in Docker: What you need is what you want.Workhorse Computing
 
Selenium sandwich-3: Being where you aren't.
Selenium sandwich-3: Being where you aren't.Selenium sandwich-3: Being where you aren't.
Selenium sandwich-3: Being where you aren't.Workhorse Computing
 

Plus de Workhorse Computing (20)

Wheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility ModulesWheels we didn't re-invent: Perl's Utility Modules
Wheels we didn't re-invent: Perl's Utility Modules
 
mro-every.pdf
mro-every.pdfmro-every.pdf
mro-every.pdf
 
Paranormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add UpParanormal statistics: Counting What Doesn't Add Up
Paranormal statistics: Counting What Doesn't Add Up
 
The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.The $path to knowledge: What little it take to unit-test Perl.
The $path to knowledge: What little it take to unit-test Perl.
 
Unit Testing Lots of Perl
Unit Testing Lots of PerlUnit Testing Lots of Perl
Unit Testing Lots of Perl
 
Generating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in PosgresqlGenerating & Querying Calendar Tables in Posgresql
Generating & Querying Calendar Tables in Posgresql
 
Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!Hypers and Gathers and Takes! Oh my!
Hypers and Gathers and Takes! Oh my!
 
BSDM with BASH: Command Interpolation
BSDM with BASH: Command InterpolationBSDM with BASH: Command Interpolation
BSDM with BASH: Command Interpolation
 
Memory Manglement in Raku
Memory Manglement in RakuMemory Manglement in Raku
Memory Manglement in Raku
 
BASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic InterpolationBASH Variables Part 1: Basic Interpolation
BASH Variables Part 1: Basic Interpolation
 
Effective Benchmarks
Effective BenchmarksEffective Benchmarks
Effective Benchmarks
 
The W-curve and its application.
The W-curve and its application.The W-curve and its application.
The W-curve and its application.
 
Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.Keeping objects healthy with Object::Exercise.
Keeping objects healthy with Object::Exercise.
 
Smoking docker
Smoking dockerSmoking docker
Smoking docker
 
Getting Testy With Perl6
Getting Testy With Perl6Getting Testy With Perl6
Getting Testy With Perl6
 
Light my-fuse
Light my-fuseLight my-fuse
Light my-fuse
 
Paranormal stats
Paranormal statsParanormal stats
Paranormal stats
 
Shared Object images in Docker: What you need is what you want.
Shared Object images in Docker: What you need is what you want.Shared Object images in Docker: What you need is what you want.
Shared Object images in Docker: What you need is what you want.
 
Putting some "logic" in LVM.
Putting some "logic" in LVM.Putting some "logic" in LVM.
Putting some "logic" in LVM.
 
Selenium sandwich-3: Being where you aren't.
Selenium sandwich-3: Being where you aren't.Selenium sandwich-3: Being where you aren't.
Selenium sandwich-3: Being where you aren't.
 

Dernier

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Dernier (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6

  • 2. There was Spaghetti Code. And it was bad.
  • 3. There was Spaghetti Code. And it was bad. So we invented Objects.
  • 4. There was Spaghetti Code. And it was bad. So we invented Objects. Now we have Spaghetti Objects.
  • 5. Based on Lambda Calculus. Few basic ideas: Transparency. Consistency.
  • 6. Constant data. Transparent transforms. Functions require input. Output determined fully by inputs. Avoid internal state & side effects.
  • 8. Used with AWS “Glacier” service. $0.01/GiB/Month. Large, cold data (discounts for EiB, PiB). Uploads require lots of sha256 values.
  • 9. Uploads chunked in multiples of 1MB. Digest for each chunk & entire upload. Result: tree-hash.
  • 10. Image from Amazon Developer Guide (API Version 2012-06-01) http://docs.aws.amazon.com/amazonglacier/latest/dev/checksum-calculations.html
  • 11. sub calc_tree { my ($self) = @_; my $prev_level = 0; while (scalar @{ $self->{tree}->[$prev_level] } > 1) { my $curr_level = $prev_level+1; $self->{tree}->[$curr_level] = []; my $prev_tree = $self->{tree}->[$prev_level]; my $curr_tree = $self->{tree}->[$curr_level]; my $len = scalar @$prev_tree; for (my $i = 0; $i < $len; $i += 2) { if ($len - $i > 1) { my $a = $prev_tree->[$i]; my $b = $prev_tree->[$i+1]; push @$curr_tree, { hash => sha256( $a->{hash}.$b->{hash} ), start => $a->{start}, finish => $b->{finish}, joined => 0 }; } else { push @$curr_tree, $prev_tree->[$i]; } } $prev_level = $curr_level; } }
  • 12. Trees are naturally recursive. Two-step generation: Split the buffer. Reduce the hashes.
  • 13. Reduce pairs. Until one value remains. sub reduce_hash { # undef for empty list @_ > 1 or return $_[0]; my $count = @_ / 2 + @_ % 2; reduce_hash map { @_ > 1 ? sha256 splice @_, 0, 2 : shift } ( 1 .. $count ) }
  • 14. Reduce pairs. Until one value remains. Catch: Eats Stack sub reduce_hash { # undef for empty list @_ > 1 or return $_[0]; my $count = @_ / 2 + @_ % 2; reduce_hash map { @_ > 1 ? sha256 splice @_, 0, 2 : shift } ( 1 .. $count ) }
  • 15. Tail recursion is common. “Tail call elimination” recycles stack. “Fold” is a feature of FP languages. Reduces the stack to a scalar.
  • 16. Reset the stack. Restart the sub. my $foo = sub { @_ > 1 or return $_[0]; @_ = … ; # new in v5.16 goto __SUB__ };
  • 17. Voila! Stack shrinks. sub reduce_hash { @_ > 1 or return $_[0]; my $count = @_ / 2 + @_ % 2; @_ = map { @_ > 1 ? sha256 splice @_, 0, 2 : @_ } ( 1 .. $count ); goto __SUB__ }
  • 18. Voila! Stack shrinks. @_ = goto scare people. sub reduce_hash { @_ > 1 or return $_[0]; my $count = @_ / 2 + @_ % 2; @_ = map { @_ > 1 ? sha256 splice @_, 0, 2 : @_ } ( 1 .. $count ); goto __SUB__ }
  • 19. See K::D POD for {{{…}}} to avoid "@_". use Keyword::Declare; keyword tree_fold ( Ident $name, Block $new_list ) { qq # this is source code, not a subref! { sub $name { @_ > 1 or return $_[0]; @_ = do $new_list; goto __SUB__ } } }
  • 20. User supplies generator a.k.a $new_list tree_fold reduce_hash { my $count = @_ / 2 + @_ % 2; map { @_ > 1 ? sha256 splice @_, 0, 2 : @_ } ( 1 .. $count ) }
  • 21. User supplies generator. NQFP: Hacks the stack. tree_fold reduce_hash { my $count = @_ / 2 + @_ % 2; map { @_ > 1 ? sha256 splice @_, 0, 2 : @_ } ( 1 .. $count ) }
  • 22. Replace splice with offsets. tree_fold reduce_hash { my $last = @_ / 2 + @_ % 2 – 1; map { $_[ $_ + 1 ] ? sha256 @_[ $_, $_ + 1 ] : $_[ $_ ] } map { 2 * $_ } ( 0 .. $last ) }
  • 23. Replace splice with offsets. Still messy: @_, stacked map. tree_fold reduce_hash { my $last = @_ / 2 + @_ % 2 – 1; map { $_[ $_ + 1 ] ? sha256 @_[ $_, $_ + 1 ] : $_[ $_ ] } map { 2 * $_ } ( 0 .. $last ) }
  • 24. Declare fold_hash with parameters. Caller uses lexical vars. keyword tree_fold ( Ident $name, List $argz, Block $stack_op ) { ... }
  • 25. Extract lexical variables. See also: PPI::Token my @varz # ( '$foo', '$bar' ) = map { $_->isa( 'PPI::Token::Symbol' ) ? $_->{ content } : () } map { $_->isa( 'PPI::Statement::Expression' ) ? @{ $_->{ children } } : () } @{ $argz->{ children } };
  • 26. Count & offset used to extract stack. my $lexical = join ',' => @varz; my $count = @varz; my $offset = $count -1; sub $name { @_ > 1 or return $_[0]; my $last = @_ % $count ? int( @_ / $count ) : int( @_ / $count ) - 1 ; ...
  • 27. Interpolate lexicals, count, offset, stack op. @_ = map { my ( $lexical ) = @_[ $_ .. $_ + $offset ]; do $stack_op } map { $_ * $count } ( 0 .. $last ); goto __SUB__
  • 28. Not much body left: tree_fold reduce_hash($left, $rite) { $rite ? sha2656 $left, $rite : $left }
  • 29. Explicit map, keyword with and without lexicals. 4-32MiB are good chunk sizes. MiB Explicit Implicit Keyword 1 0.02 0.01 0.02 2 0.03 0.03 0.04 4 0.07 0.07 0.07 8 0.14 0.13 0.10 16 0.19 0.18 0.17 32 0.31 0.30 0.26 64 0.50 0.51 0.49 128 1.00 1.02 1.01 256 2.03 2.03 2.03 512 4.05 4.10 4.06 1024 8.10 8.10 8.11
  • 30. Don’t need Haskell or Scala. Efficient and elegant functional code. In Perl 5.
  • 31. Don’t need Haskell or Scala. Efficient and elegant functional code. In Perl 6?
  • 32. Don’t need Haskell or Scala. Efficient and elegant functional code. In Perl 6? Doubt if even Damian could do it better.
  • 33. use v6; sub tree_hash (Str $data, Int :$chunk_size = 1024²) { reduce_hash map &sha256, comb / . ** {1..$chunk_size} /, $data } multi sub reduce_hash ( @nodes) { reduce_hash redigest @nodes } multi sub reduce_hash ([$node]) { $node } sub redigest (@list) { map -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @list; }
  • 34. use v6; sub tree_hash (Str $data, Int :$chunk_size = 1024²) { reduce_hash map &sha256, comb / . ** {1..$chunk_size} /, $data } multi sub reduce_hash ( @nodes) { samewith redigest @nodes } multi sub reduce_hash ([$node]) { $node } sub redigest (@list) { map -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @list; }
  • 35. use v6; sub tree_hash (Str $data, Int :$chunk_size = 1024²) { reduce_hash map &sha256, comb / . ** {1..$chunk_size} /, $data } multi sub reduce_hash (@nodes) { samewith map -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @nodes } multi sub reduce_hash ([$node]) { $node }
  • 36. use v6; sub tree_hash (Str $data, Int :$chunk_size = 1024²) { reduce_hash map &sha256, comb / . ** {1..$chunk_size} /, $data } sub reduce_hash (@nodes) { treefold -> $a, $b? { $b ?? sha256 $a~$b !! $a }, @nodes } sub treefold (&block, *@data) { @data > 1 ?? samewith &block, map &block, @data !! @data[0] }
  • 37. use v6; sub tree_hash (Str $data, Int :$chunk_size = 1024²) { reduce_hash map &sha256, comb / . ** {1..$chunk_size} /, $data } sub reduce_hash (@nodes) { treefold { sha256 $^a~$^b }, @nodes } multi treefold (&block, @data) { |@data } multi treefold (&block, @data where * >= &block.arity) { given @data - @data % &block.arity -> $last { samewith &block, [|map(&block, @data[^$last]), |@data[$last..*]] } }
  • 38. use v6; use Treefold; sub tree_hash (Str $data, Int :$chunk_size = 1024²) { reduce_hash map &sha256, comb / . ** {1..$chunk_size} /, $data } sub reduce_hash (@nodes) { treefold { sha256 $^a~$^b }, @nodes }
  • 39. use v6; use Treefold; sub tree_hash (Str $data, Int :$chunk_size = 1024²) { treefold { sha256 $^a~$^b }, map &sha256, comb / . ** {1..$chunk_size} /, $data }
  • 40. Don’t need Haskell or Scala. Efficient and elegant functional code. In Perl 5 or Perl 6.
  • 41. Easy to write (once you get the knack). Easy to optimize (with some syntactic sugar). Surprisingly efficient.
  • 42. Give it a try.