SlideShare une entreprise Scribd logo
1  sur  53
Télécharger pour lire hors ligne
Video
The nitty gritty of data serialisation
Maxim Zaks
@iceX33
_
One bit
2^1
true / false
_ _ _ _ _ _ _ _
One byte
2^8 = 256
{0..255}
{-128..0..127}
{ASCII / UTF-8}
_ _ _ _ _ _ _ _ | _ _ _ _ _ _ _ _
___ | ___
Little endian
Big endian
005 | 000
000 | 005
Two bytes
2^16 = 65,536
{0 .. 65,535}
{-32,768 .. 0 .. 32,767}
{UTF-16}
___ | ___ | ___ | ___
2^32 = 4,294,967,296
{0 .. 4,294,967,296}
{-2,147,483,648 .. 0 .. 2,147,483,647 }
{0, -0, NaN, Inf, -Inf, SinglePrecision}
{UTF-32}
{Pointer on arch-32}
___ | ___ | ___ | ___ | ___ | ___ | ___ | ___
2^64 = 1.844674407E19
{0 .. 1.844674407E19}
{-9.223372037E18 .. 0 .. 9.223372037E18}
{0, -0, NaN, Inf, -Inf, DoublePrecision}
{Pointer on arch-64}
Real numbers / IEEE 754
https://en.wikipedia.org/wiki/Single-precision_floating-
point_format
https://en.wikipedia.org/wiki/Double-precision_floating-
point_format
https://en.wikipedia.org/wiki/Half-precision_floating-
point_format
String / Text
https://en.wikipedia.org/wiki/ASCII
Unicode
• Code
• UTF-8
• UTF-16
• UTF-32
• etc...
https://en.wikipedia.org/wiki/Unicode
https://en.wikipedia.org/wiki/UTF-8
https://en.wikipedia.org/wiki/UTF-16
Human readable / text based serialisation
• CSV
• EDIFACT
• XML
• JSON
• YAML
• TOML
• Java property file
Second level encoding
Int as text
• Base 10
• Worst case 3x (* UTF-?) size
• Parsing overhead
Real number as text
• Min 3 bytes to ...
• What you see is not what you get *
• Parsing overhead
Bytes as Text
• Base 16 (2^4) - 2x
• Base 64 (2^6) - 1.3x
Form of structured data representation
• Table (row / column based)
• Tree / hierarchy
• Graph
Bit packing techniques
Variable length quantity
https://en.wikipedia.org/wiki/Variable-length_quantity
VLQ + ZigZag
Pack Multiple values in one byte
• Array of booleans
• Multiple enum values
Deduplication
• Lookup table as in GIF
• Most frequent values only
Storing Deltas
Memory alignment
• Fast/direct access
• Padding
Good Talk on Parque and Arrow
https://www.youtube.com/watch?v=dPb2ZXnt2_U
Binary serialisation
• Format specification
• Code
• Schema
• Tooling for human readable
Format specification
• Support for endianness and archs
• Bit-packed or memory aligned
• Need for temporary memory allocation
• Form of data representation (see schema)
• Evolution strategies
Code
• Code generation / library
• Supported languages
• Computational efficiency
• Memory vs. speed
Schema
• Schema based, or self described
• Language and tooling
• High level concepts in regards to data representation and
evolution
• Capabilities for documentation and meta data
Tooling for human readable
• Convert to and from a text based serialisation format
• GUI
• Data importer (e.g. DB integration)
Benefits of binary formats
• Compact size
• Efficient read and write
• Data representation
• Human readable
Environment
• Data centres consume 4% of world wide electricity
Thank you!
Maxim Zaks @iceX33
Questions?

Contenu connexe

Similaire à Nitty Gritty of Data Serialisation

Gdc03 ericson memory_optimization
Gdc03 ericson memory_optimizationGdc03 ericson memory_optimization
Gdc03 ericson memory_optimization
brettlevin
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
guest3eed30
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
Wei Lin
 
ImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_DoinImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_Doin
Jonny Doin
 

Similaire à Nitty Gritty of Data Serialisation (20)

Jvm goes big_data_sfjava
Jvm goes big_data_sfjavaJvm goes big_data_sfjava
Jvm goes big_data_sfjava
 
Gdc03 ericson memory_optimization
Gdc03 ericson memory_optimizationGdc03 ericson memory_optimization
Gdc03 ericson memory_optimization
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
Js2517181724
Js2517181724Js2517181724
Js2517181724
 
Js2517181724
Js2517181724Js2517181724
Js2517181724
 
2019-Db2-From_ASCII_to_UTF-8.pdf
2019-Db2-From_ASCII_to_UTF-8.pdf2019-Db2-From_ASCII_to_UTF-8.pdf
2019-Db2-From_ASCII_to_UTF-8.pdf
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure CodingLess is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
 
Big Data LDN 2017: Big Data Analytics with MariaDB ColumnStore
Big Data LDN 2017: Big Data Analytics with MariaDB ColumnStoreBig Data LDN 2017: Big Data Analytics with MariaDB ColumnStore
Big Data LDN 2017: Big Data Analytics with MariaDB ColumnStore
 
SIMD Processing Using Compiler Intrinsics
SIMD Processing Using Compiler IntrinsicsSIMD Processing Using Compiler Intrinsics
SIMD Processing Using Compiler Intrinsics
 
Making (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with CachingMaking (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with Caching
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
ImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_DoinImplementingCryptoSecurityARMCortex_Doin
ImplementingCryptoSecurityARMCortex_Doin
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
04 cache memory.ppt 1
04 cache memory.ppt 104 cache memory.ppt 1
04 cache memory.ppt 1
 
Cache Memory.ppt
Cache Memory.pptCache Memory.ppt
Cache Memory.ppt
 
04_Cache Memory.ppt
04_Cache Memory.ppt04_Cache Memory.ppt
04_Cache Memory.ppt
 
04_Cache Memory.ppt
04_Cache Memory.ppt04_Cache Memory.ppt
04_Cache Memory.ppt
 

Plus de Maxim Zaks

Swift the implicit parts
Swift the implicit partsSwift the implicit parts
Swift the implicit parts
Maxim Zaks
 
96% macoun 2013
96% macoun 201396% macoun 2013
96% macoun 2013
Maxim Zaks
 
Diagnose of Agile @ Wooga 04.2013
Diagnose of Agile @ Wooga 04.2013Diagnose of Agile @ Wooga 04.2013
Diagnose of Agile @ Wooga 04.2013
Maxim Zaks
 
Start playing @ mobile.cologne 2013
Start playing @ mobile.cologne 2013Start playing @ mobile.cologne 2013
Start playing @ mobile.cologne 2013
Maxim Zaks
 
Under Cocos2D Tree @mdvecon 2013
Under Cocos2D Tree @mdvecon 2013Under Cocos2D Tree @mdvecon 2013
Under Cocos2D Tree @mdvecon 2013
Maxim Zaks
 
Don’t do Agile, be Agile @NSConf 2013
Don’t do Agile, be Agile @NSConf 2013Don’t do Agile, be Agile @NSConf 2013
Don’t do Agile, be Agile @NSConf 2013
Maxim Zaks
 

Plus de Maxim Zaks (20)

Entity Component System - a different approach to game and app development
Entity Component System - a different approach to game and app developmentEntity Component System - a different approach to game and app development
Entity Component System - a different approach to game and app development
 
Wind of change
Wind of changeWind of change
Wind of change
 
Data model mal anders
Data model mal andersData model mal anders
Data model mal anders
 
Talk Binary to Me
Talk Binary to MeTalk Binary to Me
Talk Binary to Me
 
Entity Component System - for App developers
Entity Component System - for App developersEntity Component System - for App developers
Entity Component System - for App developers
 
Beyond JSON - An Introduction to FlatBuffers
Beyond JSON - An Introduction to FlatBuffersBeyond JSON - An Introduction to FlatBuffers
Beyond JSON - An Introduction to FlatBuffers
 
Beyond JSON @ Mobile.Warsaw
Beyond JSON @ Mobile.WarsawBeyond JSON @ Mobile.Warsaw
Beyond JSON @ Mobile.Warsaw
 
Beyond JSON @ dot swift 2016
Beyond JSON @ dot swift 2016Beyond JSON @ dot swift 2016
Beyond JSON @ dot swift 2016
 
Beyond JSON with FlatBuffers
Beyond JSON with FlatBuffersBeyond JSON with FlatBuffers
Beyond JSON with FlatBuffers
 
Basics of Computer Science
Basics of Computer ScienceBasics of Computer Science
Basics of Computer Science
 
Entity system architecture with Unity @Unite Europe 2015
Entity system architecture with Unity @Unite Europe 2015 Entity system architecture with Unity @Unite Europe 2015
Entity system architecture with Unity @Unite Europe 2015
 
UIKonf App & Data Driven Design @swift.berlin
UIKonf App & Data Driven Design @swift.berlinUIKonf App & Data Driven Design @swift.berlin
UIKonf App & Data Driven Design @swift.berlin
 
Swift the implicit parts
Swift the implicit partsSwift the implicit parts
Swift the implicit parts
 
Currying in Swift
Currying in SwiftCurrying in Swift
Currying in Swift
 
Promise of an API
Promise of an APIPromise of an API
Promise of an API
 
96% macoun 2013
96% macoun 201396% macoun 2013
96% macoun 2013
 
Diagnose of Agile @ Wooga 04.2013
Diagnose of Agile @ Wooga 04.2013Diagnose of Agile @ Wooga 04.2013
Diagnose of Agile @ Wooga 04.2013
 
Start playing @ mobile.cologne 2013
Start playing @ mobile.cologne 2013Start playing @ mobile.cologne 2013
Start playing @ mobile.cologne 2013
 
Under Cocos2D Tree @mdvecon 2013
Under Cocos2D Tree @mdvecon 2013Under Cocos2D Tree @mdvecon 2013
Under Cocos2D Tree @mdvecon 2013
 
Don’t do Agile, be Agile @NSConf 2013
Don’t do Agile, be Agile @NSConf 2013Don’t do Agile, be Agile @NSConf 2013
Don’t do Agile, be Agile @NSConf 2013
 

Dernier

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 

Nitty Gritty of Data Serialisation