SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Apache Avro

    Zafar Gilani
Muhammad Adnan Khan
     Hui Shang
Outline
•   Overview
•   Comparison
•   Specification
•   SASL profile and usage
•   References
Overview
•   A data serialization system.
•   An RPC framework.
•   For: storage & comm.
•   Purpose:
    – Provide rich data structures.
    – A compact and fast binary data format.
    – Simple integration with dynamic languages.
Overview
• Avro uses JSON for Interface Description
  Language (IDL).
  – To specify data types.
  – To specify protocols.
• Review: JavaScript Object Notation is just a
  light-weight text-based standard for data
  interchange.
Why the need for Avro?
• Primary usage in Hadoop, provides standard:
  1. Serialization format for persistent data.
  2. Wire format for communication ..
    •   .. among Hadoop nodes.
    •   .. from client programs to Hadoop services.
Overview
• Avro relies on schemas.
  – Schema stored with data.
  – Each datum written with no per-value overheads.
     • Thus serialization is fast and small.
• Avro in RPC:
  – Schema exchange during client-server handshake.
  – Correspondence in fields can be easily resolved.
APIs
• Supporting API for:
  – Java
  –C
  – C++
  – C#
  – Python
  – Ruby
Comparison with other systems
• Avro vs. Protobuf and Thrift.
• A quick note about Thrift:
  – Initially developed at Facebook by a Google intern.
  – Closer to Google’s protobuf.
Comparison with other systems
                 Avro                Google protobuf       Thrift

Implementation   Hmm..               Cleaner              Hmm..

Error handling   Complex             Simple                OK

Extensibility    Hmm..               Richer                OK

Compatibility    Java, C, C++, C#,   That and much         About the same as
                 Python and Ruby     more such as          protobuf
                                     Adobe Actionscript,
                                     Microsoft
                                     Silverlight, etc.
Specification
• Schema represented in one of:
   – JSON string, naming a defined type.
   – JSON object of the form:
      • {"type": "typeName" ...attributes...}
   – JSON array
• Primitive types: null, boolean, int, long, float,
  double, bytes, string
   – {"type": "string"}
• Complex types: records, enums, arrays, maps,
  unions, fixed
Specification, example protocol
{
    "namespace": "com.acme",
    "protocol": "HelloWorld",
    "doc": "Protocol Greetings",

    "types": [
      {"name": "Greeting", "type": "record", "fields": [
        {"name": "message", "type": "string"}]},
      {"name": "Curse", "type": "error", "fields": [
        {"name": "message", "type": "string"}]}
    ],

    "messages": {
      "hello": {
        "doc": "Say hello.",
        "request": [{"name": "greeting", "type": "Greeting" }],
        "response": "Greeting",
        "errors": ["Curse"]
      }
    }
}
SASL profile
• Simple Authentication and Security Layer.
• Provides a framework for
  – Authentication.
  – Security of network protocols.
SASL usage
• Negotiation procedure to use connection-
  oriented Avro RPC:
  – 0: START Used in a client's initial message.
  – 1: CONTINUE Used while negotiation is
    ongoing.
  – 2: FAIL Terminates negotiation unsuccessfully.
  – 3: COMPLETE Terminates negotiation
    sucessfully.
References
1. Apache Avro,
   http://avro.apache.org/docs/current/
2. Google protocol buffers vs Apache Avro,
   http://www.sammur.com/?p=36
3. Avro vs Thrift,
   http://tech.puredanger.com/2011/05/27/serializ
   ation-comparison/
4. SASL,
   http://avro.apache.org/docs/current/sasl.html
Apache Avro

    Zafar Gilani
Muhammad Adnan Khan
     Hui Shang

Contenu connexe

Tendances

Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
Alex Tumanoff
 
Real-time Web with Rails and XMPP
Real-time Web with Rails and XMPPReal-time Web with Rails and XMPP
Real-time Web with Rails and XMPP
Li Cai
 

Tendances (20)

Avro
AvroAvro
Avro
 
Apache Avro in LivePerson [Hebrew]
Apache Avro in LivePerson [Hebrew]Apache Avro in LivePerson [Hebrew]
Apache Avro in LivePerson [Hebrew]
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
 
Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol Buffers
 
An introduction to Apache Thrift
An introduction to Apache ThriftAn introduction to Apache Thrift
An introduction to Apache Thrift
 
Serialization in Go
Serialization in GoSerialization in Go
Serialization in Go
 
Redis v5 & Streams
Redis v5 & StreamsRedis v5 & Streams
Redis v5 & Streams
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
 
Apache Avro and Messaging at Scale in LivePerson
Apache Avro and Messaging at Scale in LivePersonApache Avro and Messaging at Scale in LivePerson
Apache Avro and Messaging at Scale in LivePerson
 
The Parenscript Common Lisp to JavaScript compiler
The Parenscript Common Lisp to JavaScript compilerThe Parenscript Common Lisp to JavaScript compiler
The Parenscript Common Lisp to JavaScript compiler
 
Fast & Scalable Front/Back-ends using Ruby, Rails & XMPP
Fast & Scalable Front/Back-ends using Ruby, Rails & XMPPFast & Scalable Front/Back-ends using Ruby, Rails & XMPP
Fast & Scalable Front/Back-ends using Ruby, Rails & XMPP
 
JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing World
 
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and Perl
 
Real-time Web with Rails and XMPP
Real-time Web with Rails and XMPPReal-time Web with Rails and XMPP
Real-time Web with Rails and XMPP
 
gRPC Design and Implementation
gRPC Design and ImplementationgRPC Design and Implementation
gRPC Design and Implementation
 
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12
 
IDLs
IDLsIDLs
IDLs
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
 
Developing high-performance network servers in Lisp
Developing high-performance network servers in LispDeveloping high-performance network servers in Lisp
Developing high-performance network servers in Lisp
 

Similaire à 3 apache-avro

Web Development Environments: Choose the best or go with the rest
Web Development Environments:  Choose the best or go with the restWeb Development Environments:  Choose the best or go with the rest
Web Development Environments: Choose the best or go with the rest
george.james
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and Atlas
DataWorks Summit
 

Similaire à 3 apache-avro (20)

Web servicesoverview
Web servicesoverviewWeb servicesoverview
Web servicesoverview
 
Web servicesoverview
Web servicesoverviewWeb servicesoverview
Web servicesoverview
 
Rest style web services (google protocol buffers) prasad nirantar
Rest style web services (google protocol buffers)   prasad nirantarRest style web services (google protocol buffers)   prasad nirantar
Rest style web services (google protocol buffers) prasad nirantar
 
Web Development Environments: Choose the best or go with the rest
Web Development Environments:  Choose the best or go with the restWeb Development Environments:  Choose the best or go with the rest
Web Development Environments: Choose the best or go with the rest
 
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
 
Avro intro
Avro introAvro intro
Avro intro
 
Drill dchug-29 nov2012
Drill dchug-29 nov2012Drill dchug-29 nov2012
Drill dchug-29 nov2012
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and Atlas
 
Hands on with CoAP and Californium
Hands on with CoAP and CaliforniumHands on with CoAP and Californium
Hands on with CoAP and Californium
 
Php
PhpPhp
Php
 
Php
PhpPhp
Php
 
Php
PhpPhp
Php
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop Ecosystem
 
Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex
 
Cloud Native API Design and Management
Cloud Native API Design and ManagementCloud Native API Design and Management
Cloud Native API Design and Management
 
High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014
 
2CPP02 - C++ Primer
2CPP02 - C++ Primer2CPP02 - C++ Primer
2CPP02 - C++ Primer
 
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMot
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMotDelphi ORM SOA MVC SQL NoSQL JSON REST mORMot
Delphi ORM SOA MVC SQL NoSQL JSON REST mORMot
 
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
 
API Design in the Modern Era - Architecture Next 2020
API Design in the Modern Era - Architecture Next 2020API Design in the Modern Era - Architecture Next 2020
API Design in the Modern Era - Architecture Next 2020
 

Plus de zafargilani

Plus de zafargilani (7)

Bigtable
BigtableBigtable
Bigtable
 
6 intelligent-placement-of-datacenters
6 intelligent-placement-of-datacenters6 intelligent-placement-of-datacenters
6 intelligent-placement-of-datacenters
 
Assignment 1-mtat
Assignment 1-mtatAssignment 1-mtat
Assignment 1-mtat
 
5 state-of-cloud-applications-and-platforms
5 state-of-cloud-applications-and-platforms5 state-of-cloud-applications-and-platforms
5 state-of-cloud-applications-and-platforms
 
1 logical data models for cc arch
1 logical data models for cc arch1 logical data models for cc arch
1 logical data models for cc arch
 
2 rest-elevator-pitch
2 rest-elevator-pitch2 rest-elevator-pitch
2 rest-elevator-pitch
 
1 distributed-systems-template-modified
1 distributed-systems-template-modified1 distributed-systems-template-modified
1 distributed-systems-template-modified
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

3 apache-avro

  • 1. Apache Avro Zafar Gilani Muhammad Adnan Khan Hui Shang
  • 2. Outline • Overview • Comparison • Specification • SASL profile and usage • References
  • 3. Overview • A data serialization system. • An RPC framework. • For: storage & comm. • Purpose: – Provide rich data structures. – A compact and fast binary data format. – Simple integration with dynamic languages.
  • 4. Overview • Avro uses JSON for Interface Description Language (IDL). – To specify data types. – To specify protocols. • Review: JavaScript Object Notation is just a light-weight text-based standard for data interchange.
  • 5. Why the need for Avro? • Primary usage in Hadoop, provides standard: 1. Serialization format for persistent data. 2. Wire format for communication .. • .. among Hadoop nodes. • .. from client programs to Hadoop services.
  • 6. Overview • Avro relies on schemas. – Schema stored with data. – Each datum written with no per-value overheads. • Thus serialization is fast and small. • Avro in RPC: – Schema exchange during client-server handshake. – Correspondence in fields can be easily resolved.
  • 7. APIs • Supporting API for: – Java –C – C++ – C# – Python – Ruby
  • 8. Comparison with other systems • Avro vs. Protobuf and Thrift. • A quick note about Thrift: – Initially developed at Facebook by a Google intern. – Closer to Google’s protobuf.
  • 9. Comparison with other systems Avro Google protobuf Thrift Implementation Hmm.. Cleaner  Hmm.. Error handling Complex Simple OK Extensibility Hmm.. Richer OK Compatibility Java, C, C++, C#, That and much About the same as Python and Ruby more such as protobuf Adobe Actionscript, Microsoft Silverlight, etc.
  • 10. Specification • Schema represented in one of: – JSON string, naming a defined type. – JSON object of the form: • {"type": "typeName" ...attributes...} – JSON array • Primitive types: null, boolean, int, long, float, double, bytes, string – {"type": "string"} • Complex types: records, enums, arrays, maps, unions, fixed
  • 11. Specification, example protocol { "namespace": "com.acme", "protocol": "HelloWorld", "doc": "Protocol Greetings", "types": [ {"name": "Greeting", "type": "record", "fields": [ {"name": "message", "type": "string"}]}, {"name": "Curse", "type": "error", "fields": [ {"name": "message", "type": "string"}]} ], "messages": { "hello": { "doc": "Say hello.", "request": [{"name": "greeting", "type": "Greeting" }], "response": "Greeting", "errors": ["Curse"] } } }
  • 12. SASL profile • Simple Authentication and Security Layer. • Provides a framework for – Authentication. – Security of network protocols.
  • 13. SASL usage • Negotiation procedure to use connection- oriented Avro RPC: – 0: START Used in a client's initial message. – 1: CONTINUE Used while negotiation is ongoing. – 2: FAIL Terminates negotiation unsuccessfully. – 3: COMPLETE Terminates negotiation sucessfully.
  • 14. References 1. Apache Avro, http://avro.apache.org/docs/current/ 2. Google protocol buffers vs Apache Avro, http://www.sammur.com/?p=36 3. Avro vs Thrift, http://tech.puredanger.com/2011/05/27/serializ ation-comparison/ 4. SASL, http://avro.apache.org/docs/current/sasl.html
  • 15. Apache Avro Zafar Gilani Muhammad Adnan Khan Hui Shang