Soumettre la recherche
Mettre en ligne
H cat berlinbuzzwords2012
•
10 j'aime
•
1,229 vues
Hortonworks
Suivre
Slides from Alan Gates' HCatalog talk at Berlin Buzzwords 2012
Lire moins
Lire la suite
Technologie
Signaler
Partager
Signaler
Partager
1 sur 13
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
Future of HCatalog - Hadoop Summit 2012
Future of HCatalog - Hadoop Summit 2012
Hortonworks
May 2013 HUG: HCatalog/Hive Data Out
May 2013 HUG: HCatalog/Hive Data Out
Yahoo Developer Network
HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011
Hortonworks
Hive hcatalog
Hive hcatalog
Alexandre Poletto
Web Services Hadoop Summit 2012
Web Services Hadoop Summit 2012
Hortonworks
HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917
Chicago Hadoop Users Group
YARN - Strata 2014
YARN - Strata 2014
Hortonworks
HBaseCon 2015: Analyzing HBase Data with Apache Hive
HBaseCon 2015: Analyzing HBase Data with Apache Hive
HBaseCon
Contenu connexe
Tendances
Hive Anatomy
Hive Anatomy
nzhang
Hive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
Introduction to Hadoop
Introduction to Hadoop
Ovidiu Dimulescu
Session 01 - Into to Hadoop
Session 01 - Into to Hadoop
AnandMHadoop
Apache Drill
Apache Drill
Big Data User Group Karlsruhe/Stuttgart
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
HBaseCon
Session 03 - Hadoop Installation and Basic Commands
Session 03 - Hadoop Installation and Basic Commands
AnandMHadoop
Batch is Back: Critical for Agile Application Adoption
Batch is Back: Critical for Agile Application Adoption
DataWorks Summit/Hadoop Summit
Introduction to Apache Hive
Introduction to Apache Hive
Avkash Chauhan
Apache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBase
Nag Arvind Gudiseva
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
mas4share
Hadoop
Hadoop
Peerapat Asoktummarungsri
Drilling into Data with Apache Drill
Drilling into Data with Apache Drill
MapR Technologies
03 pig intro
03 pig intro
Subhas Kumar Ghosh
Beginning hive and_apache_pig
Beginning hive and_apache_pig
Mohamed Ali Mahmoud khouder
Sql saturday pig session (wes floyd) v2
Sql saturday pig session (wes floyd) v2
Wes Floyd
Leveraging Hadoop in your PostgreSQL Environment
Leveraging Hadoop in your PostgreSQL Environment
Jim Mlodgenski
How HarperDB Works
How HarperDB Works
HarperDB
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
Subhas Kumar Ghosh
Yahoo! Hack Europe Workshop
Yahoo! Hack Europe Workshop
Hortonworks
Tendances
(20)
Hive Anatomy
Hive Anatomy
Hive Quick Start Tutorial
Hive Quick Start Tutorial
Introduction to Hadoop
Introduction to Hadoop
Session 01 - Into to Hadoop
Session 01 - Into to Hadoop
Apache Drill
Apache Drill
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
Session 03 - Hadoop Installation and Basic Commands
Session 03 - Hadoop Installation and Basic Commands
Batch is Back: Critical for Agile Application Adoption
Batch is Back: Critical for Agile Application Adoption
Introduction to Apache Hive
Introduction to Apache Hive
Apache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBase
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Hadoop
Hadoop
Drilling into Data with Apache Drill
Drilling into Data with Apache Drill
03 pig intro
03 pig intro
Beginning hive and_apache_pig
Beginning hive and_apache_pig
Sql saturday pig session (wes floyd) v2
Sql saturday pig session (wes floyd) v2
Leveraging Hadoop in your PostgreSQL Environment
Leveraging Hadoop in your PostgreSQL Environment
How HarperDB Works
How HarperDB Works
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
Yahoo! Hack Europe Workshop
Yahoo! Hack Europe Workshop
Similaire à H cat berlinbuzzwords2012
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
Adam Muise
Future of HCatalog
Future of HCatalog
DataWorks Summit
TriHUG November HCatalog Talk by Alan Gates
TriHUG November HCatalog Talk by Alan Gates
trihug
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Big Data Spain
Dancing with the Elephant
Dancing with the Elephant
DataWorks Summit
HCatalog
HCatalog
GetInData
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
t3rmin4t0r
Unit V.pdf
Unit V.pdf
KennyPratheepKumar
Hadoop past, present and future
Hadoop past, present and future
Codemotion
Jan 2012 HUG: HCatalog
Jan 2012 HUG: HCatalog
Yahoo Developer Network
Strata feb2013
Strata feb2013
alanfgates
Hadoop data access layer v4.0
Hadoop data access layer v4.0
SpringPeople
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
The Hive
Mar 2012 HUG: Hive with HBase
Mar 2012 HUG: Hive with HBase
Yahoo Developer Network
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
DataWorks Summit
Hadoop development series(1)
Hadoop development series(1)
Amar kumar
HUG Meetup 2013: HCatalog / Hive Data Out
HUG Meetup 2013: HCatalog / Hive Data Out
Sumeet Singh
Process and Visualize Your Data with Revolution R, Hadoop and GoogleVis
Process and Visualize Your Data with Revolution R, Hadoop and GoogleVis
Hortonworks
Hdp r-google charttools-webinar-3-5-2013 (2)
Hdp r-google charttools-webinar-3-5-2013 (2)
Hortonworks
43_Sameer_Kumar_Das2
43_Sameer_Kumar_Das2
Mr.Sameer Kumar Das
Similaire à H cat berlinbuzzwords2012
(20)
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
Future of HCatalog
Future of HCatalog
TriHUG November HCatalog Talk by Alan Gates
TriHUG November HCatalog Talk by Alan Gates
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Dancing with the Elephant
Dancing with the Elephant
HCatalog
HCatalog
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
Unit V.pdf
Unit V.pdf
Hadoop past, present and future
Hadoop past, present and future
Jan 2012 HUG: HCatalog
Jan 2012 HUG: HCatalog
Strata feb2013
Strata feb2013
Hadoop data access layer v4.0
Hadoop data access layer v4.0
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Mar 2012 HUG: Hive with HBase
Mar 2012 HUG: Hive with HBase
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
Hadoop development series(1)
Hadoop development series(1)
HUG Meetup 2013: HCatalog / Hive Data Out
HUG Meetup 2013: HCatalog / Hive Data Out
Process and Visualize Your Data with Revolution R, Hadoop and GoogleVis
Process and Visualize Your Data with Revolution R, Hadoop and GoogleVis
Hdp r-google charttools-webinar-3-5-2013 (2)
Hdp r-google charttools-webinar-3-5-2013 (2)
43_Sameer_Kumar_Das2
43_Sameer_Kumar_Das2
Plus de Hortonworks
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
HDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
Plus de Hortonworks
(20)
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
HDF 3.2 - What's New
HDF 3.2 - What's New
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Dernier
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Aijun Zhang
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
Aggregage
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
DianaGray10
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
bruanjhuli
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
Matt Ray
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
Jamie (Taka) Wang
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
Eric D. Schabell
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
Tarek Kalaji
201610817 - edge part1
201610817 - edge part1
Jamie (Taka) Wang
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
Mahmoud Rabie
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
Adam Moalla
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
infogdgmi
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
Matsuo Lab
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
DianaGray10
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
Jamie (Taka) Wang
Nanopower In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
Pedro Manuel
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
IES VE
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
UiPathCommunity
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
Brian Pichman
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
YounusS2
Dernier
(20)
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
201610817 - edge part1
201610817 - edge part1
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
Nanopower In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
H cat berlinbuzzwords2012
1.
HCatalog Table Management For
Hadoop Alan F. Gates @alanfgates Page 1
2.
Who Am I? •
HCatalog committer and mentor • Co-founder of Hortonworks • Lead for Pig, Hive, and HCatalog at Hortonworks • Pig committer and PMC Member • Member of Apache Software Foundation and Incubator PMC • Author of Programming Pig from O’Reilly © Hortonworks Inc. 2012 Page 2
3.
Many Data Tools
MapReduce • Early adopters • Non-relational algorithms • Performance sensitive applications Pig • ETL • Data modeling • Iterative algorithms Hive • Analysis • Connectors to BI tools Strength: Pick the right tool for your application Weakness: Hard for users to share their data © Hortonworks 2012 Page 3
4.
Tool Comparison Feature
MapReduce Pig Hive Record format Key value pairs Tuple Record Data model User defined int, float, string, int, float, string, bytes, maps, maps, structs, lists tuples, bags Schema Encoded in app Declared in script Read from or read by loader metadata Data location Encoded in app Declared in script Read from metadata Data format Encoded in app Declared in script Read from metadata • Pig and MR users need to know a lot to write their apps • When data schema, location, or format change Pig and MR apps must be rewritten, retested, and redeployed • Hive users have to load data from Pig/MR users to have access to it © Hortonworks 2012 Page 4
5.
Hadoop Ecosystem MapReduce
Hive Pig SerDe InputFormat/ InputFormat/ Load/ Metastore Client OuputFormat OuputFormat Store HDFS Metastore © Hortonworks 2012 Page 5
6.
Opening up Metadata
to MR & Pig MapReduce Hive Pig HCaInputFormat/ HCatLoader/ HCatOuputFormat HCatStorer SerDe InputFormat/ Metastore Client OuputFormat HDFS Metastore © Hortonworks 2012 Page 6
7.
Tools With HCatalog Feature
MapReduce + Pig + HCatalog Hive HCatalog Record format Record Tuple Record Data model int, float, string, int, float, string, int, float, string, maps, structs, lists bytes, maps, maps, structs, lists tuples, bags Schema Read from Read from Read from metadata metadata metadata Data location Read from Read from Read from metadata metadata metadata Data format Read from Read from Read from metadata metadata metadata • Pig/MR users can read schema from metadata • Pig/MR users are insulated from schema, location, and format changes • All users have access to other users’ data as soon as it is committed © Hortonworks 2012 Page 7
8.
Pig Example Assume you
want to count how many time each of your users went to each of your URLs raw = load '/data/rawevents/20120530' as (url, user); botless = filter raw by myudfs.NotABot(user); grpd = group botless by (url, user); cntd = foreach grpd generate flatten(url, user), COUNT(botless); store cntd into '/data/counted/20120530'; No need to know No need to Using HCatalog: file location declare schema raw = load 'rawevents' using HCatLoader(); botless = filter raw by myudfs.NotABot(user) and ds == '20120530'; grpd = group botless by (url, user); Partition filter cntd = foreach grpd generate flatten(url, user), COUNT(botless); store cntd into 'counted' using HCatStorer(); © Hortonworks 2012 Page 8
9.
Working with HCatalog
in MapReduce Setting input: table to read from HCatInputFormat.setInput(job, InputJobInfo.create(dbname, tableName, filter)); database to read specify which Setting output: from partitions to read HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbname, tableName, partitionSpec)); specify which Obtaining schema: partition to write schema = HCatInputFormat.getOutputSchema(); access fields by Key is unused, Value is HCatRecord: name String url = value.get("url", schema); output.set("cnt", schema, cnt); © Hortonworks 2012 Page 9
10.
Managing Metadata • If
you are a Hive user, you can use your Hive metastore with no modifications • If not, you can use the HCatalog command line tool to issue Hive DDL (Data Definition Language) commands: > /usr/bin/hcat -e ”create table rawevents (url string, user string) partitioned by (ds string);"; • Starting in Pig 0.11, you will be able to issue DDL commands from Pig © Hortonworks 2012 Page 10
11.
Templeton - REST
API • REST endpoints: databases, tables, partitions, columns, table properties • PUT to create/update, GET to list or describe, DELETE to drop Create new table “rawevents” Describe table “rawevents” Get a list of all tables in the default database: PUT {"columns": [{ "name": "url", "type": "string" }, { "name": "user", "type": "string"}], "partitionedBy": [{ "name": "ds", "type": "string" }]} GET GET http://…/v1/ddl/database/default/table/rawevents http://…/v1/ddl/database/default/table/rawevents http://…/v1/ddl/database/default/table Hadoop/ HCatalog { "columns": [{"name": "url","type": "string"}, "tables":"rawevents", "table": ["counted","processed",], {"name": "user","type": "string"}], "database": "default" "default” } "database": "default", "table": "rawevents" } • Included in HDP • Not yet checked in, but you can find the code on Apache’s JIRA HCATALOG-182 © Hortonworks 2012 Page 11
12.
HCatalog Project • HCatalog
is an Apache Incubator project • Version 0.4.0-incubating released May 2012 –Hive/Pig/MapReduce integration –Support for any data format with a SerDe (Text, Sequence, RCFile, JSON SerDes included) –Notification via JMS –Initial HBase integration © Hortonworks Inc. 2012 Page 12
13.
Questions
© Hortonworks 2012 Page 13
Télécharger maintenant