SlideShare a Scribd company logo
1 of 37
Apache Tajo : 
A Big Data Warehouse System 
on Hadoop 
Jaehwa Jung Research Director 
Gruter TECHDAY 2014
©2014 Gruter. All rights reserved. 
About me 
• Bigdata Platform, Gruter Inc (http://www.gruter.com) 
• Apache Tajo Committer 
• jhjung@gruter.com 
• http://blrunner.com 
• 저서: 시작하세요!하둡 프로그래밍
©2014 Gruter. All rights reserved. 
Agenda 
• Introduction to SQL-on-Hadoop 
• Introduction to Apache Tajo 
• What you can do with Tajo? 
• Why you should use Tajo? 
• Current Tajo Status 
• Use Cases 
• Demonstration
Introduction to SQL-on-Hadoop
©2014 Gruter. All rights reserved. 
Hadoop Overview 
MapReduce 
(Distributed computation) 
HDFS 
(Distributed storage) 
출처: http://www.quuxlabs.com/wp-content/uploads/2010/08/Yahoo-hadoop-cluster_OSCON_2007.jpg
©2014 Gruter. All rights reserved. 
SQL-on-Hadoop Overview 
• HDFS에 저장된 데이터를 SQL로 처리하는 시스템 
• 탈 MapReduce 모델 
• 다양한 설계 목표 : DataWarehouse VS Query Engine
Introduction to Apache Tajo
©2014 Gruter. All rights reserved. 
Apache Tajo Overview 
• A big data warehouse system on Hadoop 
• Apache Top-level project since March 2014 
• Supports SQL standards 
• Features 
– Powerful distributed processing architecture (Not MapReduce) 
– Advanced query optimization algorithms and techniques 
– Long running queries : for many hours 
– Interactive analysis queries : from 100 milliseconds 
• Recent 0.9.0 release
©2014 Gruter. All rights reserved. 
Tajo Architecture 
Master Server 
TajoMaster 
Slave Server 
TajoWorker 
QueryMaster 
Local Query 
Engine 
StorageManager 
Local 
FileSystem 
HDFS 
Client 
JDBC TSql Web UI 
Slave Server 
TajoWorker 
QueryMaster 
Local Query 
Engine 
StorageManager 
Local 
FileSystem 
HDFS 
Slave Server 
TajoWorker 
QueryMaster 
Local Query 
Engine 
StorageManager 
Local 
FileSystem 
HDFS 
CatalogStore 
DBMS 
Submit a query HCatalog 
Manage metadata 
Allocate a query 
Run & 
monitor 
a query 
Run & 
monitor 
a query
What You Can Do with Tajo?
©2014 Gruter. All rights reserved. 
Commercial Data Warehouse 
Front-End 
Analytics 
Source Data Data Warehouse 
OLTP 
CRM 
ERP 
ecommerc 
e 
Other 
ODS 
(Operational 
Data Store) 
Data 
Warehouse 
Data Mart OLAP 
Visualiz 
ation 
ETL 
ETL 
ETL 
Reports 
Data 
Mining
Hadoop based Data Warehouse with Tajo 
Front-End 
Analytics 
©2014 Gruter. All rights reserved. 
We can do ETL and Interactive Analytics! 
Source Data Data Warehouse 
OLTP 
CRM 
ERP 
ecommerce 
Other 
ODS 
(Operational 
Data Store) 
Data 
Warehouse 
Data Mart 
Reports 
OLAP 
Visualiz 
ation 
Data 
Mining 
ETL 
ETL 
ETL
Why You Should Use Tajo?
©2014 Gruter. All rights reserved. 
Mature SQL Feature Set 
• Fully distributed query executions 
– Inner join, and left/right/full outer join 
– Groupby, sort, multiple distinct aggregation 
– window function 
• SQL data types 
– CHAR, BOOL, INT, DOUBLE, TEXT, DATE, Etc 
• Various file formats 
– Text file (CSV), SequenceFile, RCFile, Parquet, Avro 
• SQL Standards 
– Non standard features : PgSQL and Oracle
©2014 Gruter. All rights reserved. 
Performance 
• Faster than Hive 0.10 (1.5 – 10 times): http://slidesha.re/1yTBTaa 
• Data Set : TPC-H Scale 100 or 1000 
• H/W : 1 master + 6 data nodes 
CPU 24 Cores (Xeon 2.5GHz, HT) 
Memory 64GB 
Disk 3TB * 6 SATA/HDD (7200 RPM) 
Network 10Gb 
• S/W 
Hadoop cdh-4.3.0 
Hive 0.10.0-cdh4.3.0 
Impala impalad_version_1.1.1_RELEASE 
Tajo 0.2-SNAPSHOT
©2014 Gruter. All rights reserved. 
Performance: Q1 – filter scan 
select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, 
sum(l_extendedprice) as sum_base_price, sum(l_extendedprice*(1- 
l_discount)) as sum_disc_price, sum(l_extendedprice*(1- 
l_discount)*(1+l_tax)) as sum_charge, avg(l_quantity) as avg_qty, 
avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, 
count(*) as count_order from lineitem where l_shipdate <= '1998- 
09-01' group by l_returnflag, l_linestatus order by l_returnflag, 
l_linestatus 
1445.69 
895.96 
789.09 
1500 
1000 
500 
0 
Q1: scan using about 20 text pattern matching filters 
Hive 
Impala 
Tajo
©2014 Gruter. All rights reserved. 
Performance: Q2 – unions and joins 
create table nation_region as select n_regionkey, r_regionkey, n_nationkey, n_name, r_name 
from region join nation on n_regionkey = r_regionkey where r_name = 'EUROPE'; 
create table r2_1 as select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, 
s_phone, s_comment, ps_supplycost from nation_region join supplier on s_nationkey = 
n_nationkey join partsupp on s_suppkey = ps_suppkey join part on p_partkey = ps_partkey 
where p_size = 15 and p_type like '%BRASS'; 
create table r2_2 as select p_partkey, min(ps_supplycost) as min_ps_supplycost from r2_1 
group by p_partkey; 
select s_acctbal, s_name, n_name, r2_1.p_partkey, p_mfgr, s_address, s_phone, s_comment 
from r2_1 join r2_2 on r2_1.p_partkey = r2_2.p_partkey where ps_supplycost = 
min_ps_supplycost order by s_acctbal, n_name, s_name, r2_1.p_partkey; 
63.64 
9.11 
38.64 
70 
60 
50 
40 
30 
20 
10 
0 
Q2: 7 unions with joins 
Hive 
Impala 
Tajo
©2014 Gruter. All rights reserved. 
Performance: Q3 - join 
select l_orderkey, sum(l_extendedprice*(1-l_discount)) as 
revenue, o_orderdate, o_shippriority from customer as c join 
orders as o on c.c_mktsegment = 'BUILDING' and c.c_custkey = 
o.o_custkey join lineitem as l on l.l_orderkey = o.o_orderkey 
where o_orderdate < '1995-03-15' and l_shipdate > '1995-03-15' 
group by l_orderkey, o_orderdate, o_shippriority order by revenue 
desc, o_orderdate; 
101.45 
36.81 
31.92 
100 
80 
60 
40 
20 
0 
Q3: join 
Hive 
Impala 
Tajo
©2014 Gruter. All rights reserved. 
Simple Operation and Software Stack 
• Simple Installation and Operation 
– 
http://tajo.apache.org/docs/current/getting_started.h 
tml 
• Simple Software Stack Requirement 
– No MapReduce and No Tez 
– Yarn support but not mandatory 
– Tajo + Linux system for single node cluster 
– Tajo + HDFS for a distributed cluster
©2014 Gruter. All rights reserved. 
Simple Integration 
• Integration with Hadoop Ecosystem 
– Hadoop 2.2.0 – 2.5.1 support 
– Be able to connect to Hive Metastore 
– Directly process tables managed by Hive 
• Yarn support (backport) 
– Enable Tajo to deploy and run on Yarn cluster 
– Allow users to add/remove cluster nodes to/from 
Tajo cluster in runtime
©2014 Gruter. All rights reserved. 
Active Open Source Community 
• Fully community-driven open source 
• Stable development team 
– 17 committers + many contributors
Current Tajo Status
©2014 Gruter. All rights reserved. 
Join 
• Join 
– NATURAL, INNER, OUTER (LEFT, RIGHT, FULL) 
– SEMI, ANTI Join (planned for v0.9) 
• Join Predicates 
– WHERE and ON predicates 
– de-factor standard outer join behavior with 
both 
SELECpTr e*d FicRaOtMe st1 LEFT JOIN t2 ON t1.num = t2.num 
WHERE t2.value = 'xxx'; 
SELECT * FROM t1 LEFT JOIN t2 WHERE t1.num = 
t2.num and t2.value = ‘xxx’;
©2014 Gruter. All rights reserved. 
Window Function 
• OVER clause 
– row_number() and rank() 
– Aggregation function support 
– PARTITION and ORDER BY clause 
SELECT depname, empno, 
salary, enroll_date FROM ( SELECT 
depname, empno, salary, enroll_date, 
rank() OVER (PARTITION BY depname 
ORDER BY salary DESC, empno) AS pos 
FROM empsalary 
) AS ss 
WHERE 
pos < 3;
WITH (‘parquet.compression’ = ‘SNAPPY’) 
©2014 Gruter. All rights reserved. 
Table Partitions 
• Column Value Partition 
– Hive Compatible Partition 
CREATE TABLE T1 (C1 INT, C2 TEXT) 
using PARQUET 
PARTITION BY COLUMN (C3 INT, C4 TEXT); 
• Range Partition (planned for 1.0) 
– Table will be partitioned by disjoint ranges. 
– Will remove the partition granularity problem 
of 
Hive Partition
©2014 Gruter. All rights reserved. 
Comparison with other platform (1/2) 
Function Tajo Hive Impala Spark 
Computing 자체 
MapReduce or 
Tez 
자체 자체 
Resource 
Management 
자체 or 
YARN 
YARN 자체 자체 or YARN 
Scheduler FIFO, Fair 
FIFO, Fair, 
Capacity 
FIFO, Fair FIFO, Fair 
Storage 
HDFS, S3, 
HBase 
HDFS, HBase, 
S3 
HDFS, HBase 
자체 RDD 
(HDSF 등) 
File Format 
CSV, RC, 
Parquet, 
Avro 등 
CSV, RC, ORC, 
Parquet, Avro 
등 
CSV, RC, 
Parquet, Avro 
등 
CSV, RC, 
Parquet, Avro 
등 
Data Model Relational Relational Relational Relational 
Query ANSI-SQL HiveQL HiveQL HiveQL
©2014 Gruter. All rights reserved. 
Comparison with other platform (2/2) 
Function Tajo Hive Impala Spark 
구현 언어 Java Java C++ Scala 
Client 
Java API, JDBC, 
CLI 
CLI, JDBC, 
ODBC, Thrift 
Server API 
CLI, JDBC, 
ODBC 
Shark 
JDBC/ODBC, 
Scala, Java, 
Python API 
Query 
Latency 
Long run, 
Interactive 
Long run, 
(Interactive-Tez) 
Interactive Interactive 
컴퓨팅 특 
징 
데이터는 Disk, 
중간 데이터는 
Memory/Disk 
모두 사용 
데이터는 Disk, 
중간 데이터는 
Memory/Disk 
모두 사용 
중간 데이터가 
In-Memory 
(최근 On-Disk 
지원) 
분석 대상 데이터 
가 In-Memory에 
로딩 
License Apache Apache Apache Apache 
Main 
Sponsor 
Gruter Hortonworks Cloudera Databricks
©2014 Gruter. All rights reserved. 
Future Works 
• 2014 4Q 
– HBase intergation 
– In/Exists SubQuery 
– User defined function 
– Multi-tenant Scheduler 
• 2015 1Q 
– Authentication and Standard Access Control 
– Scalar SubQuery 
– ROLLUP, CUBE 
• 2015 2Q 
– Vectorized Engine(C++ Operator) 
– TajoR
Use Cases
Replace Commercial Data Warehouse (SKT) 
• ETL Processing: 120+ queries, ~4TB read/day 
©2014 Gruter. All rights reserved. 
• OLAP Processing: 500+ queries 
Operational 
Systems 
Integration 
Layer 
Data Warehouse 
Data Mart 
Marketing 
Sales 
ERP 
SCM 
ODS 
Staging 
Area Strategic 
Marts 
Data 
Vault
©2014 Gruter. All rights reserved. 
Tajo-as-a-Service on AWS
Demonstration
©2014 Gruter. All rights reserved. 
TSql & Web UI 
Watch this video for Apache Tajo: 
http://www.youtube.com/watch?v=bFGjMLPEDq0
©2014 Gruter. All rights reserved. 
Get Involved! 
• We are recruiting contributors! 
• General 
– http://tajo.apache.org 
• 한국 Tajo 사용자 그룹 (Korean Tajo User Group) 
- https://groups.google.com/forum/?hl=ko#!forum/tajo-user-kr 
• Getting Started 
– http://tajo.apache.org/docs/0.9.0/getting_started.html 
• Downloads 
– http://tajo.apache.org/docs/0.9.0/getting_started/downloading_ 
source.html 
• Jira – Issue Tracker 
– https://issues.apache.org/jira/browse/TAJO 
• Join the mailing list 
– dev-subscribe@tajo.apache.org 
– issues-subscribe@tajo.apache.org
©2014 Gruter. All rights 
reserved. 
GRUTER: YOUR PARTNER 
IN THE BIG DATA REVOLUTION 
Phone +82-70-8129-2950 
Fax +82-70-8129-2952 
E-mail contact@gruter.com 
Web www.gruter.com 
Phone +1-415-841-3345

More Related Content

What's hot

Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Gruter
 
Tajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopTajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopHyunsik Choi
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
Building data pipelines with kite
Building data pipelines with kiteBuilding data pipelines with kite
Building data pipelines with kiteJoey Echeverria
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemDataWorks Summit/Hadoop Summit
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopDataWorks Summit
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedwhoschek
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerHBaseCon
 
HBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBaseCon
 
From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayDataWorks Summit
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon
 
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterBig Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterData Con LA
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesHBaseCon
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Databricks
 

What's hot (20)

Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
 
Tajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopTajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for Hadoop
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Building data pipelines with kite
Building data pipelines with kiteBuilding data pipelines with kite
Building data pipelines with kite
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage Subsystem
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
HBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and Spark
 
Apache Kite
Apache KiteApache Kite
Apache Kite
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
 
From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native way
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at Pinterest
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
 
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterBig Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application Archetypes
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
 

Viewers also liked

Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)Gruter
 
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo,  and application case of SK TelecomSQL-on-Hadoop with Apache Tajo,  and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo, and application case of SK TelecomGruter
 
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)Gruter
 
프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게Matthew (정재화)
 
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개Gruter
 
Gruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigDataGruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigDataGruter
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개Gruter
 
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론Terry Cho
 

Viewers also liked (8)

Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
 
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo,  and application case of SK TelecomSQL-on-Hadoop with Apache Tajo,  and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
 
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
 
프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게
 
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
 
Gruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigDataGruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigData
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
 
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
 

Similar to Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)

Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014Apache Tajo - BWC 2014
Apache Tajo - BWC 2014Gruter
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016StampedeCon
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfMiguel Angel Fajardo
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Hadoop / Spark Conference Japan
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
 
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analyticsmason_s
 
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisImpala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisFelicia Haggarty
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Cloudera, Inc.
 
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...Dataconomy Media
 
Apache Hive for modern DBAs
Apache Hive for modern DBAsApache Hive for modern DBAs
Apache Hive for modern DBAsLuis Marques
 
Big data, just an introduction to Hadoop and Scripting Languages
Big data, just an introduction to Hadoop and Scripting LanguagesBig data, just an introduction to Hadoop and Scripting Languages
Big data, just an introduction to Hadoop and Scripting LanguagesCorley S.r.l.
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Jen Aman
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelt3rmin4t0r
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesDataWorks Summit
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weitingWei Ting Chen
 
Hybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerianHybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerianData Con LA
 

Similar to Gruter_TECHDAY_2014_03_ApacheTajo (in Korean) (20)

Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014Apache Tajo - BWC 2014
Apache Tajo - BWC 2014
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
 
Apache Eagle - Monitor Hadoop in Real Time
Apache Eagle - Monitor Hadoop in Real TimeApache Eagle - Monitor Hadoop in Real Time
Apache Eagle - Monitor Hadoop in Real Time
 
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisImpala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
 
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
Simplifying Hadoop: A Secure and Unified Data Access Path for Computer Framew...
 
Apache Hive for modern DBAs
Apache Hive for modern DBAsApache Hive for modern DBAs
Apache Hive for modern DBAs
 
Big data, just an introduction to Hadoop and Scripting Languages
Big data, just an introduction to Hadoop and Scripting LanguagesBig data, just an introduction to Hadoop and Scripting Languages
Big data, just an introduction to Hadoop and Scripting Languages
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! Perspectives
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
Hybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerianHybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerian
 

More from Gruter

MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기Gruter
 
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseIntroduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseGruter
 
Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with TajoExpanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with TajoGruter
 
Introduction to Apache Tajo
Introduction to Apache TajoIntroduction to Apache Tajo
Introduction to Apache TajoGruter
 
스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWS스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWSGruter
 
Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)Gruter
 
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoEfficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoGruter
 
Tajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWSTajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWSGruter
 
Data analysis with Tajo
Data analysis with TajoData analysis with Tajo
Data analysis with TajoGruter
 
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Gruter
 
Hadoop security DeView 2014
Hadoop security DeView 2014Hadoop security DeView 2014
Hadoop security DeView 2014Gruter
 
Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014Gruter
 
Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료Gruter
 
Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013Gruter
 
Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105Gruter
 
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun KimDeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun KimGruter
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-Hadoop
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-HadoopGRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-Hadoop
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-HadoopGruter
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례Gruter
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...Gruter
 

More from Gruter (19)

MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기
 
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseIntroduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data Warehouse
 
Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with TajoExpanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with Tajo
 
Introduction to Apache Tajo
Introduction to Apache TajoIntroduction to Apache Tajo
Introduction to Apache Tajo
 
스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWS스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWS
 
Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)
 
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoEfficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
 
Tajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWSTajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWS
 
Data analysis with Tajo
Data analysis with TajoData analysis with Tajo
Data analysis with Tajo
 
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014
 
Hadoop security DeView 2014
Hadoop security DeView 2014Hadoop security DeView 2014
Hadoop security DeView 2014
 
Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014
 
Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료
 
Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013
 
Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105
 
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun KimDeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-Hadoop
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-HadoopGRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-Hadoop
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Tajo와 SQL-on-Hadoop
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: 온라인 컨텐츠 서비스를 위한 빅데이터 구축 사례
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: Bioinformatics Data를 위한 Hadoop기반...
 

Recently uploaded

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Recently uploaded (20)

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 

Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)

  • 1. Apache Tajo : A Big Data Warehouse System on Hadoop Jaehwa Jung Research Director Gruter TECHDAY 2014
  • 2. ©2014 Gruter. All rights reserved. About me • Bigdata Platform, Gruter Inc (http://www.gruter.com) • Apache Tajo Committer • jhjung@gruter.com • http://blrunner.com • 저서: 시작하세요!하둡 프로그래밍
  • 3. ©2014 Gruter. All rights reserved. Agenda • Introduction to SQL-on-Hadoop • Introduction to Apache Tajo • What you can do with Tajo? • Why you should use Tajo? • Current Tajo Status • Use Cases • Demonstration
  • 5. ©2014 Gruter. All rights reserved. Hadoop Overview MapReduce (Distributed computation) HDFS (Distributed storage) 출처: http://www.quuxlabs.com/wp-content/uploads/2010/08/Yahoo-hadoop-cluster_OSCON_2007.jpg
  • 6. ©2014 Gruter. All rights reserved. SQL-on-Hadoop Overview • HDFS에 저장된 데이터를 SQL로 처리하는 시스템 • 탈 MapReduce 모델 • 다양한 설계 목표 : DataWarehouse VS Query Engine
  • 8.
  • 9.
  • 10. ©2014 Gruter. All rights reserved. Apache Tajo Overview • A big data warehouse system on Hadoop • Apache Top-level project since March 2014 • Supports SQL standards • Features – Powerful distributed processing architecture (Not MapReduce) – Advanced query optimization algorithms and techniques – Long running queries : for many hours – Interactive analysis queries : from 100 milliseconds • Recent 0.9.0 release
  • 11. ©2014 Gruter. All rights reserved. Tajo Architecture Master Server TajoMaster Slave Server TajoWorker QueryMaster Local Query Engine StorageManager Local FileSystem HDFS Client JDBC TSql Web UI Slave Server TajoWorker QueryMaster Local Query Engine StorageManager Local FileSystem HDFS Slave Server TajoWorker QueryMaster Local Query Engine StorageManager Local FileSystem HDFS CatalogStore DBMS Submit a query HCatalog Manage metadata Allocate a query Run & monitor a query Run & monitor a query
  • 12. What You Can Do with Tajo?
  • 13. ©2014 Gruter. All rights reserved. Commercial Data Warehouse Front-End Analytics Source Data Data Warehouse OLTP CRM ERP ecommerc e Other ODS (Operational Data Store) Data Warehouse Data Mart OLAP Visualiz ation ETL ETL ETL Reports Data Mining
  • 14. Hadoop based Data Warehouse with Tajo Front-End Analytics ©2014 Gruter. All rights reserved. We can do ETL and Interactive Analytics! Source Data Data Warehouse OLTP CRM ERP ecommerce Other ODS (Operational Data Store) Data Warehouse Data Mart Reports OLAP Visualiz ation Data Mining ETL ETL ETL
  • 15. Why You Should Use Tajo?
  • 16. ©2014 Gruter. All rights reserved. Mature SQL Feature Set • Fully distributed query executions – Inner join, and left/right/full outer join – Groupby, sort, multiple distinct aggregation – window function • SQL data types – CHAR, BOOL, INT, DOUBLE, TEXT, DATE, Etc • Various file formats – Text file (CSV), SequenceFile, RCFile, Parquet, Avro • SQL Standards – Non standard features : PgSQL and Oracle
  • 17. ©2014 Gruter. All rights reserved. Performance • Faster than Hive 0.10 (1.5 – 10 times): http://slidesha.re/1yTBTaa • Data Set : TPC-H Scale 100 or 1000 • H/W : 1 master + 6 data nodes CPU 24 Cores (Xeon 2.5GHz, HT) Memory 64GB Disk 3TB * 6 SATA/HDD (7200 RPM) Network 10Gb • S/W Hadoop cdh-4.3.0 Hive 0.10.0-cdh4.3.0 Impala impalad_version_1.1.1_RELEASE Tajo 0.2-SNAPSHOT
  • 18. ©2014 Gruter. All rights reserved. Performance: Q1 – filter scan select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice*(1- l_discount)) as sum_disc_price, sum(l_extendedprice*(1- l_discount)*(1+l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= '1998- 09-01' group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus 1445.69 895.96 789.09 1500 1000 500 0 Q1: scan using about 20 text pattern matching filters Hive Impala Tajo
  • 19. ©2014 Gruter. All rights reserved. Performance: Q2 – unions and joins create table nation_region as select n_regionkey, r_regionkey, n_nationkey, n_name, r_name from region join nation on n_regionkey = r_regionkey where r_name = 'EUROPE'; create table r2_1 as select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment, ps_supplycost from nation_region join supplier on s_nationkey = n_nationkey join partsupp on s_suppkey = ps_suppkey join part on p_partkey = ps_partkey where p_size = 15 and p_type like '%BRASS'; create table r2_2 as select p_partkey, min(ps_supplycost) as min_ps_supplycost from r2_1 group by p_partkey; select s_acctbal, s_name, n_name, r2_1.p_partkey, p_mfgr, s_address, s_phone, s_comment from r2_1 join r2_2 on r2_1.p_partkey = r2_2.p_partkey where ps_supplycost = min_ps_supplycost order by s_acctbal, n_name, s_name, r2_1.p_partkey; 63.64 9.11 38.64 70 60 50 40 30 20 10 0 Q2: 7 unions with joins Hive Impala Tajo
  • 20. ©2014 Gruter. All rights reserved. Performance: Q3 - join select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from customer as c join orders as o on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey join lineitem as l on l.l_orderkey = o.o_orderkey where o_orderdate < '1995-03-15' and l_shipdate > '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate; 101.45 36.81 31.92 100 80 60 40 20 0 Q3: join Hive Impala Tajo
  • 21. ©2014 Gruter. All rights reserved. Simple Operation and Software Stack • Simple Installation and Operation – http://tajo.apache.org/docs/current/getting_started.h tml • Simple Software Stack Requirement – No MapReduce and No Tez – Yarn support but not mandatory – Tajo + Linux system for single node cluster – Tajo + HDFS for a distributed cluster
  • 22. ©2014 Gruter. All rights reserved. Simple Integration • Integration with Hadoop Ecosystem – Hadoop 2.2.0 – 2.5.1 support – Be able to connect to Hive Metastore – Directly process tables managed by Hive • Yarn support (backport) – Enable Tajo to deploy and run on Yarn cluster – Allow users to add/remove cluster nodes to/from Tajo cluster in runtime
  • 23. ©2014 Gruter. All rights reserved. Active Open Source Community • Fully community-driven open source • Stable development team – 17 committers + many contributors
  • 25. ©2014 Gruter. All rights reserved. Join • Join – NATURAL, INNER, OUTER (LEFT, RIGHT, FULL) – SEMI, ANTI Join (planned for v0.9) • Join Predicates – WHERE and ON predicates – de-factor standard outer join behavior with both SELECpTr e*d FicRaOtMe st1 LEFT JOIN t2 ON t1.num = t2.num WHERE t2.value = 'xxx'; SELECT * FROM t1 LEFT JOIN t2 WHERE t1.num = t2.num and t2.value = ‘xxx’;
  • 26. ©2014 Gruter. All rights reserved. Window Function • OVER clause – row_number() and rank() – Aggregation function support – PARTITION and ORDER BY clause SELECT depname, empno, salary, enroll_date FROM ( SELECT depname, empno, salary, enroll_date, rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos FROM empsalary ) AS ss WHERE pos < 3;
  • 27. WITH (‘parquet.compression’ = ‘SNAPPY’) ©2014 Gruter. All rights reserved. Table Partitions • Column Value Partition – Hive Compatible Partition CREATE TABLE T1 (C1 INT, C2 TEXT) using PARQUET PARTITION BY COLUMN (C3 INT, C4 TEXT); • Range Partition (planned for 1.0) – Table will be partitioned by disjoint ranges. – Will remove the partition granularity problem of Hive Partition
  • 28. ©2014 Gruter. All rights reserved. Comparison with other platform (1/2) Function Tajo Hive Impala Spark Computing 자체 MapReduce or Tez 자체 자체 Resource Management 자체 or YARN YARN 자체 자체 or YARN Scheduler FIFO, Fair FIFO, Fair, Capacity FIFO, Fair FIFO, Fair Storage HDFS, S3, HBase HDFS, HBase, S3 HDFS, HBase 자체 RDD (HDSF 등) File Format CSV, RC, Parquet, Avro 등 CSV, RC, ORC, Parquet, Avro 등 CSV, RC, Parquet, Avro 등 CSV, RC, Parquet, Avro 등 Data Model Relational Relational Relational Relational Query ANSI-SQL HiveQL HiveQL HiveQL
  • 29. ©2014 Gruter. All rights reserved. Comparison with other platform (2/2) Function Tajo Hive Impala Spark 구현 언어 Java Java C++ Scala Client Java API, JDBC, CLI CLI, JDBC, ODBC, Thrift Server API CLI, JDBC, ODBC Shark JDBC/ODBC, Scala, Java, Python API Query Latency Long run, Interactive Long run, (Interactive-Tez) Interactive Interactive 컴퓨팅 특 징 데이터는 Disk, 중간 데이터는 Memory/Disk 모두 사용 데이터는 Disk, 중간 데이터는 Memory/Disk 모두 사용 중간 데이터가 In-Memory (최근 On-Disk 지원) 분석 대상 데이터 가 In-Memory에 로딩 License Apache Apache Apache Apache Main Sponsor Gruter Hortonworks Cloudera Databricks
  • 30. ©2014 Gruter. All rights reserved. Future Works • 2014 4Q – HBase intergation – In/Exists SubQuery – User defined function – Multi-tenant Scheduler • 2015 1Q – Authentication and Standard Access Control – Scalar SubQuery – ROLLUP, CUBE • 2015 2Q – Vectorized Engine(C++ Operator) – TajoR
  • 32. Replace Commercial Data Warehouse (SKT) • ETL Processing: 120+ queries, ~4TB read/day ©2014 Gruter. All rights reserved. • OLAP Processing: 500+ queries Operational Systems Integration Layer Data Warehouse Data Mart Marketing Sales ERP SCM ODS Staging Area Strategic Marts Data Vault
  • 33. ©2014 Gruter. All rights reserved. Tajo-as-a-Service on AWS
  • 35. ©2014 Gruter. All rights reserved. TSql & Web UI Watch this video for Apache Tajo: http://www.youtube.com/watch?v=bFGjMLPEDq0
  • 36. ©2014 Gruter. All rights reserved. Get Involved! • We are recruiting contributors! • General – http://tajo.apache.org • 한국 Tajo 사용자 그룹 (Korean Tajo User Group) - https://groups.google.com/forum/?hl=ko#!forum/tajo-user-kr • Getting Started – http://tajo.apache.org/docs/0.9.0/getting_started.html • Downloads – http://tajo.apache.org/docs/0.9.0/getting_started/downloading_ source.html • Jira – Issue Tracker – https://issues.apache.org/jira/browse/TAJO • Join the mailing list – dev-subscribe@tajo.apache.org – issues-subscribe@tajo.apache.org
  • 37. ©2014 Gruter. All rights reserved. GRUTER: YOUR PARTNER IN THE BIG DATA REVOLUTION Phone +82-70-8129-2950 Fax +82-70-8129-2952 E-mail contact@gruter.com Web www.gruter.com Phone +1-415-841-3345