db tech showcase_2014_A14_Actian Vectorで得られる、BIにおける真のパフォーマンスとは

Copyright © 2014 Insight Technology, Inc. All Rights Reserved.
株式会社インサイトテクノロジー
新久保浩二
で得られる
BIにおける真のパフォーマンスとは
- 今からでも間に合う。そろそろ遅いBIは止めにしないか -

今日のはなし
昨今の分析系データベースの高速化により、分析処理へのニーズが高まってい
る。特にBIの分野では、従来では考えられないようなスピードで大量データの
処理が可能になっている。
この”分析系”と呼ばれるデータベースを一括りにして話をしないでほしい。
本日、紹介するActian Vectorは、分析系のデータベースとして、どのような課
題を設定し、解決しているか？
その中のテクノロジーとしてのポイントと近年の他データベースでの潮流を考
察する。
後半は、高速化された分析系データベースでの処理をさらに発展させるため、
Hadoop等の他データソースとの高速連携や、既存の基幹データベースとのリ
アルタイム連携などのソリューションを見てみる。

Who is Actian?
1970s 2010 2011 2012 2013
Ingres とし
てスタート
Vectorwise
(Actian Vector™)
リリース
Actian
に社名変更
Versant社
買収
Pervasive社&
ParAccel社
買収
2014
Actian Analytics
Platform™
リリース

What is Actian Vector?
Actian Analytics Platform™
Analyze ActConnect
Actian Analytics Accelerators
Accelerate
Hadoop
Accelerate
Analytics
Accelerate
BI
Enterprise
Applications Data
Warehouse
Social
Internet of Things
SaaS
WWW Machine
Data
Mobile
World-Class Risk
Management
Competitive
Advantage
Customer
Delight
Disruptive New
Business Models
NoSQLTraditional
VectorMatrixDataFlow Vector

Actian Vectorの歴史
現在の最新バージョンは3.5
次期バージョンのHadoop
にネイティブ対応した
Vector in Hadoop(Vortex)
が先日のHadoop Summit
で発表されている
/X100
2002 2008
2010 2014
TPC-Hベンチマークで
初登場で圧倒的な第1位
を記録
2011
分析向けRDBMS(MonetDB/SQL)
- インメモリーに近い構造
- カラムナー
一般的なDBと比較してx100高速化する研究
プロジェクト
- MonetDBがベース
- Vector Processing
- Smarter Compression
- CPU cache optimization
Vectorwiseのx100 engineと
Ingres Databaseとの統合
http://homepages.cwi.nl/~boncz/

3rd パーティーとのエコシステム
DataFlow
BI Replication
ETL

データベースが抱えている問題は？

コンピューター内部では
CPU使用“効率”の低さ
- 従来のデータベースの設計思想では、CPU命令(インストラクション)数が多す
ぎて、CPUクロックが上がりにくい現代では問題になっている(要はCPUボト
ルネック)
現実世界ではこう見える
- 並列化されてプロセス/スレッドは、CPUを100%使い切っているように見える
が、実行されているインストラクションは、非効率(現代的ではない)や、CPU
の投機的実行によるもの(多数の分岐による分岐ハザード等)や、無駄なプリ
フェッチで占められている可能性がある
相対的に遅いI/O
- メモリーをはじめディスク(SSD、HDD)へのI/Oは、CPUとの相対的な速度差
が大きい(メモリーであってCPUクロックに比較して数10～数100倍遅い)こと
が問題。近年はインメモリーDBなども登場してメモリーI/Oの遅さも注目され
ている。(要はCPUが遊ぶ)

データベースのチャレンジ領域
- 従来とは異なるアルゴリズムやハードウェア・アクセラレーションを
利用したインストラクション数の削減
- インストラクション削減にむけた、ストレージ構造の変更
- インストラクション削減に伴い、分岐の削減
(CPUの分岐予測にやさしいソフトウェア)
- ハードウェア・アクセラレーションを利用する場合の効率的なメモリー
アクセス
CPU命令数(インストラクション数)
の削減と高効率なCPU使用
ディスクの次はメモリーへの
アクセス速度の向上

データベースのチャレンジ領域(Oracle編)
Jonathan Lewis – Oracle Scratchpad
“12c In-memory
I wrote a note about the 12c “In-Memory” option some time ago on the OTN Database forum and thought I’d posted a link to it
from the blog. If I have I can’t find it now so, to avoid losing it, here’s a copy of the comments I made:
Juan Loaiza’s presentation is probably available on the Oracle site by now, but in outline: the in-memory component duplicates
data (specified tables – perhaps with a restriction to a subset of columns) in columnar format in a dedicated area of the SGA.
The data is kept up to date in real time, but Oracle doesn’t use undo or redo to maintain this copy of the data because it’s never
persisted to disc in this form, it’s recreated in-memory (by a background process) if the instance restarts. The optimizer can
then decide whether it would be faster to use a columnar or row-based approach to address a query.
The intent is to help systems which are mixed OLTP and DSS – which sometimes have many “extra” indexes to optimise DSS
queries that affect the performance of the OLTP updates. With the in-memory columnar copy you should be able to drop many
“DSS indexes”, thus improving OLTP response times – in effect the in-memory stuff behaves a bit like non-persistent bitmap
indexing.
Updated 18th Oct:
I’ve been reminded that I think the presentation also included some comments about the way that the code also takes
advantage of “vector” (SIMD) instructions at the CPU level to allow the code to evaluate predicates on multiple rows (extracted
from the column store, not the row store) simultaneously, and this contributes to the very high rates of data scanning that
Oracle Corp. claims.
The presentation from Juan Loaiza was still unavailable at the time of publishing this blog note (3rd Nov 2013). If it does
become available as part of the Open World set of presentations it should be at this URL.
“ http://jonathanlewis.wordpress.com/2013/11/06/12c-in-memory/

データベースのチャレンジ領域(SQL Server編)
SQL Server – Column Store Index
http://msdn.microsoft.com/ja-jp/library/gg492088.aspx
SQL Server に追加されたバッチモード実行と呼ばれる新しいクエリ実行メカニズム
により、CPU 使用率が大きく軽減されます。バッチモード実行は、列ストアスト
レージ形式と緊密に統合され、このストレージ形式に合わせて最適化されていま
す。バッチモード実行は、ベクターベースの実行またはベクター化された実行と呼
ばれることもあります。

データベースのチャレンジ領域(DB2編)
http://public.dhe.ibm.com/common/ssi/ecm/en/imd14435usen/IMD14435USEN.PDF
IBM DB2 10.5 with
BLU Acceleration

The SAP HANA Database – An Architecture Overview
“4 Analytical Query Processing
As generally agreed, column-stores are well suited for analytical queries on massive amounts of data [1]. For
high read performance the SAP HANA DB’s column-store uses efﬁcient compression schemes in
combination
with cache-aware and parallel algorithms. Every column is compressed with the help of a sorted dictionary,
i.e.,
each value is mapped to an integer value (the valueID). These valueIDs are further bit-packed and
compressed.
By resorting the rows in a table, the most beneﬁcial compression (e.g., run-length encoding (RLE), sparse
coding, or cluster coding) for the columns of this table can be used [11, 12]. Compressing data does not only
allow to keep more data on a single node, but it also allows for faster query processing, e.g., by exploiting the
RLE to compute aggregates. Scans are accelerated by excessively using SIMD algorithms working directly on
the compressed data [16].
”
http://sites.computer.org/debull/A12mar/hana.pdf
データベースのチャレンジ領域(HANA編)

SIMDとは？
SIMD (Single Instruction Multiple Data)
Pentium ⅢよりSSE(Streaming SIMD Extensions)として搭載され、Sandy BridgeよりIntel AVX(Advanced Vector eXtensions)へ
・・・
・・・
・・・
・・・
・・・
Instruction
Data
Output

データベースのインストラクション数
行
列
inメ
Row Database A Row Database B
Column
Database A
Column
Database B
In-Memory
Database A

データベースのインストラクション数
OS RHEL 6.5 x86_64
CPU 16 core (Xeon E5-2680 2.7GHz (8) * 2)
Memory 512 GB
Disk 1.3 TB (120GB * 22 (RAID10))
Benchmark Data TPC-H@100GB
Benchmark Query select
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= date '1996-01-01'
and l_shipdate < date '1996-01-01' + interval '1' year
and l_discount between 0.02 - 0.01 and 0.02 + 0.01
and l_quantity < 24
約80GB、6億件のデータ

レスポンスタイムを簡略化すると
Rt = Instructions / (IPC * Hz * Parallelism)
* Rt : クエリーのレスポンスタイム
* IPC (Instructions Per Cycle) : CPUの実行効率
* Hz : CPUのクロック速度
* Parallelism : クエリー実行の並列度

インストラクション数
2.7E+10
2.4E+11
2.0E+11
7.8E+11
1.9E+12 1.9E+12
2.8E+10
3.8E+11
4.8E+11
8.3E+11
2.8E+12
1.9E+12
1
9 7
29
102
68
0
20
40
60
80
100
120
0.0E+00
5.0E+11
1.0E+12
1.5E+12
2.0E+12
2.5E+12
3.0E+12
Columnar DB A Columnar DB B In Memory DB A
Row Store DB A Row Store DB B
CPU
Instructions
Vectorとの
比較(倍)

ブランチミス数
1.8E+07
1.1E+09
3.0E+08
1.1E+09
1.6E+09
7.7E+08
2.1E+07
1.4E+09
1.2E+09
1.1E+09
1.7E+09
7.7E+08
1
64
17
62
88
43
0
10
20
30
40
50
60
70
80
90
100
0.0E+00
2.0E+08
4.0E+08
6.0E+08
8.0E+08
1.0E+09
1.2E+09
1.4E+09
1.6E+09
1.8E+09
2.0E+09
Columnar DB A Columnar DB B In Memory DB ARow Store DB A Row Store DB B
CPU
Branch-Misses
Vectorとの
比較(倍)

IPC(Instructions Per Cycle)
2.19
1.70
2.05 1.94 1.83 2.08
1.74
1.58 1.40
1.85 1.58
2.08
0
0.5
1
1.5
2
2.5
3
3.5
Instrunctions
Per Cycle
Vectorとの
比較(倍)

0.48 3.44
35.58
209.45
467.36
332.56
1
7
74
434
968
689
0
200
400
600
800
1000
1200
0.0E+00
5.0E+01
1.0E+02
1.5E+02
2.0E+02
2.5E+02
3.0E+02
3.5E+02
4.0E+02
4.5E+02
5.0E+02
最終的なレスポンスタイム(秒)は
Parallelismは各データベースのEditionやデータの状況に
もよるので、あくまでも参考値です。
Query
Elapsed Time
(秒)
Vectorとの
比較(倍)

BIは分析DBのパフォーマンスだけではない
多くのBIシステムでは、分析データベースにデータを運ぶので手いっぱい

フレキシブルでハイパフォーマンスETL(今)
RDBMS
Vector
Legacy
ETL
S3 / (S)FTP(S)

フレキシブルでハイパフォーマンスETL(今後)
RDBMS
DataFlow Engine
Vector
DataFlow Engine
- No Map Reduceの並列分散実行エンジン
- ロード先がVectorの場合は、Hadoop
側で、データファイルをパラレルで作成
- 基本的にコーディングなし
S3 / (S)FTP(S)

フレキシブルでハイパフォーマンスETL(デモ)
DataFlow Engine
Vector
HDFS上にTPC-H＠100GBのデータを用意
その内80GBを占めるLINEITEMのデータ
をVectorに転送する。
その際、LINEITEMのデータを簡単な
フィルター処理を入れる

Demo (Actian DataFlow)

データの鮮度を保ったBI
RDBMS
Vector
RDBMSのトランザクション
ログを読んで、そのトランザ
クションをVectorに適用
フルロード時はVectorのネイティブロー
ダー(vwload)を使用。Vectorへのレプリ
ケーション時は、バッチモードでn秒お
きにまとめて、更新。
OLTP(TPC-C相当)の負荷
顧客の発注地域をリアルタイムで確
認(データ量は、1億5,000万件の
ログインデータ、5,000万件の顧客
データ、8,000万件の受注データ)

データの鮮度を保ったBI

結論
BIの肝は”高速なデータベース”
- Actian Vectorを”カラムナー”といった一括りにしないで
- ポイントは、大量データを処理できる実装か否か
- 現代的なハードウェア・アクセラレーションを上手く取り入れているか
- クエリーの高速処理を実現可能なデータベースだと
- キューブ等の事前集計は必要ない
- 分析の柔軟性を低下させる(静的な分析しかできない)
- 大量バッチのバッチ時間とメンテに追われる複雑な運用を強いる
“データ・インテグレーション”を含めた”高速かつ柔軟”なBIシステム
- 分析対象のデータソースは、様々な場所に格納されている
- 簡単、柔軟かつ高速に、データソースにアクセス可能であるべき
- 分析対象のデータ鮮度が、今後のBIの価値を左右する
- ETLでの複雑なデータ前処理を高速、高頻度に実施可能か？
- リプリケーションによるリアルタイム分析
- +両方の組みあわせ

最後に
Actian
D15: Actian Matrix (16:00 – 16:50)
“最強にスケーラブルなカラムナーDBよ、Hadoopとのタッグで
ビッグデータの地平を目指せ！”
平間大輔 (Insight Technology)
A35: Actian Analytics Platform
“[事例] ビッグデータのマーケティング活用事例”
ジュセッペ小林 (Actian)
Attunity
A23: Oracle
“Oracle移行を簡単に。レプリケーションテクノロジーを使いこなす”
宮地敬史 (Insight Technology)

db tech showcase_2014_A14_Actian Vectorで得られる、BIにおける真のパフォーマンスとは

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à db tech showcase_2014_A14_Actian Vectorで得られる、BIにおける真のパフォーマンスとは

Similaire à db tech showcase_2014_A14_Actian Vectorで得られる、BIにおける真のパフォーマンスとは (20)

Plus de Koji Shinkubo

Plus de Koji Shinkubo (15)

db tech showcase_2014_A14_Actian Vectorで得られる、BIにおける真のパフォーマンスとは