https://www.elastic.co/elasticon/tour/2019/seoul/devsisters-game-service-integration-logging-platform-using-elastic-stack
데브시스터즈에서 서비스하고 있는 모든 게임에서 생성된 각종 로그들은 하나의 통합 로깅 플랫폼으로 수집되어 데이터 분석, 서버 운영 및 트러블슈팅, 고객 문의 대응 등 다양한 용도로 사용하고 있습니다. 본 발표에서는 이 통합 로깅 플랫폼에서 Elastic Stack이 어떻게 사용되는지 다룹니다. 구체적으로, Filebeat를 이용한 Kubernetes와 AWS EC2 환경에서의 로그 수집, Elasticsearch를 이용한 로그 조회 서비스 구성에 대해 살펴보며, 서비스 구축 및 운영 과정에서 발생한 이슈들의 해결 과정, 그리고 앞으로의 미래에 대해 이야기합니다.
Strategies for Landing an Oracle DBA Job as a Fresher
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
1. 1
Choice of ‘ElasticSearch’ for online e-
commerce big-data analysis based on
high performance and high availability
Bosoon, Kim
CTO (Builton Co., Ltd.)
February 22, 2018
http://www.builton.co.kr/en
2. 2
BuiltOn
• The scale of e-commerce worldwide grows day by day.
• E-commerce, data analysis is essential for companies to choose what to do
and how to do it.
• We analyze various aspects of retailers, sellers and consumers of e-
commerce industry.
• Many companies in South Korea, including global companies, is using
BuiltOn’s data to analyze e-commerce big-data.
• We also collaborate with global data analysis partners.
http://www.builton.co.kr/en
Source: Gray Arial 10pt
4. 4
Necessity of e-commerce analysis
• Why my product is not selling?
• What is the strategies of selling product of the competitors?
• What is the thoughts of consumers who bought our products or services?
• What is the best selling products?
• What is most effective way of ads to boost sales?
• Who is selling our products?
• How much is sold for our product?
• Where the our products are sold?
• In addition, still there are many questions in e-commerce.
People who work in the e-commerce environment are curious.
Source: Gray Arial 10pt
5. 5
The e-commerce big-data analysis process diagram
Same flow as typical big-data analytics.
1 2 3 4 5
E-commerce big-data
warehouse configuration
Data collection,
data refining and
data quality control
Configure aggregate
data marts
Visualization Derivation of KPIs
(Ker performance
indicators)
6. 6
Analysis based on digital-shelf.
• Collects the search results of categories
and keywords in target online retailer.
• Analyze the digital shelf share of the
manufacturers and brands.
• Can see the market penetration rate of
my products and competitors.
• Can also see the share of advertising by
manufacturer, brand and product.
• The search results show that consumers
are more likely to choose products that
are exposed to the top.
Brand analysis in TV category digital shelf for target retailer
Source: Gray Arial 10pt
D2.27%(3)
E2.27%(3)
C2.27%(3)
Etc 7.58% (10)
B Electronics
40.91%
(54)
A Electronics
43.18%
(57)
F
16.67%
(2)
D2.50%(3)
Etc 5.00% (6)
B Electronics
45.00%
(54)
A Electronics
47.50%
(57)
G
16.67%
(2)
H
16.67
(2)
E
25.00%
(3)
C Electronics
25.00%
(3)
A Electronics
B Electronics
D
C Electronics
E
F
Total
(132)
Advertisement
(12)
Normal
(120)
G
H
8. 8
The analysis based on price.
• Can analyze the price of products by the seller
and the online retailer according to the time
series.
• For the same product, consumers are more
likely to purchase the lowest-priced product.
• If the prices of goods sold abroad are much
lower, consumers are not willing to buy it in
local.
• The lower of the commodity price, the less
profitable the seller is.
Minimum Advertised Price(MAP) violations by resellers.
Source: Gray Arial 10pt
CHANEL SUBLIMAGE LA CR. TS
420,000
430,000
440,000
450,000
460,000
470,000
480,000
490,000
500,000
510,000
520,000
530,000
540,000
550,000
560,000
570,000
580,000
Retailer A
Retailer B
Retailer C
Retailer D
Retailer E
10. 10
The analysis based on customer review
• Analyzes the customer’s review of the product.
• Analyzes customer reaction (positive and
negative) of product characteristics through
comments.
• Identify problems of your products and
competitor’s products.
• Discover the sales trend of your products count
by totaling the number of purchases in the
ecommerce websites.
Product review trend
Source: Gray Arial 10pt
Instant rice 210g x 1
Reviews satisfaction rate
12. 12
Analysis based on consumer behavior
• Provide real-time inflow status of online
product page.
• Track consumer behavior of each product.
‒ # of purchase button clicks
‒ # of cart button clicks
‒ Sales success rate
• Provides tracking report that has consist of
analysis platforms, keywords and ads.
Source: Gray Arial 10pt
0
10
20
30
40
50
60
70
80
90
100
PC Mobile App
100% Stacked chart for platform share based
on time series.
15. 15
Starting Architecture
RDBMS
(with Replication)
Nodes (X)
Data Collection Engine
Nodes (X)
Business Server
Web Service
Data-mart
Visualization
Nodes (X)
Network gateway
Nodes (X)
Network controllerRetailer Information
• Product title
• Price and discount
ratio
• Card promotion
• Digital shelfs
• Reviews
• Seller
• Etc…
Batch process
Nodes (X)
Nodes (X)
Text search engine
Based on RDBMS
16. 16
Reason for starting architecture configuration
• Familiar development environment.
‒ C/C++
‒ LUA script engine.
‒ RDBMS on columns such as MySQL, SQL-SERVER, PostgreSQL…
• Execute separate data collection engine instance for each user.
• Cloud platforms such as Amazon web service.
‒ Cloud platform cost is very expensive.
‒ BuiltOn manage own hardware infrastructure to provide efficient architecture service for
partners.
• Self-developed visualization.
Source: Gray Arial 10pt
17. 17
Develops almost of the architecture component
• Full-text search engine.
‒ Search engine is required to find the products that you want in
big-data.
• Monitoring system.
‒ CPU, Memory, Disk, Network traffic and etc…
• Data replication into storage of customer.
‒ Interpreting and replicating the event log of RDBMS.
‒ Customers want to replicate refined data to their data center.
Source: Gray Arial 10pt
19. 19
As the company grows…
• Limit point exposure of RDBMS
‒ System slows down.
‒ Difficult to reflect customer customization.
‒ Added columns that other customers do not need.
‒ Too much time waste adding columns.
‒ Increased indexing time.
‒ Frequent replication synchronization issues.
‒ Full-text search tasks a long time.
‒ RDBMS cluster is not very fast even though increase nodes.
• Storage scale-up cost is too expensive.
‒ Initially, HDD
‒ Next, SDD
‒ High-performance NVMe SSD in the end
‒ It’s too expensive
There have been many technical issues.
Source: Gray Arial 10pt
Storage cost & Maintenance cost
Storage Performance
20. 20
As the company grows…
• Spending too much time for developing
visualization.
• Difficulties on O/S log analysis.
• Long downtime for hardware failures.
• Recurrent development for solving issues.
There have been many technical issues.
Source: Gray Arial 10pt
22. 22
Why & What happened?
• Excessive desire for development and
testing.
• Enormous stored data.
• The belief that hardware scale-up will
solve everything.
• Lack of understanding on the latest
analytical trend.
Source: Gray Arial 10pt
24. 24
What should be changed?
• At least the performance has to be much faster than
now.
‒ Without expensive NVMe SSD.
• Schema free for flexible data management.
• Minimize downtime due to hardware equipment
replacement.
• Storage engine that can support full text search
without a separate search engine.
• Automatically, archiving old data in low-cost storage.
Excessive desire for development and testing is wasting of time and money.
Source: Gray Arial 10pt
26. 26
Own evaluation for existing storage engine
• RDBMS Cluster
‒ As the number of nodes increased, storage capacity was available, but performance was
not satisfactory.
• CouchBase NoSQL database
‒ Random access is good, but the sequential access is bad. The system died as the data
grow up. Now? Changed maybe?
• HDFS
‒ Reliable, high-capacity storage is good. But all the rest must be developed by the
developer.
Source: Gray Arial 10pt
27. 27
Suddenly, the worst situation happens.
• There was a report that has to be aggregated and processed for 3 minutes
to the analytic report.
• Because many of the input parameters are changed by the user, pre-
calculation is not possible.
• The customer asked us to get the output as soon as they clicked on it.
• It was an unreasonable and excessive demand and could not be processed
in our environment.
One day…
Source: Gray Arial 10pt
28. 28
ElasticSearch
• Unstable and unreliable storage engine could not be used.
• Meet ElasticSearch while trying to solve these troubles.
• We moved all the data from RDBMS to ElasticSearch, so we provided the
reports within time customer required.
First meet.
Source: Gray Arial 10pt
29. 29
RDBMS
Based on high
performance
NVMe SSD
420000 IOPS
1 nodes
Response time = x60 faster
ElasticSearch
Based on Normal
SSD
96000 IOPS
2 nodes
3m 3s
180 seconds
response time
3 seconds
response time
32. 32
New Architecture
Nodes (X)
Job Worker
Node.js
Nodes (X)
Central Scheduler
Node.js
RDBMS data-mart
Visualization based on
Business Intelligence
Nodes (X)
Network gateway
Nodes (X)
Network controller
Retailer Information
Product title, price, card
promotion
Digital shelfs
Shopper reviews
ETL & ELT
Nodes (X)
Elasticsearch
X-pack
Master Nodes (3)
Ingest Nodes (X)
Data Nodes - Hot (X)
Data Nodes - Warm (X)
Nodes (X)
Server
Metricbeat
X-pack
Instances (X)
Refinement
Nodes (X)
Elasticsearch
33. 33
What have we changed?
• Replaced storage engine from RDBMS to
ElasticSearch.
• Perform a full-text search directly from ElasticSearch.
• Changed the system monitoring to Metricbeat.
• Use Hot-Warm nodes without backup old data
separately.
‒ Old data uses based on low-cost hardware such as HDD.
• No longer operate RDBMS data replication.
‒ We trust shard and replication of ElasticSearch.
• If not enough capacity, just add a new node.
‒ ElasticSearch is fast and easy to scale-out.
We’ve changed everything that can be replaced by ElasticSearch.
Source: Gray Arial 10pt
Metricbeat
34. 34
Changed architecture comparison
Item Old - RDBMS New – ElasticSearch
Data type Based on columns Document
Schema free support N/A YES
Real-time analysis response time Slow High Fast
Downtime Long Almost none
Storage extension policy Scale-up Scale-out
Storage cost Expensive Cheap
SSD type Server side high performance NVMe Server side normal SSD
CPU Xeon E5-2620 v4 2.10GHz / x2 Xeon E5-2620 v4 2.10GHz
Memory 512GB per a node 64GB per a node
Data distribution N/A Shard
Backup Replication Replication
Full-text search In house-development Basic support
Archiving Individual backup into HDD Hot-Warm
System monitoring In house-development Metricbeat
Visualization In house-development Kibana, Tableau or Etc…
35. 35
Before
RDBMS
Expensive CPU /
6 Nodes based on server-side
NVMe SSD /
512GB Memory per a node /
Replication-based backup policies /
Sometimes slow response time
Daily data throughput
After
ElasticSearch
Cheap CPU /
17 Nodes based on Normal server-
side SSD /
64GB Memory per a node /
Multi-shard based cluster /
High fast response time
30GB 500GB
36. 36
Technical Support
• Rapid advanced technical support.
‒ Restart some nodes.
‒ The problem is that the primary shard data is not redistributed.
‒ In the worst case, data loss can occur.
‒ We ask for technical support and were able to solve the problem quickly.
‒ We found that problem turned off the index recovery setting.
‒ We still have technical support if have questions.
X-PACK
Source: Gray Arial 10pt
38. 38
Future work
• Virtualization of ElasticSearch with Docker.
• Infographic using Canvas.
• Buzz analysis using Nori.
• Network monitoring with Packetbeat.
• Monitoring e-commerce big-data properties information using Kibana.
• Logstash will be applied to ETL and ELT.
Can do more with ElasticSearch.
Source: Gray Arial 10pt