Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Pulsar Virtual Summit North America 2021
Apache Pulsar at Tencent Game:
Adoption, Operational Quality Optimization Experie...
Pulsar Virtual Summit North America 2021
Liping ZHAO
Senior Software Engineer @
Tencent
Liping Zhao is the architect and M...
Pulsar Virtual Summit North America 2021
Agenda
01
02
03
04
About Tencent Game Big Data
Why Pulsar ?
Operational Quality O...
Pulsar Virtual Summit North America 2021
Tencent Game Big Data
Daily transmission: 1.8+ trillion ,300+TB,Total storage:100...
Pulsar Virtual Summit North America 2021
Tencent Game Big Data
1
2
3
4
Analysis Report
-Batch
MapReduceHive
iData
- Candid...
Pulsar Virtual Summit North America 2021
平台组件
Tencent Game Big Data
用户自助中心
data
dev
test
api
dev
Visual report
Effectivene...
Pulsar Virtual Summit North America 2021
Agenda
01
02
03
04
About Tencent Game Big Data
Why Pulsar ?
Operational Quality O...
Pulsar Virtual Summit North America 2021
Why Pulsar? --Pain Points of Online Kafka Operations
01
Data loss in some scenari...
Pulsar Virtual Summit North America 2021
Why Pulsar? --First look
Pulsar Virtual Summit North America 2021
Why Pulsar? --Kafka Optimization-1
P0 P2 P4 P1 P3 P5
P0 P2 P4 P1 P3 P5
Broker 1
B...
Pulsar Virtual Summit North America 2021
Why Pulsar?
broker and bookie
expansion without
any data migration
Auto Recovery
...
Pulsar Virtual Summit North America 2021
Why Pulsar?
• Pulsar vs Kafka : Comparison of key consumption patterns and servic...
Pulsar Virtual Summit North America 2021
Why Pulsar?
Delayed Message
Instant Expansion
Cloud Native
Auto Recovery
Metrics
...
Pulsar Virtual Summit North America 2021
Agenda
01
02
03
04
About Tencent Game Big Data
Why Pulsar ?
Operational Quality O...
Pulsar Virtual Summit North America 2021
Pulsar in Tencent Game
GameDB
GameDB
DataServer
GameX
TDbank
Pulsar
Kafka
Flink
S...
Pulsar Virtual Summit North America 2021
Hierarchical Management and Cluster Deployment
Business Group
e.g. IEG
Authorizat...
Pulsar Virtual Summit North America 2021
Producer/Consumer Client Architecture
Pulsar Client
Broker1
Broker2
Broker3
Booki...
Pulsar Virtual Summit North America 2021
Pulsar Manager in Tencent Game
Developers
Operators
Access
Management
Security Ce...
Pulsar Virtual Summit North America 2021
Operational Quality Service System
Pulsar
Flink
Kafka
others
KP-Monitor
Cluster
m...
Pulsar Virtual Summit North America 2021
High Availability Solution
Pulsar-cluster-1
Sliding Window
Pulsar
Client
Data Cen...
Pulsar Virtual Summit North America 2021
Agenda
01
02
03
04
About Tencent Game Big Data
Why Pulsar ?
Operational Quality O...
Pulsar Virtual Summit North America 2021
Future work
• KOP:6K+ historical tasks to be switched
• kafka2.2: used to support...
Pulsar Virtual Summit North America 2021
THANKS
THE END
Prochain SlideShare
Chargement dans…5
×
Prochain SlideShare
What to Upload to SlideShare
Suivant
Télécharger pour lire hors ligne et voir en mode plein écran

0

Partager

Télécharger pour lire hors ligne

Apache Pulsar at Tencent Game: Adoption, Operational Quality Optimization Experience, and Future - Pulsar Summit NA 2021

Télécharger pour lire hors ligne

After nearly 10 years of development of Tencent Game big data, the daily data transmission volume can reach 1.7 trillion. As the key component of the big data platform, the MQ system is critical to provide real-time service operational quality assurance, which requires the support of various applications such as real-time game operational service, real-time index data analysis, and real-time personalized recommendation. With the fast growth of the gaming business and the continuous expansion of data, the challenge of real-time service operational quality assurance is also increasing.

In this presentation, We will introduce the development history of Tencent Game big data technology and our practical experience of operational service quality optimization for Apache Pulsar in Tencent Game real-time service scenarios.

  • Soyez le premier à aimer ceci

Apache Pulsar at Tencent Game: Adoption, Operational Quality Optimization Experience, and Future - Pulsar Summit NA 2021

  1. 1. Pulsar Virtual Summit North America 2021 Apache Pulsar at Tencent Game: Adoption, Operational Quality Optimization Experience and Future Liping ZHAO, Senior Software Engineer @ Tencent
  2. 2. Pulsar Virtual Summit North America 2021 Liping ZHAO Senior Software Engineer @ Tencent Liping Zhao is the architect and MQ team leader of Tencent Game Real-time big data service responsible for upgrading and optimizations of real-time computing service architecture ,which deals with a vast number of service traffics. Before joining Tencent, she got her master's degree from the Renmin University of China.
  3. 3. Pulsar Virtual Summit North America 2021 Agenda 01 02 03 04 About Tencent Game Big Data Why Pulsar ? Operational Quality Optimization Experience Future Work
  4. 4. Pulsar Virtual Summit North America 2021 Tencent Game Big Data Daily transmission: 1.8+ trillion ,300+TB,Total storage:100PB+ Tables : 150k+ Data: 400bn+ Tables : 40k+ Data: 60bn+ Tables : 300k+ Data:1300+bn Bitmap Filter Builder Dynamic Bitmap Index Cache Bitmap Index Generator Execute Engine Data Mapper Col-1 Col-1 Col… Aggregate Merger Multi-dimensional Analysis TDW 100P+ Streaming:90+mil./s Batch :1800+ bn End Games: 100+ Page Games: 90+ Mobile Games: 300+ - Kafka/Pulsar: 60+ clusters ,700+ nodes - Storm/Flink : 90+ clusters ,1000+ nodes Streaming clusters:
  5. 5. Pulsar Virtual Summit North America 2021 Tencent Game Big Data 1 2 3 4 Analysis Report -Batch MapReduceHive iData - Candidate real-time DataMore - Real-time KafkaStormSpark ODP - One-stop operation and development platform PulsarFlink Druid 2012年 2013年 2016年 Intelligent era 2020年
  6. 6. Pulsar Virtual Summit North America 2021 平台组件 Tencent Game Big Data 用户自助中心 data dev test api dev Visual report Effectiveness analysis activity marketing Personal center News push Data Application Data Source Xone -Integrated development platform Analysis Application User reach Channel management GDAM/CMDB Real-time data service operation support system Operating Platform TGLog Agent TDBank Storm/Flink Kafka/ Pulsar TGLog TDW Access Transmission & Calculation Tredis Tspider Storage Architecture upgrade Real-time activity : Users are keen on fault perception High requirements for operational quality LOL Mission System 王者荣耀LBS荣耀战区 … Monitor Data/App Bloodline release in-game out-game Real-time big data computing architecture Realtime rule User portrait Personalized Recommendation
  7. 7. Pulsar Virtual Summit North America 2021 Agenda 01 02 03 04 About Tencent Game Big Data Why Pulsar ? Operational Quality Optimization Experience Future Work
  8. 8. Pulsar Virtual Summit North America 2021 Why Pulsar? --Pain Points of Online Kafka Operations 01 Data loss in some scenarios 06 02 05 03 04 No delayed message and dead letter topic? When the game activity traffic surges, expansion is not timely & affect service quality High cost of machine abolition for operators High cost of lag monitoring Write availability hazards
  9. 9. Pulsar Virtual Summit North America 2021 Why Pulsar? --First look
  10. 10. Pulsar Virtual Summit North America 2021 Why Pulsar? --Kafka Optimization-1 P0 P2 P4 P1 P3 P5 P0 P2 P4 P1 P3 P5 Broker 1 Broker 0 Broker 1 Broker 0 Broker 2 Partition reassign P0 P2 P4 P1 P3 P5 Broker 1 Broker 0 Broker 2 P2 P5 not working! • Expansion cost is high • Not timely • Affect service quality • No auto-recovery • Write availability hazards • When replicas < min.insync.replicas ,write unavailable data migration P0 Broker 0 P1 Broker 1 … Cluster 0 P0’ Broker 0 P1’ Broker 1 … Cluster 1 Topic A = T1+T1’ Logical Service Physical service T1 T1’ Client Client Topic A optimization • High maintenance cost • Resource management challenges • Easily lead to waste of resources and low load expansion
  11. 11. Pulsar Virtual Summit North America 2021 Why Pulsar? broker and bookie expansion without any data migration Auto Recovery Higher availability in the event of replicas loss failure Hour-level expansion is upgraded to second-level
  12. 12. Pulsar Virtual Summit North America 2021 Why Pulsar? • Pulsar vs Kafka : Comparison of key consumption patterns and services Items Kafka Pulsar Subscription Modes Stream only Stream : Exclusive and Failover Subscription Queue : Shared and Key-Shared Subscription Sequential Consumption Yes , Partially ordered Yes , Partially ordered the number of consumer threads , the number of partitions the number of consumer threads <= the number of partitions Shared and Key-Shared Subscription: the number of consumer threads can be greater than the number of partitions Message persistence and cleanup Yes,not flexible enough Yes, More flexible,retention + TTL Delayed message No Yes,optimizing Dead letter topic No Yes Transaction message Yes Yes,optimizing Message backtracking Yes,offset Yes , Reader API , Message ID Idempotency Yes Yes Effective-once Yes Yes GEO-replication No Yes
  13. 13. Pulsar Virtual Summit North America 2021 Why Pulsar? Delayed Message Instant Expansion Cloud Native Auto Recovery Metrics Advantages of Pulsar Multi-tenant Geo-replication Items Kafka Pulsar Performance latency low low TPS High,10W+/s High,14W+/s Service capability Subscription Modes Stream only Stream and Queue Data reliability High High Write Availability Write availability hazards, when replicas<min.insync.replicas High Operational capability Multi-tenant No Yes Expansion Hour-level Second-level Auto Recovery No Yes Geo-replication No Yes Language Scala Java Ease of use Yes Yes Metrics The official only provides jmx metrics, and the cost of collecting lag metrics is high. Officially provide backlog and other key metrics …
  14. 14. Pulsar Virtual Summit North America 2021 Agenda 01 02 03 04 About Tencent Game Big Data Why Pulsar ? Operational Quality Optimization Experience Future Work
  15. 15. Pulsar Virtual Summit North America 2021 Pulsar in Tencent Game GameDB GameDB DataServer GameX TDbank Pulsar Kafka Flink Storm Tendis TSpider DBMS ..... 运营产品 Other DataMore Real-time data service Turing Recommended system Pandora Game marketing iData Game data analysis AMS&Prop City more Access Transmission & Calculation Storage Application KPAgent TDW
  16. 16. Pulsar Virtual Summit North America 2021 Hierarchical Management and Cluster Deployment Business Group e.g. IEG Authorization management Storage policy Cleanup Policy Storage Quota Retention/cleanup Policy Throttling policy Clusters : • Public Cluster • several Dedicated Clusters ( for a few specific games ) Public Cluster Dedicated Cluster for King of Glory Clusters Dedicated Cluster for LOL … -- Tenant Namespace Game Businesses e.g. PubgMobile Topic Business Table
  17. 17. Pulsar Virtual Summit North America 2021 Producer/Consumer Client Architecture Pulsar Client Broker1 Broker2 Broker3 Bookie1 Bookie2 Bookie4 Bookie3 KP-Conf Kpconf sdk http req KP-Manager Config Publish Get Broker list 2 3 4 5 Producer/Consumer Client always gets the live broker list Cluster changes are transparent to the client Higher Client Availability 1 Broker registered to KPConf 1 2 KP-Manager publish the conf-policy to KP-Conf 3 Pulsar Client requests KPConf to get the latest broker list 4 Got the serviceurl ,Topic lookup and Consume/Produce Process: pulsar Advantages:
  18. 18. Pulsar Virtual Summit North America 2021 Pulsar Manager in Tencent Game Developers Operators Access Management Security Center(Ranger) - producer: apply for a new topic - consumer: apply for a subsciption return authentication information Pulsar manager (community) + Cluster manager and more… CMDB Pulsar Cluster Security check Security Policy manage Topic/ subscription.. Cluster manage Machine info approve and assign permission Cluster manage Security Policy manage KP-Manager Pulsar manager++ Secure access and use Authentication approval process Capacity assessment Meet the different needs of developers and operatiors
  19. 19. Pulsar Virtual Summit North America 2021 Operational Quality Service System Pulsar Flink Kafka others KP-Monitor Cluster manager Metric manager Monitor manager Bloodline-based impact assessment CMDB/GDAM Tracing manager Grafana/EPR TDW Alert Colloctor Storage Monitor Application Kafka Collector … LogX Search Monitoring : Metrics + Logging + Tracing Quickly analyze the root cause of the problem Full link impact assessment Flink/Storm Pulsar/Kafka KP-Agent Dataserver interface game activity table-topic topic-topo topo-index Index-interface activity interface game-table Bloodline-based Impact Assessment Tredis key template Alarms can be automatically associated with the scope of impact of game activities and its indicators DataServer Metrics Logging Tracing Pulsar Collector Flink Collector Visualization Persistence
  20. 20. Pulsar Virtual Summit North America 2021 High Availability Solution Pulsar-cluster-1 Sliding Window Pulsar Client Data Centers FailoverProducer SDK: • Check and retry • Sliding window choose a healthier cluster for producing dynamically refresh failure status pulsar-cluster-1 pulsar-cluster-2 Pulsar-cluster-2 Cross-data center service disaster recovery What if the data center goes down?
  21. 21. Pulsar Virtual Summit North America 2021 Agenda 01 02 03 04 About Tencent Game Big Data Why Pulsar ? Operational Quality Optimization Experience Future Work
  22. 22. Pulsar Virtual Summit North America 2021 Future work • KOP:6K+ historical tasks to be switched • kafka2.2: used to support smooth switching of historical tasks • kafka0.9.x: The community no longer supports it, this version needs to be rolled back and the performance is poor after testing • Pulsar on K8S:cloud native,test environment has been deployed on tencent cloud TKE • Delayed message delivery:Performance optimization • Pulsar manager:health examination of bookie 、audit management and more features • more usecases based on Pulsar+Flink Recommendation Friend Recommendation warband Recall Friend
  23. 23. Pulsar Virtual Summit North America 2021 THANKS THE END

After nearly 10 years of development of Tencent Game big data, the daily data transmission volume can reach 1.7 trillion. As the key component of the big data platform, the MQ system is critical to provide real-time service operational quality assurance, which requires the support of various applications such as real-time game operational service, real-time index data analysis, and real-time personalized recommendation. With the fast growth of the gaming business and the continuous expansion of data, the challenge of real-time service operational quality assurance is also increasing. In this presentation, We will introduce the development history of Tencent Game big data technology and our practical experience of operational service quality optimization for Apache Pulsar in Tencent Game real-time service scenarios.

Vues

Nombre de vues

128

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

4

Actions

Téléchargements

7

Partages

0

Commentaires

0

Mentions J'aime

0

×