SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
- Manimuthu Ayyannan
Self Service Metadata driven
Data Loader Framework
About Us
Manimuthu Ayyannan
manimuthu.ayyannan@walmart.com
LinkedIn:@manimuthuayyannan
• Senior Manager II, Software Engineering @Walmart Global Tech
• Data Enthusiast
• Data Platform Services (Big Data | Spark | Cloud | AI-ML)
Agenda
• Personalization @Walmart
• Challenges
• Solution Approaches
• High Level System Architecture
• Metadata Designand Connectors
• Orchestrator
• Schedule Optimizer
• Telemetry
Personalization @Walmart
• Our Customers are becoming increasingly
omni channel
• ~220M Customers & Members visits ~10,500
stores & clubs under 46 banners in 24 countries
& eCommerce websites in a week
• Billions of product impressions served every
week which generates events in petabytes
• We at FE team, run thousands of data
applications to generate features that
powers the personalized recommendations to
our customers
source
Walmart
General
Merchandise
+Walmart
Grocery, Store
Pickup &
Delivery
+Walmart
Stores
Personalization|Data Landscape
Persoalization | Data
Landscape
User Experience & Access Control
Security
Logging
Alerting
Telemetry
Data Engineers Data Scientists Data Analysts
Data Apps | Data Loader Platform
Muti – DC and Public Cloud
Streaming | In Memory | No SQL | Analytical
• Data applicationonboardingrequires a lot of manualhand coding and developers need time to
develop,integrate, and test code to solve the underlying complexities
• Buildingfunctionalityrich applicationneeds integrationwith variousbig data technologies,wide
array of datasources, sinks and data processors
• Difficult to control the resource allocation/usageand do the retrospection
• Competing high and low priority applicationsare introducingthe latency to the serving layers
Challenges
Challenges | New App Onboarding | Cumbersome & Fragile
Integrate
Data App 1 Integrate Develop Implement Enable
Source System Target System Processor Security Telemetry
Test and Deploy
Integrate
Data App 2 Integrate Develop Implement Enable Test and Deploy
Integrate
Data App 3 Integrate Develop Implement Enable Test and Deploy
Integrate
Data App 4 Integrate Develop Implement Enable Test and Deploy
Integrate
Data App N Integrate Develop Implement Enable Test and Deploy
Allocate
Resource
Allocate
Allocate
Allocate
Allocate
Data Loader Simplifies the onboarding
Configure
Data App 1
Source System Target System Processor Security Telemetry
Test and Deploy
Configure
Data App 2
Test and Deploy
Configure
Data App 3
Test and Deploy
Configure
Data App 4
Test and Deploy
Configure
Data App N
Test and Deploy
Resource
-Data Loader Platform-
An abstract layer equippedwith standardparsers
and connectors
• A centralized metadatadriven dataloading platform with plug and play onboardingcapability
• An abstractionlayer to buildthe workflow orchestrationwhich simplifies the complex service
integrationsand faster time to deployment
• A compelling UI that dramaticallyincreases the developer’sproductivityby providingready-to-use
connectorsto configure the business logic
• An IntelligentSystem to provide optimized recommendationbased on the previousruns
• Smart run schedule pool to enqueue and dequeue the run instances based on priority
Solution Approach
High Level System Architecture
Metadata Under the hood
Connectors
• Framework is equipped to parse and handle all the data formats like JSON, AVRO, Parquet
and CSV
• Users can pick the existing connectors supporting different source and target systems like
Kafka, Cassandra and BQ.
• Metadata stores the system and application specific resource configuration to optimize
the resource allocations
• Abstract layer bundled with Custom UDFs that provides user flexibility to query the
systems like Kafka and Cassandra with SQL
Sample Domain API call in SQL UDF
• Accessing new domain APIs requires lot of engineering effort to integrate it in any data
applications
• Creating UDFs for Domain APIs and use these APIs in parallel computational engine like Spark
where it accepts UDFs usage in SQL
spark.sql("select getAccountStatus('cust_id:xxxxxxxxx') as is_active from table limit 1").show(false)
+------------------------------+
|is_active |
+------------------------------+
|Y|
+------------------------------+
Orchestrator
• Builds the optimized execution plan based
on the application configs from the
metadatastore
• Responsible for generating the run
instances based on the app priority and
source systems
• Executors picks the optimized execution
plan during the execution
Metadata
Store
Executors
Read App Config
Job Optimizer
Generate Run
Instance
Run Scheduler
Orchestrator
• Smart priority groups assigned to each loader for all the applicationsbased on the criticality
• Top priority jobstake precedence over the already scheduled lower priority
ones by dequeuing them
• Automatic resumption of the lower priority jobs once all the top priority and SLA bound jobs
are complete
Schedule Optimizer
Schedule Optimizer Illustration
10:00 | non-core app| instance 1 | Done
10:00 | non-core app| instance 2| Done
10:00 | non-core app| instance 3| In Progress
10:00 | non-core app| instance 4| In Progress
10:00 | non-core app| instance 5|waiting
10:00 | non-core app| instance 6| waiting
10:00 | non-core app| instance 1 | Done
10:00 | non-core app| instance 2| Done
10:00 | non-core app| instance 3| Done
10:00 | non-core app| instance 4| Done
10:00 | non-core app| instance 5|waiting
10:00 | non-core app| instance 6| waiting
10:30 | core app | instance 1 | waiting
10:30 | core app | instance 2| waiting
10:30 | core app | instance 3| waiting
10:30 | core app | instance 4| waiting
Current Schedule Pool
Updated Schedule Pool
Incoming Schedule Pool
10:30 | core app | instance 1 | In Progress
10:30 | core app | instance 2| In Progress
10:30 | core app | instance 3| waiting
10:30 | core app | instance 4| waiting
• Real-time dashboardsthat provide run time statisticsfor each application
• Insightful experience to deep dive on various metrics
• Alerting and notificationmechanism to let app owners know about any erroneous or fault
scenarios
• Consolidatedview of all applicationswith corresponding success/failure ratio
Telemetry
Putting the pieces together
Self Service
Metadata Store
Multiple
Execution
Engines
E2E App Life
Cycle
Management
Multiple
Source & Target
Systems
Telemetry
Version Control
& CI/CD
Cloud Native
Plug & Play
Low or No code
• Quick turnaroundtime from few months to weeks
• Developer productivityexpected to increase by multiple folds
• Non-Engineeringteams can also leverage this framework to buildfunctionalapplicationswith
basic knowledge of SQL
• Intelligentapp execution based on the app priority compared to non-SLA applications
Outcome
Thank You

Contenu connexe

Similaire à IDEAS Global A.I. Conference 2022.pdf

Hybrid Cloud example for SlideShare
Hybrid Cloud example for SlideShareHybrid Cloud example for SlideShare
Hybrid Cloud example for SlideShareHewlett-Packard
 
Modernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-ArchitectModernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-ArchitectDevOps.com
 
Whitepaper factors to consider commercial infrastructure management vendors
Whitepaper  factors to consider commercial infrastructure management vendorsWhitepaper  factors to consider commercial infrastructure management vendors
Whitepaper factors to consider commercial infrastructure management vendorsapprize360
 
Azure Monitoring Overview
Azure Monitoring OverviewAzure Monitoring Overview
Azure Monitoring Overviewgjuljo
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowPuppet
 
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...VMware Tanzu
 
Disruptive Trends in Application Development
Disruptive Trends in Application DevelopmentDisruptive Trends in Application Development
Disruptive Trends in Application DevelopmentWaveMaker, Inc.
 
Systemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenterSystemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenterjmustac
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®confluent
 
StreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformStreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformAtul Sharma
 
What’s New in Athene™ 11
What’s New in Athene™ 11What’s New in Athene™ 11
What’s New in Athene™ 11Precisely
 
Automated Application Integration with FME & Cityworks Webinar
Automated Application Integration with FME & Cityworks WebinarAutomated Application Integration with FME & Cityworks Webinar
Automated Application Integration with FME & Cityworks WebinarSafe Software
 
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Odinot Stanislas
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaKai Wähner
 

Similaire à IDEAS Global A.I. Conference 2022.pdf (20)

Hybrid Cloud example for SlideShare
Hybrid Cloud example for SlideShareHybrid Cloud example for SlideShare
Hybrid Cloud example for SlideShare
 
Modernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-ArchitectModernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-Architect
 
Whitepaper factors to consider commercial infrastructure management vendors
Whitepaper  factors to consider commercial infrastructure management vendorsWhitepaper  factors to consider commercial infrastructure management vendors
Whitepaper factors to consider commercial infrastructure management vendors
 
Azure Monitoring Overview
Azure Monitoring OverviewAzure Monitoring Overview
Azure Monitoring Overview
 
sagar
sagarsagar
sagar
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNow
 
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
vCloud Automation Center and Pivotal Cloud Foundry – Better PaaS Solution (VM...
 
Disruptive Trends in Application Development
Disruptive Trends in Application DevelopmentDisruptive Trends in Application Development
Disruptive Trends in Application Development
 
System center seminar presentation
System center seminar presentationSystem center seminar presentation
System center seminar presentation
 
Systemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenterSystemology presentation- System Center & the modern datacenter
Systemology presentation- System Center & the modern datacenter
 
inmation Presentation
inmation Presentationinmation Presentation
inmation Presentation
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
 
Power Apps for developers
Power Apps for developersPower Apps for developers
Power Apps for developers
 
Sadiq_CV_7
Sadiq_CV_7Sadiq_CV_7
Sadiq_CV_7
 
StreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics PlatformStreamAnalytix - Multi-Engine Streaming Analytics Platform
StreamAnalytix - Multi-Engine Streaming Analytics Platform
 
Arunprakash Alagesan
Arunprakash AlagesanArunprakash Alagesan
Arunprakash Alagesan
 
What’s New in Athene™ 11
What’s New in Athene™ 11What’s New in Athene™ 11
What’s New in Athene™ 11
 
Automated Application Integration with FME & Cityworks Webinar
Automated Application Integration with FME & Cityworks WebinarAutomated Application Integration with FME & Cityworks Webinar
Automated Application Integration with FME & Cityworks Webinar
 
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
 

Dernier

UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdfSuman Jyoti
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 

Dernier (20)

UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 

IDEAS Global A.I. Conference 2022.pdf

  • 1. - Manimuthu Ayyannan Self Service Metadata driven Data Loader Framework
  • 2. About Us Manimuthu Ayyannan manimuthu.ayyannan@walmart.com LinkedIn:@manimuthuayyannan • Senior Manager II, Software Engineering @Walmart Global Tech • Data Enthusiast • Data Platform Services (Big Data | Spark | Cloud | AI-ML)
  • 3. Agenda • Personalization @Walmart • Challenges • Solution Approaches • High Level System Architecture • Metadata Designand Connectors • Orchestrator • Schedule Optimizer • Telemetry
  • 4. Personalization @Walmart • Our Customers are becoming increasingly omni channel • ~220M Customers & Members visits ~10,500 stores & clubs under 46 banners in 24 countries & eCommerce websites in a week • Billions of product impressions served every week which generates events in petabytes • We at FE team, run thousands of data applications to generate features that powers the personalized recommendations to our customers source Walmart General Merchandise +Walmart Grocery, Store Pickup & Delivery +Walmart Stores
  • 5. Personalization|Data Landscape Persoalization | Data Landscape User Experience & Access Control Security Logging Alerting Telemetry Data Engineers Data Scientists Data Analysts Data Apps | Data Loader Platform Muti – DC and Public Cloud Streaming | In Memory | No SQL | Analytical
  • 6. • Data applicationonboardingrequires a lot of manualhand coding and developers need time to develop,integrate, and test code to solve the underlying complexities • Buildingfunctionalityrich applicationneeds integrationwith variousbig data technologies,wide array of datasources, sinks and data processors • Difficult to control the resource allocation/usageand do the retrospection • Competing high and low priority applicationsare introducingthe latency to the serving layers Challenges
  • 7. Challenges | New App Onboarding | Cumbersome & Fragile Integrate Data App 1 Integrate Develop Implement Enable Source System Target System Processor Security Telemetry Test and Deploy Integrate Data App 2 Integrate Develop Implement Enable Test and Deploy Integrate Data App 3 Integrate Develop Implement Enable Test and Deploy Integrate Data App 4 Integrate Develop Implement Enable Test and Deploy Integrate Data App N Integrate Develop Implement Enable Test and Deploy Allocate Resource Allocate Allocate Allocate Allocate
  • 8. Data Loader Simplifies the onboarding Configure Data App 1 Source System Target System Processor Security Telemetry Test and Deploy Configure Data App 2 Test and Deploy Configure Data App 3 Test and Deploy Configure Data App 4 Test and Deploy Configure Data App N Test and Deploy Resource -Data Loader Platform- An abstract layer equippedwith standardparsers and connectors
  • 9. • A centralized metadatadriven dataloading platform with plug and play onboardingcapability • An abstractionlayer to buildthe workflow orchestrationwhich simplifies the complex service integrationsand faster time to deployment • A compelling UI that dramaticallyincreases the developer’sproductivityby providingready-to-use connectorsto configure the business logic • An IntelligentSystem to provide optimized recommendationbased on the previousruns • Smart run schedule pool to enqueue and dequeue the run instances based on priority Solution Approach
  • 10. High Level System Architecture
  • 12. Connectors • Framework is equipped to parse and handle all the data formats like JSON, AVRO, Parquet and CSV • Users can pick the existing connectors supporting different source and target systems like Kafka, Cassandra and BQ. • Metadata stores the system and application specific resource configuration to optimize the resource allocations • Abstract layer bundled with Custom UDFs that provides user flexibility to query the systems like Kafka and Cassandra with SQL
  • 13. Sample Domain API call in SQL UDF • Accessing new domain APIs requires lot of engineering effort to integrate it in any data applications • Creating UDFs for Domain APIs and use these APIs in parallel computational engine like Spark where it accepts UDFs usage in SQL spark.sql("select getAccountStatus('cust_id:xxxxxxxxx') as is_active from table limit 1").show(false) +------------------------------+ |is_active | +------------------------------+ |Y| +------------------------------+
  • 14. Orchestrator • Builds the optimized execution plan based on the application configs from the metadatastore • Responsible for generating the run instances based on the app priority and source systems • Executors picks the optimized execution plan during the execution Metadata Store Executors Read App Config Job Optimizer Generate Run Instance Run Scheduler Orchestrator
  • 15. • Smart priority groups assigned to each loader for all the applicationsbased on the criticality • Top priority jobstake precedence over the already scheduled lower priority ones by dequeuing them • Automatic resumption of the lower priority jobs once all the top priority and SLA bound jobs are complete Schedule Optimizer
  • 16. Schedule Optimizer Illustration 10:00 | non-core app| instance 1 | Done 10:00 | non-core app| instance 2| Done 10:00 | non-core app| instance 3| In Progress 10:00 | non-core app| instance 4| In Progress 10:00 | non-core app| instance 5|waiting 10:00 | non-core app| instance 6| waiting 10:00 | non-core app| instance 1 | Done 10:00 | non-core app| instance 2| Done 10:00 | non-core app| instance 3| Done 10:00 | non-core app| instance 4| Done 10:00 | non-core app| instance 5|waiting 10:00 | non-core app| instance 6| waiting 10:30 | core app | instance 1 | waiting 10:30 | core app | instance 2| waiting 10:30 | core app | instance 3| waiting 10:30 | core app | instance 4| waiting Current Schedule Pool Updated Schedule Pool Incoming Schedule Pool 10:30 | core app | instance 1 | In Progress 10:30 | core app | instance 2| In Progress 10:30 | core app | instance 3| waiting 10:30 | core app | instance 4| waiting
  • 17. • Real-time dashboardsthat provide run time statisticsfor each application • Insightful experience to deep dive on various metrics • Alerting and notificationmechanism to let app owners know about any erroneous or fault scenarios • Consolidatedview of all applicationswith corresponding success/failure ratio Telemetry
  • 18. Putting the pieces together Self Service Metadata Store Multiple Execution Engines E2E App Life Cycle Management Multiple Source & Target Systems Telemetry Version Control & CI/CD Cloud Native Plug & Play Low or No code
  • 19. • Quick turnaroundtime from few months to weeks • Developer productivityexpected to increase by multiple folds • Non-Engineeringteams can also leverage this framework to buildfunctionalapplicationswith basic knowledge of SQL • Intelligentapp execution based on the app priority compared to non-SLA applications Outcome