SlideShare a Scribd company logo
1 of 36
Download to read offline
WHT/082311
1 | | ©2013, Cognizant1 | ©2017, Cognizant 1
Hammer and beyond – An ensembling journey
WHT/082311
2 | ©2017, Cognizant
Who am I?
Sunil Babu Peethambaram
Architect, Cognizant Technology Solutions, CTSH (NASDAQ)
Total IT experience – 13+ years
Consulting with LexisNexis since 2013 (Chennai, Dayton, Buford, Alpharetta)
Experience in HPCC Systems – more than 3 years
Domains worked on :
• Supply Chain Management
• Logistics
• Retail –
• Merchandise and Store operations
• Order Management and
• Warehouse Management Systems
• Insurance
• Healthcare
• Aviation
WHT/082311
3 | ©2017, Cognizant
Problem statement - How did it all start
Build valid flight connections (VFC) based on direct flight schedules (DFS)
DFS come in a proprietary encoded format
DFS spans across 1000 carriers and over 4 million records
DFS are for a year or more into the future
DFS keeps changing every day and VFC needs to be versioned for every day (potentially)
Building VFC requires evaluating feasibility of over 16 trillion potential connections
Valid connections to be identified by applying:
• Circuitry
• Cabotage
• BIETA and LCC
• Schedule conflicts
• MCT rules of over 100,000 to be applied in sequence
WHT/082311
4 | ©2017, Cognizant
The Legacy Setup
• Complex Business Logic
• Data intensive
• .NET/SQL Server
• Local datacenter
• Scaled-up architecture
• Ageing hardware
• Sequential processing
• Low fault tolerance
• Stale data delivery
• 24 X 7 life support
WHT/082311
5 | ©2017, Cognizant
The ask
SOS!
WHT/082311
6 | ©2017, Cognizant
The ask
Not really!
WHT/082311
7 | ©2017, Cognizant
The ask
Relevant data delivery – faster processing, parallelize independent tasks
Don’t marry the hardware
(just friends with benefits)
Performance as a configuration
(take your time, hurry up, choice is yours, don't be late)
Fail fast, recover faster
Onboard new customers quickly
Automated data delivery pipeline
Better maintainability – support and enhance the complex business logic
WHT/082311
8 | ©2017, Cognizant
So What?
WHT/082311
9 | ©2017, Cognizant
Every project has complex business logic
WHT/082311
10 | ©2017, Cognizant
But, We have to generate hundreds of millions of records…
WHT/082311
11 | ©2017, Cognizant
…which means we have a “big data” problem
WHT/082311
12 | ©2017, Cognizant
And we are going to do whatever it takes…
WHT/082311
13 | ©2017, Cognizant
OK Google.. What is big data?
WHT/082311
14 | ©2017, Cognizant
This is what we got!
WHT/082311
15 | ©2017, Cognizant
Our problem was different
WHT/082311
16 | ©2017, Cognizant
We have a big “data problem”
and the answers are a whole lot bigger!!!
WHT/082311
17 | ©2017, Cognizant
So, why HPCC Systems?
Why not?
WHT/082311
18 | ©2017, Cognizant
So, why HPCC Systems?
Our use case was data intensive and batch oriented
Embarrassingly parallel
ECL was built specifically for distributed data processing and gave us the fine
control we needed
Been there.. done that, lot of real experiences to tap into
Access to the HPCC Systems development team
It’s performing and maintainable
We did a proof of concept and validated fitment anyway
• 45 minute job ran in 1 second
• 4 hours job ran in 90 seconds
• 4 weeks planned proof of concept was completed in 4 days
WHT/082311
19 | ©2017, Cognizant
What did Bill have to say about it?
WHT/082311
20 | ©2017, Cognizant
Why AWS?
Bring a multi-node HPCC Systems cluster up or down at a click of a button
Scale up or down with zero upfront cost
Validating multiple configurations for performance and choose the best
And…
No need for
Data Centers
Pay as you USE
Go Global
Speed of
computing
WHT/082311
21 | ©2017, Cognizant
High level flow
WHT/082311
22 | ©2017, Cognizant
Inside HPCC Systems
Data warehouse as Source of Truth
Data warehouse is the base on which our
solution was built.
Follows a push-pull architecture
The raw data from different data sources
are cleansed and transformed to data
cubes (push).
The cubes acts as views that are used by
downstream applications (pull). Eg:
Connection builder
Data warehouse is the only way by which
data enters into the distributed data
processing system
All views follow a common interface
through which data can be accessed
WHT/082311
23 | ©2017, Cognizant
Lifecycle of a view in DW
WHT/082311
24 | ©2017, Cognizant
How did we fare?
Metrics Measure (Legacy – UTG) Measure (HPCC Systems)
Building connections (Singles) 40 hours 1 hour
Lines of Code 26535 (Not including SQL) 3973
Delivery Frequency Weekly Daily (Possible)
Hardware 24 GB and 12 cores for Batch Server
384 GB and 24 Cores for SQL Server
Thor Master + Middleware – 16 GB
Thor Slaves 64 GB – 16 cores across 4 nodes
AWS
4.4 million
100 million
13.5 million
WHT/082311
25 | ©2017, Cognizant
Happy Side Effects
Data Warehouse as a framework for new data sources
Data Warehouse as an interface for downstream applications
Plug and play by design
File builder template – Blue print for all data delivery jobs
Unit testing framework for HPCC Systems
Regression testing suite – Can run all tests in the code base and provide report
We integrated comparison testing tool from LNR into Hammer
HPCC Systems cluster can now be built in AWS at a click of a button (puppet)
Seamless sync between external FTP location and landing zone through S3
WHT/082311
26 | ©2017, Cognizant
What next?
WHT/082311
27 | ©2017, Cognizant
What next?
WHT/082311
28 | ©2017, Cognizant
What next?
WHT/082311
29 | ©2017, Cognizant
What next?
WHT/082311
30 | ©2017, Cognizant
What next?
WHT/082311
31 | ©2017, Cognizant
What next?
WHT/082311
32 | ©2017, Cognizant
What next?
WHT/082311
33 | ©2017, Cognizant
What next?
WHT/082311
34 | ©2017, Cognizant
What next?
WHT/082311
35 | ©2017, Cognizant
?
Questions?
WHT/082311
36 | ©2017, Cognizant
Thank you
Reach out to me: Sunil.Babu@flightglobal.com
Useful links
Cognizant: http://www.cognizant.com
FlightGlobal http://www.flightglobal.com
HPCC Systems Portal: http://hpccsystems.com
Machine Learning: http://hpccsystems.com/ml
Online Training: http://learn.lexisnexis.com/hpcc
HPCC Systems Wiki & Red Book: https://wiki.hpccsystems.com
Our GitHub portal: https://github.com/hpcc-systems
Community Forums: http://hpccsystems.com/bb
Documentation: https://hpccsystems.com/download/documentation

More Related Content

What's hot

Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie Mac
DataWorks Summit
 
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Con LA
 
Creating the Smart Transportation Infrastructure of the Future
Creating the Smart Transportation Infrastructure of the FutureCreating the Smart Transportation Infrastructure of the Future
Creating the Smart Transportation Infrastructure of the Future
DataWorks Summit
 
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data VirtualizationPowering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Denodo
 

What's hot (20)

Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie Mac
 
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
Data Science Out of The Box : Case Studies in the Telecommunication by Anand ...
 
Big Data Tech Stack
Big Data Tech StackBig Data Tech Stack
Big Data Tech Stack
 
Creating the Smart Transportation Infrastructure of the Future
Creating the Smart Transportation Infrastructure of the FutureCreating the Smart Transportation Infrastructure of the Future
Creating the Smart Transportation Infrastructure of the Future
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success Stories
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Webinar: How Leading Healthcare Companies use MongoDB
Webinar: How Leading Healthcare Companies use MongoDBWebinar: How Leading Healthcare Companies use MongoDB
Webinar: How Leading Healthcare Companies use MongoDB
 
Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017
 
Data Virtualization: From Zero to Hero (Middle East)
Data Virtualization: From Zero to Hero (Middle East)Data Virtualization: From Zero to Hero (Middle East)
Data Virtualization: From Zero to Hero (Middle East)
 
Real-time Analytics in Financial
Real-time Analytics in FinancialReal-time Analytics in Financial
Real-time Analytics in Financial
 
Webinar: Electronic Health Records (EHRs) and MongoDB - Advancing the Data Pl...
Webinar: Electronic Health Records (EHRs) and MongoDB - Advancing the Data Pl...Webinar: Electronic Health Records (EHRs) and MongoDB - Advancing the Data Pl...
Webinar: Electronic Health Records (EHRs) and MongoDB - Advancing the Data Pl...
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail Banking
 
Risk listening: monitoring for profitable growth
Risk listening: monitoring for profitable growthRisk listening: monitoring for profitable growth
Risk listening: monitoring for profitable growth
 
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
 
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with Alation
 
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data VirtualizationPowering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
 

Viewers also liked

Enabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC SystemsEnabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC Systems
HPCC Systems
 
Airline_Industry_Update_April2016
Airline_Industry_Update_April2016Airline_Industry_Update_April2016
Airline_Industry_Update_April2016
Sanford Rederer
 

Viewers also liked (20)

Proagrica - Big Data to Feed the World
Proagrica - Big Data to Feed the WorldProagrica - Big Data to Feed the World
Proagrica - Big Data to Feed the World
 
HPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the WorldHPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the World
 
The UN Global Compact-Accenture Strategy CEO Study 2016 Agenda 2030: A Window...
The UN Global Compact-Accenture Strategy CEO Study 2016 Agenda 2030: A Window...The UN Global Compact-Accenture Strategy CEO Study 2016 Agenda 2030: A Window...
The UN Global Compact-Accenture Strategy CEO Study 2016 Agenda 2030: A Window...
 
The New World of As a Service
The New World of As a ServiceThe New World of As a Service
The New World of As a Service
 
Enabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC SystemsEnabling Aviation Analytics through HPCC Systems
Enabling Aviation Analytics through HPCC Systems
 
IBM Connect 2017 フィードバックセッション
IBM Connect 2017 フィードバックセッションIBM Connect 2017 フィードバックセッション
IBM Connect 2017 フィードバックセッション
 
Lease financing in France
Lease financing in FranceLease financing in France
Lease financing in France
 
Make product not war
Make product not warMake product not war
Make product not war
 
Atdd half day_new_1_up
Atdd half day_new_1_upAtdd half day_new_1_up
Atdd half day_new_1_up
 
Employee Advocacy Best Practices
Employee Advocacy Best PracticesEmployee Advocacy Best Practices
Employee Advocacy Best Practices
 
Airline_Industry_Update_April2016
Airline_Industry_Update_April2016Airline_Industry_Update_April2016
Airline_Industry_Update_April2016
 
The Challenge of Wi-Fi: Providing a Consistent Customer Experience over Unlic...
The Challenge of Wi-Fi: Providing a Consistent Customer Experience over Unlic...The Challenge of Wi-Fi: Providing a Consistent Customer Experience over Unlic...
The Challenge of Wi-Fi: Providing a Consistent Customer Experience over Unlic...
 
Mastering Chemical Industry Disruption: Megatrends That Matter
Mastering Chemical Industry Disruption: Megatrends That MatterMastering Chemical Industry Disruption: Megatrends That Matter
Mastering Chemical Industry Disruption: Megatrends That Matter
 
The Cyber Security Leap: From Laggard to Leader
The Cyber Security Leap: From Laggard to LeaderThe Cyber Security Leap: From Laggard to Leader
The Cyber Security Leap: From Laggard to Leader
 
Navigating the Crude Cycle: 10 Strategic Actions for oilfield service and equ...
Navigating the Crude Cycle: 10 Strategic Actions for oilfield service and equ...Navigating the Crude Cycle: 10 Strategic Actions for oilfield service and equ...
Navigating the Crude Cycle: 10 Strategic Actions for oilfield service and equ...
 
Mastering Chemical Industry Disruption: The Race is On
Mastering Chemical Industry Disruption: The Race is OnMastering Chemical Industry Disruption: The Race is On
Mastering Chemical Industry Disruption: The Race is On
 
A new era for the chemicals industry: Cloud computing changes the game
A new era for the chemicals industry: Cloud computing changes the gameA new era for the chemicals industry: Cloud computing changes the game
A new era for the chemicals industry: Cloud computing changes the game
 
Accenture Spend Trends Report Q3 2014
Accenture Spend Trends Report Q3 2014Accenture Spend Trends Report Q3 2014
Accenture Spend Trends Report Q3 2014
 
Accenture Spend Trends Report Q1 2015
Accenture Spend Trends Report Q1 2015Accenture Spend Trends Report Q1 2015
Accenture Spend Trends Report Q1 2015
 
Sitecore Personalization on websites cached on CDN servers
Sitecore Personalization on websites cached on CDN serversSitecore Personalization on websites cached on CDN servers
Sitecore Personalization on websites cached on CDN servers
 

Similar to Meetup: Case Study - HPCC Systems implementation for an Aviation company

CBC Reporting Brochure
CBC Reporting BrochureCBC Reporting Brochure
CBC Reporting Brochure
Alex Dijkhoff
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1
blewington
 

Similar to Meetup: Case Study - HPCC Systems implementation for an Aviation company (20)

CBC Reporting Brochure
CBC Reporting BrochureCBC Reporting Brochure
CBC Reporting Brochure
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
Learn the new rules of cloud storage
Learn the new rules of cloud storageLearn the new rules of cloud storage
Learn the new rules of cloud storage
 
Next gen software operations models in the cloud
Next gen software operations models in the cloudNext gen software operations models in the cloud
Next gen software operations models in the cloud
 
Google Cloud infrastructure in Conrad Connect by Google & waylay
Google Cloud infrastructure in Conrad Connect by Google & waylayGoogle Cloud infrastructure in Conrad Connect by Google & waylay
Google Cloud infrastructure in Conrad Connect by Google & waylay
 
8 Things to Consider as SharePoint Moves to the Cloud
8 Things to Consider as SharePoint Moves to the Cloud8 Things to Consider as SharePoint Moves to the Cloud
8 Things to Consider as SharePoint Moves to the Cloud
 
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1
 
Accelerate Cloud Migration to AWS Cloud with Cognizant Cloud Steps
Accelerate Cloud Migration to AWS Cloud with Cognizant Cloud StepsAccelerate Cloud Migration to AWS Cloud with Cognizant Cloud Steps
Accelerate Cloud Migration to AWS Cloud with Cognizant Cloud Steps
 
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
 
Best practices for live streaming
Best practices for live streamingBest practices for live streaming
Best practices for live streaming
 
Couchbase & HPCC Systems – A complete mobile & data platform in the enterprise
Couchbase & HPCC Systems – A complete mobile & data platform in the enterpriseCouchbase & HPCC Systems – A complete mobile & data platform in the enterprise
Couchbase & HPCC Systems – A complete mobile & data platform in the enterprise
 
Solving the Hidden Costs of Kubernetes with Observability
Solving the Hidden Costs of Kubernetes with ObservabilitySolving the Hidden Costs of Kubernetes with Observability
Solving the Hidden Costs of Kubernetes with Observability
 
Modeling the grid for de centralized energy
Modeling the grid for de centralized energyModeling the grid for de centralized energy
Modeling the grid for de centralized energy
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
 
New ways to apply infrastructure data for better business outcomes
New ways to apply infrastructure data for better business outcomesNew ways to apply infrastructure data for better business outcomes
New ways to apply infrastructure data for better business outcomes
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
StreamSet ETL tool
StreamSet  ETL toolStreamSet  ETL tool
StreamSet ETL tool
 
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
 
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy ClustersData Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
 

More from HPCC Systems

Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 

More from HPCC Systems (20)

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex Systems
 
Welcome
WelcomeWelcome
Welcome
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon Cutting
 
Path to 8.0
Path to 8.0 Path to 8.0
Path to 8.0
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine Learning
 
Docker Support
Docker Support Docker Support
Docker Support
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network Capabilities
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis Tool
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL Neater
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
 

Recently uploaded

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

Meetup: Case Study - HPCC Systems implementation for an Aviation company

  • 1. WHT/082311 1 | | ©2013, Cognizant1 | ©2017, Cognizant 1 Hammer and beyond – An ensembling journey
  • 2. WHT/082311 2 | ©2017, Cognizant Who am I? Sunil Babu Peethambaram Architect, Cognizant Technology Solutions, CTSH (NASDAQ) Total IT experience – 13+ years Consulting with LexisNexis since 2013 (Chennai, Dayton, Buford, Alpharetta) Experience in HPCC Systems – more than 3 years Domains worked on : • Supply Chain Management • Logistics • Retail – • Merchandise and Store operations • Order Management and • Warehouse Management Systems • Insurance • Healthcare • Aviation
  • 3. WHT/082311 3 | ©2017, Cognizant Problem statement - How did it all start Build valid flight connections (VFC) based on direct flight schedules (DFS) DFS come in a proprietary encoded format DFS spans across 1000 carriers and over 4 million records DFS are for a year or more into the future DFS keeps changing every day and VFC needs to be versioned for every day (potentially) Building VFC requires evaluating feasibility of over 16 trillion potential connections Valid connections to be identified by applying: • Circuitry • Cabotage • BIETA and LCC • Schedule conflicts • MCT rules of over 100,000 to be applied in sequence
  • 4. WHT/082311 4 | ©2017, Cognizant The Legacy Setup • Complex Business Logic • Data intensive • .NET/SQL Server • Local datacenter • Scaled-up architecture • Ageing hardware • Sequential processing • Low fault tolerance • Stale data delivery • 24 X 7 life support
  • 5. WHT/082311 5 | ©2017, Cognizant The ask SOS!
  • 6. WHT/082311 6 | ©2017, Cognizant The ask Not really!
  • 7. WHT/082311 7 | ©2017, Cognizant The ask Relevant data delivery – faster processing, parallelize independent tasks Don’t marry the hardware (just friends with benefits) Performance as a configuration (take your time, hurry up, choice is yours, don't be late) Fail fast, recover faster Onboard new customers quickly Automated data delivery pipeline Better maintainability – support and enhance the complex business logic
  • 8. WHT/082311 8 | ©2017, Cognizant So What?
  • 9. WHT/082311 9 | ©2017, Cognizant Every project has complex business logic
  • 10. WHT/082311 10 | ©2017, Cognizant But, We have to generate hundreds of millions of records…
  • 11. WHT/082311 11 | ©2017, Cognizant …which means we have a “big data” problem
  • 12. WHT/082311 12 | ©2017, Cognizant And we are going to do whatever it takes…
  • 13. WHT/082311 13 | ©2017, Cognizant OK Google.. What is big data?
  • 14. WHT/082311 14 | ©2017, Cognizant This is what we got!
  • 15. WHT/082311 15 | ©2017, Cognizant Our problem was different
  • 16. WHT/082311 16 | ©2017, Cognizant We have a big “data problem” and the answers are a whole lot bigger!!!
  • 17. WHT/082311 17 | ©2017, Cognizant So, why HPCC Systems? Why not?
  • 18. WHT/082311 18 | ©2017, Cognizant So, why HPCC Systems? Our use case was data intensive and batch oriented Embarrassingly parallel ECL was built specifically for distributed data processing and gave us the fine control we needed Been there.. done that, lot of real experiences to tap into Access to the HPCC Systems development team It’s performing and maintainable We did a proof of concept and validated fitment anyway • 45 minute job ran in 1 second • 4 hours job ran in 90 seconds • 4 weeks planned proof of concept was completed in 4 days
  • 19. WHT/082311 19 | ©2017, Cognizant What did Bill have to say about it?
  • 20. WHT/082311 20 | ©2017, Cognizant Why AWS? Bring a multi-node HPCC Systems cluster up or down at a click of a button Scale up or down with zero upfront cost Validating multiple configurations for performance and choose the best And… No need for Data Centers Pay as you USE Go Global Speed of computing
  • 21. WHT/082311 21 | ©2017, Cognizant High level flow
  • 22. WHT/082311 22 | ©2017, Cognizant Inside HPCC Systems Data warehouse as Source of Truth Data warehouse is the base on which our solution was built. Follows a push-pull architecture The raw data from different data sources are cleansed and transformed to data cubes (push). The cubes acts as views that are used by downstream applications (pull). Eg: Connection builder Data warehouse is the only way by which data enters into the distributed data processing system All views follow a common interface through which data can be accessed
  • 23. WHT/082311 23 | ©2017, Cognizant Lifecycle of a view in DW
  • 24. WHT/082311 24 | ©2017, Cognizant How did we fare? Metrics Measure (Legacy – UTG) Measure (HPCC Systems) Building connections (Singles) 40 hours 1 hour Lines of Code 26535 (Not including SQL) 3973 Delivery Frequency Weekly Daily (Possible) Hardware 24 GB and 12 cores for Batch Server 384 GB and 24 Cores for SQL Server Thor Master + Middleware – 16 GB Thor Slaves 64 GB – 16 cores across 4 nodes AWS 4.4 million 100 million 13.5 million
  • 25. WHT/082311 25 | ©2017, Cognizant Happy Side Effects Data Warehouse as a framework for new data sources Data Warehouse as an interface for downstream applications Plug and play by design File builder template – Blue print for all data delivery jobs Unit testing framework for HPCC Systems Regression testing suite – Can run all tests in the code base and provide report We integrated comparison testing tool from LNR into Hammer HPCC Systems cluster can now be built in AWS at a click of a button (puppet) Seamless sync between external FTP location and landing zone through S3
  • 26. WHT/082311 26 | ©2017, Cognizant What next?
  • 27. WHT/082311 27 | ©2017, Cognizant What next?
  • 28. WHT/082311 28 | ©2017, Cognizant What next?
  • 29. WHT/082311 29 | ©2017, Cognizant What next?
  • 30. WHT/082311 30 | ©2017, Cognizant What next?
  • 31. WHT/082311 31 | ©2017, Cognizant What next?
  • 32. WHT/082311 32 | ©2017, Cognizant What next?
  • 33. WHT/082311 33 | ©2017, Cognizant What next?
  • 34. WHT/082311 34 | ©2017, Cognizant What next?
  • 35. WHT/082311 35 | ©2017, Cognizant ? Questions?
  • 36. WHT/082311 36 | ©2017, Cognizant Thank you Reach out to me: Sunil.Babu@flightglobal.com Useful links Cognizant: http://www.cognizant.com FlightGlobal http://www.flightglobal.com HPCC Systems Portal: http://hpccsystems.com Machine Learning: http://hpccsystems.com/ml Online Training: http://learn.lexisnexis.com/hpcc HPCC Systems Wiki & Red Book: https://wiki.hpccsystems.com Our GitHub portal: https://github.com/hpcc-systems Community Forums: http://hpccsystems.com/bb Documentation: https://hpccsystems.com/download/documentation