More Related Content Similar to Meetup: Case Study - HPCC Systems implementation for an Aviation company (20) More from HPCC Systems (20) Meetup: Case Study - HPCC Systems implementation for an Aviation company1. WHT/082311
1 | | ©2013, Cognizant1 | ©2017, Cognizant 1
Hammer and beyond – An ensembling journey
2. WHT/082311
2 | ©2017, Cognizant
Who am I?
Sunil Babu Peethambaram
Architect, Cognizant Technology Solutions, CTSH (NASDAQ)
Total IT experience – 13+ years
Consulting with LexisNexis since 2013 (Chennai, Dayton, Buford, Alpharetta)
Experience in HPCC Systems – more than 3 years
Domains worked on :
• Supply Chain Management
• Logistics
• Retail –
• Merchandise and Store operations
• Order Management and
• Warehouse Management Systems
• Insurance
• Healthcare
• Aviation
3. WHT/082311
3 | ©2017, Cognizant
Problem statement - How did it all start
Build valid flight connections (VFC) based on direct flight schedules (DFS)
DFS come in a proprietary encoded format
DFS spans across 1000 carriers and over 4 million records
DFS are for a year or more into the future
DFS keeps changing every day and VFC needs to be versioned for every day (potentially)
Building VFC requires evaluating feasibility of over 16 trillion potential connections
Valid connections to be identified by applying:
• Circuitry
• Cabotage
• BIETA and LCC
• Schedule conflicts
• MCT rules of over 100,000 to be applied in sequence
4. WHT/082311
4 | ©2017, Cognizant
The Legacy Setup
• Complex Business Logic
• Data intensive
• .NET/SQL Server
• Local datacenter
• Scaled-up architecture
• Ageing hardware
• Sequential processing
• Low fault tolerance
• Stale data delivery
• 24 X 7 life support
7. WHT/082311
7 | ©2017, Cognizant
The ask
Relevant data delivery – faster processing, parallelize independent tasks
Don’t marry the hardware
(just friends with benefits)
Performance as a configuration
(take your time, hurry up, choice is yours, don't be late)
Fail fast, recover faster
Onboard new customers quickly
Automated data delivery pipeline
Better maintainability – support and enhance the complex business logic
18. WHT/082311
18 | ©2017, Cognizant
So, why HPCC Systems?
Our use case was data intensive and batch oriented
Embarrassingly parallel
ECL was built specifically for distributed data processing and gave us the fine
control we needed
Been there.. done that, lot of real experiences to tap into
Access to the HPCC Systems development team
It’s performing and maintainable
We did a proof of concept and validated fitment anyway
• 45 minute job ran in 1 second
• 4 hours job ran in 90 seconds
• 4 weeks planned proof of concept was completed in 4 days
20. WHT/082311
20 | ©2017, Cognizant
Why AWS?
Bring a multi-node HPCC Systems cluster up or down at a click of a button
Scale up or down with zero upfront cost
Validating multiple configurations for performance and choose the best
And…
No need for
Data Centers
Pay as you USE
Go Global
Speed of
computing
22. WHT/082311
22 | ©2017, Cognizant
Inside HPCC Systems
Data warehouse as Source of Truth
Data warehouse is the base on which our
solution was built.
Follows a push-pull architecture
The raw data from different data sources
are cleansed and transformed to data
cubes (push).
The cubes acts as views that are used by
downstream applications (pull). Eg:
Connection builder
Data warehouse is the only way by which
data enters into the distributed data
processing system
All views follow a common interface
through which data can be accessed
24. WHT/082311
24 | ©2017, Cognizant
How did we fare?
Metrics Measure (Legacy – UTG) Measure (HPCC Systems)
Building connections (Singles) 40 hours 1 hour
Lines of Code 26535 (Not including SQL) 3973
Delivery Frequency Weekly Daily (Possible)
Hardware 24 GB and 12 cores for Batch Server
384 GB and 24 Cores for SQL Server
Thor Master + Middleware – 16 GB
Thor Slaves 64 GB – 16 cores across 4 nodes
AWS
4.4 million
100 million
13.5 million
25. WHT/082311
25 | ©2017, Cognizant
Happy Side Effects
Data Warehouse as a framework for new data sources
Data Warehouse as an interface for downstream applications
Plug and play by design
File builder template – Blue print for all data delivery jobs
Unit testing framework for HPCC Systems
Regression testing suite – Can run all tests in the code base and provide report
We integrated comparison testing tool from LNR into Hammer
HPCC Systems cluster can now be built in AWS at a click of a button (puppet)
Seamless sync between external FTP location and landing zone through S3
36. WHT/082311
36 | ©2017, Cognizant
Thank you
Reach out to me: Sunil.Babu@flightglobal.com
Useful links
Cognizant: http://www.cognizant.com
FlightGlobal http://www.flightglobal.com
HPCC Systems Portal: http://hpccsystems.com
Machine Learning: http://hpccsystems.com/ml
Online Training: http://learn.lexisnexis.com/hpcc
HPCC Systems Wiki & Red Book: https://wiki.hpccsystems.com
Our GitHub portal: https://github.com/hpcc-systems
Community Forums: http://hpccsystems.com/bb
Documentation: https://hpccsystems.com/download/documentation