SlideShare une entreprise Scribd logo
1  sur  17
Future of Data meetup#5
With Cognizant
Future of Data meetup#5
With Cognizant
Abdelkrim HadjidjAbdelkrim Hadjidj
Solution Engineer - Hortonworks
@ahadjidj
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
Agenda
 Merci à Cognizant pour leur accueil et pour la collation
 On est 1085 membres aujourd’hui !!
 N’hésitez pas à nous contacter si vous souhaitez participer au prochains numéros
(présentation, démo, hébergement, etc)
 Agenda
– Real life use cases from across Europe (Walid Aoudi - Cognizant)
– Lessons learnt from running an enterprise data lake for 70 teams (Matthias Kluba - Société
Générale)
– BI on Hadoop: which tool for which use case (Matthieu Lamaraisse - Hortonworks)
The Industry’s Premier Big
Data Community Event
DataWorks Summit Berlin
April 16–19, 2018
Estrel Hotel, Sonnenallee, Berlin, Germany
© 2018 Cognizant
© 2018 Cognizant
March 8th, 2018
Real life use cases from across Europe
Cognizant’s Presentation
Walid Aoudi – Ph.D, Architecte Big Data & DataScientist
© 2018 Cognizant
Corporate Overview
~244,300
Employees
(as of Jun 30, 2016)
………………………..
100+ Global
Delivery Centers
…………………….
. . . .…………………..
………………………
.
Revenue
$12.42B in 2015
(21% YoY)
$3.37B in Q2 2016
Market Capitalization
~ $35B
Fortune’s
Most Admired Companies
Years in a Row
8 Financial Times
Global 500
281
Forbes
Fast Tech 25
18 Newsweek’s
2016 World Green Rankings
101 Fortune
500
230
Forbes
Global 2000
529
© 2018 Cognizant
Leader
In Data Management
and BI Service
Providers
Rated #2 on Strategy
AIM : The global leader in data and analytics
Global Top 4
in Business
Analytics Services
& Leaders
in BA and SI service
providers
Top 5 Service
Provider for
Enterprise Data
Management &
Business Analytics
Services
Leader in
Healthcare Payer
Global Banking and
Global Insurance
Big Data and Analytics
IT Services
20,000+*
Consultants
One of the largest Analytics
Practice in the Industry
650+*
Active customers including
several Fortune 500 companies
150+*
IP based Assets (Platforms,
Solutions, Tools and
Frameworks)
Top 3 Service
Provider
in Enterprise Analytics
Services
Top Analytics IT
Service Provider
of Analytics IT
Consulting Providers
Leader
in US
Healthcare Payer BI
Consulting and
Outsourcing Services
Top 3
among MDM
System Integrators
2000+*
Specialists
Domain Experts,
Masters Degree & PhD holders
800+*
Trained Data Scientists
© 2018 Cognizant7
Big Data and AI Capability
70+
Clients
150+
Engagements
1600+
Consultants
150+
Use Cases Repository
 Won Chairman’s Award @ Leading Card & Travel Services
Firm
 Innovation & Business Value Award @ Leading Managed
Healthcare Company
 Informatica 2016 partner award
 MarkLogic 2016 – Partner Excellence Award in US & EMEA
Analystrecognition
Awards
Impacting Businesses through Big Data
 ~ $ 30 M churn reduction through machine learning
 75% improved and real-time fraud prediction
 ~ $ 1 M YoY savings through Big data offloading
 $ 2 M increased cross-sell and up sell opportunities
Innovation
Data Ingestion
Workbench
BAVA iSMART BRAVO
“Leader”
Gartner Magic
Quadrant for
Business Analytics
Services,
Worldwide 2014
Rated as a Leader by
Everest’s PEAK matrix
in their “Big Data and
Analytics assessment”
for HC, Banking and
Insurance domains
Innovation &
Business Value
Award @ Leading
Managed
Healthcare
Company
Winners Circle –
Top 7 Service
Providers
BFS Analytics
Services 2016
HfS Blueprint Report
Partnerships
SightPrisim
© 2018 Cognizant8
Data Lake on Hadoop for Downstream Systems
@ Leading Energy and Home Services provider (1/3)
Key Highlights
Business Drivers
 Change data capture from SAP –Oracle based source system
 Not meeting business SLA in delivering the data for reporting and analytics
 Separate storage and processing for downstream tools like SAS ETL and reporting
1 data repository
for enterprise wide
to customer data
Data ingestion from
1300+ SAP tables,
10K tables in total
Solution Highlights
Sources such as SAP, CRM and SMART meter logs were integrated on data lake
Implemented SCD on a file system and existing SAS code was executed against Hadoop with no or
zero rework
Used ODI journal table mechanism for capturing changes in batches & met business SLA with the
power of Hadoop parallelism
Business Outcome
 25% Decrease in query processing latency
 20% Lower cost for storage and performance gain in data processing
 With the use of SCD concept, no or zero rework downstream
Technology Stack: HDP | Pig | Hive | Sqoop | SAS | ODI for CDC | Qlikview
250 users, 150
applications, 100 node
cluster
Total data volume
2 Petabytes on Prod
© 2018 Cognizant9
Data Lake on Hadoop for Downstream Systems
@ Leading Energy and Home Services provider (2/3)
Solution
Approach
 Pushdown all the SAS
ETL code and reporting
code directly on to
Hadoop
 Hadoop architecture
which worked seamlessly
on commodity hardware,
saved cost on buying
expensive Teradata
upgrades
 Use of Hadoop
ecosystem products,
saved cost of buying
expensive ETL licensing
 POC with 10 node HDP
1.0 cluster(hosted on HP
data center) with credit
risk data
 Added new data
(Headend (xml), pulse
(mobile app field
engineer), website)
 Added new uses cases
(campaign management,
record matching)
Solution Architecture
File
Feeds
HIV
E
QL
HIV
E
QL
History CaptureLanding Transformation Layer Reports
• Tables required
for application is
brought into
Data Lake using
Ingestion
Framework
• Only tables which
gets updates and
deletes will be
added into History
Capture (Insert
Only Tables will
only be added to
Landing)
• View will be
created from
Landing and History
Capture Layer
• Any complex
transformation and
business logic will
be created as new
HIVE tables
• Transformation
Layer, a single
interface for any
reporting / data
extracts
• Excel Reports will
connect to Data
Lake using ODBC
driver for HIVE
Excel
Reports
Data Sources
Other
Feeds
SAP
CRM
Telep
hony
DYNO QAM
S
ISSAC Salesf
orce
Transformation
Layer
Landing
History
Capture
Reporting
© 2018 Cognizant
Project Highlights:
 Reduce TCO
 Operational Efficiencies
 Predictive Maintenance
 Drive Load Reduction Programs
 Early Settlements
 Customer Top-Up Prediction
 Error Free Billing and ½ hourly
consumption breakdowns
Hortonworks Hadoop Data Lake Based Smart Meter Analytics
Solution:
 Integrated 60,000 smart meters and 3 million half-hourly
records on a daily basis.
 SPARK / R Based Statistical Models
 Qlik based Consumption Dashboards
Business Drivers: Cognizant Solution:
 Tiered Pricing – Free Weekends
 Customer Segmentation based on usage
 Proactively Identify mal-functioning smart meters &
timely repair
 Improved forecasting for all Smart meters, marketing and
customer behavioral insight
 Prioritization of workforce planning for field staff that
repairs/replaced smart meters at 0.014% from 1% of 1.2
million meters in a 45 day period
 Improved visibility and transparency of home energy
consumption to end Customers, by providing usage
breakdown, similar homes comparison and insight
Benefits Delivered:
• 10 % decrease in costs in network audit
• Prediction of last mile consumption spikes with accuracy of 89%
• Increase up to 80% in prediction of consumption leakage
• Computer Weekly Best big data, BI and analytics project of the
year
• Business Dashboards Processed files with over 4 Billion records
and fetchs a staggering 45million records daily
• 9 Billion records available to the customer
Data Lake on Hadoop for Downstream Systems (Smart Meter Analytics Implementation)
@ Leading Energy and Home Services provider (3/3)
© 2018 Cognizant11
100+ TB and 20
SoRs
3 countries and
all Lines of
Businesses
Single entry
ingestion through
Kafka
Key Highlights
Business
Drivers
Solution
Highlights
Business
Outcomes
 Existing three country legacy DW systems on disparate technologies, with multiple view
of information, data quality and manual dependency
 Challenges to a one unified centralized data platform for deriving business insights
 Increase in business demand for data driven applications that leverage the
overwhelming growth of data and advancement in technology.
 Cloud Based Environment for all the Process in Software Lifecycle(development
environment, GIT, cluster, Data marts etc.).
 Common frame work for all the reusable components to be used in cloud.
 Unlimited data storage in Azure data Lake for various variety of data (Structured,
unstructured, semi structured)
 Kafka Spark Streaming for real-time data ingestions.
 Confluent repository for all Schema managements.
 ARM based environment creation with all security, authentication, authorization using
GitLab and chef.
 Technology Stack : Microsoft Azure HDInsight, Streamset, Kafka, Spark, Apache
Beam, Chef, GitLab
 Integrated ecosystem addressing full breadth of enterprise applications, analytics, users
and use cases
 Expected positive shift in customer engagement, Products, Pricing and Offers, Claims
management, etc.
 Deriving faster and accurate business insights to aid rapid and precise decision making
© 2018 Cognizant12
© 2018 Cognizant13
Network Architecture
© 2018 Cognizant14
Automation(DevOps)
© 2018 Cognizant15
Financial DataLake Implementation @ Global Custodian
Business Drivers
 Enable fund managers to gain in efficiency but also to focus on their core business
 Help customers respond to questions in 4 key areas (Fund Distribution, Financial Reporting, MIS KPI, Social Media)
 Enable analysis of investor behavior
Solution Highlights
 Designed & developed a business specific Data Management Platform on Hadoop
 Enabled end to end data acquisition / storage / Processing / Visualization to End Users in a fully secured
mode at lowest possible granularity level for data access (cell level)
 Improved performance and robustness of the platform enabling real time in memory analytics
Business Outcome
 A wide range of data in Real time on all funds, Integrating a history on several years
 A complete instrument panel allowing customers to zoom in on a specific element such as an investor In
particular, a fund, a date, a geographical area or motto
 A customization service allowing each client to adapt the tool to its own needs specific
Technology Stack: HDFS| HBase | Scaled Risk | Tableau | Kerberos | Ranger
Key Highlights
100 + Source
Systems
4 Key Domains – Fund
Distribution, Financial
Reporting, MIS KPI,
Social Media
Total data volume
150+ Tb and growing
Internal and External
Data
© 2018 Cognizant
Financial DataLake Implementation @ Global Custodian
16
Central Data Repository
Data Consumption Business Layer
Financial Reporting
Standard Report
VAR Report
Accounting Report
Fund Factsheet
Solvency 2 View
Business Views
Self Service BI
Analytics
Landing Zone
MF MP
IC
Fusion
AAA
gpms
CRD
URD amadeus
RBC
BU_1 View
BU_N View
…
Web Portal /
Intranet
File
Transfer
Downstream
Email
Internal Data
HDFS
Data Integration
API / Data as a
service
External Data
Bisam RMX
External Data sources
Raw Data
Reservoir
Certified
Data Layer
HDFS NFS
Gateway
Security Layer
SPNEGO SSL
HDFS NFS
Gateway
HDFS NFS
Gateway
custom
© 2018 Cognizant
We’re Hiring !
17

Contenu connexe

Tendances

Why and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkWhy and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on Flink
DataWorks Summit
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 
The Implacable advance of the data
The Implacable advance of the dataThe Implacable advance of the data
The Implacable advance of the data
DataWorks Summit
 

Tendances (20)

Log I am your father
Log I am your fatherLog I am your father
Log I am your father
 
Apache Hadoop Crash Course
Apache Hadoop Crash CourseApache Hadoop Crash Course
Apache Hadoop Crash Course
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
 
Intro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJIntro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJ
 
Synchronicity of a distributed financial system
Synchronicity of a distributed financial systemSynchronicity of a distributed financial system
Synchronicity of a distributed financial system
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARN
 
Why and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkWhy and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on Flink
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
Admiral Group
Admiral GroupAdmiral Group
Admiral Group
 
Multi-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASMulti-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLAS
 
The Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricThe Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data Centric
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
 
The Implacable advance of the data
The Implacable advance of the dataThe Implacable advance of the data
The Implacable advance of the data
 
Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
 

Similaire à Paris FOD Meetup #5 Cognizant Presentation

Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
Pactera_US
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
ConnectaDigital
 
SIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess QlikSIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess Qlik
Bardess Group
 

Similaire à Paris FOD Meetup #5 Cognizant Presentation (20)

Higher ROI-N
Higher ROI-NHigher ROI-N
Higher ROI-N
 
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
 
Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of Analytics
 
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise Architecture
 
Consumption based analytics enabled by Data Virtualization
Consumption based analytics enabled by Data VirtualizationConsumption based analytics enabled by Data Virtualization
Consumption based analytics enabled by Data Virtualization
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
 
Big Data for Product Managers
Big Data for Product ManagersBig Data for Product Managers
Big Data for Product Managers
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
 
Drive Business Outcomes for Big Data Environments
Drive Business Outcomes for Big Data EnvironmentsDrive Business Outcomes for Big Data Environments
Drive Business Outcomes for Big Data Environments
 
Revolution in Business Analytics-Zika Virus Example
Revolution in Business Analytics-Zika Virus ExampleRevolution in Business Analytics-Zika Virus Example
Revolution in Business Analytics-Zika Virus Example
 
SIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess QlikSIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess Qlik
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data Analytics
 

Plus de Abdelkrim Hadjidj

Plus de Abdelkrim Hadjidj (7)

Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizon
 
Paris FOD meetup - koordinator
Paris FOD meetup - koordinatorParis FOD meetup - koordinator
Paris FOD meetup - koordinator
 
Paris FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging ManagerParis FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging Manager
 
Paris FOD meetup - kafka security 101
Paris FOD meetup - kafka security 101Paris FOD meetup - kafka security 101
Paris FOD meetup - kafka security 101
 
Apache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleApache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scale
 
Future of Data Meetup : Boontadata
Future of Data Meetup : BoontadataFuture of Data Meetup : Boontadata
Future of Data Meetup : Boontadata
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Paris FOD Meetup #5 Cognizant Presentation

  • 1. Future of Data meetup#5 With Cognizant Future of Data meetup#5 With Cognizant Abdelkrim HadjidjAbdelkrim Hadjidj Solution Engineer - Hortonworks @ahadjidj © Hortonworks Inc. 2011 – 2015. All Rights Reserved
  • 2. Agenda  Merci à Cognizant pour leur accueil et pour la collation  On est 1085 membres aujourd’hui !!  N’hésitez pas à nous contacter si vous souhaitez participer au prochains numéros (présentation, démo, hébergement, etc)  Agenda – Real life use cases from across Europe (Walid Aoudi - Cognizant) – Lessons learnt from running an enterprise data lake for 70 teams (Matthias Kluba - Société Générale) – BI on Hadoop: which tool for which use case (Matthieu Lamaraisse - Hortonworks)
  • 3. The Industry’s Premier Big Data Community Event DataWorks Summit Berlin April 16–19, 2018 Estrel Hotel, Sonnenallee, Berlin, Germany
  • 4. © 2018 Cognizant © 2018 Cognizant March 8th, 2018 Real life use cases from across Europe Cognizant’s Presentation Walid Aoudi – Ph.D, Architecte Big Data & DataScientist
  • 5. © 2018 Cognizant Corporate Overview ~244,300 Employees (as of Jun 30, 2016) ……………………….. 100+ Global Delivery Centers ……………………. . . . .………………….. ……………………… . Revenue $12.42B in 2015 (21% YoY) $3.37B in Q2 2016 Market Capitalization ~ $35B Fortune’s Most Admired Companies Years in a Row 8 Financial Times Global 500 281 Forbes Fast Tech 25 18 Newsweek’s 2016 World Green Rankings 101 Fortune 500 230 Forbes Global 2000 529
  • 6. © 2018 Cognizant Leader In Data Management and BI Service Providers Rated #2 on Strategy AIM : The global leader in data and analytics Global Top 4 in Business Analytics Services & Leaders in BA and SI service providers Top 5 Service Provider for Enterprise Data Management & Business Analytics Services Leader in Healthcare Payer Global Banking and Global Insurance Big Data and Analytics IT Services 20,000+* Consultants One of the largest Analytics Practice in the Industry 650+* Active customers including several Fortune 500 companies 150+* IP based Assets (Platforms, Solutions, Tools and Frameworks) Top 3 Service Provider in Enterprise Analytics Services Top Analytics IT Service Provider of Analytics IT Consulting Providers Leader in US Healthcare Payer BI Consulting and Outsourcing Services Top 3 among MDM System Integrators 2000+* Specialists Domain Experts, Masters Degree & PhD holders 800+* Trained Data Scientists
  • 7. © 2018 Cognizant7 Big Data and AI Capability 70+ Clients 150+ Engagements 1600+ Consultants 150+ Use Cases Repository  Won Chairman’s Award @ Leading Card & Travel Services Firm  Innovation & Business Value Award @ Leading Managed Healthcare Company  Informatica 2016 partner award  MarkLogic 2016 – Partner Excellence Award in US & EMEA Analystrecognition Awards Impacting Businesses through Big Data  ~ $ 30 M churn reduction through machine learning  75% improved and real-time fraud prediction  ~ $ 1 M YoY savings through Big data offloading  $ 2 M increased cross-sell and up sell opportunities Innovation Data Ingestion Workbench BAVA iSMART BRAVO “Leader” Gartner Magic Quadrant for Business Analytics Services, Worldwide 2014 Rated as a Leader by Everest’s PEAK matrix in their “Big Data and Analytics assessment” for HC, Banking and Insurance domains Innovation & Business Value Award @ Leading Managed Healthcare Company Winners Circle – Top 7 Service Providers BFS Analytics Services 2016 HfS Blueprint Report Partnerships SightPrisim
  • 8. © 2018 Cognizant8 Data Lake on Hadoop for Downstream Systems @ Leading Energy and Home Services provider (1/3) Key Highlights Business Drivers  Change data capture from SAP –Oracle based source system  Not meeting business SLA in delivering the data for reporting and analytics  Separate storage and processing for downstream tools like SAS ETL and reporting 1 data repository for enterprise wide to customer data Data ingestion from 1300+ SAP tables, 10K tables in total Solution Highlights Sources such as SAP, CRM and SMART meter logs were integrated on data lake Implemented SCD on a file system and existing SAS code was executed against Hadoop with no or zero rework Used ODI journal table mechanism for capturing changes in batches & met business SLA with the power of Hadoop parallelism Business Outcome  25% Decrease in query processing latency  20% Lower cost for storage and performance gain in data processing  With the use of SCD concept, no or zero rework downstream Technology Stack: HDP | Pig | Hive | Sqoop | SAS | ODI for CDC | Qlikview 250 users, 150 applications, 100 node cluster Total data volume 2 Petabytes on Prod
  • 9. © 2018 Cognizant9 Data Lake on Hadoop for Downstream Systems @ Leading Energy and Home Services provider (2/3) Solution Approach  Pushdown all the SAS ETL code and reporting code directly on to Hadoop  Hadoop architecture which worked seamlessly on commodity hardware, saved cost on buying expensive Teradata upgrades  Use of Hadoop ecosystem products, saved cost of buying expensive ETL licensing  POC with 10 node HDP 1.0 cluster(hosted on HP data center) with credit risk data  Added new data (Headend (xml), pulse (mobile app field engineer), website)  Added new uses cases (campaign management, record matching) Solution Architecture File Feeds HIV E QL HIV E QL History CaptureLanding Transformation Layer Reports • Tables required for application is brought into Data Lake using Ingestion Framework • Only tables which gets updates and deletes will be added into History Capture (Insert Only Tables will only be added to Landing) • View will be created from Landing and History Capture Layer • Any complex transformation and business logic will be created as new HIVE tables • Transformation Layer, a single interface for any reporting / data extracts • Excel Reports will connect to Data Lake using ODBC driver for HIVE Excel Reports Data Sources Other Feeds SAP CRM Telep hony DYNO QAM S ISSAC Salesf orce Transformation Layer Landing History Capture Reporting
  • 10. © 2018 Cognizant Project Highlights:  Reduce TCO  Operational Efficiencies  Predictive Maintenance  Drive Load Reduction Programs  Early Settlements  Customer Top-Up Prediction  Error Free Billing and ½ hourly consumption breakdowns Hortonworks Hadoop Data Lake Based Smart Meter Analytics Solution:  Integrated 60,000 smart meters and 3 million half-hourly records on a daily basis.  SPARK / R Based Statistical Models  Qlik based Consumption Dashboards Business Drivers: Cognizant Solution:  Tiered Pricing – Free Weekends  Customer Segmentation based on usage  Proactively Identify mal-functioning smart meters & timely repair  Improved forecasting for all Smart meters, marketing and customer behavioral insight  Prioritization of workforce planning for field staff that repairs/replaced smart meters at 0.014% from 1% of 1.2 million meters in a 45 day period  Improved visibility and transparency of home energy consumption to end Customers, by providing usage breakdown, similar homes comparison and insight Benefits Delivered: • 10 % decrease in costs in network audit • Prediction of last mile consumption spikes with accuracy of 89% • Increase up to 80% in prediction of consumption leakage • Computer Weekly Best big data, BI and analytics project of the year • Business Dashboards Processed files with over 4 Billion records and fetchs a staggering 45million records daily • 9 Billion records available to the customer Data Lake on Hadoop for Downstream Systems (Smart Meter Analytics Implementation) @ Leading Energy and Home Services provider (3/3)
  • 11. © 2018 Cognizant11 100+ TB and 20 SoRs 3 countries and all Lines of Businesses Single entry ingestion through Kafka Key Highlights Business Drivers Solution Highlights Business Outcomes  Existing three country legacy DW systems on disparate technologies, with multiple view of information, data quality and manual dependency  Challenges to a one unified centralized data platform for deriving business insights  Increase in business demand for data driven applications that leverage the overwhelming growth of data and advancement in technology.  Cloud Based Environment for all the Process in Software Lifecycle(development environment, GIT, cluster, Data marts etc.).  Common frame work for all the reusable components to be used in cloud.  Unlimited data storage in Azure data Lake for various variety of data (Structured, unstructured, semi structured)  Kafka Spark Streaming for real-time data ingestions.  Confluent repository for all Schema managements.  ARM based environment creation with all security, authentication, authorization using GitLab and chef.  Technology Stack : Microsoft Azure HDInsight, Streamset, Kafka, Spark, Apache Beam, Chef, GitLab  Integrated ecosystem addressing full breadth of enterprise applications, analytics, users and use cases  Expected positive shift in customer engagement, Products, Pricing and Offers, Claims management, etc.  Deriving faster and accurate business insights to aid rapid and precise decision making
  • 15. © 2018 Cognizant15 Financial DataLake Implementation @ Global Custodian Business Drivers  Enable fund managers to gain in efficiency but also to focus on their core business  Help customers respond to questions in 4 key areas (Fund Distribution, Financial Reporting, MIS KPI, Social Media)  Enable analysis of investor behavior Solution Highlights  Designed & developed a business specific Data Management Platform on Hadoop  Enabled end to end data acquisition / storage / Processing / Visualization to End Users in a fully secured mode at lowest possible granularity level for data access (cell level)  Improved performance and robustness of the platform enabling real time in memory analytics Business Outcome  A wide range of data in Real time on all funds, Integrating a history on several years  A complete instrument panel allowing customers to zoom in on a specific element such as an investor In particular, a fund, a date, a geographical area or motto  A customization service allowing each client to adapt the tool to its own needs specific Technology Stack: HDFS| HBase | Scaled Risk | Tableau | Kerberos | Ranger Key Highlights 100 + Source Systems 4 Key Domains – Fund Distribution, Financial Reporting, MIS KPI, Social Media Total data volume 150+ Tb and growing Internal and External Data
  • 16. © 2018 Cognizant Financial DataLake Implementation @ Global Custodian 16 Central Data Repository Data Consumption Business Layer Financial Reporting Standard Report VAR Report Accounting Report Fund Factsheet Solvency 2 View Business Views Self Service BI Analytics Landing Zone MF MP IC Fusion AAA gpms CRD URD amadeus RBC BU_1 View BU_N View … Web Portal / Intranet File Transfer Downstream Email Internal Data HDFS Data Integration API / Data as a service External Data Bisam RMX External Data sources Raw Data Reservoir Certified Data Layer HDFS NFS Gateway Security Layer SPNEGO SSL HDFS NFS Gateway HDFS NFS Gateway custom