SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
TECHNIQUES OFFERED BY BIG DATA AVAILABLE
FOR USE IN THE BANKING SECTOR
RADOSŁAW KITA
Why me?
Before we embark on the journey
§  For over 12 years I was responsible for reporting systems in Onet.pl.
I was responsible for the choice of data analysis tools. In this the
transition from commercial to open source solutions.
§  I took an active part in one of the most interesting (and somewhat
controversial) Big Data project at Alior Bank.
§  I worked on the implementation of the recommendation system in
TVN content.
§  Expert Data Scientist at Allegro.
§  Actively dealing with the analysis of large data sets and analytical
tools selection from the beginning of this century J
What I do not want to talk
Before we embark on the journey
I do not want to talk about what I read on the Internet just about
something that I did in practice.
Big Data is like…
Before we embark on the journey
“Teen sex:
everyone talks about it,
nobody really knows how to do it,
everyone thinks everyone else is doing it,
so everyone claims they are doing it...”
Dan Ariely
If I were to be malicious:
Children of a certain age do not believe that their parents have sex,
despite fairly obvious proof.
When you are responsible for the choice of technology
Before we embark on the journey
“I do not really have anything against so that our
competitors has issued 3,000,000 for technology
that they will not help.”
When you are responsible for the choice of technology
Before we embark on the journey
“The truth is out there”
But there is a world where no one has ever lied.
World where no one has ever lied
Before we embark on the journey
in my humble opinion
All that jazz is about
The ability to create a coherent picture of customer needs.
§  On the basis of a number of different data sources.
§  Suitable for practical use.
§  Within a reasonable period of time.
§  At a reasonable cost.
Why?
“Data driven company”
more detailed
All that jazz is about
We want to merge data about the behavior of our customers:
§  Internal transaction systems
§  What are they doing on our website
§  What they write in social media
§  What kind of content are looking for on the Internet
§  In what case would call for call centers
§  How they use a mobile phones…
Definition Big Data (Gartner, 2001): Volume, Velocity, Variety
+ 2012: Variability and Complexity
Translating technical data
We have:
we need:
Client id	
   day	
   Target account	
   amount	
  
1	
   2009-01-01	
  0812345345678990	
   315 zł	
  
1	
   2009-01-01	
  0967899045543215 124 zł	
  
…	
   …	
  …	
   …	
  
Client id
(cookie)	
   day	
   url	
   useragent	
   IP	
  
23456781 2009-01-01www.wp.pl
Mozilla/5.0 (Windows
NT 5.1; rv:26.0) Gecko/
20100101 Firefox/26.0 217.67.211.123
…	
   …	
   …	
   …	
   …	
  
Client id	
  
Value od
the gas bill
M-6	
  
Value od
the gas bill
M-5
Gas bill #
days before/
after M-6	
  
Airlane
tickets M-6	
  
Gasoline
M-5
Food shoping high-
end shops M-5
Tiime	
  spend	
  on	
  
business	
  web	
  
pages	
  M-­‐6	
  
Expenditure on
education of the
child M-­‐5	
  
Class	
  of	
  a	
  held	
  
phone	
   …	
  
1	
   315 zł	
   320 zł	
   10	
   0 zł	
   300 zł	
   20 zł	
   3:40	
   140	
  zł	
   High	
   …	
  
2	
   420 zł	
   350 zł	
   5	
   800 zł	
   500 zł	
   550 zł	
   0:15	
   0	
  zł	
   Medium	
   …	
  
…	
   …	
   …	
   …	
   …	
   …	
   …	
   …	
   …	
  
Event_date	
   event	
  
2015-10-11 12:01:14
{"PageView": {"value": "start",
"popup": {
"menuitem": [
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]}}}
…	
   …	
  
First and last
Technical issue
§  We need to merge data coming from our web pages with data from
our transactional system
§  We do not have an united user id
§  We can merge user id’s base on IP address and userAgent parameter
coexistence close together in time
Computing power!!!!
Strong points
Big data
can collect and process vast amounts of data at a reasonable cost
https://hadoop.apache.org/
http://tez.apache.org/
http://mesos.apache.org/
Strong points
Big data
realtime computation system - a million tuples processed per second per
node
https://flink.apache.org/http://spark.apache.org/ https://storm.apache.org/
https://www.elastic.co/
Strong points
Big data
Graph databases
http://giraph.apache.org/
http://neo4j.com/
Strong points
Big data
Linear scalability
http://cassandra.apache.org/
Strong points
Big data
Schema-Free
https://drill.apache.org/ https://www.mongodb.org/
Weak points
Big data
Data Governance
§  Just on the beginning of this road
Weak points
Big data
Security
§  Definitely backend (not frontend) systems
Weak points
Big data
surprising gaps
table a
JOIN table b
ON a.index=b.index
table a
JOIN table b
ON a.day <= b.day
Weak points
Big data
we are living in a clusters world
Weak points
Big data
classical systems
well organized cities
Big Data systems
markets
Weak points
Big data
https://hive.apache.org/ https://zookeeper.apache.org/ http://kafka.apache.org/
https://pig.apache.org/
http://sqoop.apache.org/
http://falcon.apache.org/
http://atlas.incubator.apache.org/
possibilities of cooperation
Big data vs small data
§  Big Data: efficient quality data control.
§  Big Data: the ability to integrate data from different sources.
§  Big Data: fast access to data from any long period of time and for
any size group of customers (data lakes concept).
§  Advanced techniques of text analysis.
§  Analysis of the data graph.
§  Cheap and efficient power.
§  And all this to change Big Data in Small Data.
Thank you!
radoslaw.kita@allegrogroup.com

Contenu connexe

Tendances

Why distributed ledgers
Why distributed ledgersWhy distributed ledgers
Why distributed ledgersChris Skinner
 
Bitcoin Sharing session @ Stanford CEO
Bitcoin Sharing session @ Stanford CEOBitcoin Sharing session @ Stanford CEO
Bitcoin Sharing session @ Stanford CEOTom Ding
 
Digital Leaders NE: Let the blockchain take the strain
Digital Leaders NE: Let the blockchain take the strainDigital Leaders NE: Let the blockchain take the strain
Digital Leaders NE: Let the blockchain take the strainKate Baucherel
 
Blockchain- Digital digest
Blockchain- Digital digestBlockchain- Digital digest
Blockchain- Digital digestDivij Bajaj
 
GraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud PreventionGraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud PreventionNeo4j
 
Blockchain Opportunities Introduction
Blockchain Opportunities IntroductionBlockchain Opportunities Introduction
Blockchain Opportunities IntroductionIkhlaq Sidhu
 
Blockchain explained - Brunswick Review Spotlight on Cybersecurity
Blockchain explained - Brunswick Review Spotlight on CybersecurityBlockchain explained - Brunswick Review Spotlight on Cybersecurity
Blockchain explained - Brunswick Review Spotlight on CybersecurityBrunswick Group
 
Webjet Blockchain DDD 2017
Webjet Blockchain DDD 2017Webjet Blockchain DDD 2017
Webjet Blockchain DDD 2017Shashank Kaul
 
Introduction to Bitcoin, Blockchain, and Ethereum by Justin Wu
Introduction to Bitcoin, Blockchain, and Ethereum by Justin WuIntroduction to Bitcoin, Blockchain, and Ethereum by Justin Wu
Introduction to Bitcoin, Blockchain, and Ethereum by Justin WuJustin Wu
 
Blockchain Financial Networks
Blockchain Financial NetworksBlockchain Financial Networks
Blockchain Financial NetworksMelanie Swan
 
Blockchain Economics: Payment Channels
Blockchain Economics: Payment ChannelsBlockchain Economics: Payment Channels
Blockchain Economics: Payment ChannelsMelanie Swan
 
Blockchain Consulting Services
Blockchain Consulting ServicesBlockchain Consulting Services
Blockchain Consulting ServicesVishvendra Saini
 
Making Lemonade out of Lemons: Squeezing utility from a proof-of-work experiment
Making Lemonade out of Lemons: Squeezing utility from a proof-of-work experimentMaking Lemonade out of Lemons: Squeezing utility from a proof-of-work experiment
Making Lemonade out of Lemons: Squeezing utility from a proof-of-work experimentTim Swanson
 
GOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGE
GOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGEGOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGE
GOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGESteven Rhyner
 

Tendances (17)

Why distributed ledgers
Why distributed ledgersWhy distributed ledgers
Why distributed ledgers
 
Bitcoin Sharing session @ Stanford CEO
Bitcoin Sharing session @ Stanford CEOBitcoin Sharing session @ Stanford CEO
Bitcoin Sharing session @ Stanford CEO
 
Digital Leaders NE: Let the blockchain take the strain
Digital Leaders NE: Let the blockchain take the strainDigital Leaders NE: Let the blockchain take the strain
Digital Leaders NE: Let the blockchain take the strain
 
Blockchain- Digital digest
Blockchain- Digital digestBlockchain- Digital digest
Blockchain- Digital digest
 
Blockchain
Blockchain Blockchain
Blockchain
 
Chiu paper
Chiu paperChiu paper
Chiu paper
 
GraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud PreventionGraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud Prevention
 
Agile digital marketing
Agile digital marketingAgile digital marketing
Agile digital marketing
 
Blockchain Opportunities Introduction
Blockchain Opportunities IntroductionBlockchain Opportunities Introduction
Blockchain Opportunities Introduction
 
Blockchain explained - Brunswick Review Spotlight on Cybersecurity
Blockchain explained - Brunswick Review Spotlight on CybersecurityBlockchain explained - Brunswick Review Spotlight on Cybersecurity
Blockchain explained - Brunswick Review Spotlight on Cybersecurity
 
Webjet Blockchain DDD 2017
Webjet Blockchain DDD 2017Webjet Blockchain DDD 2017
Webjet Blockchain DDD 2017
 
Introduction to Bitcoin, Blockchain, and Ethereum by Justin Wu
Introduction to Bitcoin, Blockchain, and Ethereum by Justin WuIntroduction to Bitcoin, Blockchain, and Ethereum by Justin Wu
Introduction to Bitcoin, Blockchain, and Ethereum by Justin Wu
 
Blockchain Financial Networks
Blockchain Financial NetworksBlockchain Financial Networks
Blockchain Financial Networks
 
Blockchain Economics: Payment Channels
Blockchain Economics: Payment ChannelsBlockchain Economics: Payment Channels
Blockchain Economics: Payment Channels
 
Blockchain Consulting Services
Blockchain Consulting ServicesBlockchain Consulting Services
Blockchain Consulting Services
 
Making Lemonade out of Lemons: Squeezing utility from a proof-of-work experiment
Making Lemonade out of Lemons: Squeezing utility from a proof-of-work experimentMaking Lemonade out of Lemons: Squeezing utility from a proof-of-work experiment
Making Lemonade out of Lemons: Squeezing utility from a proof-of-work experiment
 
GOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGE
GOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGEGOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGE
GOLDMAN SACHS: BLOCKCHAIN IS READY FOR CENTRE STAGE
 

En vedette

Bett presentation 2016
Bett presentation 2016Bett presentation 2016
Bett presentation 2016Matthew Rogers
 
PR393 introduction session sem 2 2014
PR393 introduction session sem 2 2014PR393 introduction session sem 2 2014
PR393 introduction session sem 2 2014Lydia Gallant
 
Friendships
FriendshipsFriendships
Friendshipspalanca7
 
Wyklad inauguracyjny
Wyklad inauguracyjnyWyklad inauguracyjny
Wyklad inauguracyjnyRadoslaw Kita
 
Meetup 1 eksperymentujemy_na_duza_skale_rkita
Meetup 1 eksperymentujemy_na_duza_skale_rkitaMeetup 1 eksperymentujemy_na_duza_skale_rkita
Meetup 1 eksperymentujemy_na_duza_skale_rkitaRadoslaw Kita
 
Ansn ind ins_dasar_fisika_radiasi
Ansn ind ins_dasar_fisika_radiasiAnsn ind ins_dasar_fisika_radiasi
Ansn ind ins_dasar_fisika_radiasiIntan Nsp
 

En vedette (8)

Bett presentation 2016
Bett presentation 2016Bett presentation 2016
Bett presentation 2016
 
PR393 introduction session sem 2 2014
PR393 introduction session sem 2 2014PR393 introduction session sem 2 2014
PR393 introduction session sem 2 2014
 
Internetsegura
InternetseguraInternetsegura
Internetsegura
 
Friendships
FriendshipsFriendships
Friendships
 
Wyklad inauguracyjny
Wyklad inauguracyjnyWyklad inauguracyjny
Wyklad inauguracyjny
 
Meetup 1 eksperymentujemy_na_duza_skale_rkita
Meetup 1 eksperymentujemy_na_duza_skale_rkitaMeetup 1 eksperymentujemy_na_duza_skale_rkita
Meetup 1 eksperymentujemy_na_duza_skale_rkita
 
Ansn ind ins_dasar_fisika_radiasi
Ansn ind ins_dasar_fisika_radiasiAnsn ind ins_dasar_fisika_radiasi
Ansn ind ins_dasar_fisika_radiasi
 
Ppt molekul
Ppt molekulPpt molekul
Ppt molekul
 

Similaire à BIG DATA TECHNIQUES FOR BANKING

EMFcamp2022 - What if apps logged into you, instead of you logging into apps?
EMFcamp2022 - What if apps logged into you, instead of you logging into apps?EMFcamp2022 - What if apps logged into you, instead of you logging into apps?
EMFcamp2022 - What if apps logged into you, instead of you logging into apps?Chris Swan
 
In 2022, top 08 trending technology.docx
In 2022, top 08 trending technology.docxIn 2022, top 08 trending technology.docx
In 2022, top 08 trending technology.docxAdvance Tech
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Kai Wähner
 
Technology building blocks for innovation
Technology building blocks for innovationTechnology building blocks for innovation
Technology building blocks for innovationMahmoud Jalajel
 
Tamr | Making enterprise elephants dance @ boston data festival
Tamr | Making enterprise elephants dance @ boston data festival Tamr | Making enterprise elephants dance @ boston data festival
Tamr | Making enterprise elephants dance @ boston data festival Tamr_Inc
 
top 10 Digital transformation Technologies in 2022.docx
top 10 Digital transformation Technologies in 2022.docxtop 10 Digital transformation Technologies in 2022.docx
top 10 Digital transformation Technologies in 2022.docxAdvance Tech
 
From IoT to IoTA
From IoT to IoTAFrom IoT to IoTA
From IoT to IoTAStriim
 
Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Boris Adryan
 
IT trends – 2013 & beyond
IT trends – 2013 & beyondIT trends – 2013 & beyond
IT trends – 2013 & beyondNeha Mehta
 
Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Den Reymer
 
Multiplica.Webanalyticstrends
Multiplica.WebanalyticstrendsMultiplica.Webanalyticstrends
Multiplica.WebanalyticstrendsDavid Boronat
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...InnoTech
 
Keynote Sales Kickoff Interoute
Keynote Sales Kickoff InterouteKeynote Sales Kickoff Interoute
Keynote Sales Kickoff Interoute247 Invest
 
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...Kai Wähner
 
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...Santiago Cabrera-Naranjo
 
Action from Insight - Joining the 2 Percent Who are Getting Big Data Right
Action from Insight - Joining the 2 Percent Who are Getting Big Data RightAction from Insight - Joining the 2 Percent Who are Getting Big Data Right
Action from Insight - Joining the 2 Percent Who are Getting Big Data RightStampedeCon
 
Digitalization: A Challenge and An Opportunity for Banks
Digitalization: A Challenge and An Opportunity for BanksDigitalization: A Challenge and An Opportunity for Banks
Digitalization: A Challenge and An Opportunity for BanksJérôme Kehrli
 
Collaborative Data Management: How Crowdsourcing Can Help To Manage Data
Collaborative Data Management: How Crowdsourcing Can Help To Manage DataCollaborative Data Management: How Crowdsourcing Can Help To Manage Data
Collaborative Data Management: How Crowdsourcing Can Help To Manage DataEdward Curry
 

Similaire à BIG DATA TECHNIQUES FOR BANKING (20)

EMFcamp2022 - What if apps logged into you, instead of you logging into apps?
EMFcamp2022 - What if apps logged into you, instead of you logging into apps?EMFcamp2022 - What if apps logged into you, instead of you logging into apps?
EMFcamp2022 - What if apps logged into you, instead of you logging into apps?
 
In 2022, top 08 trending technology.docx
In 2022, top 08 trending technology.docxIn 2022, top 08 trending technology.docx
In 2022, top 08 trending technology.docx
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
 
Executive Summit for ISV & Application builders - January 2015
Executive Summit for ISV & Application builders - January 2015Executive Summit for ISV & Application builders - January 2015
Executive Summit for ISV & Application builders - January 2015
 
Technology building blocks for innovation
Technology building blocks for innovationTechnology building blocks for innovation
Technology building blocks for innovation
 
Tamr | Making enterprise elephants dance @ boston data festival
Tamr | Making enterprise elephants dance @ boston data festival Tamr | Making enterprise elephants dance @ boston data festival
Tamr | Making enterprise elephants dance @ boston data festival
 
top 10 Digital transformation Technologies in 2022.docx
top 10 Digital transformation Technologies in 2022.docxtop 10 Digital transformation Technologies in 2022.docx
top 10 Digital transformation Technologies in 2022.docx
 
From IoT to IoTA
From IoT to IoTAFrom IoT to IoTA
From IoT to IoTA
 
Big Data in FinTech
Big Data in FinTechBig Data in FinTech
Big Data in FinTech
 
Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16
 
IT trends – 2013 & beyond
IT trends – 2013 & beyondIT trends – 2013 & beyond
IT trends – 2013 & beyond
 
Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015Gartner: Top 10 Technology Trends 2015
Gartner: Top 10 Technology Trends 2015
 
Multiplica.Webanalyticstrends
Multiplica.WebanalyticstrendsMultiplica.Webanalyticstrends
Multiplica.Webanalyticstrends
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
 
Keynote Sales Kickoff Interoute
Keynote Sales Kickoff InterouteKeynote Sales Kickoff Interoute
Keynote Sales Kickoff Interoute
 
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
 
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
TDWI 17 Munich - Are enterprises ready for the 4th industrial revolution? - S...
 
Action from Insight - Joining the 2 Percent Who are Getting Big Data Right
Action from Insight - Joining the 2 Percent Who are Getting Big Data RightAction from Insight - Joining the 2 Percent Who are Getting Big Data Right
Action from Insight - Joining the 2 Percent Who are Getting Big Data Right
 
Digitalization: A Challenge and An Opportunity for Banks
Digitalization: A Challenge and An Opportunity for BanksDigitalization: A Challenge and An Opportunity for Banks
Digitalization: A Challenge and An Opportunity for Banks
 
Collaborative Data Management: How Crowdsourcing Can Help To Manage Data
Collaborative Data Management: How Crowdsourcing Can Help To Manage DataCollaborative Data Management: How Crowdsourcing Can Help To Manage Data
Collaborative Data Management: How Crowdsourcing Can Help To Manage Data
 

Dernier

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 

Dernier (17)

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 

BIG DATA TECHNIQUES FOR BANKING

  • 1. TECHNIQUES OFFERED BY BIG DATA AVAILABLE FOR USE IN THE BANKING SECTOR RADOSŁAW KITA
  • 2. Why me? Before we embark on the journey §  For over 12 years I was responsible for reporting systems in Onet.pl. I was responsible for the choice of data analysis tools. In this the transition from commercial to open source solutions. §  I took an active part in one of the most interesting (and somewhat controversial) Big Data project at Alior Bank. §  I worked on the implementation of the recommendation system in TVN content. §  Expert Data Scientist at Allegro. §  Actively dealing with the analysis of large data sets and analytical tools selection from the beginning of this century J
  • 3. What I do not want to talk Before we embark on the journey I do not want to talk about what I read on the Internet just about something that I did in practice.
  • 4. Big Data is like… Before we embark on the journey “Teen sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...” Dan Ariely If I were to be malicious: Children of a certain age do not believe that their parents have sex, despite fairly obvious proof.
  • 5. When you are responsible for the choice of technology Before we embark on the journey “I do not really have anything against so that our competitors has issued 3,000,000 for technology that they will not help.”
  • 6. When you are responsible for the choice of technology Before we embark on the journey “The truth is out there” But there is a world where no one has ever lied.
  • 7. World where no one has ever lied Before we embark on the journey
  • 8. in my humble opinion All that jazz is about The ability to create a coherent picture of customer needs. §  On the basis of a number of different data sources. §  Suitable for practical use. §  Within a reasonable period of time. §  At a reasonable cost. Why? “Data driven company”
  • 9. more detailed All that jazz is about We want to merge data about the behavior of our customers: §  Internal transaction systems §  What are they doing on our website §  What they write in social media §  What kind of content are looking for on the Internet §  In what case would call for call centers §  How they use a mobile phones… Definition Big Data (Gartner, 2001): Volume, Velocity, Variety + 2012: Variability and Complexity
  • 10. Translating technical data We have: we need: Client id   day   Target account   amount   1   2009-01-01  0812345345678990   315 zł   1   2009-01-01  0967899045543215 124 zł   …   …  …   …   Client id (cookie)   day   url   useragent   IP   23456781 2009-01-01www.wp.pl Mozilla/5.0 (Windows NT 5.1; rv:26.0) Gecko/ 20100101 Firefox/26.0 217.67.211.123 …   …   …   …   …   Client id   Value od the gas bill M-6   Value od the gas bill M-5 Gas bill # days before/ after M-6   Airlane tickets M-6   Gasoline M-5 Food shoping high- end shops M-5 Tiime  spend  on   business  web   pages  M-­‐6   Expenditure on education of the child M-­‐5   Class  of  a  held   phone   …   1   315 zł   320 zł   10   0 zł   300 zł   20 zł   3:40   140  zł   High   …   2   420 zł   350 zł   5   800 zł   500 zł   550 zł   0:15   0  zł   Medium   …   …   …   …   …   …   …   …   …   …   Event_date   event   2015-10-11 12:01:14 {"PageView": {"value": "start", "popup": { "menuitem": [ {"value": "Open", "onclick": "OpenDoc()"}, {"value": "Close", "onclick": "CloseDoc()"} ]}}} …   …  
  • 11. First and last Technical issue §  We need to merge data coming from our web pages with data from our transactional system §  We do not have an united user id §  We can merge user id’s base on IP address and userAgent parameter coexistence close together in time Computing power!!!!
  • 12. Strong points Big data can collect and process vast amounts of data at a reasonable cost https://hadoop.apache.org/ http://tez.apache.org/ http://mesos.apache.org/
  • 13. Strong points Big data realtime computation system - a million tuples processed per second per node https://flink.apache.org/http://spark.apache.org/ https://storm.apache.org/ https://www.elastic.co/
  • 14. Strong points Big data Graph databases http://giraph.apache.org/ http://neo4j.com/
  • 15. Strong points Big data Linear scalability http://cassandra.apache.org/
  • 17. Weak points Big data Data Governance §  Just on the beginning of this road
  • 18. Weak points Big data Security §  Definitely backend (not frontend) systems
  • 19. Weak points Big data surprising gaps table a JOIN table b ON a.index=b.index table a JOIN table b ON a.day <= b.day
  • 20. Weak points Big data we are living in a clusters world
  • 21. Weak points Big data classical systems well organized cities Big Data systems markets
  • 22. Weak points Big data https://hive.apache.org/ https://zookeeper.apache.org/ http://kafka.apache.org/ https://pig.apache.org/ http://sqoop.apache.org/ http://falcon.apache.org/ http://atlas.incubator.apache.org/
  • 23. possibilities of cooperation Big data vs small data §  Big Data: efficient quality data control. §  Big Data: the ability to integrate data from different sources. §  Big Data: fast access to data from any long period of time and for any size group of customers (data lakes concept). §  Advanced techniques of text analysis. §  Analysis of the data graph. §  Cheap and efficient power. §  And all this to change Big Data in Small Data.