Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Big data in Private Banking

822 vues

Publié le

Big data in Private Banking - Opportunities and how to get them

Publié dans : Technologie
  • Identifiez-vous pour voir les commentaires

Big data in Private Banking

  1. 1. 1 © Jerome Kehrli @ niceideas.ch Big Data in Private Banking Opportunities and how to get them ?
  2. 2. 2 1. What is Big Data ? 2. Opportunities in Wealth Management ? 1. Real time performance and risk metrics 2. Portfolio optimization and simulation 3. Leveraging customer data 3. How to get there ? Agenda
  3. 3. 3 1. What is Big Data ?
  4. 4. 5 An evolution of society (Souce:http://www.businessinsider.fr/us/vatican-square- 2005-and-2013-2013-3)
  5. 5. 6 The era of power
  6. 6. 7 Today In 2018, over 4 billion people are connected and share data on a continues time and everywhere Mobile First !!!
  7. 7. 8 Tomorrow « The Internet of Things is the next Big Thing » The Economist
  8. 8. 9 Data deluge ! Data deluge 5 exabytes of data (5 billions of gigabytes) Were generated since the first measurements until 2003, In 2011, this quantity was generated in 2 days In 2018, this quantity was generated in 2 minutes Source:https://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf
  9. 9. 10 All the time !All the time !
  10. 10. 11 Everywhere !
  11. 11. 12 1 minute on Internet
  12. 12. 13 Technical capability evolution For the 40 years, the IT component capabilties grew exponentially CPU, RAM, Disk … follow the Moore law! Source : http://radar.oreilly.com/2011/08/building-data-startups.html
  13. 13. 14 Storage cost evolution While the unit cost decreasing… Unit capability increasing and cost decreasing, the vertical scalability is always the easiest solution, really? 0.01 $ 0.10 $ 1.00 $ 10.00 $ 100.00 $ 1,000.00 $ 10,000.00 $ 100,000.00 $ 1,000,000.00 $ 10,000,000.00 $ 1975 1980 1985 1990 1995 2000 2005 2010 2015 Hard Drive RAM Source :http://www.mkomo.com/cost-per-gigabyte 5 $ /GB 5 M$ /GB
  14. 14. 15 Disk throughput evolution Issue : The throughput evolution is always lower than the capacity evolution How read/write more and more data through an always thicker pipe? Gain : x100 000 Capacity: Gain : x 10’000 In 10 years Throughput Gain : x 50 In 10 years
  15. 15. 16 Origins of Big Data : the web giants !
  16. 16. 17 How to process data that cannot fit in memory anymore ? The Web Giants have been the first to face the limits of traditional architectures New architectures and paradigms The massive computations are the reason why we need new architetcures and paradigms such as Hadoop or No / New SQL Idea #1 : Run transaction and computation in parallel Idea #2 : Scale at the multi-datacenter level the grid of CPU, DRAM and Disk Idea #3 : Move the code to the computing node, not the data (tier layer revolution) RAM How to read thousands of files of billions of rows in a limited time?I / O How to process trillions of operations in a limited time?CPU
  17. 17. 18 So what is big data? Defining big Data is actually beyond the formal definition. It’s all together a technology evolution anticipated by the Big Consulting companies and a business opportunity ! Big Data More and different data Evolution of Datascience More computing capacities New technologies and architectures
  18. 18. 19 Big data represents the information assets characterized by such a high volume, velocity and variety to require specific technology and analytical methods for its transformation into value Big Data : definition https://en.wikipedia.org/wiki/Big_data#Definition
  19. 19. 20 Existing Architectures Limits IO Limit Applications are heavily storage oriented CPU Limit Applications are heavily computation oriented TPS Limit (Transaction Per Second) Applications are heavily high-troughput oriented Transactional Applications / Operational Information Systems EPS Limit (Event Per Second) Applications are heavily ultra low-latency oriented Event Flow applications
  20. 20. 21 Existing Architectures Limits IO Limit Applications are heavily storage oriented CPU Limit Applications are heavily computation oriented TPS Limit (Transaction Per Second) Applications are heavily high-troughput oriented Transactional Applications / Operational Information Systems EPS Limit (Event Per Second) Applications are heavily ultra low-latency oriented Event Flow applications Over 10 Tb, « classical » architectures requires huge software and hardware adaptations. Over 1 000 transactions / second, « classical » architectures requires huge software and hardware adaptations. Over 1 000 events / second, « classical » architectures requires huge software and hardware adaptations. Over 10 threads/Core CPU, sequential programming reach its limits (IO).
  21. 21. 22 Traditional and Specialized Architectures IO Limit Applications are heavily storage oriented CPU Limit Applications are heavily computation oriented TPS Limit (Transaction Per Second) Applications are heavily high-troughput oriented Transactional Applications / Operational Information Systems EPS Limit (Event Per Second) Applications are heavily ultra low-latency oriented Event Flow applications Storage Grids Distributed Storage / Share nothing grids Transaction Grids XTP Stream Grids Event Stream Processing Computation Grids Parallel Processing Traditional Architectures RDBMS, Application servers, ETLs, ESBs, etc.
  22. 22. 23 Traditional Architectures RDBMS, Application servers, ETLs, ESBs, etc. New Architectures IO Limit Applications are heavily storage oriented CPU Limit Applications are heavily computation oriented TPS Limit (Transaction Per Second) Applications are heavily high-troughput oriented Transactional Applications / Operational Information Systems EPS Limit (Event Per Second) Applications are heavily ultra low-latency oriented Event Flow applications Hadoop Usual hadoop ecosystem NoSQL / NewSQL In-memory Analytics Other Big Data Processing Engines Streaming Solution CPU Grids GPU Grids
  23. 23. 24 Some examples IO Limit Applications are heavily storage oriented CPU Limit Applications are heavily computation oriented TPS Limit (Transaction Per Second) Applications are heavily high-troughput oriented Transactional Applications / Operational Information Systems EPS Limit (Event Per Second) Applications are heavily ultra low-latency oriented Event Flow applications RabbitMQ, Zero MQ Apache Kafka Quartet ActivePivot Sqoop Exalytics HDFS Exadata EMC Teradata SQLFire Giga Spaces HBase Cassandra, MongoDB, CouchDB Voldemort Hana Redis MapR Esper Hama Igraph Spark / Spark Streaming Hive Pig
  24. 24. 25 2. Opportunities in Wealth Management ?
  25. 25. 26 Big Data in Wealth Management Investment Research Data discovery / market research Development of Investment ideas Testing of investment strategies Portfolio Management Trading Risk Management Aggregation of position Data Position monitoring Risk dashboards Portfolio Management Customer knowledge Unified / Consolidated customer view Customer profiling and analysis Know Your Customer / External Data Analysis of unstructured data Client knowledge Investment advisory Compliance and Monitoring Pre / Post-Trade Fraud detection and prevention Anti Money Laundering Communication channels monitoring State of the Art use cases 1 2 3
  26. 26. 27 Portfolio optimization and simulation Focus on some use cases Real-time performance and risk metrics Intra-day positions and trades Risk dashboards Solvency ratios for credit approval Fraud prevention / Anti-Money laundering What if we had taken this or that investment decision ? What about this investment strategy ? Large scale portfolio optimization Leveraging Customer data Customer profiling / classification Personalized investment advices Better / deeper fraud detection Marketing campaigns
  27. 27. 28 2.1 Real time performance and risk metrics
  28. 28. 29 Portfolio metrics Portfolio metrics Performance Contributions / Exposure Risk metrics Variance Sharp Ratio Volatility Value at Risk breakdown by Region / Country / Currency / … Sector / Industry / … Global metrics Same as portfolio … breakdown by Office, LoB, Contract, Customer, etc. also breakdown by Country, Currency, Sector, Industry, etc. Portfolio metrics in private banking institutions Val / perf : night computation batches for Portfolio risk metrics : WE batches Global metrics : quarterly batches
  29. 29. 30 Make it real-time … The disastrous global financial crisis put a spotlight on the need to get rapid feedback after market events. Banks are trying to obtain an array of risk metrics in more real time, released multiple times during the day Reduce time to result as much as possible Take intraday positions and trades into account Get immediate feedback on intraday market events Using latest quotes and other metrics
  30. 30. 31 The good ol’way 1. Night / WE / Quarterly computation batches Missing intraday positions / live quotes Far from real time 2. Intraday calculation within the Operational IS Everything in the RDBMS Load / reload / compute / re-compute again Heavy load on Operational IS / slow computation Very long response time (several minutes) / crash … 3. Efficient off-Operational IS computation (rare …) Distributed cache (Jboss Cache) Reload time / Huge operating costs Computing Grids (Terradata, …) Huge licensing cost
  31. 31. 32 Commodity hardware Reduce TCO Scale Out Open-source software stack No licensing Cost Ease of operation (Web giants initiated) From Pull to Push Computing Portfolio or global Perf or VaR in real-time is difficult Processing market events and updating metrics in real-time is straightforward We have the technology The Big Data Way
  32. 32. 33 2.2 Portfolio optimization and simulation
  33. 33. 34 Portfolio optimization and simulation Portfolio Optimization Markovitz Mean-Variance (MPT) Mean-CVaR Custom investment constraints Etc. Portfolio simulation What if we had taken this or that investment decision ? What about this investment strategy ? What if we try such policy or regulation approach ? Most PMS software support simulation and some optimization models out of the box. What about yours ? Do you use these features ? What about large scale portfolio optimization ?
  34. 34. 35 Portfolio optim. – bactkesting and stress testing Portfolio backtesting / stresstesting o Backtest over past periods o This is Markowitz o Test optimization parameters o Stress-test over market events o A lot of computations … Stress-testing and back-testing are a little less common … What do you have ? What about large scale backtesting ?
  35. 35. 36 Even further : compute portfolio efficient frontier Efficient Frontier calculation o Sharp ratio o Heuristic computation + Lots of computations o With respect to constraints o Dynamic constraints o Compute result weights Very rare in most financial companies What do you have ? What about large scale sharp ratio optimization ?
  36. 36. 37 The good ol’way 1. Excel – Quantitative analysts Use extract from Operational IS … loaded In Excel Simulations / rebalancings are run from Excel Slow / inadapted / buggy 2. Calc. Program - Quantitative analysts Use extract from operational IS … loaded In Matlab Simulations / rebalancings are run from Matlab Steady learning curve Poor User Interface 3. Proprietary/ specialized software (Reuters, BB, …) To be integrated on Operational IS Tricky / Risky Expensive… Inflexible
  37. 37. 38 The Big Data Way > Findings … + Quantitative research tooling is sub-optimal in most private banking institutions + Some needs are more or less covered + Yet we are far from a large scale and systemic approach to portfolio optimization and back-testing Increase computing capacities with eXtreme parallelization / distribution Large scale distributed systems Move the computing code to the data Reduce TCO Open source software stacks Hadoop, Cassandra, Infinispan, Python SciKit, R, … Commodity Hardware
  38. 38. 39 2.3 Leveraging customer data
  39. 39. 40 Leveraging Customer data • Customer personal data and their changes • Investment profiles Customer profiles • All customer and prospects contacts • Documents / Phone calls / emails / … CRM activity • Financial transactions / account activity • Orders and confirmations Transactions and activity • Bank online systems / ebanking • Sometimes even firewall traces Online activity • What if we add external data ? • Twitter / facebook activity External data Private banking institutions are keeping tracks of a lot of data This huge amount of data is kept but rarely used Can we make something of it ?
  40. 40. 41 Opportunities can be examined from two perspectives: Customer Data : new opportunities ? Big Data as a « Cost killer » or an enhancer Big Data as a way to « widen the field of possibilities »
  41. 41. 42 What can we do with it anyway ? Fraud detection and AML > Real-time transaction monitoring + Identify patterns and outliers > Communication channels monitoring + Mining of emails, calls, logs, … > Web-site / Firewall monitoring Customer analysis > Customer segmentation + Growth potential / risk metrics / … > Customer profiling + Investment profiles + Peer groups. > Prospection analysis > Marketing campaigns > Invest on the most promising customers > Personalized investment advices > Benchmark customers > Match firewall / website breach attempts to account activity > Better / deeper fraud detection > Anty-Money Laundering
  42. 42. 43 What can we do with it anyway ? (cont’d) Online banking software > Advanced searched on transactions + Multi-criterias : date, name of shop, revenues, full-text search > Annotate transactions, shops, compare, get advises, etc.  Long term logs of all transactions Know Your Customer > Customer 360 view … + Everything with contacts, transactions, performance, risk metrics, … > Customer identification program + Matching with Web and social networks > User Experience Revolution > Marketing / Digital Experience > User-engineering > Fast access to all customer data > Unified view of the customer > Real-time construction / display
  43. 43. 44 The Big Data Way Manipulate and consolidate very large volumes of data efficiently HDFS and Other No/NewSQL storages Highly distributed computing Reduce TCO Open source software stacks Hadoop, Cassandra, Infinispan, Python SciKit-Learn, R, … Commodity Hardware Manipulating, consolidating and mining all the data related to customer and/or users activities resulting from heterogeneous sources is difficult Some initiatives already exist in most institutions.  They are however most of the time limited to small and specific sets of data  Or implemented on expensive technologies such as Terradata
  44. 44. 45 3. How to get there ?
  45. 45. 46 Improve Analytics System Reduce TCO / Cost Killer ! Extend the field of possibilities Consider use cases so far reserved to investment banking institutions Large scale portfolio optimization, simulation, rebalancing, back-testing, etc Real-time metrics Widen the field of possibilities User Experience Revolution Marketing / Digital Experience User-engineering Customer 360 view KYC Deeper depth of analysis Take Away
  46. 46. 47 Architecture Data acquisition Collection and Analysis InfiniSpan (DataGrid) Live Market Data • Instruments • Quotes • Index values Storm (Real-time) Portfolios • Simulation • Optimization • Rebalancing • Backtesting • Stresstesting Historical Market Data • Instruments • Quotes • Index values Cassandra (DB) RestitutionExternal Data • Web API • Web searches • Social networks (PULL mode) Operational Data • Transactions • Acccount Activity • Mails / Calls • Logs • PMS : Portfolios/ Positions • Trades / • Accounts / Structure • Reference Data Real-time and batch metrics • Portfolio perf. • Risk metrics • Real-time Dashboards Hadoop / HDFS / Hbase (Data Storage And Analysis) Tulip / Hive / Pig (Querying / Data Viz / Results Reporting) Analysis • Customer knowledge • Fraud detection Live Mkt. Data Market Data Accounting Reference PMS Data (Unstructured Data) Account Act. Transactions Big Data Deployment In Private Banking Institutions Expert System (Portfolio simulation, optimization, etc.)
  47. 47. 48 1. Identify your potential use cases 2. Prioritize use cases with business experts / representatives 3. Identify technological leads / opportunities 4. Implement Proof Of Concepts Incremental / Iterative way : How to get there ? • One time extract • Scheduled extractors Import Data within target Technology • Pig / Hive • DataViz • Storm, Infinispan Discover Data and technology • New analysis • Reporting • Automation Automate / Analyze Data
  48. 48. 49 Thanks for listening

×