SlideShare une entreprise Scribd logo
1  sur  19
Project	
  "Babelfish"	
  
A	
  data	
  warehouse	
  to	
  a5ack	
  
complexity	
  
Prof.	
  Dr.	
  Christoph	
  Denzler	
  &	
  Daniel	
  Kröni	
  
{christoph.denzler,	
  daniel.kroeni}@Inw.ch	
  
StarKng	
  PosiKon	
  
•  Finnova	
  is	
  a	
  soOware	
  house	
  developing	
  a	
  
bankware	
  soluKon	
  for	
  universal	
  banks.	
  
•  About	
  300	
  employees,	
  200	
  of	
  them	
  in	
  
development,	
  engineering,	
  applicaKon	
  
management	
  and	
  customer	
  care	
  
•  Banking	
  System	
  
– more	
  than	
  7	
  million	
  lines	
  of	
  code	
  
– controlled	
  by	
  15'000	
  parameters	
  
– around	
  2000	
  UI	
  screens	
  
IncepKon	
  
•  SoOware	
  grew	
  over	
  
past	
  15	
  years	
  
–  approx.	
  13	
  person	
  years	
  
of	
  development	
  per	
  
month	
  
•  Architectural	
  challenges	
  
–  new	
  business	
  models	
  
–  new	
  regulaKons	
  
–  internaKonal	
  customers	
  
–  bigger	
  customers	
  
–  new	
  technologies	
  
→ How	
  to	
  keep	
  track	
  of	
  
–  architecture	
  
–  code	
  
–  tests	
  
–  customers	
  
parametrizaKon	
  
–  bug	
  reports	
  
–  change	
  requests	
  
–  developers	
  output	
  
?	
  
Product	
  
Concrete	
  Problems	
  
•  The	
  business	
  logic	
  is	
  changed.	
  In	
  which	
  GUIs	
  will	
  this	
  be	
  
visible?	
  
•  A	
  customer	
  reports	
  a	
  bug	
  on	
  screen	
  XY.	
  Which	
  parts	
  of	
  
the	
  code	
  do	
  handle	
  this	
  screen	
  and	
  its	
  data?	
  Which	
  
developer	
  is	
  resoponsible	
  for	
  this	
  code?	
  
•  Does	
  a	
  new	
  funcKon	
  break	
  architectural	
  guidelines?	
  
E.g.	
  does	
  it	
  introduce	
  dependency	
  loops?	
  
•  Which	
  modules	
  of	
  the	
  soOware	
  do	
  not	
  have	
  to	
  be	
  
taken	
  offline	
  during	
  a	
  system	
  upgrade?	
  
•  which	
  tests	
  need	
  to	
  be	
  rerun	
  aOer	
  a	
  change	
  in	
  code?	
  
ExpectaKons	
  
•  Improve	
  quality	
  of	
  bankware	
  soluKon	
  by	
  
– earlier	
  detecKon	
  of	
  architecture	
  violaKons	
  
•  Improve	
  issue	
  handling	
  
– faster	
  locality	
  determinaKon	
  of	
  bugs	
  
•  Improve	
  tesKng	
  by	
  
– tesKng	
  only	
  what	
  has	
  changed	
  
•  Improve	
  stability	
  by	
  
– reliable	
  dependency	
  informaKon	
  during	
  
deployment	
  and	
  producKon	
  
System	
  Overview	
  
Import"
WebAPI"
Core

System"
Core	
  System	
  
Versioning"
Schema"
DSL"
Core	
  System	
  
•  Neo4j	
  Graph	
  Database	
  
•  Model:	
  Directed	
  Property	
  Graph	
  
•  Nodes	
  
•  Typed	
  Edges	
  
•  ProperKes	
  
	
  
•  QuanKKes	
  	
  
	
  #Nodes	
  ~	
  6'300'000	
  
	
  #Edges	
  >	
  15'000'000	
  
Versioning"
Schema"
DSL"
name:	
  "Credit"	
   name:	
  "Log"	
  
calls	
  	
  
Core	
  System	
  
•  Version	
  aware	
  API	
  
•  access	
  graph	
  as	
  of	
  a	
  specific	
  version	
  
•  Allows	
  to	
  query	
  what	
  changed	
  	
  
•  when,	
  most	
  oOen,	
  together,	
  ...	
  
•  Mapping	
  of	
  versioned	
  nodes	
  to	
  DB	
  nodes	
  
Versioning"
Schema"
DSL"
name:	
  "Credit"	
  
LOC:	
  832	
  	
  
name:	
  "Credit"	
  
LOC:	
  832	
  
from:	
  13	
  
to:	
  _	
  
LOC:	
  750	
  
from:	
  1	
  
to:	
  12	
  
	
  
	
  Logical	
  QuanKKes	
  
#Nodes	
  2'046'128	
  
#Edges	
  4'292'867	
  
	
  
	
  
Storage	
  QuanKKes	
  	
  
	
  #Nodes	
  ~	
  6'300'000	
  
	
  #Edges	
  >	
  15'000'000	
  
Core	
  System	
  
•  Domain	
  model	
  
•  Common	
  vocabulary	
  with	
  the	
  partner	
  
•  Index	
  
•  Query	
  language	
  
Versioning"
Schema"
DSL"
Package	
  
name:	
  String	
  
LOC:	
  Long	
  
Release	
  
id:	
  Long	
  
name:	
  String	
  
Calls	
  
Contains	
  
Core	
  System	
  
•  Custom	
  Query	
  Language	
  
•  Schema	
  aware	
  
•  Version	
  aware	
  
•  Fast	
  graph	
  traversal	
  
•  Describing	
  the	
  structure	
  of	
  paths	
  as	
  
with	
  a	
  formal	
  grammar	
  
•  CollecKng	
  properKes	
  on	
  the	
  way	
  
•  SQL	
  postprocessing	
  
•  Implemented	
  as	
  an	
  internal	
  Scala	
  DSL	
  
•  Easy	
  to	
  extend	
  
Versioning"
Schema"
DSL"
Query	
  Language:	
  Basics	
  
•  Schema	
  aware	
  
–  Refer	
  to	
  nodes	
  /	
  edges	
  /	
  properKes	
  
•  Graph	
  navigaKon	
  primiKves	
  
–  V,	
  E,	
  inE,	
  outV,	
  outE,	
  inV	
  
•  Grammar	
  style	
  combinators	
  
–  ~,	
  |,	
  ?,	
  *,	
  +	
  
outE	
   inV	
  
inE	
  outV	
  
out	
  
in	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Extensions	
  
	
  
	
  
	
  
•  Labeling	
  
–  Name	
  values	
  for	
  later	
  processing	
  
•  ExtracKon	
  
–  Select	
  what	
  you	
  want	
  in	
  your	
  table	
  
•  SQL	
  Postprocessing	
  
–  SQL	
  is	
  nice	
  for	
  aggregaKon	
  
from	
  {	
  
	
  	
  V(Package)	
  ~	
  in(_Calls_).+	
  ~	
  get(Package.Name).as("n")	
  
}	
  extract	
  {	
  "n"	
  }	
  sql	
  {	
  
	
  	
  "SELECT	
  n	
  FROM	
  t1	
  ORDER	
  BY	
  n	
  DESC"	
  
}	
  
QuesKons	
  /	
  Remarks	
  
?/!	
  

Contenu connexe

En vedette

En vedette (7)

Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
 
Strata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case StudiesStrata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case Studies
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
7 Predictive Analytics, Spark , Streaming use cases
7 Predictive Analytics, Spark , Streaming use cases7 Predictive Analytics, Spark , Streaming use cases
7 Predictive Analytics, Spark , Streaming use cases
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 

Similaire à Project "Babelfish" - A data warehouse to attack complexity

Web Development using Ruby on Rails
Web Development using Ruby on RailsWeb Development using Ruby on Rails
Web Development using Ruby on Rails
Avi Kedar
 
Wireless Developing Wireless Monitoring and Control devices
Wireless Developing Wireless Monitoring and Control devicesWireless Developing Wireless Monitoring and Control devices
Wireless Developing Wireless Monitoring and Control devices
Aidan Venn MSc
 
Shane_O'Neill_CV_slim
Shane_O'Neill_CV_slimShane_O'Neill_CV_slim
Shane_O'Neill_CV_slim
Shane O'Neill
 

Similaire à Project "Babelfish" - A data warehouse to attack complexity (20)

CQRS and Event Sourcing
CQRS and Event SourcingCQRS and Event Sourcing
CQRS and Event Sourcing
 
Embracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumEmbracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and Debezium
 
Migrating Speedment to Java 9
Migrating Speedment to Java 9Migrating Speedment to Java 9
Migrating Speedment to Java 9
 
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
Measure and Increase Developer Productivity with Help of Serverless at JCON 2...
 
Web Development using Ruby on Rails
Web Development using Ruby on RailsWeb Development using Ruby on Rails
Web Development using Ruby on Rails
 
Cincom Smalltalk News
Cincom Smalltalk NewsCincom Smalltalk News
Cincom Smalltalk News
 
Ren cao kafka connect
Ren cao   kafka connectRen cao   kafka connect
Ren cao kafka connect
 
Technical Debt - SOTR14 - Clarkie
Technical Debt -  SOTR14 - ClarkieTechnical Debt -  SOTR14 - Clarkie
Technical Debt - SOTR14 - Clarkie
 
Versioning for Developers
Versioning for DevelopersVersioning for Developers
Versioning for Developers
 
Wireless Developing Wireless Monitoring and Control devices
Wireless Developing Wireless Monitoring and Control devicesWireless Developing Wireless Monitoring and Control devices
Wireless Developing Wireless Monitoring and Control devices
 
Learn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best PracticesLearn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best Practices
 
WebLogic Event Server - Alexandre Alves, BEA
WebLogic Event Server - Alexandre Alves, BEAWebLogic Event Server - Alexandre Alves, BEA
WebLogic Event Server - Alexandre Alves, BEA
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational Controls
 
Shane_O'Neill_CV_slim
Shane_O'Neill_CV_slimShane_O'Neill_CV_slim
Shane_O'Neill_CV_slim
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
 
CV
CVCV
CV
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Ipc mysql php
Ipc mysql php Ipc mysql php
Ipc mysql php
 
ResumeDilipKumarPanchali (1)
ResumeDilipKumarPanchali (1)ResumeDilipKumarPanchali (1)
ResumeDilipKumarPanchali (1)
 
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
 

Plus de Swiss Big Data User Group

Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density Choice
Swiss Big Data User Group
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maket
Swiss Big Data User Group
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
Swiss Big Data User Group
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
Swiss Big Data User Group
 

Plus de Swiss Big Data User Group (20)

Making Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to useMaking Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to use
 
A real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operatorA real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operator
 
Data Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2CData Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2C
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Closing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisClosing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data Analysis
 
Big Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesBig Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companies
 
Design Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time LearningDesign Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time Learning
 
Educating Data Scientists of the Future
Educating Data Scientists of the FutureEducating Data Scientists of the Future
Educating Data Scientists of the Future
 
Unleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data WarehouseUnleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data Warehouse
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density Choice
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maket
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
 
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computingTechnology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
 
In-Store Analysis with Hadoop
In-Store Analysis with HadoopIn-Store Analysis with Hadoop
In-Store Analysis with Hadoop
 
Big Data Visualization With ParaView
Big Data Visualization With ParaViewBig Data Visualization With ParaView
Big Data Visualization With ParaView
 
Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
 
Oracle's BigData solutions
Oracle's BigData solutionsOracle's BigData solutions
Oracle's BigData solutions
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Project "Babelfish" - A data warehouse to attack complexity

  • 1. Project  "Babelfish"   A  data  warehouse  to  a5ack   complexity   Prof.  Dr.  Christoph  Denzler  &  Daniel  Kröni   {christoph.denzler,  daniel.kroeni}@Inw.ch  
  • 2. StarKng  PosiKon   •  Finnova  is  a  soOware  house  developing  a   bankware  soluKon  for  universal  banks.   •  About  300  employees,  200  of  them  in   development,  engineering,  applicaKon   management  and  customer  care   •  Banking  System   – more  than  7  million  lines  of  code   – controlled  by  15'000  parameters   – around  2000  UI  screens  
  • 3. IncepKon   •  SoOware  grew  over   past  15  years   –  approx.  13  person  years   of  development  per   month   •  Architectural  challenges   –  new  business  models   –  new  regulaKons   –  internaKonal  customers   –  bigger  customers   –  new  technologies   → How  to  keep  track  of   –  architecture   –  code   –  tests   –  customers   parametrizaKon   –  bug  reports   –  change  requests   –  developers  output   ?  
  • 5. Concrete  Problems   •  The  business  logic  is  changed.  In  which  GUIs  will  this  be   visible?   •  A  customer  reports  a  bug  on  screen  XY.  Which  parts  of   the  code  do  handle  this  screen  and  its  data?  Which   developer  is  resoponsible  for  this  code?   •  Does  a  new  funcKon  break  architectural  guidelines?   E.g.  does  it  introduce  dependency  loops?   •  Which  modules  of  the  soOware  do  not  have  to  be   taken  offline  during  a  system  upgrade?   •  which  tests  need  to  be  rerun  aOer  a  change  in  code?  
  • 6. ExpectaKons   •  Improve  quality  of  bankware  soluKon  by   – earlier  detecKon  of  architecture  violaKons   •  Improve  issue  handling   – faster  locality  determinaKon  of  bugs   •  Improve  tesKng  by   – tesKng  only  what  has  changed   •  Improve  stability  by   – reliable  dependency  informaKon  during   deployment  and  producKon  
  • 9. Core  System   •  Neo4j  Graph  Database   •  Model:  Directed  Property  Graph   •  Nodes   •  Typed  Edges   •  ProperKes     •  QuanKKes      #Nodes  ~  6'300'000    #Edges  >  15'000'000   Versioning" Schema" DSL" name:  "Credit"   name:  "Log"   calls    
  • 10. Core  System   •  Version  aware  API   •  access  graph  as  of  a  specific  version   •  Allows  to  query  what  changed     •  when,  most  oOen,  together,  ...   •  Mapping  of  versioned  nodes  to  DB  nodes   Versioning" Schema" DSL" name:  "Credit"   LOC:  832     name:  "Credit"   LOC:  832   from:  13   to:  _   LOC:  750   from:  1   to:  12      Logical  QuanKKes   #Nodes  2'046'128   #Edges  4'292'867       Storage  QuanKKes      #Nodes  ~  6'300'000    #Edges  >  15'000'000  
  • 11. Core  System   •  Domain  model   •  Common  vocabulary  with  the  partner   •  Index   •  Query  language   Versioning" Schema" DSL" Package   name:  String   LOC:  Long   Release   id:  Long   name:  String   Calls   Contains  
  • 12. Core  System   •  Custom  Query  Language   •  Schema  aware   •  Version  aware   •  Fast  graph  traversal   •  Describing  the  structure  of  paths  as   with  a  formal  grammar   •  CollecKng  properKes  on  the  way   •  SQL  postprocessing   •  Implemented  as  an  internal  Scala  DSL   •  Easy  to  extend   Versioning" Schema" DSL"
  • 13. Query  Language:  Basics   •  Schema  aware   –  Refer  to  nodes  /  edges  /  properKes   •  Graph  navigaKon  primiKves   –  V,  E,  inE,  outV,  outE,  inV   •  Grammar  style  combinators   –  ~,  |,  ?,  *,  +   outE   inV   inE  outV   out   in   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 14. Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 15. Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 16. Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 17. Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 18. Query  Language:  Extensions         •  Labeling   –  Name  values  for  later  processing   •  ExtracKon   –  Select  what  you  want  in  your  table   •  SQL  Postprocessing   –  SQL  is  nice  for  aggregaKon   from  {      V(Package)  ~  in(_Calls_).+  ~  get(Package.Name).as("n")   }  extract  {  "n"  }  sql  {      "SELECT  n  FROM  t1  ORDER  BY  n  DESC"   }