SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
2© 2016 Pivotal Software, Inc. All rights reserved. 2© 2016 Pivotal Software, Inc. All rights reserved.
Large Scale Fraud Analytics
GemFire Greenplum Connector (G2C)
3© 2016 Pivotal Software, Inc. All rights reserved.
Background
Ÿ  Government fraud revenue retention program
Ÿ  Detecting & retaining ~$5B annually
–  Primary focus on identity theft
–  Processes up to 8 million cases per day
–  Current & historic data size ~60 TB (compressed)
Ÿ  Modifying architecture to integrate GemFire for scalable
Java-based business logic, web service integration, and
event driven design
4© 2016 Pivotal Software, Inc. All rights reserved.
Fraud Systems Simplified
Prepare
•  Ingest
•  Restructure (ETL)
Score
•  Model Evaluation
Disposition
•  Business Logic
•  Prioritization
Respond
•  Investigation
•  Stop Payments
Business Logic Engine
ETL
Reporting
In-db Analytics
Application Services
5© 2016 Pivotal Software, Inc. All rights reserved.
Case Study Architecture – Scaling Up
GemFire
Greenplum
Spring Boot App Services
Informatica w/ PWX (ETL)
Business Objects
(Reporting)
Legacy Logic
Implementation
Logic Engine
In-db Analytics
Greenplum
Prepare
•  Ingest
•  Restructure (ETL)
Score
•  Model Evaluation
Disposition
•  Business Logic
•  Prioritization
Respond
•  Investigation
•  Stop Payments
6© 2016 Pivotal Software, Inc. All rights reserved.
Pivotal Greenplum (GPDB)
Ÿ  Postgres Community OSS
–  Original fork of 8.2.15
–  Massively parallel processing
database
Ÿ  Master coordinates queries
across segments databases
Ÿ  Supports in-database model
evaluation
–  MadLib, PL/R, SAS
GPDB
Logical
GPDB
Physical
GPDB
Software
Master
Segments
7© 2016 Pivotal Software, Inc. All rights reserved.
Initial Implementation
Ÿ  Fraud model results evaluated
by business logic engine
Ÿ  Flat file data extraction
–  Significant custom code to
construct required object model
–  Table à CSV à POJO
Ÿ  Shared element in an otherwise
distributed system
–  Performance considerations
GPDB
Legacy Logic
Implementation
8© 2016 Pivotal Software, Inc. All rights reserved.
Architecture Adjustments
Ÿ  New requirements introduced
external integrations
–  Drives desire for web-services
Ÿ  Desire to improve performance
& simplify codebase
Ÿ  Expanding business logic
–  Logic engine run as a GemFire
function
GemFire
GPDB
Legacy Logic
Implementation
Spring Boot (App Services)
9© 2016 Pivotal Software, Inc. All rights reserved. 9© 2016 Pivotal Software, Inc. All rights reserved.
GemFire Greenplum Connector
10© 2016 Pivotal Software, Inc. All rights reserved.
Context
Greenplum!
ANSI
SQL
Analytical
Parallel
Configurable Data
Load
GemFire!App 1App 1App 1
App 1App 1App 2
Native API
Rest /
HTTP
Transactional
Custom Apps
Transactional
data write
behind
Data Science,
Analytics & ML
11© 2016 Pivotal Software, Inc. All rights reserved.
GemFire Greenplum Connector (G2C)
Ÿ  Extension package for GemFire
Ÿ  Provides simple import and export of data between GemFire
regions & Greenplum tables
–  Parallel data motion leveraging Greenplum’s external table interface
Ÿ  Simple mapping between table rows and PdxInstance
–  Flat object relational mapping
–  Set of predefined type conversions
–  Configurable GemFire data collocation
12© 2016 Pivotal Software, Inc. All rights reserved.
Greenplum
Master
Segments GemFire
G2C Data Interfaces
JDBC /
ODBC
Data
Node
Data
Node
Control Logic
13© 2016 Pivotal Software, Inc. All rights reserved.
GpdbService is the primary entry
point for explicitly invoked data
motion
1.  Import - loads the full table
contents from Greenplum
2.  Export - sends region
contents to Greenplum
Sample Data Import / Export
Cache cache = CacheFactory.getAnyInstance();
GpdbService gpdb = GpdbService.getInstance(cache);
long count;
count = gpdb.importRegion(region);
count = gpdb.exportRegion(region);
1
2
14© 2016 Pivotal Software, Inc. All rights reserved.
Basic Cache Configuration
Configured via GemFire extension
framework
•  1) Each region maps to a jndi data
source back by Greenplum
•  2) Link an entity type and table
•  3) Declare a field to be used as the key
•  Compound keys supported
•  4) Define a mapping between the table
columns
•  Default auto-configuration
•  Optional name and column attributes for
naming convention changes
•  Class used to control type conversion
•  Set of built in types
<region name="Parent">
<region-attributes refid="PARTITION">
<partition-attributes/>
</region-attributes>
<gpdb:store datasource="datasource">
<gpdb:types>
<gpdb:pdx name="io.pivotal...entity.Parent"
table="parent">
<gpdb:id field="id" />
<gpdb:fields>
<gpdb:field name="name" />
<gpdb:field name="id" column="id" />
<gpdb:field name="income"
class="java.math.BigDecimal" />
</gpdb:fields>
</gpdb:pdx>
</gpdb:types>
</gpdb:store>
</region>
2
1
3
4
15© 2016 Pivotal Software, Inc. All rights reserved.
Configuring Collocation
Parent-child foreign key relationships
supported through collocation
1.  Compound keys configurations
result in a HashMap based key in
GemFire
2.  Provided partition resolver works
with compound keys
<region name="Child">
<...>
<partition-resolver>
<class-name>
io.pivotal.gemfire.gpdb.IdPartitionResolver
</class-name>
<parameter name="field">
<string>parentId</string>
</parameter>
</...>
<gpdb:id>
<gpdb:field ref="parentId" />
<gpdb:field ref="id" />
</gpdb:id>
<gpdb:fields>
<gpdb:field name="parentId"/>
<gpdb:field name="id" />
</...>
1
2
16© 2016 Pivotal Software, Inc. All rights reserved.
Configuring Automatic Synchronization
●  Data exported to Greenplum via
asynchronous eventing
○  Time and batch size triggers
available
●  Causes each GemFire member to
independently interact with Greenplum
○  Configure GPDB resource queues
accordingly
<region name="Child">
<...>
<gpdb:store datasource="datasource">
<gpdb:synchronize mode="automatic"
time-interval="3000"
persistent="false" />
<gpdb:types>
<...>
17© 2016 Pivotal Software, Inc. All rights reserved.
Case Study G2C Configuration Details
Ÿ  Existing required domain objects
–  Multiple many-to-one groupings
Ÿ  Wide tables / objects (500+ fields)
Ÿ  Data Collocation configured on
caseId
Ÿ  Source tables wrapped in views
CaseWrapper
-  caseId
-  …
ModelScores
-  caseId
-  …
Documents
-  caseId
-  …
PriorHistory
-  caseId
-  …
OtherData…
-  caseId
-  …
* *
* *
1
LogicResults
-  caseId
-  …
18© 2016 Pivotal Software, Inc. All rights reserved.
Simple Loading – Single Table per Object
:LoadTrigger :GPDBService :Region :AsyncEventLister :LogicEngine results:Region
Import()
put()
processEvents()
process()
put()
19© 2016 Pivotal Software, Inc. All rights reserved.
Complex Loading – Multiple Tables per Object
:MergeLoader :GPDBService :Region :LogicEngine results:Region
Import()
put()
process()
put()
par
assemble()
:LoadTrigger
executeFunction()
20© 2016 Pivotal Software, Inc. All rights reserved.
Impacts & Results
Ÿ  Simplified implementation & code reduction
Ÿ  Maintained or improved data motion rates
–  Case study CPU bound
–  Additional improvements in the backlog
Ÿ  Improved system throughput
21© 2016 Pivotal Software, Inc. All rights reserved. 21© 2016 Pivotal Software, Inc. All rights reserved.
Questions?
Join the Apache Geode Community!
•  Check out: http://geode.incubator.apache.org
•  Subscribe: user-subscribe@geode.incubator.apache.org
•  Download: http://geode.incubator.apache.org/releases/
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum

Contenu connexe

Tendances

Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid DataWorks Summit
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
 
Build your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open SourceBuild your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open SourceApache Geode
 
In memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainIn memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainData Con LA
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement VMware Tanzu
 
Spark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internalsSpark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internalsClaudiu Barbura
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Spark Summit
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive DataWorks Summit
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Databricks
 
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...Flink Forward
 
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Stephen Darlington
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...Spark Summit
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at ScaleElasticsearch
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Spark Summit
 
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...DataWorks Summit
 

Tendances (20)

Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
ApexMeetup Geode - Talk1 2016-03-17
ApexMeetup Geode - Talk1 2016-03-17ApexMeetup Geode - Talk1 2016-03-17
ApexMeetup Geode - Talk1 2016-03-17
 
Build your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open SourceBuild your first Internet of Things app today with Open Source
Build your first Internet of Things app today with Open Source
 
In memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGainIn memory computing principles by Mac Moore of GridGain
In memory computing principles by Mac Moore of GridGain
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
 
Geode Meetup Apachecon
Geode Meetup ApacheconGeode Meetup Apachecon
Geode Meetup Apachecon
 
Spark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internalsSpark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internals
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
 
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
 
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at Scale
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
 
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
 

En vedette

#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future DesignPivotalOpenSourceHub
 
#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode AdaptorPivotalOpenSourceHub
 
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...PivotalOpenSourceHub
 
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and GeodePivotalOpenSourceHub
 
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using GeodePivotalOpenSourceHub
 
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...PivotalOpenSourceHub
 
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache GeodePivotalOpenSourceHub
 
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)PivotalOpenSourceHub
 
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System ArchitecturesPivotalOpenSourceHub
 
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analyticsPivotalOpenSourceHub
 
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...PivotalOpenSourceHub
 
#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and FuturePivotalOpenSourceHub
 
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & GeodePivotalOpenSourceHub
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"PivotalOpenSourceHub
 
#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed SystemsPivotalOpenSourceHub
 
Get Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + DellGet Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + Dellskahler
 
Build Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + DellBuild Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + Dellskahler
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal GemfireIn-Memory Computing Summit
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Christian Tzolov
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 

En vedette (20)

#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design
 
#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor#GeodeSummit - Redis to Geode Adaptor
#GeodeSummit - Redis to Geode Adaptor
 
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
 
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
 
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
 
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
 
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
 
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
 
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures#GeodeSummit - Where Does Geode Fit in Modern System Architectures
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
 
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
 
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
 
#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Spring Data GemFire API Current and Future
 
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
 
#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Design Tradeoffs in Distributed Systems
 
Get Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + DellGet Results, Build Your Own Big Data Beast : Greenplum + Dell
Get Results, Build Your Own Big Data Beast : Greenplum + Dell
 
Build Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + DellBuild Your Own Data Beast : Greenplum + Dell
Build Your Own Data Beast : Greenplum + Dell
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 

Similaire à #GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum

CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformDeepak Chandramouli
 
Spring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in ActionSpring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in ActionJohn Blum
 
Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019Goutam Tadi
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to GreenplumDave Cramer
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetTigerGraph
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overviewjimliddle
 
Online Upgrade Using Logical Replication
 Online Upgrade Using Logical Replication Online Upgrade Using Logical Replication
Online Upgrade Using Logical ReplicationEDB
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryMárton Kodok
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Provectus
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Romit Mehta
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDeepak Chandramouli
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...HostedbyConfluent
 
SpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data GemfireSpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data GemfireJay Lee
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitDenis Magda
 
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricNETWAYS
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit
 

Similaire à #GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum (20)

CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic Platform
 
Spring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in ActionSpring Data and In-Memory Data Management in Action
Spring Data and In-Memory Data Management in Action
 
Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019Greenplum for Kubernetes PGConf india 2019
Greenplum for Kubernetes PGConf india 2019
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to Greenplum
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overview
 
Online Upgrade Using Logical Replication
 Online Upgrade Using Logical Replication Online Upgrade Using Logical Replication
Online Upgrade Using Logical Replication
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
 
SpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data GemfireSpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
SpringCamp 2016 - Apache Geode 와 Spring Data Gemfire
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
 
BigData_Krishna Kumar Sharma
BigData_Krishna Kumar SharmaBigData_Krishna Kumar Sharma
BigData_Krishna Kumar Sharma
 
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos Erotocritou
 

Plus de PivotalOpenSourceHub

Zettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabaseZettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabasePivotalOpenSourceHub
 
New Security Framework in Apache Geode
New Security Framework in Apache GeodeNew Security Framework in Apache Geode
New Security Framework in Apache GeodePivotalOpenSourceHub
 
Apache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based ReplicationApache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based ReplicationPivotalOpenSourceHub
 
Building Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodeBuilding Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodePivotalOpenSourceHub
 
GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServicePivotalOpenSourceHub
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivotalOpenSourceHub
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 PivotalOpenSourceHub
 
Postgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPostgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPivotalOpenSourceHub
 
Geode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil BawaskarGeode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil BawaskarPivotalOpenSourceHub
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015PivotalOpenSourceHub
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRPivotalOpenSourceHub
 
Data Science Perspective and DS demo
Data Science Perspective and DS demo Data Science Perspective and DS demo
Data Science Perspective and DS demo PivotalOpenSourceHub
 

Plus de PivotalOpenSourceHub (15)

Zettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabaseZettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum Database
 
New Security Framework in Apache Geode
New Security Framework in Apache GeodeNew Security Framework in Apache Geode
New Security Framework in Apache Geode
 
Apache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based ReplicationApache Geode Clubhouse - WAN-based Replication
Apache Geode Clubhouse - WAN-based Replication
 
Building Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodeBuilding Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache Geode
 
GPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a ServiceGPORCA: Query Optimization as a Service
GPORCA: Query Optimization as a Service
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
 
Apache Geode Offheap Storage
Apache Geode Offheap StorageApache Geode Offheap Storage
Apache Geode Offheap Storage
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16
 
Build & test Apache Hawq
Build & test Apache Hawq Build & test Apache Hawq
Build & test Apache Hawq
 
Postgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPostgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh Shah
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Geode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil BawaskarGeode Transactions by Swapnil Bawaskar
Geode Transactions by Swapnil Bawaskar
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015
 
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalRMADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
MADlib Architecture and Functional Demo on How to Use MADlib/PivotalR
 
Data Science Perspective and DS demo
Data Science Perspective and DS demo Data Science Perspective and DS demo
Data Science Perspective and DS demo
 

Dernier

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Dernier (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Greenplum

  • 1.
  • 2. 2© 2016 Pivotal Software, Inc. All rights reserved. 2© 2016 Pivotal Software, Inc. All rights reserved. Large Scale Fraud Analytics GemFire Greenplum Connector (G2C)
  • 3. 3© 2016 Pivotal Software, Inc. All rights reserved. Background Ÿ  Government fraud revenue retention program Ÿ  Detecting & retaining ~$5B annually –  Primary focus on identity theft –  Processes up to 8 million cases per day –  Current & historic data size ~60 TB (compressed) Ÿ  Modifying architecture to integrate GemFire for scalable Java-based business logic, web service integration, and event driven design
  • 4. 4© 2016 Pivotal Software, Inc. All rights reserved. Fraud Systems Simplified Prepare •  Ingest •  Restructure (ETL) Score •  Model Evaluation Disposition •  Business Logic •  Prioritization Respond •  Investigation •  Stop Payments Business Logic Engine ETL Reporting In-db Analytics Application Services
  • 5. 5© 2016 Pivotal Software, Inc. All rights reserved. Case Study Architecture – Scaling Up GemFire Greenplum Spring Boot App Services Informatica w/ PWX (ETL) Business Objects (Reporting) Legacy Logic Implementation Logic Engine In-db Analytics Greenplum Prepare •  Ingest •  Restructure (ETL) Score •  Model Evaluation Disposition •  Business Logic •  Prioritization Respond •  Investigation •  Stop Payments
  • 6. 6© 2016 Pivotal Software, Inc. All rights reserved. Pivotal Greenplum (GPDB) Ÿ  Postgres Community OSS –  Original fork of 8.2.15 –  Massively parallel processing database Ÿ  Master coordinates queries across segments databases Ÿ  Supports in-database model evaluation –  MadLib, PL/R, SAS GPDB Logical GPDB Physical GPDB Software Master Segments
  • 7. 7© 2016 Pivotal Software, Inc. All rights reserved. Initial Implementation Ÿ  Fraud model results evaluated by business logic engine Ÿ  Flat file data extraction –  Significant custom code to construct required object model –  Table à CSV à POJO Ÿ  Shared element in an otherwise distributed system –  Performance considerations GPDB Legacy Logic Implementation
  • 8. 8© 2016 Pivotal Software, Inc. All rights reserved. Architecture Adjustments Ÿ  New requirements introduced external integrations –  Drives desire for web-services Ÿ  Desire to improve performance & simplify codebase Ÿ  Expanding business logic –  Logic engine run as a GemFire function GemFire GPDB Legacy Logic Implementation Spring Boot (App Services)
  • 9. 9© 2016 Pivotal Software, Inc. All rights reserved. 9© 2016 Pivotal Software, Inc. All rights reserved. GemFire Greenplum Connector
  • 10. 10© 2016 Pivotal Software, Inc. All rights reserved. Context Greenplum! ANSI SQL Analytical Parallel Configurable Data Load GemFire!App 1App 1App 1 App 1App 1App 2 Native API Rest / HTTP Transactional Custom Apps Transactional data write behind Data Science, Analytics & ML
  • 11. 11© 2016 Pivotal Software, Inc. All rights reserved. GemFire Greenplum Connector (G2C) Ÿ  Extension package for GemFire Ÿ  Provides simple import and export of data between GemFire regions & Greenplum tables –  Parallel data motion leveraging Greenplum’s external table interface Ÿ  Simple mapping between table rows and PdxInstance –  Flat object relational mapping –  Set of predefined type conversions –  Configurable GemFire data collocation
  • 12. 12© 2016 Pivotal Software, Inc. All rights reserved. Greenplum Master Segments GemFire G2C Data Interfaces JDBC / ODBC Data Node Data Node Control Logic
  • 13. 13© 2016 Pivotal Software, Inc. All rights reserved. GpdbService is the primary entry point for explicitly invoked data motion 1.  Import - loads the full table contents from Greenplum 2.  Export - sends region contents to Greenplum Sample Data Import / Export Cache cache = CacheFactory.getAnyInstance(); GpdbService gpdb = GpdbService.getInstance(cache); long count; count = gpdb.importRegion(region); count = gpdb.exportRegion(region); 1 2
  • 14. 14© 2016 Pivotal Software, Inc. All rights reserved. Basic Cache Configuration Configured via GemFire extension framework •  1) Each region maps to a jndi data source back by Greenplum •  2) Link an entity type and table •  3) Declare a field to be used as the key •  Compound keys supported •  4) Define a mapping between the table columns •  Default auto-configuration •  Optional name and column attributes for naming convention changes •  Class used to control type conversion •  Set of built in types <region name="Parent"> <region-attributes refid="PARTITION"> <partition-attributes/> </region-attributes> <gpdb:store datasource="datasource"> <gpdb:types> <gpdb:pdx name="io.pivotal...entity.Parent" table="parent"> <gpdb:id field="id" /> <gpdb:fields> <gpdb:field name="name" /> <gpdb:field name="id" column="id" /> <gpdb:field name="income" class="java.math.BigDecimal" /> </gpdb:fields> </gpdb:pdx> </gpdb:types> </gpdb:store> </region> 2 1 3 4
  • 15. 15© 2016 Pivotal Software, Inc. All rights reserved. Configuring Collocation Parent-child foreign key relationships supported through collocation 1.  Compound keys configurations result in a HashMap based key in GemFire 2.  Provided partition resolver works with compound keys <region name="Child"> <...> <partition-resolver> <class-name> io.pivotal.gemfire.gpdb.IdPartitionResolver </class-name> <parameter name="field"> <string>parentId</string> </parameter> </...> <gpdb:id> <gpdb:field ref="parentId" /> <gpdb:field ref="id" /> </gpdb:id> <gpdb:fields> <gpdb:field name="parentId"/> <gpdb:field name="id" /> </...> 1 2
  • 16. 16© 2016 Pivotal Software, Inc. All rights reserved. Configuring Automatic Synchronization ●  Data exported to Greenplum via asynchronous eventing ○  Time and batch size triggers available ●  Causes each GemFire member to independently interact with Greenplum ○  Configure GPDB resource queues accordingly <region name="Child"> <...> <gpdb:store datasource="datasource"> <gpdb:synchronize mode="automatic" time-interval="3000" persistent="false" /> <gpdb:types> <...>
  • 17. 17© 2016 Pivotal Software, Inc. All rights reserved. Case Study G2C Configuration Details Ÿ  Existing required domain objects –  Multiple many-to-one groupings Ÿ  Wide tables / objects (500+ fields) Ÿ  Data Collocation configured on caseId Ÿ  Source tables wrapped in views CaseWrapper -  caseId -  … ModelScores -  caseId -  … Documents -  caseId -  … PriorHistory -  caseId -  … OtherData… -  caseId -  … * * * * 1 LogicResults -  caseId -  …
  • 18. 18© 2016 Pivotal Software, Inc. All rights reserved. Simple Loading – Single Table per Object :LoadTrigger :GPDBService :Region :AsyncEventLister :LogicEngine results:Region Import() put() processEvents() process() put()
  • 19. 19© 2016 Pivotal Software, Inc. All rights reserved. Complex Loading – Multiple Tables per Object :MergeLoader :GPDBService :Region :LogicEngine results:Region Import() put() process() put() par assemble() :LoadTrigger executeFunction()
  • 20. 20© 2016 Pivotal Software, Inc. All rights reserved. Impacts & Results Ÿ  Simplified implementation & code reduction Ÿ  Maintained or improved data motion rates –  Case study CPU bound –  Additional improvements in the backlog Ÿ  Improved system throughput
  • 21. 21© 2016 Pivotal Software, Inc. All rights reserved. 21© 2016 Pivotal Software, Inc. All rights reserved. Questions?
  • 22. Join the Apache Geode Community! •  Check out: http://geode.incubator.apache.org •  Subscribe: user-subscribe@geode.incubator.apache.org •  Download: http://geode.incubator.apache.org/releases/