SlideShare a Scribd company logo
1 of 23
FOR FINANCE
CEO Jan-Kees Buenen
CTO Jorik Blaas
FOR FINANCE
“to Know your Datalake”
+ +
Dealing with the
complexity of
data at scale
Volume, variety, velocity and
veracity
Text
a,b,c,…
Sensors
IoT
Wearables
Etc.
Digital Images
Video
Pictures
Scans
Network
Numbers
1,2,3,…
SYNERSCOPE’S CHOICE FOR A HIGH-PERFORMANCE
CONVERGED SYSTEM
o Scalable
o Flexible
o Attacks the right bottlenecks
o Unique balance of compute, memory and
storage
SYNERSCOPE’S PRIMARY USE CASES FOR IBM
POWER
o High Volume transactions
o On-premise image analytics
o Sensor data (IoT)
o Large scale networks
IBM Power
Systems
COMPUTER AUTOMATION PLUS HUMAN INTERACTION
Machine
Learning + Visual
Sensemaking
o Speed and Flexibility
o Double cognitive system
Four main components
End to End
Analytics
Connect & Correlate
at Scale
State of the art
Access Control
AI
Deep Learning
TIME BURNERS WHEN WORKING WITH DATA
o Getting the infrastructure ready
o Data Science tooling deployment
o Loading data into the platform
o Data quality, provenance or cleaning
o Searching the data
o Selecting the relevant data
o Finding which data holds information
value
o Operationalize the new findings
o Getting to data driven ways of working
End-to-End
Analytics
Productivity with Raw
and Complex Data
BULK PROCESSING FOR ALL THE HEAVY LIFTING
o Ingest from unlimited data sources
o Content Scanning at field level
o Data-driven Correlation
o Enterprise Search
Connect and
correlate data at
Scale
Fully Automated, zero interaction by Ixiwa IximeerSped up by Ixiwa
Traditional ETL data science approach
Re-arrange the analytic workflow,
Bulk
Ingest
Scan &
Organize
Correlate &
Enrich
Find Extract Analyze
Find Ingest Organize Enrich Extract Analyze
AUTOMATED BULK PROCESSES DO THE HEAVY LIFTING
o Ingest
o Content Scanning
o Data-driven Correlation
o Enterprise Search
Replace upfront
human efforts
with targeted
efforts
FINE TOUCH INTEGRATION WITH HDP AND SPARK
o Scalable performance
IXIWA BUILD-IN INTELLIGENCE TO
o Detect input format and encodings
o Extract text from PDF, DOCX, XLSX, etc. with
Tika
o Index all data with SOLR for datalake-wide
search
o Automatically tag the data
Ixiwa on
IBM Power
Systems with
POWER8 for
heavy data lifting
PATENTED MANY-TO-MANY AUTO-CORRELATOR AT SCALE
Data
Similarity &
Linking
Data
Fingerprinting Auto-correlator
Similar data
redundant
Linkable data
• Augment
• Enrich
ACCESS CONTROL
o Role based access made easy
o First do bulk ingest then find data-driven
access
o Field level data scanning for precision
o Compartmentalize the data
o Roles and field content together define access
o Continuous audit trails
GDPR READY
o Know where all sensitive data is
o Know which applications and users access
what
Enterprise Grade
Security for your
Datalake
Ixiwa + Ranger + Atlas
IXIWA ON APACHE RANGER AND APACHE ATLAS
o Automated data tagging
o Easy setting of access permissions
o Bulk ingested data made easy to share safely
o Full provenance trails kept in Ranger
o Transparent, traceable and auditable
SECURE HIGH-LEVEL TASK AND WORKFLOW SYSTEM
o Reduce the need for one-off data science
activities
Smart data
scanning and
data
management
BOTTLENECKS
o How data is presented to humans
o Collaboration in and between groups
o Complex decision making
o Data driven decisions
AI BECOMES A HUMANS BEST COMPANION
o Changes how data is presented
o Allows humans to do what humans do best
o ASSOCIATE DATA AND INFORMATION
Productivity
Productivity
Productivity
with data at Scale
AI and
Deep Learning
Quality of the People
and Quality of the Data
Analyst
Analyze Emerging
Patterns & Context
Historical Data
Test found rules, return
high-quality hits
Streaming Flows
Implement found rules,
alert in near real-time
COMBINE IBM- HORTONWORKS- GOOGLE
SynerScope’s
fine touch
integration of
HDP, GPU &
Tensorflow
o Mixed cluster with multiple node types
o Leverage the Hybrid nature of Tensorflow (CPU+GPU)
o Leverage YARN node labels to launch on GPU nodes
o Models stored on HDFS
HDP
YARN
CPU Nodes
TensorFlow
PySpark
GPU Nodes
CUDA
TensorFlow
PySpark
SYNERSCOPE & TENSORFLOW ON IBM POWER SYSTEMS
Warp power for
Storage,
Compute and
Learning
o PowerAI Deep Learning distribution includes pre-compiled
TensorFlow binaries for fast deployment
o Direct support for NVIDIA Tesla and Pascal series
o NVLink support on Pascal
Spectrum
Scale
Storage
IBM
POWER8
HDP
PySpark
IBM POWER8
&
NVIDIA Pascal
HDP
PowerAI
TensorFlow
PySpark
Scalable ComputeScalable Storage Scalable Learning
EXAMPLE OF A PERFECT TASK FOR IBM POWER SYSTEMS
o Image classification and grading
o Batch-wise training on ever expanding data
o Field level data scanning for precision
On-premise
Image
Classification
when a public cloud is off-limits
SynerScope the
more flexible link
Practical Live Modelling with SAP, IBM Power Systems and ML
Live Order
Data
Stream
Full
Historical
Data
Labels &
Predictio
ns
Async
Stream
Processin
g
Trained and
Dynamically
Updated Models
Virtualized on IBM
POWER8
IBM POWER8 (lab) IBM POWER8 NVIDIA
CONTACTS
INFO@SYNERSCOPE.COM
CTO jorik.blaas@synerscope.com
CEO jan-kees.Buenen@synerscope.com
Learn more about Hortonworks on Power:
ibm.biz/hortonworksOnPower
Questions?

More Related Content

What's hot

Empowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningEmpowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningDataWorks Summit
 
Hadoop Journey at Walgreens
Hadoop Journey at WalgreensHadoop Journey at Walgreens
Hadoop Journey at WalgreensDataWorks Summit
 
Depositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDepositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDataWorks Summit/Hadoop Summit
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors DataWorks Summit/Hadoop Summit
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
 
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!DataWorks Summit/Hadoop Summit
 
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open sourceBig SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open sourceDataWorks Summit
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemGregg Barrett
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...DataWorks Summit
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitDataWorks Summit
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and DeploymentCisco Canada
 

What's hot (20)

Log I am your father
Log I am your fatherLog I am your father
Log I am your father
 
Empowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningEmpowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine Learning
 
Hadoop Journey at Walgreens
Hadoop Journey at WalgreensHadoop Journey at Walgreens
Hadoop Journey at Walgreens
 
Depositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDepositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske Bank
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Filling the Data Lake
Filling the Data LakeFilling the Data Lake
Filling the Data Lake
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
 
Big Data at your Desk with KNIME
Big Data at your Desk with KNIMEBig Data at your Desk with KNIME
Big Data at your Desk with KNIME
 
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!
 
Benefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a ServiceBenefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a Service
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open sourceBig SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystem
 
Beyond TCO
Beyond TCOBeyond TCO
Beyond TCO
 
Hadoop for the Masses
Hadoop for the MassesHadoop for the Masses
Hadoop for the Masses
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 

Similar to Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive Business

Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceeRic Choo
 
Big Data LDN 2017: Big Impact with Big Data
Big Data LDN 2017: Big Impact with Big DataBig Data LDN 2017: Big Impact with Big Data
Big Data LDN 2017: Big Impact with Big DataMatt Stubbs
 
Big Data Modeling Challenges and Machine Learning with No Code
Big Data Modeling Challenges and Machine Learning with No CodeBig Data Modeling Challenges and Machine Learning with No Code
Big Data Modeling Challenges and Machine Learning with No CodeLiana Ye
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIAmazon Web Services
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data InsightsDataWorks Summit
 
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking ForumΑνδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking ForumStarttech Ventures
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalHPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalDataWorks Summit
 
Structuring Data from Unstructured Things. Sean Lorenz
Structuring Data from Unstructured Things. Sean LorenzStructuring Data from Unstructured Things. Sean Lorenz
Structuring Data from Unstructured Things. Sean LorenzFuture Insights
 
Design for X: Exploring Product Design with Apache Spark and GraphLab
Design for X: Exploring Product Design with Apache Spark and GraphLabDesign for X: Exploring Product Design with Apache Spark and GraphLab
Design for X: Exploring Product Design with Apache Spark and GraphLabAmanda Casari
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health CareJeffrey Funk
 
End-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooEnd-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooJason Dai
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john malloryAmazon Web Services
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemDan Eaton
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Etu Solution
 
Denodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeMicrosoft
 
Trivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AITrivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AITrivadis
 

Similar to Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive Business (20)

Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
 
Big Data LDN 2017: Big Impact with Big Data
Big Data LDN 2017: Big Impact with Big DataBig Data LDN 2017: Big Impact with Big Data
Big Data LDN 2017: Big Impact with Big Data
 
Big Data Modeling Challenges and Machine Learning with No Code
Big Data Modeling Challenges and Machine Learning with No CodeBig Data Modeling Challenges and Machine Learning with No Code
Big Data Modeling Challenges and Machine Learning with No Code
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AI
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data Insights
 
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking ForumΑνδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-Ari
 
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalHPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposal
 
Structuring Data from Unstructured Things. Sean Lorenz
Structuring Data from Unstructured Things. Sean LorenzStructuring Data from Unstructured Things. Sean Lorenz
Structuring Data from Unstructured Things. Sean Lorenz
 
Design for X: Exploring Product Design with Apache Spark and GraphLab
Design for X: Exploring Product Design with Apache Spark and GraphLabDesign for X: Exploring Product Design with Apache Spark and GraphLab
Design for X: Exploring Product Design with Apache Spark and GraphLab
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
 
BDA ( haoop ).pptx
BDA ( haoop ).pptxBDA ( haoop ).pptx
BDA ( haoop ).pptx
 
Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
 
End-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooEnd-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics Zoo
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john mallory
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台
 
Denodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the Cloud
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
Trivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AITrivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AI
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive Business

  • 1. FOR FINANCE CEO Jan-Kees Buenen CTO Jorik Blaas
  • 2. FOR FINANCE “to Know your Datalake” + +
  • 3. Dealing with the complexity of data at scale Volume, variety, velocity and veracity Text a,b,c,… Sensors IoT Wearables Etc. Digital Images Video Pictures Scans Network Numbers 1,2,3,…
  • 4. SYNERSCOPE’S CHOICE FOR A HIGH-PERFORMANCE CONVERGED SYSTEM o Scalable o Flexible o Attacks the right bottlenecks o Unique balance of compute, memory and storage SYNERSCOPE’S PRIMARY USE CASES FOR IBM POWER o High Volume transactions o On-premise image analytics o Sensor data (IoT) o Large scale networks IBM Power Systems
  • 5. COMPUTER AUTOMATION PLUS HUMAN INTERACTION Machine Learning + Visual Sensemaking o Speed and Flexibility o Double cognitive system
  • 6. Four main components End to End Analytics Connect & Correlate at Scale State of the art Access Control AI Deep Learning
  • 7. TIME BURNERS WHEN WORKING WITH DATA o Getting the infrastructure ready o Data Science tooling deployment o Loading data into the platform o Data quality, provenance or cleaning o Searching the data o Selecting the relevant data o Finding which data holds information value o Operationalize the new findings o Getting to data driven ways of working End-to-End Analytics Productivity with Raw and Complex Data
  • 8. BULK PROCESSING FOR ALL THE HEAVY LIFTING o Ingest from unlimited data sources o Content Scanning at field level o Data-driven Correlation o Enterprise Search Connect and correlate data at Scale
  • 9. Fully Automated, zero interaction by Ixiwa IximeerSped up by Ixiwa Traditional ETL data science approach Re-arrange the analytic workflow, Bulk Ingest Scan & Organize Correlate & Enrich Find Extract Analyze Find Ingest Organize Enrich Extract Analyze
  • 10. AUTOMATED BULK PROCESSES DO THE HEAVY LIFTING o Ingest o Content Scanning o Data-driven Correlation o Enterprise Search Replace upfront human efforts with targeted efforts
  • 11. FINE TOUCH INTEGRATION WITH HDP AND SPARK o Scalable performance IXIWA BUILD-IN INTELLIGENCE TO o Detect input format and encodings o Extract text from PDF, DOCX, XLSX, etc. with Tika o Index all data with SOLR for datalake-wide search o Automatically tag the data Ixiwa on IBM Power Systems with POWER8 for heavy data lifting
  • 12. PATENTED MANY-TO-MANY AUTO-CORRELATOR AT SCALE Data Similarity & Linking Data Fingerprinting Auto-correlator Similar data redundant Linkable data • Augment • Enrich
  • 13. ACCESS CONTROL o Role based access made easy o First do bulk ingest then find data-driven access o Field level data scanning for precision o Compartmentalize the data o Roles and field content together define access o Continuous audit trails GDPR READY o Know where all sensitive data is o Know which applications and users access what Enterprise Grade Security for your Datalake Ixiwa + Ranger + Atlas
  • 14. IXIWA ON APACHE RANGER AND APACHE ATLAS o Automated data tagging o Easy setting of access permissions o Bulk ingested data made easy to share safely o Full provenance trails kept in Ranger o Transparent, traceable and auditable SECURE HIGH-LEVEL TASK AND WORKFLOW SYSTEM o Reduce the need for one-off data science activities Smart data scanning and data management
  • 15. BOTTLENECKS o How data is presented to humans o Collaboration in and between groups o Complex decision making o Data driven decisions AI BECOMES A HUMANS BEST COMPANION o Changes how data is presented o Allows humans to do what humans do best o ASSOCIATE DATA AND INFORMATION Productivity Productivity Productivity with data at Scale
  • 16. AI and Deep Learning Quality of the People and Quality of the Data
  • 17. Analyst Analyze Emerging Patterns & Context Historical Data Test found rules, return high-quality hits Streaming Flows Implement found rules, alert in near real-time
  • 18. COMBINE IBM- HORTONWORKS- GOOGLE SynerScope’s fine touch integration of HDP, GPU & Tensorflow o Mixed cluster with multiple node types o Leverage the Hybrid nature of Tensorflow (CPU+GPU) o Leverage YARN node labels to launch on GPU nodes o Models stored on HDFS HDP YARN CPU Nodes TensorFlow PySpark GPU Nodes CUDA TensorFlow PySpark
  • 19. SYNERSCOPE & TENSORFLOW ON IBM POWER SYSTEMS Warp power for Storage, Compute and Learning o PowerAI Deep Learning distribution includes pre-compiled TensorFlow binaries for fast deployment o Direct support for NVIDIA Tesla and Pascal series o NVLink support on Pascal Spectrum Scale Storage IBM POWER8 HDP PySpark IBM POWER8 & NVIDIA Pascal HDP PowerAI TensorFlow PySpark Scalable ComputeScalable Storage Scalable Learning
  • 20. EXAMPLE OF A PERFECT TASK FOR IBM POWER SYSTEMS o Image classification and grading o Batch-wise training on ever expanding data o Field level data scanning for precision On-premise Image Classification when a public cloud is off-limits
  • 22. Practical Live Modelling with SAP, IBM Power Systems and ML Live Order Data Stream Full Historical Data Labels & Predictio ns Async Stream Processin g Trained and Dynamically Updated Models Virtualized on IBM POWER8 IBM POWER8 (lab) IBM POWER8 NVIDIA
  • 23. CONTACTS INFO@SYNERSCOPE.COM CTO jorik.blaas@synerscope.com CEO jan-kees.Buenen@synerscope.com Learn more about Hortonworks on Power: ibm.biz/hortonworksOnPower Questions?