SlideShare une entreprise Scribd logo
1  sur  28
1
Internal Use - Confidential
DataWorks Summit
Shawn Smith – Big Data Specialist
shawn.smith@dell.com
Accelerating Big Data Insights
Internal Use - Confidential
Transforming The Business
We help organizations reinvent themselves and realize their digital future
Digital
Transformation
Security
Transformation
Workforce
Transformation
IT
Transformation
Internal Use - Confidential
Dell EMC - Internal Use - Confidential
BUSINESS TRANSFORMATION
Ready for Whatever Comes Next:
AI, Augmented Reality, Machine Learning . . .
Emerging Challenges
Internal Use - Confidential
What is Unstructured Data?
• 80% + of data created globally is for unstructured data
• File data is growing VERY fast. Most customers see 30%
to 50% unstructured growth year over year
• Dell EMC is #1 in Scale Out File & Object storage
according to IDC and Gartner because of SIMPLICITY!
• Simple – Single Volume
• Efficient – Best Storage Utilization
• Scale-Out – Scale and grow without pain
• NO MIGRATIONS!
80%
Internal Use - Confidential
Unstructured Data Requires
Unconstrained
Scale
Optimized TCO/ROI
Longevity
Flash to Cloud
Flexibility
Enterprise
Features
Massive
Performance
SIMPLICITY
At Any Scale
Fraud
Detection &
Risk Analytics
Trading / Tick
Data Analytics
IoT
Data Driven
Business
Transformation
Unstructured Analytics Use Cases
Customer 360
Analytics
Enabling enterprises to improve operational efficiencies
and monetize new revenue streams
Internal Use - Confidential
Organizations need to deliver analytics on more than
just their traditional structured data
Evolving spectrum of data analytics
Requires infrastructure that enables multiple applications and varied use cases
Predictive
Analytics
Business
Intelligence
Analytics of
Things
Cyber security
Analytics
Real-time
Analytics
Machine
Learning
Internal Use - Confidential
Enables analytics for ALL of your data
Dell EMC Unstructured Analytics Portfolio
Performance
Centric
Storage
Centric
Predictive
Analytics
Business
Intelligence
Analytics of
Things
Cyber security
Analytics
Real-time
Analytics
Machine
Learning
Archive
Centric
Internal Use - Confidential
Proven solutions for unstructured analytics
Dell EMC Unstructured Analytics Portfolio
Solution accelerators
 Hadoop Ready Bundle
 QuickStart for Hadoop
 EDW Optimization Solutions
 Hadoop Backup Solutions
 SAS-Grid Solution with Isilon
 Streaming Analytics Solutions
 Splunk Ready System
Right Solution Configuration for the use case
 High Performance
 100% Compliance to Hadoop features
 Ability to scale down at cost
Oneor
more
 Storage scaling faster than compute
 Enterprise Grade File Mgmt.
 Consolidation of IT Workloads
 Aggregate capacity > 100 TB
One or
more
DataCompute
 Geo-distributed single namespace
 40% to 60% less than public cloud
Compute Data
Compute + Data
Direct
Attached
Storage
SharedStorage
ENTERPRISE REQUIREMENTS CONFIGURATIONdrive
Performance-
centric
Storage-
centric
Archive-
centric
11
Internal Use - Confidential
THE BEDROCK OF THE MODERN DATA CENTER
PowerEdge R740xd
High performance server
Performance and Scale
Expanded GPU & storage capacity
boost workload performance
Innovative Design
Up to 24 NVME with up
to 18 x 3.5” drives
Integrated Security
Cyber resilient architecture, security
is integrated into full server lifecycle
– from design to retirement
Intelligent automation
New OpenManage™ Enterprise
console delivers crystal clear
reporting & full lifecycle automation
11
Market Leader Hadoop
Shared Storage
Customers running
Analytics / Hadoop
PBs of Analytics / Hadoop
• World’s #1 Courier Company
• 3 of the largest telecommunications companies in the
Americas
• One of the largest online retailer
• Multiple leading financial institutions
WHO IS USING ISILON FOR ANALYTICS?
385
Isilon Analytics Momentum
21 Industry Verticals
13
Internal Use - Confidential
Ethernet
Job Tracker Task Tracker DataNode 2nd NameNode
NameNode
Hadoop Architecture - Traditional
R (RHIPE) Mahout Hive HBasePIG
NameNode
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
14
Internal Use - Confidential
Ethernet
R (RHIPE)
PIG
Mahout Hive HBase
Job Tracker Task Tracker DataNode
Compute Node Compute Node Compute Node
Compute NodeCompute Node Compute Node
NameNode
Hadoop Architecture with Isilon
name
node
name
node
name
node
name
node
datanode
15
Internal Use - Confidential
ISILON DATA LAKE
DATA PROTECTION
DATA SECURITY
PERFORMANCE MANAGEMENT
DATA MANAGEMENT
16
Internal Use - Confidential
HDFS
SMB, NFS,
HTTP, FTP,
HDFS
node
info
node
info
node
info
node
info
node
info
node
info
node
info
node
info
node
info
Node
reply
Node
reply
Node
reply
Node
reply
Node
reply
Node
reply
Node
reply
Node
reply
Node
reply
file
file
file
file
file
file
file
file
Node
reply
Node
reply
Node
reply
Node
replyNFS
NFS
SMB
SMB
name
node
name
node
name
node
name
node
name
node
name
node
name
node
MAP
Reduce
MAP
Reduce
MAP
Reduce
MAP
Reduce
MAP
Reduce
MAP
Reduce
MAP
Reduce
MAP
Reduce
MAP
Reduce
datanodedatanode
Isilon
OneFS Compute
Data
1X
HOW IT LOOKS
Name node
Data
Compute
Workload Consolidation
and streaming analytics
/ Sharepoint
Internal Use - Confidential
Phased Approach to Hadoop Tiered Storage with Isilon
• Hadoop Cluster with DAS for interactive and batch queries
• Queriable “active archive” in Isilon / ECS configured as a separate Hadoop cluster
• Archival policy implemented using scripts executed manually
Phase 0: Archival
Cluster
• Hot data in Hadoop Cluster with DAS
• Cold data in Isilon configured as a HDFS Target
• Hive, map-reduce and Spark jobs can run across the 2 clusters
• URIs to indicate whether data is in DAS cluster or Isilon Cluster
• Tiering policy implemented using scripts executed manually
Phase 1: Tiering with
Location Aware queries
Same as Phase 1, with additional capability :
• Data location handled transparently for Hive, map-reduce and Spark jobs : URIs don’t
need to indicate whether data is in DAS cluster or Isilon Cluster
Phase 2: Tiering with
Location transparent
queries
Same as Phase 2, with additional capability :
• Tiering policy implemented using automated data movement mechanisms.
Phase 3: Automated
tiering
19
Internal Use - Confidential
It is an ecosystem where sensors, devices and equipment are connected to a
network and can transmit and receive data for tracking, analysis and action.
Operational
Technology
Industrial automation
Fleet telematics
Material handling
Information
Technology
Assets
Inventory
People
IoT
It’s not new and
not new to Dell.
It is the integration and extension
of OT and IT technologies that have
been round for decades
What is the Internet of Things?
20
Internal Use - Confidential
It’s a great big IoT world out there
Smart Connected Business – from gateways to informed decisions
Transport Connector
Private and public networks10’s of billions of connected things
Things Sensors
High-performance computer infrastructure
Application layer
SAP Hana
In-Memory database layer
Libraries
Manufacturing
Energy and Natural Resources
Transportation
Building & Industrial Automation
21
Internal Use - Confidential
Multiple Partners and Blueprints for OT / IT
SAP HANA®Software AG Apama®
Dell Edge Gateway 5000
Structured
Data
Dell EMC Data Center
Real-Time
Data
Unstructured
Data
Kepware KEPServerEX®
VisualizationsStream Analytics Machine LearningReportingAnalyticsProtocol Translation
0 0 1 0 1 1
1 0 0 1 1 0
Our Vision for
Unstructured
Storage
OBJECT
STREAM
FILE
ISILONECS
PROJECT NAUTILUS
Software-DefinedIn The CloudCommon ExperienceCommon Hardware
Internal Use - Confidential
Project “Nautilus”
Streaming Storage + Analytics EngineProject Nautilus
Turbocharge Isilon and
ECS for Streaming
Batch Storage tier
Streaming IoT data
Today’s IoT Analytics “Accidental Architecture”
Batch
Real-Time
Interactive exploration
by Data Scientists
Real-time intelligence at
the NOC
Sensors
MirrorMaker
DR Site
Mobile Devices
App Logs
Producers
Surface /
Act
Internal Use - Confidential
Project Nautilus: A Unified Data Pipeline
Strongly Consistent Storage  Exactly Once Processing  Unified Analytics
Unified Analytics
Real-Time, Batch, Interactive
Sensors
Mobile Devices
App Logs Isilon / ECS
Ingest Buffer Pub/Sub Search Persistent Data
Structures
Pravega Streams
Unified Storage
Real-time intelligence at
the NOC
Interactive exploration
by Data Scientists
Surface /
Act
Producers
Internal Use - Confidential
Project Nautilus: A Unified Data Pipeline
Strongly Consistent Storage  Exactly Once Processing  Unified Analytics
Unified Analytics
Real-Time, Batch, Interactive
Sensors
Mobile Devices
App Logs
Isilon / ECS
Ingest Pub/Sub Search S3
Pravega Streams
Unified Storage
Real-time intelligence at
the NOC
Interactive exploration
by Data Scientists
Surface /
Act
Producers
HDFS NFS SMB
Internal Use - Confidential
pravega.io
Dell EMC Unstructured Analytics Solutions

Contenu connexe

Tendances

Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 
Build Big Data Enterprise solutions faster on Azure HDInsight
Build Big Data Enterprise solutions faster on Azure HDInsightBuild Big Data Enterprise solutions faster on Azure HDInsight
Build Big Data Enterprise solutions faster on Azure HDInsightDataWorks Summit
 
Realizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamRealizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamDataWorks Summit
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the CloudDataWorks Summit
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
 
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataEnabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataDataWorks Summit
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsDataWorks Summit/Hadoop Summit
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...DataWorks Summit/Hadoop Summit
 
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalHPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalDataWorks Summit
 
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...DataWorks Summit
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseDataWorks Summit/Hadoop Summit
 
Protecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against DisastersProtecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against DisastersDataWorks Summit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonDataWorks Summit/Hadoop Summit
 

Tendances (20)

Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Build Big Data Enterprise solutions faster on Azure HDInsight
Build Big Data Enterprise solutions faster on Azure HDInsightBuild Big Data Enterprise solutions faster on Azure HDInsight
Build Big Data Enterprise solutions faster on Azure HDInsight
 
Realizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamRealizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache Beam
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
 
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profitDeep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
 
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataEnabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government data
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalHPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposal
 
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
 
Protecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against DisastersProtecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against Disasters
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
 

Similaire à Dell EMC Unstructured Analytics Solutions

MT129 Isilon Data Lake Overview
MT129 Isilon Data Lake OverviewMT129 Isilon Data Lake Overview
MT129 Isilon Data Lake OverviewDell EMC World
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisNetAppUK
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)Cloudera, Inc.
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big DataNetApp
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit MumbaiAnand Haridass
 
Ibm integrated analytics system
Ibm integrated analytics systemIbm integrated analytics system
Ibm integrated analytics systemModusOptimum
 
Breaking the Silos: Storage for Analytics & AI
Breaking the Silos: Storage for Analytics & AIBreaking the Silos: Storage for Analytics & AI
Breaking the Silos: Storage for Analytics & AIDataWorks Summit
 
dell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptxdell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptxSriramFreelance
 
Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System Alluxio, Inc.
 
MT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerMT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerDell EMC World
 
Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...NetAppUK
 
MT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of ViewMT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of ViewDell EMC World
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...Denodo
 
Alluxio @ Uber Seattle Meetup
Alluxio @ Uber Seattle MeetupAlluxio @ Uber Seattle Meetup
Alluxio @ Uber Seattle MeetupAlluxio, Inc.
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsDataWorks Summit
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big datasolarisyourep
 

Similaire à Dell EMC Unstructured Analytics Solutions (20)

MT129 Isilon Data Lake Overview
MT129 Isilon Data Lake OverviewMT129 Isilon Data Lake Overview
MT129 Isilon Data Lake Overview
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
 
Ibm integrated analytics system
Ibm integrated analytics systemIbm integrated analytics system
Ibm integrated analytics system
 
Breaking the Silos: Storage for Analytics & AI
Breaking the Silos: Storage for Analytics & AIBreaking the Silos: Storage for Analytics & AI
Breaking the Silos: Storage for Analytics & AI
 
dell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptxdell-emc-powerscale-for-ngs.pptx
dell-emc-powerscale-for-ngs.pptx
 
Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System
 
MT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerMT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data center
 
Sgi hadoop
Sgi hadoopSgi hadoop
Sgi hadoop
 
Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...
 
MT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of ViewMT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of View
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
 
Alluxio @ Uber Seattle Meetup
Alluxio @ Uber Seattle MeetupAlluxio @ Uber Seattle Meetup
Alluxio @ Uber Seattle Meetup
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

Dell EMC Unstructured Analytics Solutions

  • 1. 1 Internal Use - Confidential DataWorks Summit Shawn Smith – Big Data Specialist shawn.smith@dell.com Accelerating Big Data Insights Internal Use - Confidential
  • 2. Transforming The Business We help organizations reinvent themselves and realize their digital future Digital Transformation Security Transformation Workforce Transformation IT Transformation
  • 3. Internal Use - Confidential Dell EMC - Internal Use - Confidential BUSINESS TRANSFORMATION Ready for Whatever Comes Next: AI, Augmented Reality, Machine Learning . . . Emerging Challenges
  • 4. Internal Use - Confidential What is Unstructured Data? • 80% + of data created globally is for unstructured data • File data is growing VERY fast. Most customers see 30% to 50% unstructured growth year over year • Dell EMC is #1 in Scale Out File & Object storage according to IDC and Gartner because of SIMPLICITY! • Simple – Single Volume • Efficient – Best Storage Utilization • Scale-Out – Scale and grow without pain • NO MIGRATIONS! 80%
  • 5. Internal Use - Confidential Unstructured Data Requires Unconstrained Scale Optimized TCO/ROI Longevity Flash to Cloud Flexibility Enterprise Features Massive Performance SIMPLICITY At Any Scale
  • 6. Fraud Detection & Risk Analytics Trading / Tick Data Analytics IoT Data Driven Business Transformation Unstructured Analytics Use Cases Customer 360 Analytics Enabling enterprises to improve operational efficiencies and monetize new revenue streams
  • 7. Internal Use - Confidential Organizations need to deliver analytics on more than just their traditional structured data Evolving spectrum of data analytics Requires infrastructure that enables multiple applications and varied use cases Predictive Analytics Business Intelligence Analytics of Things Cyber security Analytics Real-time Analytics Machine Learning
  • 8. Internal Use - Confidential Enables analytics for ALL of your data Dell EMC Unstructured Analytics Portfolio Performance Centric Storage Centric Predictive Analytics Business Intelligence Analytics of Things Cyber security Analytics Real-time Analytics Machine Learning Archive Centric
  • 9. Internal Use - Confidential Proven solutions for unstructured analytics Dell EMC Unstructured Analytics Portfolio Solution accelerators  Hadoop Ready Bundle  QuickStart for Hadoop  EDW Optimization Solutions  Hadoop Backup Solutions  SAS-Grid Solution with Isilon  Streaming Analytics Solutions  Splunk Ready System
  • 10. Right Solution Configuration for the use case  High Performance  100% Compliance to Hadoop features  Ability to scale down at cost Oneor more  Storage scaling faster than compute  Enterprise Grade File Mgmt.  Consolidation of IT Workloads  Aggregate capacity > 100 TB One or more DataCompute  Geo-distributed single namespace  40% to 60% less than public cloud Compute Data Compute + Data Direct Attached Storage SharedStorage ENTERPRISE REQUIREMENTS CONFIGURATIONdrive Performance- centric Storage- centric Archive- centric
  • 11. 11 Internal Use - Confidential THE BEDROCK OF THE MODERN DATA CENTER PowerEdge R740xd High performance server Performance and Scale Expanded GPU & storage capacity boost workload performance Innovative Design Up to 24 NVME with up to 18 x 3.5” drives Integrated Security Cyber resilient architecture, security is integrated into full server lifecycle – from design to retirement Intelligent automation New OpenManage™ Enterprise console delivers crystal clear reporting & full lifecycle automation 11
  • 12. Market Leader Hadoop Shared Storage Customers running Analytics / Hadoop PBs of Analytics / Hadoop • World’s #1 Courier Company • 3 of the largest telecommunications companies in the Americas • One of the largest online retailer • Multiple leading financial institutions WHO IS USING ISILON FOR ANALYTICS? 385 Isilon Analytics Momentum 21 Industry Verticals
  • 13. 13 Internal Use - Confidential Ethernet Job Tracker Task Tracker DataNode 2nd NameNode NameNode Hadoop Architecture - Traditional R (RHIPE) Mahout Hive HBasePIG NameNode Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node
  • 14. 14 Internal Use - Confidential Ethernet R (RHIPE) PIG Mahout Hive HBase Job Tracker Task Tracker DataNode Compute Node Compute Node Compute Node Compute NodeCompute Node Compute Node NameNode Hadoop Architecture with Isilon name node name node name node name node datanode
  • 15. 15 Internal Use - Confidential ISILON DATA LAKE DATA PROTECTION DATA SECURITY PERFORMANCE MANAGEMENT DATA MANAGEMENT
  • 16. 16 Internal Use - Confidential HDFS SMB, NFS, HTTP, FTP, HDFS node info node info node info node info node info node info node info node info node info Node reply Node reply Node reply Node reply Node reply Node reply Node reply Node reply Node reply file file file file file file file file Node reply Node reply Node reply Node replyNFS NFS SMB SMB name node name node name node name node name node name node name node MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce datanodedatanode Isilon OneFS Compute Data 1X HOW IT LOOKS Name node Data Compute
  • 17. Workload Consolidation and streaming analytics / Sharepoint
  • 18. Internal Use - Confidential Phased Approach to Hadoop Tiered Storage with Isilon • Hadoop Cluster with DAS for interactive and batch queries • Queriable “active archive” in Isilon / ECS configured as a separate Hadoop cluster • Archival policy implemented using scripts executed manually Phase 0: Archival Cluster • Hot data in Hadoop Cluster with DAS • Cold data in Isilon configured as a HDFS Target • Hive, map-reduce and Spark jobs can run across the 2 clusters • URIs to indicate whether data is in DAS cluster or Isilon Cluster • Tiering policy implemented using scripts executed manually Phase 1: Tiering with Location Aware queries Same as Phase 1, with additional capability : • Data location handled transparently for Hive, map-reduce and Spark jobs : URIs don’t need to indicate whether data is in DAS cluster or Isilon Cluster Phase 2: Tiering with Location transparent queries Same as Phase 2, with additional capability : • Tiering policy implemented using automated data movement mechanisms. Phase 3: Automated tiering
  • 19. 19 Internal Use - Confidential It is an ecosystem where sensors, devices and equipment are connected to a network and can transmit and receive data for tracking, analysis and action. Operational Technology Industrial automation Fleet telematics Material handling Information Technology Assets Inventory People IoT It’s not new and not new to Dell. It is the integration and extension of OT and IT technologies that have been round for decades What is the Internet of Things?
  • 20. 20 Internal Use - Confidential It’s a great big IoT world out there Smart Connected Business – from gateways to informed decisions Transport Connector Private and public networks10’s of billions of connected things Things Sensors High-performance computer infrastructure Application layer SAP Hana In-Memory database layer Libraries Manufacturing Energy and Natural Resources Transportation Building & Industrial Automation
  • 21. 21 Internal Use - Confidential Multiple Partners and Blueprints for OT / IT SAP HANA®Software AG Apama® Dell Edge Gateway 5000 Structured Data Dell EMC Data Center Real-Time Data Unstructured Data Kepware KEPServerEX® VisualizationsStream Analytics Machine LearningReportingAnalyticsProtocol Translation 0 0 1 0 1 1 1 0 0 1 1 0
  • 22. Our Vision for Unstructured Storage OBJECT STREAM FILE ISILONECS PROJECT NAUTILUS Software-DefinedIn The CloudCommon ExperienceCommon Hardware
  • 23. Internal Use - Confidential Project “Nautilus” Streaming Storage + Analytics EngineProject Nautilus Turbocharge Isilon and ECS for Streaming Batch Storage tier Streaming IoT data
  • 24. Today’s IoT Analytics “Accidental Architecture” Batch Real-Time Interactive exploration by Data Scientists Real-time intelligence at the NOC Sensors MirrorMaker DR Site Mobile Devices App Logs Producers Surface / Act
  • 25. Internal Use - Confidential Project Nautilus: A Unified Data Pipeline Strongly Consistent Storage  Exactly Once Processing  Unified Analytics Unified Analytics Real-Time, Batch, Interactive Sensors Mobile Devices App Logs Isilon / ECS Ingest Buffer Pub/Sub Search Persistent Data Structures Pravega Streams Unified Storage Real-time intelligence at the NOC Interactive exploration by Data Scientists Surface / Act Producers
  • 26. Internal Use - Confidential Project Nautilus: A Unified Data Pipeline Strongly Consistent Storage  Exactly Once Processing  Unified Analytics Unified Analytics Real-Time, Batch, Interactive Sensors Mobile Devices App Logs Isilon / ECS Ingest Pub/Sub Search S3 Pravega Streams Unified Storage Real-time intelligence at the NOC Interactive exploration by Data Scientists Surface / Act Producers HDFS NFS SMB
  • 27. Internal Use - Confidential pravega.io