SlideShare une entreprise Scribd logo
1  sur  31
MongoDB and IBM How Customer and Developer
Challenges in a heterogenous data world have driven a more
holistic data management strategy and approach to consumable
solutions, team execution, and AI.
Michael Connor
Program Director, Offering Management, Open Source Databases
IBM Analytics
November 8, 2018
74% of respondents say their data landscape is so complex that it
limits agility
85% struggle with data from a variety of locations, and 72 % say
that their data landscape is complex with the variety and
number of data sources
75% of analytics solutions will incorporate 10 or more exogenous
data sources from 2nd party partners or 3rd party providers
(Gartner states that by 2019)
Data Management is complex…
3
It’s not the team
with the best
players that wins.
It’s the players
with the best team
that wins!
4
“We must have had 99
percent of the game. it
was the other three
percent that cost us
the match.”
Ruud Gullit
168databases in average in the enterprise
• IBM’s commitment to:
• Open Source
• Freedom of choice
• Fusing Open Source with a modern, high performance data architecture
• Cloud flexibility – for data federation, advanced analytics and AI
• IBM Cloud Private for Data
…change drives response
7
More change…
43% ‘striving to develop high quality,
high performance and secure code.’
25% view benefit of AI integration
Group Name / DOC ID / Month XX, 2018 / © 2018 IBM
Corporation
8
74% of developers are primarily using AI or ML to improve existing apps rather
than to create new native AI apps
9
9
…and response…Ladder to AI
Multi-Cloud
COLLECT
ORGANIZE
ANALYZE
AUTOMATE
Data of every type, regardless of where it lives
MODERNIZE
TRUST
AI
The Hybrid Data
Management Solution
Set expands access
by leveraging the
Common SQL Engine
and Virtualization
improving visibility of
disparate data sources
across the enterprise
Digital transformation journey with hybrid data management
Hybrid Data
Management
COLLECT
Governance and
Integration
ORGANIZE
Data Science and
Business Analytics
ANALYZE
Write Once, Run Anywhere, with a Common SQL Engine
Hybrid Data Management Solutions - Unified application and user experience
Anchored by a Common SQL Engine enabling true, highly scalable hybrid data warehousing solutions with portable analytics
– Application compatibility
Write once, run anywhere
– Operational compatibility
Reuse operational and housekeeping procedures
– Licensing
Single entitlement for flexible consumption
enabling business agility and cost-optimization
– Integration
Data virtualization capabilities for query
federation and data movement
– Standardized analytics
Common programming model for in-DB analytics
– Ecosystem
One ISV product certification for all platforms
Managed public
Cloud DBaaS
Db2 on Cloud
Db2 Warehouse
on Cloud
Compose
Software
defined warehouse
on-premises
or in cloud
Db2 Warehouse
Dedicated analytics
appliance
Integrated Analytics
System
Custom deployable
database
Db2
Open source
MongoDB
PostgresSQL
Big SQL
Hadoop w/Hortonworks
So what is Data
Virtualization?
The ability to view, access,
manipulate and analyze data
without the need to know or
understand its physical
format or location.
13
Bringing Data
Virtualization to bear
on real problems
Where it applies …
Optimizing the analytics over different lines of
business.
Unifying data from multiple independent
without copying the data
Staying in compliance with privacy and security
legislation.
Combining IoT and enterprise data.
14
Your applications can provide transparent access to other data sources via built-in data virtualization
Your
applications IBM
Hybrid Data
Management
Data Sources
Data virtualization enables IT provisioning for the business
Select details for MongoDB:
• JSON support for NoSQL data stores
 Federate to MongoDB collection
 Ability to parse and query collection
 Initial phase support local processing
 Next phase supports Pushdown
- DDL to create Federation objects
create server mongotest type jdbc version 2.54 wrapper JAVA
options(host '9.30.252.5', port '28017', dbname 'test');
create nickname students
(
name char(32) options(jpath '$.name'),
exam_score double options(jpath
'$.scores[0].score'),
quiz_score double options(jpath
'$.scores[1].score'),
homework_score double options(jpath
'$.scores[2].score')
)
for server mongotest options(collection 'students');
- SQL for federated query from MongoDB
select * from students where exam_score > 60;
NAME EXAM_SCORE QUIZ_SCORE HOMEWORK_SCORE
--------------------------- -------------------------- ------------------------- --------------------------
Salena Olmos +8.03782650915718E+001 +4.24878066695681E+001 +9.65298617163333E+001
Sanda Ryba +7.70050995365469E+001 +8.78044963253892E+001 +2.52736853243295E+001
Aurelia Menendez +6.50604507103096E+001 +5.27979069190387E+001 +7.17613343916554E+001
{
"_id":1,
"name":"Aurelia Menendez",
"scores":[
{
"score":65.06045071030959,
"type":"exam"
},
{
"score":52.79790691903873,
"type":"quiz"
},
{
"score":71.76133439165544,
"type":"homework"
}
]
}
student
s
companies
…
Database: test
Example json document
Federated access to MongoDB
Federation
Data Source
SQL
Data
Predicate Pushdown
JSON Data Parsing
Nickname
Server
Nickname
Nickname
Moving forward an extended approach is required
Dynamic multipath routing avoids
bottlenecks and slow systems
Each node instead simultaneously
sends the relevant portions of the
query to both the connected data
source(s) to it’s peers in the
network.
Combines and process the results
as they are received.
Implicitly results in balanced
processing of the query through
the constellation.
Simplifying data
consumption by
automating
provisioning, access
control and governance
including IBM Cloud
Private and IBMCloud
Private for Data
Collaborative
Teaming across
various Roles
19
Data Engineer
Architects data pipelines & ensures operability.
Data Steward
Governs data & ensures regulatory compliance.
Data Scientist
Gets deep into the data to draw insights for the business.
Business Analyst
Works with data to apply insights to business strategy.
App Developer
Plugs into analysis and code to build apps.
Collect Data
– Fast provisioning of
Databases
– Data Warehousing
– Fast data ingest
– Data Virtualizing for
internal and remote
sources
– Structured and
Unstructured data
Organize Data
– Data integration & shaping
– Data curation
– Governance and privacy
policies
– Data asset lifecycle
management
Analyze Data
– Self-service analytics
tooling and productivity
– Data visualization &
exploration
– Machine learning
– Model management and
deployment
– Dashboards and business
reporting
What is IBM Cloud Private for Data?
Cloud-native Micro Services
Instant Provisioning of Infrastructure & Experiences
• Data Science & ML
• Data Preparation
• Dashboards
New
Enterprise Data Catalog
IBM Cloud Private
• Data integration
• Data profiling
• Policy management
• Databases & warehousing
• Fast data event store
• Data virtualization
A Self-Service, cloud native experience
Enterprise Dbs
Community Dbs
• Redis Community• MariaDB Community• MongoDB Community
• PostgreSQL Community w/Elite Support
• MongoDB Enterprise Advanced
*Coming
Soon…more to
follow
MongoDB Enterprise Tile
Accessible in ICP Catalog
User Requests MongoDB
Enterprise Instance(s)
Kupernetes VM Environment
Provisioned
Deploy MongoDB Instances
andOps Manager
Secure Access, Manage, and
Grow
App Dev - Data In Motion
Analytics Visualizations
Warehouse Disaster Recovery
Data Preparation / Wrangling
Data Repositories
Technology Consulting
Persistent Contain Storage
Storage as a Service
A growing ecosystem
Data Science,
Machine Learning
and greater data
understanding
including IBM Watson
24https://github.com/IBM/watson-training-from-on-prem-data
IBM Watson is a vast umbrella of
technologies and solutions, one of
which is Watson Studio, a PAML
solution
Watson Studio blends workflow
capabilities with open source
machine learning libraries and
notebook-based interfaces
It is designed for all
collaborators— who are key to
making machine learning models
surface into production
applications
Watson offers easy integrated
access to IBM Cloud pretrained
machine learning models such as
Visual Recognition, Watson
Natural Language Classifier, and
many others.
What is Watson?
3
Open Source tools – Jupyter and RStudio
Watson Visual Recognition – retrain Watson
Elastic and customizable compute environments
Create ML flows and design Neural Networks visually
Tooling and API to make building apps easy, with the
ability to create and manage custom models with Watson
Studio.
Support for CoreML to leverage models on iOS devices.
Privacy and Security ensured by IBM.
An image recognition service that enables users
to quickly and accurately tag, classify, and train
visual content using machine learning.
BASIL
LEAF
HERB
PLANT STEM
GREEN
Visual Recognition
26
General
Faces
Custom
Food
Text Explicit
Quickly
understand
the
contents,
scenes, and
actions
within an
image.
Locate faces
within an image
and receive age
and gender
estimates.
GA: Face
Detection
4Q Beta: Face
Matching
Determine if an
image contains
inappropriate
content that may
be unsuitable for
general
audiences.
Train
Watson to
understand
and classify
your own
custom
content.
Recognize foods
and meals with
enhanced
accuracy.
Extract full
words from
natural scene
images (i.e.
billboards,
street signs)
Watson Visual Recognition is trained
on:
Visual Inspection: An Insurance company
builds an image recognition solution to
automate visual inspections for damage,
defects, and quality assurance.
Aerial Inspection: A drone can use a custom
image model to survey and quickly identify
burned or flood damaged homes.
Social Media Listening: An Advertising
agency analyzes visual content in social
media posts to understand content,
sentiment, and trends.
Demographics: A Retailer uses face
detection capabilities to gather age and
gender estimates of its shoppers.
Resource Identification: A Mining &
Minerals company uses image recognition to
automatically identify assets and sites in
satellite imagery.
Content Enrichment: A Media company
uses image recognition to automatically
append metadata to visual content, turning
dark data into searchable content.
Identify multiple objects in images to better
understand an image as a whole.
4Q Roadmap: Closed BetaObject
Detectio
n
What is Visual Recognition used for?
27
* Two critical concepts: Classifiers/classes and Scores
https://pbs.twimg.com/media/CdJavKoUAAAFLBG.jpg
Watson Visual Recognition – What is produced?
Enhancing the data
experience leveraging
IBM Power and IBM Z
platform capabilities
IBM Confidential29
Secure and Scale Your Data through IBM
Z®
/Hyper Protect Services / DBaaS
PowerAI + IBM POWER9™ + GPUs
with MongoDB on Private and Hybrid
Cloud
Advantage with IBM Platforms
• 4x faster model training on best GPU
server for AI
• 43% lower solution costs saving up to
$2M per rack
• 2x faster on fewer systems with less
cost
•
• Industry-leading data confidentiality
through built-in workload isolation,
restricted administrator access, tamper
protection against internal threats
• High availability and reliability
• Supports industry compliance and
certifications – GDPR
• Provides standard APIs to provision,
manage, maintain and monitor multiple
database types
• Integrates with IBM Cloud services
Summary: Why IBM and Hybrid Data
 IBM has always been major sponsor of Open Source including in areas of
Hadoop, Data science, NoSQL, and Java
 IBM Hybrid Data Management now includes MongoDB Enterprise with
Integration points including Db2 family Federation, Governance, Analytics
 Cloud Private speed consumption of databases across organizations and roles
with Watson fit helps move organizations from analytics to AI
 IBM Platforms can extend further the benefits of HDM
IBM is focused on getting developers to AI faster
Global Markets31

Contenu connexe

Tendances

Sainath_Resume_updated
Sainath_Resume_updatedSainath_Resume_updated
Sainath_Resume_updated
sainath devara
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
Adrian Turcu
 

Tendances (13)

Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
 
Internet of Things in Tbilisi
Internet of Things in TbilisiInternet of Things in Tbilisi
Internet of Things in Tbilisi
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
My Master's Thesis
My Master's ThesisMy Master's Thesis
My Master's Thesis
 
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph AlgorithmsNeo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
 
4. Document Discovery with Graph Data Science
 4. Document Discovery with Graph Data Science 4. Document Discovery with Graph Data Science
4. Document Discovery with Graph Data Science
 
AI in the Enterprise at Scale
AI in the Enterprise at ScaleAI in the Enterprise at Scale
AI in the Enterprise at Scale
 
Simplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data VirtualizationSimplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data Virtualization
 
Sainath_Resume_updated
Sainath_Resume_updatedSainath_Resume_updated
Sainath_Resume_updated
 
Unveiling the knowledge in knowledge graphs
Unveiling the knowledge in knowledge graphsUnveiling the knowledge in knowledge graphs
Unveiling the knowledge in knowledge graphs
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
 
IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
 IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
 

Similaire à Data Management is a Team Sport - IBM

Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
YogeshIJTSRD
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
PyData
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo
 

Similaire à Data Management is a Team Sport - IBM (20)

Introduction To Sql Server Data Mining
Introduction To Sql Server Data MiningIntroduction To Sql Server Data Mining
Introduction To Sql Server Data Mining
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
 
Data engineering design patterns
Data engineering design patternsData engineering design patterns
Data engineering design patterns
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWS
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
 
Introduction to Machine learning and Deep Learning
Introduction to Machine learning and Deep LearningIntroduction to Machine learning and Deep Learning
Introduction to Machine learning and Deep Learning
 
IBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio
IBM Meetup on November 1, 2018: Machine Learning made easy with Watson StudioIBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio
IBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
DEVOPS AND MACHINE LEARNING
DEVOPS AND MACHINE LEARNINGDEVOPS AND MACHINE LEARNING
DEVOPS AND MACHINE LEARNING
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
 
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4jNeo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
 
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoTMeetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 

Plus de MongoDB

Plus de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Dernier

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Dernier (20)

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 

Data Management is a Team Sport - IBM

  • 1. MongoDB and IBM How Customer and Developer Challenges in a heterogenous data world have driven a more holistic data management strategy and approach to consumable solutions, team execution, and AI. Michael Connor Program Director, Offering Management, Open Source Databases IBM Analytics November 8, 2018
  • 2. 74% of respondents say their data landscape is so complex that it limits agility 85% struggle with data from a variety of locations, and 72 % say that their data landscape is complex with the variety and number of data sources 75% of analytics solutions will incorporate 10 or more exogenous data sources from 2nd party partners or 3rd party providers (Gartner states that by 2019) Data Management is complex…
  • 3. 3 It’s not the team with the best players that wins. It’s the players with the best team that wins!
  • 4. 4 “We must have had 99 percent of the game. it was the other three percent that cost us the match.” Ruud Gullit
  • 5. 168databases in average in the enterprise
  • 6. • IBM’s commitment to: • Open Source • Freedom of choice • Fusing Open Source with a modern, high performance data architecture • Cloud flexibility – for data federation, advanced analytics and AI • IBM Cloud Private for Data …change drives response
  • 8. 43% ‘striving to develop high quality, high performance and secure code.’ 25% view benefit of AI integration Group Name / DOC ID / Month XX, 2018 / © 2018 IBM Corporation 8 74% of developers are primarily using AI or ML to improve existing apps rather than to create new native AI apps
  • 9. 9 9 …and response…Ladder to AI Multi-Cloud COLLECT ORGANIZE ANALYZE AUTOMATE Data of every type, regardless of where it lives MODERNIZE TRUST AI
  • 10. The Hybrid Data Management Solution Set expands access by leveraging the Common SQL Engine and Virtualization improving visibility of disparate data sources across the enterprise
  • 11. Digital transformation journey with hybrid data management Hybrid Data Management COLLECT Governance and Integration ORGANIZE Data Science and Business Analytics ANALYZE
  • 12. Write Once, Run Anywhere, with a Common SQL Engine Hybrid Data Management Solutions - Unified application and user experience Anchored by a Common SQL Engine enabling true, highly scalable hybrid data warehousing solutions with portable analytics – Application compatibility Write once, run anywhere – Operational compatibility Reuse operational and housekeeping procedures – Licensing Single entitlement for flexible consumption enabling business agility and cost-optimization – Integration Data virtualization capabilities for query federation and data movement – Standardized analytics Common programming model for in-DB analytics – Ecosystem One ISV product certification for all platforms Managed public Cloud DBaaS Db2 on Cloud Db2 Warehouse on Cloud Compose Software defined warehouse on-premises or in cloud Db2 Warehouse Dedicated analytics appliance Integrated Analytics System Custom deployable database Db2 Open source MongoDB PostgresSQL Big SQL Hadoop w/Hortonworks
  • 13. So what is Data Virtualization? The ability to view, access, manipulate and analyze data without the need to know or understand its physical format or location. 13
  • 14. Bringing Data Virtualization to bear on real problems Where it applies … Optimizing the analytics over different lines of business. Unifying data from multiple independent without copying the data Staying in compliance with privacy and security legislation. Combining IoT and enterprise data. 14
  • 15. Your applications can provide transparent access to other data sources via built-in data virtualization Your applications IBM Hybrid Data Management Data Sources Data virtualization enables IT provisioning for the business Select details for MongoDB: • JSON support for NoSQL data stores  Federate to MongoDB collection  Ability to parse and query collection  Initial phase support local processing  Next phase supports Pushdown
  • 16. - DDL to create Federation objects create server mongotest type jdbc version 2.54 wrapper JAVA options(host '9.30.252.5', port '28017', dbname 'test'); create nickname students ( name char(32) options(jpath '$.name'), exam_score double options(jpath '$.scores[0].score'), quiz_score double options(jpath '$.scores[1].score'), homework_score double options(jpath '$.scores[2].score') ) for server mongotest options(collection 'students'); - SQL for federated query from MongoDB select * from students where exam_score > 60; NAME EXAM_SCORE QUIZ_SCORE HOMEWORK_SCORE --------------------------- -------------------------- ------------------------- -------------------------- Salena Olmos +8.03782650915718E+001 +4.24878066695681E+001 +9.65298617163333E+001 Sanda Ryba +7.70050995365469E+001 +8.78044963253892E+001 +2.52736853243295E+001 Aurelia Menendez +6.50604507103096E+001 +5.27979069190387E+001 +7.17613343916554E+001 { "_id":1, "name":"Aurelia Menendez", "scores":[ { "score":65.06045071030959, "type":"exam" }, { "score":52.79790691903873, "type":"quiz" }, { "score":71.76133439165544, "type":"homework" } ] } student s companies … Database: test Example json document Federated access to MongoDB Federation Data Source SQL Data Predicate Pushdown JSON Data Parsing Nickname Server Nickname Nickname
  • 17. Moving forward an extended approach is required Dynamic multipath routing avoids bottlenecks and slow systems Each node instead simultaneously sends the relevant portions of the query to both the connected data source(s) to it’s peers in the network. Combines and process the results as they are received. Implicitly results in balanced processing of the query through the constellation.
  • 18. Simplifying data consumption by automating provisioning, access control and governance including IBM Cloud Private and IBMCloud Private for Data
  • 19. Collaborative Teaming across various Roles 19 Data Engineer Architects data pipelines & ensures operability. Data Steward Governs data & ensures regulatory compliance. Data Scientist Gets deep into the data to draw insights for the business. Business Analyst Works with data to apply insights to business strategy. App Developer Plugs into analysis and code to build apps.
  • 20. Collect Data – Fast provisioning of Databases – Data Warehousing – Fast data ingest – Data Virtualizing for internal and remote sources – Structured and Unstructured data Organize Data – Data integration & shaping – Data curation – Governance and privacy policies – Data asset lifecycle management Analyze Data – Self-service analytics tooling and productivity – Data visualization & exploration – Machine learning – Model management and deployment – Dashboards and business reporting What is IBM Cloud Private for Data?
  • 21. Cloud-native Micro Services Instant Provisioning of Infrastructure & Experiences • Data Science & ML • Data Preparation • Dashboards New Enterprise Data Catalog IBM Cloud Private • Data integration • Data profiling • Policy management • Databases & warehousing • Fast data event store • Data virtualization A Self-Service, cloud native experience Enterprise Dbs Community Dbs • Redis Community• MariaDB Community• MongoDB Community • PostgreSQL Community w/Elite Support • MongoDB Enterprise Advanced *Coming Soon…more to follow MongoDB Enterprise Tile Accessible in ICP Catalog User Requests MongoDB Enterprise Instance(s) Kupernetes VM Environment Provisioned Deploy MongoDB Instances andOps Manager Secure Access, Manage, and Grow
  • 22. App Dev - Data In Motion Analytics Visualizations Warehouse Disaster Recovery Data Preparation / Wrangling Data Repositories Technology Consulting Persistent Contain Storage Storage as a Service A growing ecosystem
  • 23. Data Science, Machine Learning and greater data understanding including IBM Watson
  • 24. 24https://github.com/IBM/watson-training-from-on-prem-data IBM Watson is a vast umbrella of technologies and solutions, one of which is Watson Studio, a PAML solution Watson Studio blends workflow capabilities with open source machine learning libraries and notebook-based interfaces It is designed for all collaborators— who are key to making machine learning models surface into production applications Watson offers easy integrated access to IBM Cloud pretrained machine learning models such as Visual Recognition, Watson Natural Language Classifier, and many others. What is Watson? 3 Open Source tools – Jupyter and RStudio Watson Visual Recognition – retrain Watson Elastic and customizable compute environments Create ML flows and design Neural Networks visually
  • 25. Tooling and API to make building apps easy, with the ability to create and manage custom models with Watson Studio. Support for CoreML to leverage models on iOS devices. Privacy and Security ensured by IBM. An image recognition service that enables users to quickly and accurately tag, classify, and train visual content using machine learning. BASIL LEAF HERB PLANT STEM GREEN Visual Recognition
  • 26. 26 General Faces Custom Food Text Explicit Quickly understand the contents, scenes, and actions within an image. Locate faces within an image and receive age and gender estimates. GA: Face Detection 4Q Beta: Face Matching Determine if an image contains inappropriate content that may be unsuitable for general audiences. Train Watson to understand and classify your own custom content. Recognize foods and meals with enhanced accuracy. Extract full words from natural scene images (i.e. billboards, street signs) Watson Visual Recognition is trained on: Visual Inspection: An Insurance company builds an image recognition solution to automate visual inspections for damage, defects, and quality assurance. Aerial Inspection: A drone can use a custom image model to survey and quickly identify burned or flood damaged homes. Social Media Listening: An Advertising agency analyzes visual content in social media posts to understand content, sentiment, and trends. Demographics: A Retailer uses face detection capabilities to gather age and gender estimates of its shoppers. Resource Identification: A Mining & Minerals company uses image recognition to automatically identify assets and sites in satellite imagery. Content Enrichment: A Media company uses image recognition to automatically append metadata to visual content, turning dark data into searchable content. Identify multiple objects in images to better understand an image as a whole. 4Q Roadmap: Closed BetaObject Detectio n What is Visual Recognition used for?
  • 27. 27 * Two critical concepts: Classifiers/classes and Scores https://pbs.twimg.com/media/CdJavKoUAAAFLBG.jpg Watson Visual Recognition – What is produced?
  • 28. Enhancing the data experience leveraging IBM Power and IBM Z platform capabilities
  • 29. IBM Confidential29 Secure and Scale Your Data through IBM Z® /Hyper Protect Services / DBaaS PowerAI + IBM POWER9™ + GPUs with MongoDB on Private and Hybrid Cloud Advantage with IBM Platforms • 4x faster model training on best GPU server for AI • 43% lower solution costs saving up to $2M per rack • 2x faster on fewer systems with less cost • • Industry-leading data confidentiality through built-in workload isolation, restricted administrator access, tamper protection against internal threats • High availability and reliability • Supports industry compliance and certifications – GDPR • Provides standard APIs to provision, manage, maintain and monitor multiple database types • Integrates with IBM Cloud services
  • 30. Summary: Why IBM and Hybrid Data  IBM has always been major sponsor of Open Source including in areas of Hadoop, Data science, NoSQL, and Java  IBM Hybrid Data Management now includes MongoDB Enterprise with Integration points including Db2 family Federation, Governance, Analytics  Cloud Private speed consumption of databases across organizations and roles with Watson fit helps move organizations from analytics to AI  IBM Platforms can extend further the benefits of HDM IBM is focused on getting developers to AI faster

Notes de l'éditeur

  1. If you are watching the soccer world cup that’s happening right now, one thing becomes clear: You can have the best individual player, but to win consistently or the championship, you need to have the best team.