2. Watson Data Platform Architecture - Overview
Common Processes
Common Data
Business
Analyst
BI
Developer
API
Data
Scientist
DS
Tools
Data &
Analytics
Processing
Protected
Data Access
(Governance)
Data
Sources StreamsPublicExternalAppsCloudOn Prem
Data
Engineer
DE
Data Flows, Models,
Machine Learning
Security, Governance,
Auditing, etc.
Productive use
experiences geared
to specific personas
Broad set of
connectivity
3. We are unleashing the power of data with Watson Data Platform
Data Engineering Data Science Business Analysis App Development
Data Sources
• On-premises / cloud
• Structured / unstructured
[and content repositories]
• In-motion / at-rest
• Internal / external
Hadoop
NoSQL / SQL
Object store
Discovery / Exploration
Machine learning
Model development
Reports / Dashboards
Applications
APIs
Integration
Matching / Quality
Streaming
Persist
Analyze
Ingest Deploy
Iterate
Govern
Data Assessment
Metadata / Policies
Find Share Collaborate
Data Fabric
common data, pipelines and projects
Composable data &
analytics cloud services
1
1
2 Tailored user
experiences for data
professionals
3 Foundational
elements that
provide a common
catalog, projects,
and community
capabilities across
the platform
3
3
2
4. We have the breadth of capabilities and offerings required
Fit for purpose
User Experiences
Ingest
Analyze
Persist
Data Connect for the Data Engineer
Data Science Experience for the Data Scientist
Watson Analytics for the Business Analyst
Bluemix for the Developer
Access and prepare data with Data Connect
Migrate data with Bluemix Lift
Capture data-in-motion with Streaming Analytics
Rapid, in-memory processing with Apache Spark
Continuous intelligence with Watson Machine Learning
Real-time analysis on data-in-motion with Streaming Analytics
Easily store JSON data with Cloudant
Optimize for analytic workloads and warehouse data with dashDB for Analytics
Optimize for online transactional processing workloads with dashDB for Transactions
Answer questions about complex networks of inter-related data with Graph
Choose from the best open-source databases with Compose
5. Watson Data Platform Fabric
A console to manage activities
and monitor usage
A catalog to store and unify
metadata across multiple
sources
A single orchestration pipeline
that brings together data from
distinct sources and flows that
data to multiple runtime engines
Common tools including
shapers and data visualization
capabilities
We are unifying Watson Data Platform through a new “Fabric”
Console Catalog Community Projects
Connectors Tools Orchestration …
Enable
team
collaboration
through
projects and a
community
Enable simple
access to
disparate data
sources
through
connectors
6. 6
Tailored Experiences for Users Collaborating Together
Architects how data is organized &
ensures operability
Gets deep into the data to draw
hidden insights for the business
Works with data to apply
insights to the business strategy
Plugs into data and models &
writes code to build apps
Inges
t
data
Transfor
m: clean
Create
and
build
model
Evaluat
e
Deliver
and
deploy
model
Communicate
results
Understand
problem and
domain
Explore and
understand
data
Transfor
m:
shape
OUTPUT
ANALYSIS
INPUT Data Engineer
Data Scientist
Business Analyst
App Developer
IBM Bluemix Data Connect
Data Science Experience
Watson Analytics
Bluemix
7. Data Science Experience
7
Built-in learning to
get started or go the
distance with
advanced tutorials
Learn
The best of open source
and IBM value-add to
create state-of-the-art data
products
Create
Community and social
features that provide
meaningful
collaboration
Collaborate
URL: http://datascience.ibm.com
8. Watson Machine Learning
• ML models are first-class entities in the WDP
asset Catalog
• Model Builder assistant simplifies data
preparation and offers an “Automatic Path” to
help with model selection
9. HOW TO CAPTURE VALUE IN THE
NEXT FIVE YEARS
1 Shift to a platform mindset
10. HOW TO CAPTURE VALUE IN THE
NEXT FIVE YEARS
1 Shift to a platform mindset
2 Meaning Making Owner
11. HOW TO CAPTURE VALUE IN THE
NEXT FIVE YEARS
1 Shift to a platform mindset
2 Meaning Making Owner
3 Embrace the Vision of change
Notes de l'éditeur
WDP is made up of layers
Integrated experiences designed for the core personas
A common processing layer of data flows, models, machine learning and analytics – how we get to the insights
A foundation of common data which enables self-service access
The goal of the platform is to bring these pieces together. By itself DSx is a great tool for data science. Watson Analytics is a great tool for BI. But they don’t enable transformation until the work together across broader use cases.
The “Experiences” are product offerings that consist of a set of services and an interface tailored for each role. In addition to the tailored interface, collaboration really ties the team and the organization together by allowing them to share projects, code and ideas. Imagine a Data Engineer builds out a new data source and shares that asset with the Data Scientist and the Business Analyst. The Business Analyst immediately builds the reports and dashboards they need. The Data Scientist experiments with the data and ultimately builds a model that passes all the tests and is worth of promoting to new applications. They can immediately share that model with the Application Developer who deploys the new application using the model. Along this journey the team members are keeping each other updated on their status, asking questions, maybe sharing ideas or requirements. This is where Data and Analytics Development becomes a team sport. No longer does this need to be done in silos. Additionally, because these assets can now to published – other departments can re-use these assets, making the entire organization more agile.
When you look at refining data into a valuable business results, it will involve a number of roles, each with a specific purpose.
Data Engineer: Drive data integration, connections, and quality
Data Scientist: Find new trends and convert into models
Business Analyst: Perform general analysis and prepares visualizations
App (Application) Developer: Uses many data services and integrates models
The Business Analyst may get assistance from the emerging Data Scientist role who can do a more sophisticated analysis, find a root cause to a problem, and develop a solution based on an insight that he or she discovers. The Data Scientist role is a pretty broad spectrum of individuals, ranging from the traditional user of SPSS, SAS, etc. to the “new” Data Scientist that is manipulating large amounts of data using open sources tool and program languages (versus a GUI).
The Data Engineer is another critical role. They focus enable data integrations, connections (plumbing) and data quality. They do the underlying enablement that a data scientist and business analyst depend on.
The App Developer himself needs data services to store and manage the applications he is building. For example, the data scientists may develop a model or algorithm that then gets instantiated in a reactive application.
They all have similar requirements:
They want self-service; they often take the do-it-yourself approach which makes it challenging to collaborate and move the result to production. It also may result in them not having access to all data.
They often need and want access to many different tools and capabilities, many of which are open-source based.
They want and need to collaborate with each other. Getting all of these groups working together more easily can speed time to insight and results.