At DAMA Day NYC, WhereScape's CTO Neil Barton spoke about the automation of data infrastructure as a necessary component to effectively enable the citizen data scientist and augmented analytics.
Neil also discussed how AI/ML can be used to recommend data ingestion pipelines and models in either supervised or unsupervised paradigms.
Augmented Analytics and Automation in the Age of the Data Scientist
1. Augmented Analytics and Automation in
the Age of the Data Scientist
Neil Barton, CTO, WhereScape
June 2019
2. Data Infrastructure Framework Approach
WhereScapeConventional approach involves slow and manual processes
requiring a multitude of tools, time and resources Automates these tasks to deliver data projects
faster and more efficiently with a single tool
3. Data Sources Ingestion
Data Stores (cloud & on-premise)
Methodology
WhereScape Metadata
IoT / Sensors
Social / Apps
Database
Files
…
Ecosystem
Integration
Batch
CDC
Stream
RDBMS Hadoop Object Stores NoSQL
Dimensional
Data Lake
Data Store
Data Vault
3NF
Design Develop Deploy Document Operate
Data Science
& Exploration
Data Catalog BI Reporting
Data
Virtualization
Data
Governance
Current Landscape
4. The Challenge for IT
• Do more, without more
• Learning curve and skills shortage
• Evolving technologies, changing landscape
• Added data landscape complexity with an
established data infrastructure in place
6. Cloud
Ease of adoption
Elastic compute
Lower (zero) management
Pay only for what you use
YOU still need to build it!!
Source: Snowflake
7. Scenario
• From SQL Server to Snowflake,
using WhereScape® automation
and Data Vault 2.0
Results
• Created first data vault design
in WhereScape within 3 days
• First production data vault
within Snowflake in 3 months,
fully documented
Case Study
8. Streaming: A New Breed of Data and Analytics
Sensor Data Social Media Machine Data
12. Augmented Analytics
Source: Eckerson Group
PURPOSE-BUILT
TOOLS
Production Reports, Ad
Hoc Reports, OLAP
1990s 2000s 2010 2015 2020
BUSINESS
INTELLIGENCE
SUITES
All-In-One Packages
AUGMENTED
INTELLIGENCE
AI-enabled analytics
and analytics-enabled
AI
ANALYTIC
PLATFORMS
Open, extensible,
embeddable analytic
environments
VISUAL DISCOVERY
TOOLS
Self-service tools
1st Generation: BI
IT Generated
2nd Generation: Self-Service BI
Business Generated
3rd Generation: AI
Statistics Generated
13. Despite some dystopian
predictions, machines,
including third-generation BI
tools, will not replace humans;
they will augment them. Like
any automation technology, AI
will liberate people from manual
tasks and the drudgery of
routine, repetitive work. AI will
free people to focus on more
value-added activities, making
them much more productive
and effective at what they do.
– Wayne W. Eckerson
Eckerson Group
Need new Image
14. Data Sources Ingestion
Data Stores (cloud & on-premise)
Methodology
IoT / Sensors
Social / Apps
Database
Files
…
Ecosystem
Integration
Batch
CDC
Stream
RDBMS Hadoop Object Stores NoSQL
Dimensional
Data Lake
Data Store
Data Vault
3NF
Data Science
& Exploration
Data Catalog BI Reporting
Data
Virtualization
Data
Governance
AI-enabled Automation
WhereScape Metadata
Design Develop Deploy Document Operate
WhereScape Active Metadata
Design Develop Deploy Document Operate
Human Driven
AI-enabled
Modernization driven by;
TIME TO VALUE
Evolving Business needs [ data science, IoT, real-time analytics]
Automation First Approach to building modern data infrastructure.
Cloud helps solve ½ the battle – the sys admin / infrastructure side of the problem.
DOES NOT solve the problem of building a well structured and manageable data warehouse environment.