Contenu connexe Similaire à Data Insights and Analytics: Simplifying Data Lake and Modern BI Architecture (20) Data Insights and Analytics: Simplifying Data Lake and Modern BI Architecture1. The First Step in Information Management
looker.com
Produced by:
MONTHLY SERIES
In partnership with:
Simplifying Data Lake and Modern BI Architecture
February 1, 2018
Sponsored by:
18. What “Simplifying” Really Means
Processes for Modern Business Intelligence (BI) Architecture
Deployment Process Requirements
Bridging the Gap: Traditional BI to Contemporary BI and Data Lakes
Key Takeaways
Q&A
Let’s Keep Things Simple Today …
pg 2First San Francisco Partners www.firstsanfranciscopartners.com© 2018
20. “Architecture” Defined
The art and discipline of designing buildings and structures, from the
macro-level of urban planning to the micro-level of creating furniture
and machine parts.
The design of any complex object or system. It may refer to the implied architecture of
abstract things such as music or mathematics, the apparent architecture of natural
things such as geological formations or living things, or explicitly planned architecture
of human-made things such as buildings, machines, organizations, processes, software
and databases.
The organized arrangement of component elements to optimize the function,
performance, feasibility, cost and/or aesthetics of an overall structure.
pg 4© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
From The DAMA Guide to the Data Management Body of Knowledge
21. Simply put, it’s
about the
architecture.
Simplifying Doesn’t Mean It’s Simple
pg 5First San Francisco Partners www.firstsanfranciscopartners.com© 2018
“Simplified” architecture:
Easier to use
More flexible
More consensus on its structure
Easier to support “realistic” self-service capabilities
Fewer modifications are required
Easier to manage and govern
Isn’t easily disrupted (broken)
Simplicity can come from lessons learned
22. www.firstsanfranciscopartners.com
Processes Needed to Derive
Modern BI Architecture
The architect should strive continually to simplify;
the ensemble of the rooms should then be carefully considered
that comfort and utility may go hand in hand with beauty.
– Architect Frank Lloyd Wright
23. Two Lenses to Derive an Effective Architecture
pg 7© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Form
Developing the
architecture so all
stakeholders can
actually understand
and develop it
Progression
Develop architectures
that are best fit for
purpose and effective,
no matter how simple
or complex
Simple can be harder than complex.
You have to work hard to get your thinking clean to make it simple.
But it's worth it in the end, because once you get there you can move mountains.
– Steve Jobs
24. Business needs
Organizational culture
Data characteristics
(latency, volumes and quality)
Understand the data landscape
Understand what you have now
Understand Architecture Characteristics
pg 8First San Francisco Partners www.firstsanfranciscopartners.com© 2018
Simply put,
use your
own data.
Granularity
Fact Volatility
Dimensional Complexity
Dimensional Volatility
“Historicity”
Latency
Cross Functionality/Distribution
Size
Source Complexity
Frequency
Response Time
Follow-up Time
Data Quality
Availability
Persistency
Access type
Algorithm Complexity
Content Variety
25. Input funnel/what comes out is what you need
Define the decision-making process
Possible scenarios:
− What you have to use (existing department
database and tools, even Excel)
− What you could use (gap analysis)
Set policies on what’s allowed and not allowed
Manage what goes in the Data Lake
Use the Required Characteristics to Stay Simple
pg 9First San Francisco Partners www.firstsanfranciscopartners.com© 2018
Simply put, be aligned
with your business.
Your architecture
Landscape
Current
State
Business
Needs
26. Reality of the Data Lake
pg 10© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
The Data Lake has changed due to storage availability, data management tools
and ease of which data can be managed.
Today’s Data Lake is comprised of:
‒ Landing Zone
‒ Standardization Zone
‒ Analytics Sandbox
27. Reality of the Data Lake
pg 11© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
LANDING ZONE STANDARDIZATION ZONE ANALYTICS SANDBOX
DATA GOVERNANCE
DATA CONSUMERS
DATA OPERATIONS
DATA SOURCES
DATA SCIENTISTS
DATA MANAGEMENT
28. Data Consumers
pg 12© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Access Layer
Portals
Report, BI,
Query
Workbenches Labs
Data Services, Data Virtualization, ETL
Mobile
Data Logistics
DATA CONSUMERS
30. Have a Methodology
Establish (but with a defined architecture)
a sandbox or proof of concept
Define the vision of value and return
Perform alignment
Assess culture and organizational readiness
Define long-term requirements for use
Define operating models
Design the BI/analytics architecture
Develop a realistic roadmap
Transition to a sustainable architecture
pg 14© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Copyright: First San Francisco Partners, 2017
REQUIREMENTS ROADMAP
OPERATING
MODEL
MEASUREMENT
AND SUSTAINMENT
ARCHITECTURE
AND DESIGN
IMPLEMENTATION
AND OPERATION
STRATEGIZE ACT
ENVISION
AND ALIGN
ASSESS
DISCOVER
INITIATE
31. Data Operations
Data Design
Data Requirements and Discovery
Data Capability Development
Data-Centric
Development
Life Cycle
(High Level)
Source Data
Discovery
Target Data
Architecture
Target Data
Modeling
Target Database
Build
Quality Assurance
Production
Migration
Production Data
Quality Monitoring
Information
Requirements
Map Source Data
to Information
Requirements
Source Data
Analysis
ETL Development
Report
Development
Architecture and
Design
Roadmap
Operating
Model
Strategy
Simply put,
you have to
execute.
Deployment Process
Technology
Rationalization
pg 15
33. Traditional EDW blended with Data Lake
Bridging the Gap
pg 17© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Lack of agility
Performance
Hard to extend
Structured data only
Missed expectations
Enables experimentation
Satisfies timing and
turnaround issues
Allows unstructured data
Mature and useful
technology advances
34. Bridging the Gap
pg 18© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Lake technology to leap frog Data Warehouse
Traditional EDW blended with Data Lake
Lack of agility
Performance
Hard to extend
Structured data only
Missed expectations
Enables experimentation
Satisfies timing and
turnaround issues
Allows unstructured data
Mature and useful
technology advances
Organizations without
Data Warehouse simply
start with a Data Lake
Or organizations that need to
evolve their warehouse
CAREFULLY replace it with a
Data Lake
35. − Gather the data on your characteristics
− Align how you will use it with business needs
− Remember 30 years of lessons learned
Replacing an Enterprise Data Warehouse
pg 19First San Francisco Partners www.firstsanfranciscopartners.com© 2018
Simply put, form
follows function.
− Assume the characteristics are the same
− Blindly follow a reference architecture
− Just lift tables over to the lake
− Build it and they will come (they still won’t)
36. FSFP Reference Architecture
Like an I-beam, the
data architecture
needs to take the
load of meeting
business objectives,
and distribute that
load to supportive
structures
pg 20© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
DATA INSIGHT ARCHITECTURE
Wrangling
Layer
Management Layer
Data Access Layer
Business Strategy
37. FSFP Reference Architecture
DATA INSIGHT ARCHITECTURE
pg 21© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
1
Data Life
Cycles
Management
Data Usage
Vintage Area Contemporary Area
Business Strategy
Legacy BI and Reporting
Data Warehouse, ODS, Mart
ETL, EAI, Replication
Data Lake, Pond
NoSQL (HDFS, Graph)
Advanced Analytics
RDBMS, SQL, In-Memory
Appliance
Metadata Lineage Reference Data
Alignment
Data Monetization
Visualization DataWranglingMobile Logical DW
Unstructured Data
39. Simplifying means being able to use and adapt your
BI/Data Lake architecture without a lot of trauma.
If your BI/Data Lake architecture reflects your
business environment, it will be easier to
understand and use.
Blindly adapting an external reference architecture is a formula for
confusion, i.e., complexity.
Leverage what you have – i.e., the knowledge, expertise and opportunities
in your organization.
Key Takeaways
pg 23First San Francisco Partners www.firstsanfranciscopartners.com© 2018
Simply put, don’t
completely reinvent
the wheel.
41. Thank you for joining – thanks, also, to
Looker.com for sponsoring the webinar.
Please join our next webinar on Thursday, March 1,
The Importance of Effective Communications in Analytics.
John Ladley @jladley
john@firstsanfranciscopartners.com
Kelle O’Neal @kellezoneal
kelle@firstsanfranciscopartners.com