This Slideshare presentation is a partial preview of the full business document. To view and download the full document, please go here:
http://flevy.com/browse/business-document/shared-services-data-management-strategybig-data-and-bi-2174
BENEFITS OF DOCUMENT
1. Collection of Frameworks to assess Information Architecture & build strategy for Shared services model for Enterprise Data Management
2. Frameworks for Enterprise Architecture - Data Quality, Metadata, Data Governance, Data Modeling, Multi-tenancy in Hadoop, Metering & Charge Back, Operating Model, Engagement Model, Implement Agile in Shared Services mode
3. Enhance your understanding of co-existence of Big Data & Traditional Information Management ecosystems
DOCUMENT DESCRIPTION
This is a comprehensive document that details how an enterprise wide shared services model for data management could transform IT business synergy, while creating increased ROI. The strategy document details:
1. Common Issues when adequate data architecture & governance processes are not in place
2. Business/IT Drivers for Shared Services Model for Data Management
3. Enterprise Data Management Framework ? Drive Shared Service Model
4. Factors Influencing Shared Services Enterprise Data Management (EDM) Strategies
5. Critical Success Factors for Shared Service Enterprise Data Management (EDM) Implementation
6. Data strategy Overview
7. Shared Services Enterprise Data Platform ? Foundational elements
8. Conceptual architecture ? Big Data & IM & Analytics for enterprise data management
9. Architecture - Co existence of Hadoop & Relational databases
10. A Framework to Review & Assess Information Architecture
11. Target state view for Shared Services Enterprise Data Platform
12. Vision of Shared Services Enterprise Data Platform
13. Components of Shared Services Enterprise Data Platform
14. Data Quality as Service
15. Metadata Management Architectures - Why Centralized Metadata management?
16. Data Governance - Mobilize & Implement
17. Data Quality & Data Modeling
18. Multi-tenancy in Hadoop clusters for enterprise re-use of Hadoop data lake
a. Framework for Multi tenancy implementation
b. Metering and charge back
c. Disaster Recovery
19. Target Operating Model - Shared Services Model
20. Engagement Model - Initiating Data Program
21. Sample Organization Structure - Data roles
22. Innovation & Product / Vendor evaluation framework
23. Talent building framework
24. Agile Delivery Framework - comparative view with traditional BI Development
25. DW Testing - Agile Process Map
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Shared Services Data Management Strategy- Big Data & BI
1. Data as an Asset
Building Shared Services Enterprise
Data Management
www.aadhyasolutions.com
Arvind.krishnen@outlook.com
2. Business/IT Drivers for Shared Services Model for Data Management
Identifies the existing architecture services and discovers the overlaps and redundant architectures
within the organization and determines which among those can be standardized
Fills the void of the existing architecture services
Adopts best practices from within as well as outside the organization
Establishes enterprise/LOB standards, procedures and governance
Standardizes infrastructure, development methods and operational procedures
Creates repeatable processes, common business rules and components tailored to business needs
Lays down rules and policies on how information is conceived and distributed
Evaluates available tools and methodologies standardizing implementation
Helps to increase business agility
Enables reuse and faster time to market and increases ROI
Business Benefits from Shared Services Model for Enterprise Data Management
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
3. Culture, Politics, Leadership, and Operating Model: Many unique cultural, political and leadership factors like leadership styles, openness of
communication, degree of partnership between business and IT groups within the organization/LOB, degree of autonomy of business groups
within the organization/LOB, etc, define an organization/LOB and influence the effectiveness of a shared services EDM program
Business Linkage : Refers to the extent to which any EDM effort is linked to business strategy.
Senior Management Involvement : If a motivated senior management team exists, ways can be found to make the architecture process more
scalable; to overcome or adjust to tight budgets; or to market the shared services Program more effectively.
Business Participation: The successful data architect must master the process of navigating cultural, organizational, and political barriers and
achieving broad consensus across IT and business organizations alike.
Governance Structure and Compliance Process: The creation of a governance structure with appropriate senior-management, business,
and IT department representation as well as a compliance process is essential to a successful ongoing shared services EDM effort.
EDM Resources: EDM Effectiveness is strongly dependent on the competency and availability of the resources performing the activities of the
DAM process. This includes not only those directly involved in the process, such as the Architects, DBAs, Modelers; but also those involved in
governing the DAM and applying the DAM to their own efforts, such as project mangers, infrastructure engineers, and application developers.
Technology Investment and Procurement : The objective is to deepen the penetration of architecture content into daily, operational activities.
At a minimum, technology investment decisions, even nominal upgrades, must be guided by the shared services EDM process and content.
Critical Success Factors for Shared Service Enterprise Data Management (EDM) Implementation
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
4. Shared Services Enterprise Data Platform – Foundational elements
Data
Categories
Structured Un-Structured Real-time 3rd Party Social & web
IM
infrastructure
Traditional DW Appliance
Research
platforms
BigData & HDFS
Reporting &
dashboarding
Enterprise Data Strategy gets completed when we bring traditional & Hadoop based data management strategies together
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
5. A Framework to Review & Assess Information Architecture..
Data
Architecture
Architecture
Reference Architecture – E.g. Centralized vs. Federated vs.
Distributed, Traditional vs. Cloud or Big Data
Global Strategies – Data Architecture, Master Data Mgmt.,
Metadata Mgmt. etc.
Certified Provisioning – Canonical based Interfaces &
System Rationalization
Reference Architecture – E.g. Centralized vs. Federated vs.
Distributed, Traditional vs. Cloud or Big Data
Global Strategies – Data Architecture, Master Data Mgmt.,
Metadata Mgmt. etc.
Certified Provisioning – Canonical based Interfaces &
System Rationalization
Design
Design based on Configurability & Reusability
Centralized Data Management Framework for Job Control, Auditing &
Monitoring
Embedded Data Hygiene – E.g. Data Reconciliation, Data Quality
Management & Cruise Control for standard failures
Design based on Configurability & Reusability
Centralized Data Management Framework for Job Control, Auditing &
Monitoring
Embedded Data Hygiene – E.g. Data Reconciliation, Data Quality
Management & Cruise Control for standard failures
Technology
Non-Proliferation – Define an Approved Stack
Best Choice for a Use Case – Low Frequency Batch vs. High
Frequency Granular ETL; Operational Reporting vs. Analytic
Reporting
Benchmarking driven Environment & Development Guidelines
Non-Proliferation – Define an Approved Stack
Best Choice for a Use Case – Low Frequency Batch vs. High
Frequency Granular ETL; Operational Reporting vs. Analytic
Reporting
Benchmarking driven Environment & Development Guidelines
Single Version of Truth Consistency of Processes Time to Market
Assess the current state
data architecture in terms of
Architecture, Design and
Technology on parameters
– ability to deliver single
version of truth, consistency
of processes, time to
market.
Outcome:
Areas of optimization
across architecture,
design and technology
Support with developing
Business case for
implementation
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
6. Components of Shared Services Enterprise Data Platform
Analytics
and
Reporting
Data
Management
Platform
Management
Security &
Governance
Information and Insight
Data Optimization Platform Efficiency
Compliance
Multi Tenancy
• Virtual Machines
Adoption
• Containers -
Lightweight OS
virtualization
Platform Capabilities
• Kafka as a Service
• Cloudera 5.5 Upgrade
• Cloudera Kudu
• Centralized
Monitoring &
Optimization
Multi Tenancy
• Virtual Machines
Adoption
• Containers -
Lightweight OS
virtualization
Platform Capabilities
• Kafka as a Service
• Cloudera 5.5 Upgrade
• Cloudera Kudu
• Centralized
Monitoring &
Optimization
Common Enterprise
Utilities
• Data Sanitization
• Data ingestion
Framework
• Data Quality As
Service
Data Ingestion
• Spark streaming
• Real time event
ingestion and storage
using Kafka
Common Enterprise
Utilities
• Data Sanitization
• Data ingestion
Framework
• Data Quality As
Service
Data Ingestion
• Spark streaming
• Real time event
ingestion and storage
using Kafka
Process
• Project/Change Mgmt
• Knowledge Base
• Process Innovations
Security
• Access controls
• Data Encryption
• Auditing/ Monitoring
Process
• Project/Change Mgmt
• Knowledge Base
• Process Innovations
Security
• Access controls
• Data Encryption
• Auditing/ Monitoring
Operational Reporting
• Microstrategy,Tableau,Spotfi
re Integration with Hadoop,
Impala
• Spark SQL
Discovery Analytics
• SAS & R Based analytics
• Expose Spark ML libraries for
Machine Learning
• Real Time Decisioning
Operational Reporting
• Microstrategy,Tableau,Spotfi
re Integration with Hadoop,
Impala
• Spark SQL
Discovery Analytics
• SAS & R Based analytics
• Expose Spark ML libraries for
Machine Learning
• Real Time Decisioning
Governance & ControlsGovernance & Controls Newer Capability / Transformation Assets and AcceleratorsNewer Capability / Transformation Assets and Accelerators
Rapid Scale in / Scale OutRapid Scale in / Scale Out
EA strategyEA strategy
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
7. Why Centralized Metadata Management?
• A centrally managed metadata implementation also ensures that one
definition of data, locations, content and business rules is used by all
technologies and solutions
• Key Benefits
1. Lowers cost of ownership
2. Makes integration activities easier
3. Ensures data integrity
• Difficulties
1. Manual Intervention in E2E object linkages at various levels of
abstraction
2. Increased complexity in configuration management.
• The centralized metadata architecture ensures
• Standardized metadata across different systems.
• No replication of metadata across systems and hence no need for
synchronization of metadata across the components used.
• No need for maintaining bi-directional connections to be between
various tools for metadata exchange.
• Minimal effort in system integration.
• Optimal hardware resource requirements.
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
8. Data Quality Aspects
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
9. Hadoop Multi-Tenancy: Metering and Charge Back
Compute
( CPU)
Storage Bandwidth Namespace
Containers CPU
Cores used by
apps to perform
computation / data
processing.
HDFS (usable)
space needed by
an app with
Default replication
factor of three
Network Bandwidth
needed to move
data in and out of
Clusters by the App
Files and Directories
used by the apps to
understand /limit the
load on NN
Unit
Unit cost
$ / vcore-hour
vCores of CPU
available for an hour
Monthly CPU Cost /
Avail CPU vCores
$ / GB Stored
Usable storage space
( less replication and
overhead)
Monthly Storage Cost /
Avg Usable Storage
$ / GB Inter-region
Data Transfer
Inter – Region (peak)
link capacity
Monthly BW cost /
Monthly GB In-Out
NA
NA
NA
Total cost
Cost
Elements
Hadoop Operations Cost Elements
Compute
(Memory)
Container Memory
where apps perform
computation and
access HDFS if
needed
$ / GB Hour
GB’s of Memory
Available for an hour
Monthly Memory
Cost / Avail Memory
Capacity
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
10. •Technique # 2: Active-Active setup (WD Fusion):
•Using WANDISCO third party tool.
•This is relatively new concept has not being used at TCS customer places, we can contact WANDISCO to provide list of customers who
have deployed in case needed.
•WD Fusion is a software application that allows Hadoop deployments to replicate HDFS data between Hadoop clusters that are running
different, even incompatible versions of Hadoop. It is even possible to replicate between different vendor distributions and versions of
Hadoop using WD Fusion.
•Some of the benefits of using WD Fusion are:
• RPO/ RTO is reduced to minutes than hours.
• Virtual File System for Hadoop, compatible with all Hadoop applications.
• Single, virtual Namespace that integrates storage from different types of Hadoop, including CDH, HDP, EMC Isilon, Amazon S3/EMRFS
and MapR.
• Storage can be globally distributed.
Hadoop – Disaster Recovery (Active-Active) (2/2)
WANDisco design of active-active DR between two clustersThis document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
11. Sample Organisation structure – indicative key roles
Strategic Interfaces Data Council/Steering group Business Services
Change
Governance
PoC/PoT
Run/
Operations
Technology
CoE
IBM
Informatica
MDM
Big Data
Product
Partners
Quality
Assurance
Personal
Commercial
Other
Operating
Units
Business
Operating
Units
Core Solution Team
Shared Delivery & Support - Roles
Business and Data SMEs
(Client Roles)
Business
SMEs
Solution
Architect
Delivery
Manager
/Scrum
Master
Tester
Data
Analyst/
Stewards
ELT/ETL
Developer
Hadoop/DW
Developer
System Admin
Data
Scientist
Big Data
Architect
Hadoop/DB
Admin
Designer
Centralised Data Team
Core Solution team
Data
SMEs/Stewards
Delivery Team
Tech SMEs
Performanc
e
Engineerin
g
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
12. Diagnostics / Assessments
Customized
Training Schedule
Self Study
component
(WBT & Online
Learning portal)
Instructor lead
Training
(Weekly
teleconference)
Practice
Assignments
and group
exercises
Evaluation & Feedback
Training Program organized and conducted by
senior professors from External & Internal trainings
Toastmasters and weekly soft skills training session
Specialized Induction process to bring new joinees
into speed
Highly efficient appraisal process to give feedbacks
and continuous improvements to the associates.
34
Training Program
Induction
Training
Team
Induction
Experience Pool
Knowledge
Transfer
Knowledge Repository
• Best Practices
• Web Based & Classroom training
• Tools & Checklists
• Proficiency Build
• SPEED: A approach to appraisal
• Peer and Client Feedback
Shadow
Support
Domain
Training
EXIT TEST
Maturing through multiple projects
Experience
Talent Building Framework
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
13. PerformanceTesting – Process Map
This document is a partial preview. Full document download can be found on Flevy:
http://flevy.com/browse/document/strategy-shared-services-enterprise-data-management-2174
14. 1
Flevy (www.flevy.com) is the marketplace
for premium documents. These
documents can range from Business
Frameworks to Financial Models to
PowerPoint Templates.
Flevy was founded under the principle that
companies waste a lot of time and money
recreating the same foundational business
documents. Our vision is for Flevy to
become a comprehensive knowledge base
of business documents. All organizations,
from startups to large enterprises, can use
Flevy— whether it's to jumpstart projects, to
find reference or comparison materials, or
just to learn.
Contact Us
Please contact us with any questions you may have
about our company.
• General Inquiries
support@flevy.com
• Media/PR
press@flevy.com
• Billing
billing@flevy.com