Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

DAS Slides: Metadata Management From Technical Architecture & Business Techniques

583 vues

Publié le

Metadata provides context for the “who, what, when, where, and why” of data, and is of critical interest in today’s data-driven business environment. Since metadata is created and used by both business and IT, architectural and organizational techniques need to encompass a holistic approach across the organization to address all audiences. This webinar provides practical ways to manage metadata in your organization using both technical architecture and business techniques.

Publié dans : Données & analyses
  • Soyez le premier à commenter

DAS Slides: Metadata Management From Technical Architecture & Business Techniques

  1. 1. Copyright Global Data Strategy, Ltd. 2019 Metadata Management: Technical Architecture & Business Techniques Donna Burbank Global Data Strategy, Ltd. July 25th, 2019 Follow on Twitter @donnaburbank Twitter Event hashtag: #DAStrategies
  2. 2. Data Intelligence Portfolio Overview J U LY 2 0 1 9 S H A W N R O B E R T S - V P S O L U T I O N S T R AT E G Y
  3. 3. erwin, Inc. Snapshot 2 Founded March 2016 4 strategic M&As to expand portfolio and footprint R&D for new technologies, including data governance and NoSQL 3,500+ customers with more than 50,000 users ~250 employees in 7 offices around the world Industry-leading NPS © 2019 erwin, Inc. All rights reserved.
  4. 4. 1000s of undocumented applications and databases 1000s of business terms across different business units Our Mission: What Data Do You Have and Where Is It? 3 Harvest data Collect data schema and business terms Analyze data Mapping data and attributes Structure data Standardize on specific business terms and definitions Govern data Develop a governance model to manage standards and set best practices Visualize data Enable all stakeholders to see data in one place in their own context © 2019 erwin, Inc. All rights reserved.
  5. 5. Big $$$ Being Spent on New Technologies to Deliver Data-Driven Insights 4 Data discovery / analytics Business process management Data integration and quality IT operations Governance, risk and compliance Data warehousing Databases Business intelligence Requirements management Software development lifecycle (SDLC) $21bn 7% $2.4bn 8% $16bn 12% $36bn 8.7% $21bn 8.3% $728m 13.8% Data Governance $1.2bn 15% $4bn 8% $5.1bn 8.7% $4.1bn 10.5% $2.7bn 4.4% © 2017 erwin. All rights reserved.
  6. 6. Data Governance for Complex Architectures © 2019 erwin, Inc. All rights reserved. 5 erwin Data Intelligence Suite Business Glossary Rules Policies Communication Center Data Catalog DQ, Profiling and Scoring Automatic Lineage Automation Framework 1. Gather 2. Clean/Master 3. Store 4. Model/Present 5. Share 6. Analyze CRM Sources CRM ERP Operational Systems External Data Excel Enterprise Data Warehouse DataStage 001 DataStage 002 ODS DQS MDM Sales Schema Financial Schema Marketing Schema Data Mart Tabular Multi- dimensional PowerPivot for SharePoint Excel PowerPivot for Excel Staging 3NF Data Warehouse SSAS BI REPORTING LAYER SQL Server Reporting Services MicroStrategy Tableau IBM Cognos Others… Publish SSIS SSIS Dimensional SSIS SSIS Transactional Data Lake
  7. 7. Key Market Trends © 2019 erwin, Inc. All rights reserved. 6 Automation Data professionals need to reduce the time to prepare, share and govern data and build a sustainable data pipeline for the business. How We Help We automatically scan, harvest and refresh data from the widest array of data sources. We continue to develop and add new data connectors to this portfolio. Our automation framework also includes the ability for third parties to connect with their own ecosystems. Artificial Intelligence (AI) and Machine Learning (ML) Data professionals need an automated and intelligent approach for data discovery, quality, modeling and more. How We Help We auto-match business terms with data assets, as well as automatically document lineage, both verbose and simple. Our AI/ML roadmap deepens this capability with more than 8 additional use cases in the next 12 months. Enterprise Lineage and Impact “What data do I have, where is it, and who’s using it?” are critical questions for data owners, risk and compliance and governance professionals. Detailed lineage and impact analysis is key to understanding and safeguarding enterprise value streams. How We Help We provide data within context by connecting it to process models and enterprise architectures. Connecting the data catalog and business glossary to the other erwin products provides new perspective and true business semantics.
  8. 8. © 2019 erwin, Inc. All rights reserved. Metadata Management Vision DATA INTELLIGENCE SECURITY AWARE REAL-TIME MONITORING SEMANTIC MODELING PERVASIVE, ACTIVE METADATA The erwin EDGE is the only data intelligence platform that depicts an organization’s entire metadata landscape. Uniquely, we automatically harvest, transform and feed metadata from data sources, operational processes, business applications and data models into a central data catalog and then make it accessible and understandable within contextual, role-based views.
  9. 9. Visionary Impact on Metadata Solution Market © 2019 erwin, Inc. All rights reserved. 8 • Automation • Continuous refreshing, versioning • Smart data connectors • Data catalog and data literacy capabilities • Driven by data • Business processes and enterprise applications connected to data • Standardized stakeholder communication • Active data lineage and impact analysis • Fast connectivity to data sources • Document ETL • Unravel complex lineage • Reduce reliance on technical resources • Generate mapping scripts for ETL • Assist cloud migration projects • Support security programsData Intelligence Platform Active Metadata Approach Speed to Insights Data Movement Design Data Quality • Identification of issues and inconsistencies • Link to operational data, such as ServiceNow • Real-time issue identification • Workflow
  10. 10. erwin Data Literacy Suiteerwin Data Catalog Suite Business User Portal Business Glossary Manager Mapping Manager Lifecycle Manager Reference Data Manager Data Quality Data Intelligence SuiteEnterprise Modeling Suites erwin Enterprise Architecture erwin Business Process erwin Data Modeler Data Automation Standard Data Connectors Smart Data Connectors erwin Enterprise Modeling & Data Intelligence Software © 2019 erwin, Inc. All rights reserved. 9
  11. 11. © 2019 erwin, Inc. All rights reserved. 10 Data Analyst ETL Developer Data Architect BI Developer Business Analyst Data Steward Data Scientist Data Owner POWERING DIGITAL TRANSFORMATION All Others … Improving Digital Experiences Enhancing Digital Operations Driving Digital Innovation Building Digital Ecosystems Buyer & User Personas
  12. 12. Active Metadata Management Harvesting and Version-Controlled Metadata Detailed cataloging / crowd-source catalog • erwin-owned connectors for any JDBC data source and a broad variety of formats • Smart connectors for a wide array of code types/repositories including data warehouses, lakes, ETL/ELT/procedural code, BI environments and Big Data solutions • Changed-metadata capture, sensitive-data flagging, enrichment, domain/stewardship assignment, direct association with glossary terms, policies and business rules © 2019 erwin, Inc. All rights reserved. 11 Real Business Value • Dynamically updated, sustainable and accessible • Easy to search/discover and collaborate • Control and accelerate metadata-driven projects • Enables compliance, detailed auditability
  13. 13. Data Stewardship & Curation • Asset registration, both technical and business-wide • Metadata enrichment • Associated DQ metrics • Inline data profiling • User-defined metadata extensions/properties • Association with business terminology/policy/rules • Sensitive-data categorization and dashboarding © 2019 erwin, Inc. All rights reserved. 12 CAPTURE, UPDATE, ASSOCIATE:
  14. 14. Integrated Data Profiling & Data Quality Scoring • Data profiling at table and column level, total rows, unique values, distinct values, null values, min value, max value, # of distinct patterns • Display report of statistical summary and associated scoring metrics • On-demand data preview capabilities based on user authentication REAL BUSINESS VALUE • Business user-friendly profiling and data quality • Management of critical data curation metrics • Easy to use and associated with registered data assets • Data quality scoring follows the data element through the system lifecycle CAPABILITIES
  15. 15. Integrated Business Glossary • Multi-glossary • Ownership/stewardship • Business terminology authoring process • Metrics dashboard • Business policy manager • Business rules manager CAPABILITIES • Process management • Business-oriented discovery and navigation • Ease of auditability • Low maintenance overhead REAL BUSINESS VALUE
  16. 16. On-Demand Dynamic Data Lineage • Accessible from erwin Data Catalog and erwin Data Literacy Suites for both technical staff and business users (tabular and graphical) • Dynamically generated from physical data movement mappings, auto-documented and always up to date • Navigable visualizations with drill-down to table and column level detail and full transformations CAPABILITIES REAL BUSINESS VALUE • Ease of access and use • Always accurate and complete • Sustainable and trustworthy
  17. 17. On-Demand Dynamic Impact Analysis • Accessible from erwin Data Catalog and erwin Data Literacy Suites • System generated and maintained based on real-time updates to metadata and mappings CAPABILITIES REAL BUSINESS VALUE • Always current • Always accurate and complete • Sustainable • Usage details and metrics
  18. 18. On-Demand Dynamic Mind Mapping • System generated “one-stop” contextual View • Associated terms, policies, ownership, status and metadata • Available from erwin Data Catalog and erwin Data Literacy • Navigable visualizations with drill-down to column and transformation details CAPABILITIES • Always current and easily sustained • Accurate and complete • Accessible and intuitive REAL BUSINESS VALUE
  19. 19. Thank You
  20. 20. Global Data Strategy, Ltd. 2019 Donna Burbank 2 Donna is a recognised industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. Her background is multi-faceted across consulting, product development, product management, brand strategy, marketing, and business leadership. She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. In past roles, she has served in key brand strategy and product management roles at CA Technologies and Embarcadero Technologies for several of the leading data management products in the market. As an active contributor to the data management community, she is a long time DAMA International member, Past President and Advisor to the DAMA Rocky Mountain chapter, and was recently awarded the Excellence in Data Management Award from DAMA International in 2016. Donna is also an analyst at the Boulder BI Train Trust (BBBT) where she provides advice and gains insight on the latest BI and Analytics software in the market. She was on several review committees for the Object Management Group’s for key information management and process modeling notations. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co- authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications. She can be reached at donna.burbank@globaldatastrategy.com Donna is based in Boulder, Colorado, USA. Follow on Twitter @donnaburbank Twitter Event hashtag: #DAStrategies
  21. 21. Global Data Strategy, Ltd. 2019 DATAVERSITY Data Architecture Strategies • January 24 - on demand Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 18 - on demand Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 28 - on demand Data Modeling at the Environment Agency of England - Case Study • April 25 - on demand Data Governance - Combining Data Management with Organizational Change • May 23 - on demand Master Data Management - Aligning Data, Process, and Governance • June 27 - on demand Enterprise Architecture vs. Data Architecture • July 25 Metadata Management: Technical Architecture & Business Techniques • August 22 Data Quality Best Practices (w/ guest Nigel Turner) • Sept 26 Self Service BI & Analytics: Architecting for Collaboration • October 24 Data Modeling Best Practices: Business and Technical Approaches • December 3 Building a Future-State Data Architecture Plan: Where to Begin? 3 This Year’s Lineup
  22. 22. Global Data Strategy, Ltd. 2019 Today’s Topic Metadata provides context for the “who, what, when, where, and why” of data, and is of critical interest in today’s data-driven business environment. Since metadata is created and used by both business and IT, architectural and organizational techniques need to encompass a holistic approach across the organization to address all audiences. This webinar provides practical ways to manage metadata in your organization using both technical architecture and business techniques. 4
  23. 23. Global Data Strategy, Ltd. 2019 Metadata is the “Who, What, Where, Why, When & How” of Data 5 Who What Where Why When How Who created this data? What is the business definition of this data element? Where is this data stored? Why are we storing this data? When was this data created? How is this data formatted? (character, numeric, etc.) Who is the Steward of this data? What are the business rules for this data? Where did this data come from? What is its usage & purpose? When was this data last updated? How many databases or data sources store this data? Who is using this data? What is the security level or privacy level of this data? Where is this data used & shared? What are the business drivers for using this data? How long should it be stored? Who “owns” this data? What is the abbreviation or acronym for this data element? Where is the backup for this data? When does it need to be purged/deleted? Who is regulating or auditing this data? What are the technical naming standards for database implementation? Are there regional privacy or security policies that regulate this data?
  24. 24. Global Data Strategy, Ltd. 2019 Business vs. Technical Metadata • The following are examples of types of business & technical metadata. 6 Business Metadata Technical Metadata • Definitions & Glossary • Data Stewardship • Privacy Level • Security Level • Acronyms & Abbreviations • Business Rules • KPIs and Metrics • Etc. • Column structure of a database table • Data Type & Length (e.g. VARCHAR(20)) • Domains • Standard abbreviations (e.g. CUSTOMER -> CUST) • Nullability • Keys (primary, foreign, alternate, etc.) • Validation Rules • Data Movement Rules • Permissions • Etc.
  25. 25. Global Data Strategy, Ltd. 2019 Metadata is Part of a Larger Enterprise Landscape 7 A Successful Data Strategy Requires Many Inter-related Disciplines “Top-Down” alignment with business priorities “Bottom-Up” management & inventory of data sources Managing the people, process, policies & culture around data Coordinating & integrating disparate data sources Leveraging & managing data for strategic advantage
  26. 26. Global Data Strategy, Ltd. 2019 Metadata is Hotter than ever 8 A Growing Trend In a recent DATAVERSITY survey, over 80% of respondents stated that: Metadata is as important, if not more important, than in the past. 1 Emerging Trends in Metadata Management, 2016, DATAVERSITY, by Donna Burbank and Charles Roe
  27. 27. Global Data Strategy, Ltd. 2019 Who Uses Metadata? Metadata is used by a wide range of roles across the organization. 9 1 Emerging Trends in Metadata Management, 2016, DATAVERSITY, by Donna Burbank and Charles Roe
  28. 28. Global Data Strategy, Ltd. 2019 Who Uses Metadata? “Business” Users were the largest audience. 10 1 Emerging Trends in Metadata Management, 2016, DATAVERSITY, by Donna Burbank and Charles Roe
  29. 29. Global Data Strategy, Ltd. 2019 Metadata is Needed by Business Stakeholders 11 Making business decisions on accurate and well-understood data 80% of users of metadata are from the business, according to a recent DATAVERSITY survey1. Business users often “get” metadata more than IT does! How was this “Total Sales” figure calculated? 1 Emerging Trends in Metadata Management, 2016, DATAVERSITY, by Donna Burbank and Charles Roe “Metadata helps both IT and business users understand the data they are working with. Without Metadata, the organization is at risk for making decision based on the wrong data.”1
  30. 30. Global Data Strategy, Ltd. 2019 Who Uses Metadata? Responses for “Other” Users were informative including: • Clients • Data Governance Team, Data Stewards • Management • Technical Sales • External Data Providers • General Public • Government Organizations • Regulators • Library Staff • Research Team • Students • Scientists • GIS Staff • …”Everyone, they just don’t know it yet” 12 1 Emerging Trends in Metadata Management, 2016, DATAVERSITY, by Donna Burbank and Charles Roe
  31. 31. Global Data Strategy, Ltd. 2019 Who Uses Metadata? 13 Developer If I change this field, what else will be affected? Business Person (e.g. Finance) What’s the definition of “Regional Sales” Auditor How was “Total Sales” calculated? Show me the lineage. Data Architect What is the approved data structure for storing customer data? Data Warehouse Architect What are the source-to- target mappings for the DW? Business Person (e.g. HR) How can I get new staff up- to-speed on our company’s business terminology?
  32. 32. Global Data Strategy, Ltd. 2019 Who Creates Metadata? • Typically, the same people, yes? • ….or is that “someone else’s job”? 14
  33. 33. Global Data Strategy, Ltd. 2019 Data Governance is a Critical Enabler for Metadata Management • Data Governance creates the roles, policies, procedures, and organizational structures to facilitate metadata management. • Multiple Roles work together to create business and technical metadata. 15 Business Data Steward • Glossary terms & definitions • Business rules • Acronyms Data Architect • Conceptual & logical models w/ core business rules and definitions • Naming standards • Data Lineage Policies, Procedures, Training, and Job Descriptions help guide and enforce metadata creation and maintenance. System Data Steward • Physical metadata structures for core applications • Business definitions for application fields • Alignment of systems with business rules DBA/Data Engineer • Physical metadata structures • Naming standards • Data type standards * Note: Roles are different for each organization. Each organization’s governance structure and roles are unique. Sample Governance Roles Involved with Metadata Creation * Business Data Owner • KPIs • Organizational Metrics • Regulatory Guidelines & Policy
  34. 34. Global Data Strategy, Ltd. 2019 Capturing & Storing Business Metadata • Much business metadata and the history of the business exists in employee’s heads. • It is important to capture this metadata in an electronic format for sharing with others. • Avoid the dreaded “I just know” 16 Avoid the dreaded “I just know” Part Number is what used to be called Component Number before the acquisition. Business Glossary Metadata Repository / Catalogue Data Models Etc. Collaboration Tools
  35. 35. Global Data Strategy, Ltd. 2019 Crowdsourcing Governance & Metadata Definitions • Many metadata projects & vendors are embracing the concept of “crowdsourcing”. i.e. The Wikipedia vs. Encyclopedia approach • Open editing • Popularity & Usage Rankings • Dynamically changing 17 Encyclopedia Wikipedia • Created by a few, then published as read-only • Single source of “vetted” truth • Static • Created by a by many, edited by many • Eventual consistency with multiple inputs • Dynamic For Standardized, Enterprise Data Sets For Self-Service Data Prep & Analytics
  36. 36. Global Data Strategy, Ltd. 2019 Finding the Right Balance 18 • When implementing metadata management in today’s rapidly-changing, self-service data landscape, it is important to find a balance between: Standards-based Metadata & Governance The two methods work well together, using the right approach depending on the data usage. Collaboration-based Metadata & Governance • Well-suited for enterprise-wide data standards • Well-suited for self-service data preparation & analytics
  37. 37. Global Data Strategy, Ltd. 2019 Capturing & Storing Technical Metadata 19 • Technical metadata can be used “top-down”, i.e. to design and implement systems • … as well as “bottom-up” to discover metadata embedded in existing systems. Top Down Metadata used for active system creation Bottom Up Metadata discovered through automated ‘scanning’ / ‘crawling’ • Data structures • Lineage & mapping • Relationships & Keys • Indexes • Etc.
  38. 38. Global Data Strategy, Ltd. 2019 Data Lineage: Reporting & Data Warehouse Example • In the data warehouse example below, metadata exists in a number tools & data stores that are used to generate the final figure on a given report – metadata can help show that path. 20 Audit and Traceability Sales Report CUSTOMER Database Table CUST Database Table CUSTOMER Database Table CUSTOMER Database Table TBL_C1 Database Table Business Glossary ETL Tool ETL Tool Physical Data Model Physical Data Model Logical Data Model Dimensional Data Model BI Tool Total Sales for Customer X this Quarter are $1.5M
  39. 39. Global Data Strategy, Ltd. 2019 Metadata Lineage • Metadata lineage can also be traced in Cloud and “Big Data Environments” 21 Graphic is sourced from : https://aws.amazon.com/glue/
  40. 40. Global Data Strategy, Ltd. 2019 Machine Learning & Metadata Discovery • Machine Learning offers ways to automate tedious tasks that may have been done manually before: • e.g. Data Mapping • SSN -> Field1_SSN • SSN -> Soc_Num • Etc. • Machine Learning Pattern Matching • NNN-NN-NNNN -> Field_X follows this pattern, it must be a SSN 22 Source kdnuggets.com • There is a place for both methods: • Sometimes you want to define specific mapping rules • Sometimes you want a pattern-matching, discovery- style approach.
  41. 41. Global Data Strategy, Ltd. 2019 Architectural Options for Metadata Management 23 • The following are common architectural options for metadata management within & between organizations. • There is no “one size fits all” approach. • They can be used together within the same organization. Central, Enterprise-wide Metadata Repository Metamodel(s) Metadata Storage (Database) Population Interfaces Matching & Reuse Logic Publication & Sharing Reports Web Portal Integration & Export Tool or Purpose-Specific Repository Business Glossary ETL Tool Data Modeling Tool BI ToolEtc Data Dictionary Database Metadata Exchange & Registry Information Sharing & Standards
  42. 42. Global Data Strategy, Ltd. 2019 Summary • Metadata provides critical business and technical context providing the “who, what, where, when, and why” around data • A wide variety of roles create and consume metadata across business and IT • Data governance provides orchestration for roles and responsibilities around metadata creation and maintenance • Technical metadata can often be automated for metadata discovery; human creation is typically necessary for design and creation • A wide range of architectural options are available for storing, sharing, and managing metadata within and between organizations.
  43. 43. Global Data Strategy, Ltd. 2019 DATAVERSITY Data Architecture Strategies • January 24 Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 18 Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 28 Data Modeling at the Environment Agency of England - Case Study (w/ guest Becky Russell from the EA) • April 25 Data Governance - Combining Data Management with Organizational Change (w/ guest Nigel Turner) • May 23 Master Data Management - Aligning Data, Process, and Governance • June 27 Enterprise Architecture vs. Data Architecture • July 25 Metadata Management: Technical Architecture & Business Techniques • August 22 Data Quality Best Practices (w/ guest Nigel Turner) • Sept 26 Self Service BI & Analytics: Architecting for Collaboration • October 24 Data Modeling Best Practices: Business and Technical Approaches • December 3 Building a Future-State Data Architecture Plan: Where to Begin? 25 Join Us Next Month
  44. 44. Global Data Strategy, Ltd. 2019 About Global Data Strategy, Ltd • Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. • Our passion is data, and helping organizations enrich their business opportunities through data and information. • Our core values center around providing solutions that are: • Business-Driven: We put the needs of your business first, before we look at any technology solution. • Clear & Relevant: We provide clear explanations using real-world examples. • Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of technical expertise in the industry. 26 Data-Driven Business Transformation Business Strategy Aligned With Data Strategy Visit www.globaldatastrategy.com for more information
  45. 45. Global Data Strategy, Ltd. 2019 Questions? 27 • Thoughts? Ideas?

×