Bringing Agility and Flexibility to Data Design and Integration
1. Bringing Agility and Flexibility to
Data Design and Integration
Phasic Systems Inc
Delivering Agile Data
www.phasicsystemsinc.com
888-735-1774
2. 2
Introduction to Phasic Systems Inc
• Bringing Agile capabilities to data lifecycle for business success
• Methods and tools tested and refined over years of in-depth large-
scale efforts
• Solve toughest data problems where traditional methods fail
• Based on extensive consulting lessons learned and real-world
results
• Began in 2005 to commercialize advanced Agile methods
successfully deployed in competitive development contracts
3. 3
Phasic Systems Inc Management
• Geoffrey Malafsky, Ph.D, Founder and CEO
▫ Research scientist
▫ Supported many organizations in their quest to access the right
information at the right time
• Tim Traverso, Sr VP Federal
▫ Technical Director, Navy Deputy CIO
• Marshall Maglothin, Sr VP HealthCare
▫ Sr. Executive multiple large health care systems
• Deborah Malafsky Sr VP Business Development
4. 4
Our Agile Methods
• Why be Agile?
▫ Provide flexibility and adaptability to changing business needs while
maintaining accuracy and commonality
▫ Segmented approach is too slow, rigid, and costly
• How?
▫ Treat data lifecycle as one continuous operation from governance to
modeling to integration to warehouses to Business Intelligence
▫ Emphasize value produced at each step and overall coordination
▫ Seamlessly fit with existing organization, procedures, tools but add Agility,
commonality, flexibility, and reduced cost and time
• We are Agile and comprehensive
▫ Typical 60-90 day engagement
▫ Deliver completed products not just plans or partial results
5. 5
Methods and Tools
• DataStar Discovery: Agile data governance, standards and design
▫ Add business and security context to data
▫ Flexible, common data definitions/ semantics, models
• DataStar Unifier: Agile warehousing and aggregation
▫ Simplified, common semantics using Corporate NoSQL™
▫ Source to target mapping with flexibility, standardization
▫ Aggregate data using all use case and system variations simply and
easily into standard or NoSQL databases
6. 6
PSI Customer Testimonial
“As a COO of a Wall Street firm and a former Vice Admiral in the United
States Navy in charge of a large integrated organization of thousands of people
and numerous IT systems, I have seen firsthand the critical role that high-quality
enterprise data plays in day-to-day operations of an organization. Without
timely access to reliable and trusted data all of our operations were vulnerable
to poor decision making, weak performance, and a failure to compete. With
Phasic Systems Inc.’s agile methodology and technology, we were finally able to
solve our data challenges at a fraction of the time, cost, and organizational
turmoil that all the previous and more expensive, time-consuming approaches
failed to do. Phasic Systems Inc. offers a new and much-needed approach to
this important area of Business Intelligence.”
VADM (ret) J. “Kevin” Moran
7. 7
The Business Case
Today’s Response Timeline (15 to 27 Months)
3 to 6 Months 6 to 9 Months 3 to 6 Months 3 to 6 Months
Business Groups IT Groups BI Groups Users
• Requirements • Develop Systems & Applications • Capability Problems
• BI Data Models
• Conceptual/Logical Models • Physical Data Models • New Capabilities
• Reports
• Data Quality • Databases / Data Warehouse • Missing Data
• Dashboards
• Business Rules • ETL controls
• Standards • MDM
Tomorrow’s Initial Response Timeline with PSI (Subsequent Response Timeline – Days)
2 to 6 Months
• Requirements • Develop Systems & Applications
• Conceptual Data Model • Physical Data Models
• Logical Data Model • Databases / Data Warehouse
• Business Rules • ETL controls
• Standards • MDM
• BI Data Models
• Data Quality
8. 8
Agile: Overcome Hurdles
• Group rivalry
▫ Embrace important business variations; recognize no valid reason
to force everyone to use only one view exclusively.
• Terminology confusion
▫ Use a guided framework of well-known concepts to rapidly identify,
and implement variations as related entities.
• Poor knowledge sharing
▫ Use integrated metadata where important products (business
models, data models, glossaries, code lists, and integration rules)
are visible, coordinated, and referenceable
• Inflexible designs
▫ Use a hybrid approach (Corporate NoSQL™) for Agile
warehousing and integration blending traditional tables and
NoSQL for its immense flexibility and inherent speed
9. Schema Are Not Enough
Governance Integration CEO/CFO/CIO SAP/IBM/ORACLE
Design ? MDM Sales, ?
Accounting
D. Loshin 2008
Which Value? Whose? My “customer” or your “customer”?
How is data used?
Must be agile in order to adapt quickly to new business needs
▫ Continuous change is norm: requirements, consolidation
▫ We must use all the important business variations of key terms (e.g.
account, client, policy) – No such thing as single version for all!
12. 12
Real Estate Listing Example
• Seems simple and well-defined
▫ Each house has a type, id, address, etc..
▫ Industry standards: OSCRE, RETS
• Yet, data systems are very different
▫ Data model tied tightly to business workflow
▫ Extensions and “make-it-work” changes added over time
• Similar to customer relationship mgmt, ERP, and many
other fields
13. 13
Semantic Conflict in
Real Estate Models NKY
HOMESEEKERS
NKY attribute ‘basement’
does not have a corollary in
HOMESEEKERS
14. 14
Data Value Semantic
Errors = Inconsistent, Lot_dimensions: implied semantics for size
Difficult to Merge, data. Actually has all sorts of data
Report, Analyze
Semiannual_taxes: implied semantics for
numeric data. Actually has all sorts of data
20. 20
DataStar Corporate NoSQL™
• Large systems use NoSQL for its flexibility, performance,
and adaptability
▫ But, it is poorly suited for corporate use – lacks connection to
business
• DataStar Corporate NoSQLTM
▫ Blends traditional techniques and NoSQL Speed
▫ Entities come directly from Unified Business Model &
Agility
▫ Object structure with simple tables
▫ Key-value pairs are basic repeating structure of all tables
▫ Business driven terminology
▫ Easily handles semantic variations & updates w/o changes to
logical or physical models
▫ Can be as ‘dimensional’ or ‘normalized’ as desired
22. Results
• Applied to production data:
▫ Fully cleaned & integrated data governance approved
Requirement: 500,000 records in 2 hrs on Sun E25K
Actual: 50 minutes on 3 year low-cost server
• Governance documents produced and approved
▫ Legacy data models – first time in ten years
▫ Common data model – directly derived from ontology.
Position-Resume model
• Standing governance board created with short decision-
making monthly meetings
▫ Position-Resume Governance Board
• Process approach and technology applied to new IT
systems
23. Navy HR Data Analysis
• Groups “share” data and control only if they don’t lose project
control or funds
• Governance, business process, data engineers create separate
designs and don’t know how to coordinate
• Try hard to follow industry guidance but stuck
• Actual data is very different than policy, mgmt awareness
▫ Example 1: Multiple Rate/Rating entries. Person xxxxxx has 5
entries: 4 end on the same date, 2 have start dates after they
their end dates , 2 start and end on the same days but are
different
▫ Example 2: 30 different values used for RACE but only 6 allowed
values in the Navy Military Personnel Manual derived from DoD
policy