This presentation delivers a detailed understanding of taxonomy definitions, taxonomy value (ROI), and taxonomy design methodologies and approaches. It was originally delivered by Zach Wahl and Tatiana Cakici of Enterprise Knowledge at Taxonomy Boot Camp 2019 in Washington, DC.
2. HELLO!
ZACH WAHL
PRINCIPAL
Areas of Focus:
Management & IT Leadership,
KM Strategy, Information
Governance, Taxonomy Design
TATIANA CAKICI
SENIOR CONSULTANT
Areas of Focus:
Taxonomy Design,
Information Governance,
KM Strategy,
@EKConsulting
4. tax·on·o·my (tāk-sōn-mē)
n. pl. tax·on·o·mies
1. The classification of organisms in an ordered system that indicates natural
relationships.
2. The science, laws, or principles of classification; systematics.
3. Division into ordered groups or categories: "Scholars have been laboring to
develop a taxonomy of young killers" (Aric Press).
4
TAXONOMY DEFINITION
EK’s Definition of Taxonomy
Controlled vocabularies used to describe or characterize explicit concepts of
information, for purposes of capture, management, and presentation.
@EKConsulting
5. TAXONOMY AND METADATA
• Provide structure to unstructured information.
• Join or relate multiple disparate sources of information.
• Provide multiple avenues to find and discover information.
• Enable findability.
Findability
@EKConsulting
6. In a supermarket,
where would you
expect to find
almond milk?
• Breakfast section
• Dairy section
• Baking section
• Beverages section
@EKConsulting
8. Sometimes content repositories look like products in a supermarket.
Thousands of items. Multiple categories and multiple facets.
Can you find the almond milk?
9. General Product Metadata:
• Delivery Day
• Amazon Prime
• Eligible for Free Shipping
Specific TV Metadata:
• TV Display Size
• Television Resolution
• Electronic Device Model Year
• Etc.
13. Business taxonomies are
classification for findability.
13
TRADITIONAL V. BUSINESS TAXONOMIES
Traditional taxonomies are
classification for the sake of
classification.
Source: https://adapaproject.org/
14. 14
Traditional Taxonomy Business Taxonomy
PURPOSE Categorization Findability
DESIGNED BY Scientists/Librarians The Business
MANAGED BY Scientists/Librarians The Business
USED BY Scientists/Librarians Everyone
COMPLEXITY Deep, Wide, Detailed Flat, Simple, Deconstructed
KEY CHARACTERISTICS Mutually Exclusive, Collectively
Exhaustive
Usable, Intuitive, Natural
TRADITIONAL V. BUSINESS TAXONOMIES
@EKConsulting
15. METADATA FIELD CONCEPTS
▪ Primary Metadata Field:
A field that can apply to all content across all
systems.
▪ Secondary Metadata Field:
A field that can apply to a subset of content across
all systems.
▪ Tertiary Metadata Field:
A system of function specific field.
@EKConsulting
16. A business taxonomy is:
• Usable – Easy to adopt
and utilize for any skill
level.
• Relatively flat (2-3
levels).
• “Easy” to navigate.
• Intuitive – Does not
require training, reflects the
way the user thinks.
• Natural – Uses the
organization, vocabulary,
and logic of the user.
16
BUSINESS TAXONOMIES
17. TRADITIONAL VS. BUSINESS TAXONOMIES
▪ Tend to be less rigid and
constrained.
▪ Influenced by “traditional” usability
design.
▪ Driven by the content needs you
have today and will have tomorrow.
▪ Leverage multiple categorization
approaches (via multiple metadata
fields and multiple taxonomies).
▪ Accept imperfect categorization.
▪ Rigid structure
▪ Items are classified into a
single category
BUSINESS TAXONOMIESTRADITIONAL TAXONOMIES
@EKConsulting
18. TAXONOMY AND ONTOLOGY
Taxonomy - Controlled vocabularies used to
describe or characterize explicit concepts of
information, for purposes of capture,
management, and presentation.
Ontology - A defined model that organizes
structured and unstructured information
through entities, their properties, and the way
they relate to one another.
@EKConsulting
19. FOLKSONOMY
Free-text tags.
CONTROLLED LIST
List of pre-defined
terms.
Improves consistency.
TAXONOMY
Pre-defined terms &
synonyms.
Hierarchical
relationships.
Improves consistency.
Allows for parent/child
content relationships.
Capture related data.
Integration of structured and
unstructured information.
Linked data Store.
Architecture and data
models to enable machine
learning (ML) and other AI
capabilities. Drive efficient
and intelligent data and
information management
solutions.
ONTOLOGY
Predefined classes &
properties.
Expanded relationship
types.
Increased
expressiveness.
Semantics. Inference.
KNOWLEDGE GRAPHS
KNOWLEDGE ORGANIZATION CONTINUUM
@EKConsulting
20. ONTOLOGY DEFINITIONS
on·tol·o·gy (änˈtäləjē)
n. pl. on·tol·o·gies
1.“A set of concepts and categories in a subject area or domain that shows their
properties and the relations between them.” (Oxford Dictionary)
2.“Controlled, consistent vocabularies to describe concepts and relationships, thereby
enabling knowledge sharing.” (Gruber, 1993)
3.“Formal naming and definition of the types, properties, and interrelationships of the
entities that really or fundamentally exist for a particular domain of discourse.”
(Wikipedia)
EK’s Definition of Ontology
A defined model that organizes structured and unstructured information through
entities, their properties, and the way they relate to one another.
(Example: pizza has topping cheese, Alsace is located in France)
21. SAMPLE ONTOLOGY
Ontologies = Relationships
• Widgets, Inc. has a contract with Consult, Inc.
• Alice Reddy works for Widgets, Inc.
• Alice Reddy reports to Bob Jones.
• Kat Thomas is working with Bob Jones.
• Kat Thomas is working on the Sales Process Redesign Project.
@EKConsulting
23. THE INFORMATION MANAGEMENT CHALLENGE
“Democratization of
Content
Management” has
resulted in
exponential
increases in
information.
80% of business is
conducted on
unstructured
information.
Unstructured data
doubles every three
months.
88
Knowledge workers
spend 15% - 35% of
their time searching
for information.
40% of corporate
users can’t find the
information they
need to do their jobs.
@EKConsulting
24. BUSINESS TAXONOMY VALUE
TAXONOMY FOR
STANDARDIZATION
TAXONOMY FOR
FINDABILITY
TAXONOMY FOR RISK
AVOIDANCE AND
MANAGEMENT
@EKConsulting
25. TAXONOMY RETURNS – IMPROVED FINDABILITY
Not locating and
retrieving information
has an opportunity
cost of more than $15
million annually.
*Sue Feldman. “The High Cost of Not Finding Information.”
Time spent looking
for and not finding
information costs a
total of $6 million a
year.
The cost of reworking
information because it hasn't
been found costs a further $12
million a year (15% of time
spent duplicating existing
information)
@EKConsulting
26. TAXONOMY RETURNS – INCREASED REVENUES
Web Retail Taxonomy Refreshers Have Yielded:
30%
Increased Conversion Rate
20%
Increased Order Lift
@EKConsulting
27. TAXONOMY VALUE EXAMPLE 1
▪ Project: Taxonomy Design for a Customer Call Center System
▪ Expected Business Value:
Add a layer of findability to content for sales agents.
Faster access to information (by product, service, key topic, or customer profile)
and in turn, offer proactive customer service.
Tag answers to FAQs with products and customer type to increase first contact
resolution and sales conversion.
Improve findability of content on common topics to reduce call handling time
and save costs.
Organize information in an intuitive way that allows agents to a have
streamlined, productive interactions with customers.
Enhance and expand search features to discover content that may be of value
to sales agents.
FAQ
28. TAXONOMY VALUE EXAMPLE 2
Increased revenue through more specific conversations with
customers.
More targeted conversations with candidates supported by
specific language that describes what the company does.
Decreased costs through time savings; content re-creation
and pointless searching are eliminated.
Accuracy of reporting to achieve more effective decision
making.
• Project: Taxonomy Design for a Public-facing Website
• Expected Business Value:
29. TAXONOMY VALUE EXAMPLE 3
▪ Project: Taxonomy Design for an Internal Knowledge Repository
▪ Expected Business Value:
Give users the ability to filter content by key facets (e.g. topic, author) and
find related documents/content.
Develop standard content types to provide faster creation and access of
documents across the organization.
Improve findability of FAQ by tagging them with common topics, type of
customer, type of issue, etc..
Reduce cost with smarter reuse of knowledge while improving management
of current and future projects.
32. BUSINESS TAXONOMY FOR YOUR ORGANIZATION
Metadata
Field
Metadata
Values
Your Organization’s Website
TOPIC
❑ Topic 1
❑ Topic 2
❑ Topic 3
❑ Topic 4
❑ …
DOCUMENT TYPE
❑ Type 1
❑ Type 2
❑ Type 3
❑ Type 4
❑ …
LOCATION
❑ Location 1
❑ Location 2
❑ Location 3
❑ Location 4
❑ …
BUSINESS AREA
❑ B. Area 1
❑ B. Area 2
❑ B. Area 3
❑ B. Area 4
❑ …
33. 33
• Categorize in multiple, independent,
categories.
• Allow combinations of categories to
narrow the choice of items.
• 4 independent categories of 10 nodes
each have the same discriminatory
power as one hierarchy of 10,000
nodes
• Easier to maintain
• Easier to reuse existing material
42 values to maintain (10+6+11+15)
9900 combinations (10x6x11x15)
Main
Ingredients
Cooking
Methods
Meal Type Cuisines
• Chocolate
• Dairy
• Fruits
• Grains
• Meat &
Seafood
• Nuts
• Olives
• Pasta
• Spices &
Seasonings
• Vegetables
• Breakfast
• Brunch
• Lunch
• Supper
• Dinner
• Snack
• African
• American
• Asian
• Caribbean
• Continental
• Eclectic/
Fusion/
International
• Jewish
• Latin American
• Mediterranean
• Middle Eastern
• Vegetarian
• Advanced
• Bake
• Broil
• Fry
• Grill
• Marinade
• Microwave
• No Cooking
• Poach
• Quick
• Roast
• Sauté
• Slow
Cooking
• Steam
• Stir-fry
MULTIPLE TAXONOMIES COMBINE SYNERGISTICALLY
@EKConsulting
34. 34
Method Definition Examples
Subject-oriented Information categorized by
subject or topic.
• Instantive - each child
category is an instance
of the parent category
• Partitive - each child
category is a part of the
parent category
water pollution, soil
pollution, air pollution…
Functional Information categorized by
the process to which it
relates
employment, staffing,
training
Organizational Information categorized by
corporate departments or
business entities.
Human Resources,
Marketing, Accounting,
Research…
Document Type Information categorized by
the type of document
presentations, expense
reports, press releases …
COMMON METADATA FIELDS
@EKConsulting
35. TAXONOMY DESIGN AND BEST PRACTICES
Leverage Existing
Information
Plan for the
Long-Term
Leverage
Governance
Look to Usability
Best Practices
Define & Document
Your Purpose
Focus on the
Business User
Understand Your
Publishing Process
Use the Simplest
Language Possible
Deconstruct Your
Taxonomy
A
B C
@EKConsulting
37. TOP-DOWN, BOTTOM-UP APPROACH
TOP-DOWN BOTTOM-UP
Interviews, Workshops, and Focus Groups
Goals:
1. Identify overall structure and
major categories of information.
2. Subdivide categories as
necessary to build taxonomy.
Analysis of individual documents, key
document sets, and major content repositories.
Goals:
1. Identify overall structure and major
categories of information.
2. Subdivide categories as necessary to
build the taxonomy.
40. • Define audience.
• Define the mission of your audience.
• Define the true reasons for designing
the taxonomy.
• What specifically can the taxonomy do
for the end business users?
40
Any organization can say “we
want to build a taxonomy to make
finding information easier for our
users,” but what does that tell us?
How does that help us? We need
to understand our users from the
business perspective and answer
the question: We want our on-
the-road sales staff to have one-
click access to customer news.
We want every employee to find
any form we have without calling
or emailing anyone. We want
new employees to be able to find
everything they need to get
started on Day 1.
BUSINESS CASE
@EKConsulting
41. • Timeline
• Set dates for “broader”
project (technology or
organizational).
• Regulatory requirements.
• People
• Availability
• Acceptance
• Understanding
• Technology
• Requirements v.
Capabilities
• Budget
41
Timeline
Technology
People
Budget
SCOPING
Taxonomy Scope
Constrains
@EKConsulting
42. • Communication, Education, and Marketing:
• Set user expectations
• Translate “pain points” to solutions in real time
• Create “buzz” around the project
• Market the results, not the definitions
• Identify taxonomy and content starting points
• Key stakeholders and early adopters
• Existing taxonomies and information systems
• Critical “must find” content
42
KNOWLEDGE GATHERING
43. • Convene wide-spectrum team (12-18 people) to represent their components
of the organization. Strive for diversity in:
• Function
• Hierarchy (to a degree)
• Tenure
• Geography
• Strive to identify individuals who “get it,” but also yield influence in their
respective domains.
• Participation should become an official and measurable job activity,
supported by management.
43
TAXONOMY TEAM
@EKConsulting
44. • The Taxonomy Team will ensure the taxonomy is a true business taxonomy.
• Participate in initial workshops to identify metadata fields and top-down
taxonomy design.
• Identify and enlist additional representatives for follow-on workshops,
focus groups, and testing.
• Support the content migration (and tagging) process.
• The Taxonomy Team will continue to meet throughout the length of the effort,
and ideally beyond.
44
TAXONOMY TEAM
@EKConsulting
45. TAXONOMY DESIGN WORKSHOP
Ensure your organization designs a truly
impactful taxonomy design.
RESULTS
1 ALIGNMENT
Stakeholders baselined in what taxonomy is, the
value it offers, and the resources necessary to
sustain and evolve the design.
2 DESIGN
A starter taxonomy design that follows taxonomy
design best practices on which to elaborate.
3 APPROACH
A clear path forward around which to proceed, plan,
and build a taxonomy which represents your
stakeholders.
ONE DAY
BUSINESS TAXONOMY
DESIGN WORKSHOP
@EKConsulting
46. PEOPLE CENTRIC TAXONOMY DESIGN ACTIVITIES
Working with a cross-
organizational group of
stakeholders and guiding them to
provide metadata and taxonomy
details by asking key questions
about content the create or use.
During the discussions, participants
identify audiences.
The discussion leads to the
identification of topics, document
types and other taxonomies.
Workshops
Conducting taxonomy focus groups
per business area to identify
metadata fields that are applicable
to the organization as a whole &
metadata fields that are unique to
their own business area.
Participants are asked to discuss
about content from their own
business area & identify
associated keywords.
The discussion leads to the
identification of topics that are
unique to that business area.
Focus Groups
This approach consists in
discussing content and taxonomy
needs with a specific individual.
Participants are typically key
project stakeholders in a senior
leadership role. It is also common
to conduct interviews with people
that have unique roles:
▪ Platform owner
▪ Taxonomy lead
Interviews
This approach consists in
attending system demos to learn
more about the client’s content and
taxonomy needs.
These demos help visualize how
existing taxonomies (if any) are
used for tagging and search
purposes and whether they meet
their current and future content
needs.
System Demos
47. CONTENT CENTRIC TAXONOMY DESIGN ACTIVITIES
This approach consists in manually
reviewing individual pieces of content
(e.g. documents or website pages) to
identify patterns of content and possible
taxonomies.
Content Analysis
A “quick reference” list of past or
existing documents, content, and items
that provides helpful information for the
taxonomy design and taxonomy
governance efforts.
▪ Existing systems and taxonomies
▪ Lessons learned from taxonomy
efforts
▪ Taxonomy requirements
▪ Existing taxonomy
policies/procedures
▪ Search logs
Taxonomy Background
Documentation Review
The use of text mining entity
extraction tools, such as PoolParty
help uncover the complexity of
information and identify new ways to
see, find and relate information.
The analysis of a collection of
documents with a text mining
application can reveal a set of
metadata and associated
taxonomies for an organization.
Corpus Analysis
48. TAXONOMY VALIDATION OBJECTIVES
Alignment
Taxonomy values are reflected and
accurately distributed across content
Usability
The structure and language of the
Taxonomy are intuitive to end users
Completeness
The taxonomy values are applicable to the
complete set of content across the system
Corpus Analysis
Test Tagging
Card Sorting and Tree Test
@EKConsulting
49. TAXONOMY VALIDATION TECHNIQUES
Card Sorting
A technique that requires participants
to sort representative content into
categories from the taxonomy.
Typically online.
Tree Test
An exercise that consists of separate
tasks to find content by navigating
through the taxonomy.
Test Tagging
In-person workshops where
participants work in pairs to apply
values from the taxonomy to tag
existing content.
Corpus Analysis
A semantic analysis of content that
compares it to the proposed taxonomy
to identify gaps though a machine
learning algorithm.
Online Tree Test
Test Tagging
Corpus Analysis
50. CONTENT TYPES
A Content Type is a reusable collection of metadata fields for a category of
content, with its corresponding taxonomies that allows you to manage
information in a centralized, reusable way. For example:
News Content Type
TopicSource
Client
Type Region
Title
Author
Date
@EKConsulting
51. TAXONOMY AND CONTENT TYPES
Taxonomy and Content Types help streamline content creation, allowing content authors to focus on entering
content in a standardized way, tagging it with the taxonomy, and getting it published.
Content Creation Content Publishing
Taxonomy
Taxonomy Design
@EKConsulting
52. • Establish clear taxonomy
governance:
• Policies and Procedures
• Roles and Responsibilities
• Communications, Education,
and Marketing
• Maintain the Taxonomy Team to
guide future development
• Continuously reexamine the
taxonomy
• Establish mechanisms to gather
user feedback and respond to it in
a timely manner
52
Most of the work in an average
taxonomy project will take place
within the Maintenance and Evolution
Stage.
No initial rollout of a taxonomy will
yield 100% perfection. Striving for
that will only delay your project and
risk your sanity. By preparing for this
on going work, you ensure the hard
work of the project team will not be
lost. With the correct mechanisms in
place, the team can respond to user
feedback and bring the taxonomy
closer to 100% perfection over time.
MAINTENANCE AND EVOLUTION
53. TAXONOMY GOVERNANCE
• Define a customized governance model
(loose or tight, centralized or
decentralized, etc.), that addresses:
- Roles and responsibilities;
- Policies and procedures; and
- Communications and education.
@EKConsulting
54. TAXONOMY METRICS
Alignment Metrics
Most Used Terms for
Tagging
Least Used Terms for
Tagging
Usage for Recently Added
or Modified Terms
Usability Metrics
Most Used Search Terms
Least Used Taxonomical
terms Found in Search
Completeness Metrics
Number of Taxonomy
Requests by Type (New,
Modification, Deletion)
Number of Requests by
Taxonomy
Number of Taxonomy
Requests by Business
Users / Department
@EKConsulting
55. AUTOMATION
Automation is the use of systems of instruction to carry out a repeated set of processes to spare humans from
doing that same set of processes.
Migration & CleanupMachine Learning
• Classification
• Prediction
• Regression
• Clustering
Natural Language Processing
• Entity Extraction
• Auto-Tagging
• Grammatical
Dependencies
• Language Detection
• Summarization
• Language Generation
• Translation
• Dependencies
• Sentiment Analysis
• Extract, Transform, Load
(ETL)
• Data Quality Checks
• Quality Assurance
Checks/Processes
@EKConsulting
57. TAXONOMY DESIGN EXAMPLE 1
Audience
Content
Type
Industry
Language
Location
Topic
Business
Taxonomy
Content Type
Policy
Procedure
Proposal
Report
Templates
…
Project Type
Internal
External
Language
English
French
Spanish
…
@EKConsulting
58. TAXONOMY DESIGN EXAMPLE 2
Solution
Approach/
Offering
Technology/
Platform
Industry
Service Line
Content
Type
Project
Type
Topic
Region
Business
Taxonomy
Industry
Agriculture
Construction
Education
Health Care
Oil and Gas
…
Project Type
Internal
External
…
Region
North America
Europe
Asia
…
@EKConsulting