SlideShare une entreprise Scribd logo
1  sur  38
© Concept Searching 2018
www.conceptsearching.com
marketing@conceptsearching.com
Twitter @conceptsearch
Graham Simms
Director of Delivery and Consulting Services
Concept Searching
grahams@conceptsearching.com
Why You Need Intelligent Metadata and
Auto-classification in Records Management
© Concept Searching 2018
Graham Simms – Director of Delivery and Consulting Services
at Concept Searching has over 20 years’ experience in the IT
industry, and received his Bachelor’s degree in Computer Science
from the Victoria University of Manchester, where he studied under
Professor Steven Furber, developer of the ARM processor. He is an
expert in information retrieval and automated document
classification solutions, and has developed taxonomies and
classification systems for several companies, including AT&T, BP,
DAI, and the US Air Force.
© Concept Searching 2018
Agenda
• Who we are and what we do
• Why we all agree that metadata is key to records management
• Why manual metadata doesn’t work
• Auto-classification as a solution
• Pros and cons of auto-classification types
• Case studies
• Conclusions
© Concept Searching 2018
• Company founded in 2002
• Product launched in 2003
• Focus on management of structured and unstructured information
• Profitable, debt free
• Technology Platform
• Delivered as a web service
• Automatic concept identification, content tagging, auto-classification,
taxonomy management
• Only statistical vendor that can extract conceptual metadata
• 9 years KMWorld ‘100 Companies that Matter in Knowledge Management’
9 years KMWorld ‘Trend Setting Product’
• Authority to Operate enterprise wide US Air Force, NETCON US Army,
and Canadian SLSA
• Client base: Fortune 500/1000 organizations in Healthcare,
Financial Services, Manufacturing, Energy, Professional Services,
Pharmaceutical, Public sector and DoD
• Microsoft Gold Certification in Application Development
• Member of SharePoint PAC and TAP programs
• Suitable for all versions of SharePoint on-premises and SharePoint Online,
including the vNext dedicated platform and the government cloud
The Global Leader in
Managed Metadata Solutions
© Concept Searching 2018
Why Are We Different?
It’s all about metadata
• Unique IP compound term processing
• Identifies multi-word terms that form
a complex entity
• Ambiguity inherent in single words
is eliminated
• Works in any language, regardless of
grammar or linguistic style
• Generates non-subjective metadata
based on an understanding of
conceptual meaning
© Concept Searching 2018
© Concept Searching 2018
• Vocabulary and assumptions based on paper fail
to scale
• Confidence in records management, eDiscovery,
security, privacy, and compliance remains low
• Records managers must develop new lifecycle
roadmaps to include on-premises, cloud, and
mobile
• The technical challenges tie the hands of records
managers to make strategic decisions
• Older records management applications have not
kept pace
• Preservation of older records should be a
concern
• Social media may contain legal and regulatory
risk, provides little to no metadata
• Email
The Changing Role of Records Management
Innovation or Disruption?
© Concept Searching 2018
© Concept Searching 2018
SharePoint Metadata
• Many different field types
• Term Store introduced in 2010
• Provides infrastructure for taxonomy management
• Managed metadata properties designed for hierarchical metadata
• Integrated with search via the refinement panel
• Labels introduced in Office 365
© Concept Searching 2018
What Does Metadata Impact?
© Concept Searching 2018
How Is it Supposed to Work?
© Concept Searching 2018
“Information which is not communicated is valueless, and
information that cannot be found is similarly useless.”
Robek, Brown, and Stephens
Records Management, Fourth Edition
© Concept Searching 2018
A manual metadata approach will fail 95%+ of the time
What’s the Problem?
© Concept Searching 2018
Metadata Tagging – the Problem
Manage Store Preserve Deliver
X X X XIneffective
Capture
Manual Meta-tagging Problems
• Created from a subjective frame of reference
• May not be in line with corporate governance
• Limits document transparency in an ECM environment
• Repercussions from noncompliance, impacts eDiscovery, potential
privacy or sensitive information exposure, degrades enterprise search
• Cost in-effective
© Concept Searching 2018
Metadata Tagging – a Solution
Manage Store Preserve Deliver
Effective
Capture
Solution to Manual Meta-tagging Problems
• Taxonomy management – organizational file plan/folder structure
• Automatic metadata generation – produce highly relevant corporate
metadata
• Automatic document meta-tagging – eliminate all manual meta-tagging
costs
• Auto-classification of all documents – organize all content to
organizational standard
© Concept Searching 2018
“There is a debilitating disconnect between the proliferation of electronic
information and the constant need to quickly and accurately find all of the
information and expertise that is essential for work every day. From top to
bottom, enterprises have failed to take seriously the high cost of being grossly
inadequate at finding information, data, documents, experts. Instead they have
settled for low performance, low-return techniques to… sort of handle Search.”
Julie Hunt
Search Consultant
© Concept Searching 2018
Taxonomies
• Hierarchical representation of entities of
interest in an organization
• Primary tool to provide structure to
unstructured data
• Front end and/or back end functionality
• Actualized through metadata
• Business taxonomies
• Tend to be less rigid and constrained
• Usability – minimize clicks
• Content driven
• Allows flexibility and redundancy
• Provides a single methodology for
classification (categorization)
• Provides for entity extraction using NLP
© Concept Searching 2018
Auto-classification
© Concept Searching 2018
Types of Classification Metadata
Intrinsic – information that can be extracted
directly from an object (file name, size)
Administrative/Management – information
used to manage the document (author, date
created, date to be reviewed)
Descriptive – information that describes
the object (title, subject, audience)
Semantic – ability to extract concepts from
within content and generate the metadata
(intelligent metadata)
© Concept Searching 2018
Content-Based – weight given to a
particular subject in a document
determines the class to which the
document is assigned
Request-Based – sometimes referred
to as indexing, classification in which
the anticipated requests from users
influence how documents are
classified
Policy-Based – classification that is
aimed at a particular audience or user
group
The History of Auto-classification
© Concept Searching 2018
Automatic Document Classification
Supervised – some external
mechanism, such as human feedback,
provides information on the correct
classification
Unsupervised – also known as
document clustering, where the
classification has no reference to
external information
Semi-supervised – where parts of the
documents are labeled by an external
mechanism and some by human
intervention
© Concept Searching 2018
Taxonomies and thesauri are the foundation of an auto-classifier.
They provide the vocabulary against which rules are built and
‘teach’ the machine how to ‘understand’ and categorize content
Statistical
• Often use Bayes theorem: measures ‘degrees of belief’
(or ‘degrees of aboutness’)
• Use frequency and location to determine important or useful
concepts
• Feed the system example text for the specific category
• Statistically identifies and extracts significant keywords and
patterns
• Document training sets
• Match word/concept patterns to categories
• Often need sets of 50+ documents, or more
• Poor document choice can cause pollution/noise
• Drawbacks
• Effort required to create the training set
• Relies on the availability of keyword-rich text
• Hard to determine problems
Auto-classification Systems – Statistical
© Concept Searching 2018
Rule-Based
• Rely on Boolean (and, or, not) categorization rules
to find either a positive or negative evidence of a
match to a category
• More control over behavior – More work!
• Success depends on quality of rules
• Example: (Google OR Salesforce) NOT LinkedIn
• Drawbacks
• Dependent on the richness of the taxonomy and collection of
synonyms/keywords
• Creating and/or tweaking the rules for each category – can be onerous
Most popular taxonomy management suites include auto-classification modules
• With few exceptions, taxonomy tools are generally rule-based systems
Auto-classification Systems – Rule-Based
© Concept Searching 2018
Linguistic
• No commitment to a taxonomic tree, based on parts of
speech and their relationships, typically not scalable
• Related to parts of speech, syntactic parses, or
semantic interpretations
Machine Learning
• Subfield of computer science (CS) and artificial
intelligence (AI) that deals with the construction and
study of systems that can learn from data, rather than
follow only explicitly programmed instructions
Semantic Networks
• Refers to a set of relationships between concepts and
words, including parts of speech and
real-world relationships
• These can include rules of various types – not just
Boolean
Auto-classification Systems – Other
© Concept Searching 2018
Pros and Cons of Most Widely Used
Classification Techniques
Statistical Rule-based
Work involved in building good
training sets
Work involved in building
exhaustive rules
(mitigated by taxonomy tools)
If there’s a problem, can be difficult
to diagnose and rectify/retrain
If there’s a problem, go back to the
rule set and tweak
Machine learning can augment
accuracy or lead to pollution
(accuracy can wax and wane)
System doesn’t evolve without new
rules, but high degree of control
(accuracy mostly increases)
• Most widely used are statistical and rule-based
• Several are a combination of both statistical and rule-based
© Concept Searching 2018
Auto-classification Systems – What Do They Do?
Document
Preparation
• Split into language
blocks (paragraphs,
headings),
formatting, layout
Parsing
• Entity extraction
• NLP: parts of speech,
phrases
• Terms, variants
Weighting
• Frequency
• Location in text,
phrase
• Proximity
• Combination
• Format of text
Classification
• If threshold reached
• Can influence search
results
This is where rules
vs statistics come
into play…Not all classification solutions are created equal!
© Concept Searching 2018
• Concept Searching’s unique statistical concept identification underpins all technologies
• Multi-word suggestion is explicitly more valuable than single term suggestion algorithms
Concept Searching has a unique approach to ensure success
• conceptClassifier will generate conceptual metadata by
extracting multi-word terms that identify ‘triple heart bypass’
as a concept as opposed to single keywords
• Metadata can be used by any search engine index or any
application/process that uses metadata
Concept Searching
Provides Automatic
Concept Term Extraction
Triple
Baseball
Three
Heart
Organ
Center
Bypass
Highway
Avoid
Building a Records Management Concept Index – Example
© Concept Searching 2018
Designed for the Business Professional
Unique to conceptTaxonomyManager
• Compound term processing technology that
identifies ‘concepts in context’
• Automatic intelligent metadata generation as
content is created or ingested
• Rule-based engine that eliminates the need
for training sets and highly specialized human
resources
• Automatic taxonomy node clue suggestion
• Dynamic screen updating to immediately see
impact of changes in the taxonomy
• Document movement feedback to see cause
and effect of changes without re-indexing
© Concept Searching 2018
There are more than 14,000 laws and regulations related to information management, many of
which can be challenging to enforce across an enterprise-size IT infrastructure.
© Concept Searching 2018
When Does Records Management Become
Information Governance?
• Records Management involves the implementation
of a process or system for directing and controlling
an organization's information (records)
• Information Governance is the strategy or
framework for controlling information (records) in a
way which encourages compliance, mitigates legal
risks, and aligns to corporate governance policies
• Holistic approach to manage information in all
its formats and forms, regardless of where it
is stored or how it was acquired
• Beyond Records Management – transitioning to
information governance
• Risk assessment
• Legal mitigation
• Defensible audit
• Process
“So called ‘data breaches’ are thefts of
information and, as such, they are first and
foremost a traditional records management
problem. Until organizations understand this and
include records management as a critical
component of their long-term cybersecurity
strategy, data breaches – and the disastrous
consequences they bring – will continue
unabated.”
Don Lueders
© Concept Searching 2018
Advantages
• Ability to develop a single repository of organizationally relevant
metadata to be made available to any application that requires the use
of metadata
• Elimination of costs and errors associated with end user tagging
• Normalization of content across functional and geographic boundaries
to remove ambiguity in vocabulary
• Metadata managed and changed in one place
• Ability to apply policy consistently across diverse repositories and
applications
• Provide flexibility to rapidly make changes to the repository for
regulatory compliance where changes are immediately available for
use by applications
• Works bisynchronously with the SharePoint Term Store, reading and
writing in real time
The Value of Semantic, Multi-term Metadata
© Concept Searching 2018
© Concept Searching 2018
Records Management
Situation:
• Nonprofit public benefit corporation
• Highly regulated
• Relied heavily on web site to address the unique
requirements of diverse audiences, updated daily
• Unable to implement content lifecycle management
Challenges:
• Erroneous tagging of documents
• Poor information retrieval
• Needed to improve site visitor experience
• Management of the complexity and amount of content
and data
• Leverage SharePoint investment
Products Used:
• conceptClassifier platform
• conceptClassifier for SharePoint
• conceptTaxonomyWorkflow
Achievements
• Automates document workflow for
storage, preservation, access, and usage
controls and eliminates end user tagging
• Assists in the management of content by
identifying records as well as content that
should be archived or contains sensitive
information
• Facilitates the retrieval of records as well
as highly correlated content that typically
would not be found
• Ensures compliance with industry and
government mandates enabling rapid
implementation to address regulatory
changes
• Native integration with SharePoint and
the Term Store, maintains GUIDs
© Concept Searching 2018
Identification of Sensitive and Privacy Information
Situation:
• US Government Military Agency
Challenges:
• Erroneous tagging of documents
• Prevent data exposures of privacy, sensitive, and
confidential information
Products Used:
• conceptClassifier platform
• Hosted by Serco
Outcome:
• The solution scans both the Internet and intranet, and
compiles a report of PII erroneously present on
unprotected portals
Achievements
• Ability to proactively take action on the
potential data breaches
• Removes from unauthorized access
• Prevents portability
• Eliminated end user tagging
• Not transmittable view email without
specific authorization
• No breach for 11 years
• Increased productivity – 2,500 record
codes
• Processed information faster and
achieved higher accuracy
© Concept Searching 2018
Situation:
• Budget of $6.9 Billion
• Over 60,000 users
• Runs 75 hospitals and clinics providing care to more than 2.6 million
beneficiaries
Challenge:
• Data Privacy
• Intelligent Migration
• Before and after
• Records Management
• 72,000 Site Collections, 5,300 retention codes, classify 200,000
documents per hour with minimum resources (Proof of Concept)
Solution:
• conceptClassifier for SharePoint platform
Benefits:
• Automatic tagging based on organizational vocabulary and descriptors
• Automatic routing and the ability to change the SharePoint content type
• Eliminated manual tagging, removes from unauthorized access and
portability
• No security exposures or breaches in 5 years, since deployed
The US Air Force deployed the
technologies to implement data
privacy protection processes and
after five years has not had a
data breach
Automatic Tagging, Policy, and Governance
© Concept Searching 2018
• All information is automatically tagged resulting in the
classification of unstructured data to organizational taxonomies
• All information is retrievable using concepts (high-precision)
instead of key-words, proximity, full text, (low-precision)
• Cleanses file shares, SharePoint, Exchange, and any repository
• Identifies and protects privacy and sensitive information
exposures in real time
• General Data Protection Regulation (GDPR) compliance
• Insight engine feeds search engine index enabling
concept-based searching
• Reduces the risks and costs associated with eDiscovery
• Works interactively with records management applications to
identify and classify records based on the file plan, automatically
classifies them to a SharePoint content type, processes to
records management application
• Provides secure collaboration based on the contents within
documents to be shared
• Knowledge management, research, text mining and analytics
What Are the Results?
© Concept Searching 2018
© Concept Searching 2018
www.conceptsearching.com
marketing@conceptsearching.com
Twitter @conceptsearch
Graham Simms
Director of Delivery and Consulting Services
Concept Searching
grahams@conceptsearching.com
Thank You

Contenu connexe

Tendances

Climbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations WebinarClimbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations Webinar
Concept Searching, Inc
 
Geek Sync | 5 Best Practices for Operationalizing Data Governance
Geek Sync | 5 Best Practices for Operationalizing Data GovernanceGeek Sync | 5 Best Practices for Operationalizing Data Governance
Geek Sync | 5 Best Practices for Operationalizing Data Governance
IDERA Software
 
Metadata Matters – Collaboration, Search, and Information Governance at Brail...
Metadata Matters – Collaboration, Search, and Information Governance at Brail...Metadata Matters – Collaboration, Search, and Information Governance at Brail...
Metadata Matters – Collaboration, Search, and Information Governance at Brail...
Concept Searching, Inc
 
Intelligent Metadata Enabled Migration with SharePoint
Intelligent Metadata Enabled Migration with SharePointIntelligent Metadata Enabled Migration with SharePoint
Intelligent Metadata Enabled Migration with SharePoint
Concept Searching, Inc
 
conceptClassifier For SharePoint Driving Business Value
conceptClassifier For SharePoint Driving Business ValueconceptClassifier For SharePoint Driving Business Value
conceptClassifier For SharePoint Driving Business Value
martingarland
 
Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...
Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...
Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...
Concept Searching, Inc
 
The role of BI in content strategies
The role of BI in content strategiesThe role of BI in content strategies
The role of BI in content strategies
Jorge Garcia
 

Tendances (20)

Climbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations WebinarClimbing the Slippery Slope of SharePoint Migrations Webinar
Climbing the Slippery Slope of SharePoint Migrations Webinar
 
Concept Searching Webinar P
Concept Searching Webinar PConcept Searching Webinar P
Concept Searching Webinar P
 
Geek Sync | 5 Best Practices for Operationalizing Data Governance
Geek Sync | 5 Best Practices for Operationalizing Data GovernanceGeek Sync | 5 Best Practices for Operationalizing Data Governance
Geek Sync | 5 Best Practices for Operationalizing Data Governance
 
Metadata Matters – Collaboration, Search, and Information Governance at Brail...
Metadata Matters – Collaboration, Search, and Information Governance at Brail...Metadata Matters – Collaboration, Search, and Information Governance at Brail...
Metadata Matters – Collaboration, Search, and Information Governance at Brail...
 
Intelligent Compliance to Optimize Energy Sector Enterprise Content Managemen...
Intelligent Compliance to Optimize Energy Sector Enterprise Content Managemen...Intelligent Compliance to Optimize Energy Sector Enterprise Content Managemen...
Intelligent Compliance to Optimize Energy Sector Enterprise Content Managemen...
 
Intelligent Metadata Enabled Migration with SharePoint
Intelligent Metadata Enabled Migration with SharePointIntelligent Metadata Enabled Migration with SharePoint
Intelligent Metadata Enabled Migration with SharePoint
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
 
Product Information is Key to Winning the Customer Experience Race
Product Information is Key to Winning the Customer Experience Race Product Information is Key to Winning the Customer Experience Race
Product Information is Key to Winning the Customer Experience Race
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
 
Process-Centric Governance and Information Architecture
Process-Centric Governance and Information ArchitectureProcess-Centric Governance and Information Architecture
Process-Centric Governance and Information Architecture
 
Building internal-competencies-in-ioa
Building internal-competencies-in-ioaBuilding internal-competencies-in-ioa
Building internal-competencies-in-ioa
 
conceptClassifier For SharePoint Driving Business Value
conceptClassifier For SharePoint Driving Business ValueconceptClassifier For SharePoint Driving Business Value
conceptClassifier For SharePoint Driving Business Value
 
Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...
Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...
Coexist or Integrate? Manage Unstructured Content from Diverse Repositories a...
 
Going Meta – How to use Metadata in SharePoint
Going Meta – How to use Metadata in SharePointGoing Meta – How to use Metadata in SharePoint
Going Meta – How to use Metadata in SharePoint
 
Applying reference models with archi mate
Applying reference models with archi mateApplying reference models with archi mate
Applying reference models with archi mate
 
Data Management
Data ManagementData Management
Data Management
 
The role of BI in content strategies
The role of BI in content strategiesThe role of BI in content strategies
The role of BI in content strategies
 
Ontology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick GuideOntology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick Guide
 
IT6701 Information Management - Unit III
IT6701 Information Management - Unit IIIIT6701 Information Management - Unit III
IT6701 Information Management - Unit III
 
IT6701 Information Management Unit - V
IT6701 Information Management Unit - VIT6701 Information Management Unit - V
IT6701 Information Management Unit - V
 

Similaire à FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Management

Reduce Cost, Time, and Risk – eDiscovery and Records Management in SharePoint
Reduce Cost, Time, and Risk – eDiscovery and Records Management in SharePointReduce Cost, Time, and Risk – eDiscovery and Records Management in SharePoint
Reduce Cost, Time, and Risk – eDiscovery and Records Management in SharePoint
Concept Searching, Inc
 
Metadata Matters Eliminating Manual Tagging in AllRegs by Ellie Mae
Metadata Matters Eliminating Manual Tagging in AllRegs by Ellie MaeMetadata Matters Eliminating Manual Tagging in AllRegs by Ellie Mae
Metadata Matters Eliminating Manual Tagging in AllRegs by Ellie Mae
Concept Searching, Inc
 
How to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right WebinarHow to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right Webinar
Concept Searching, Inc
 
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
Concept Searching, Inc
 

Similaire à FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Management (20)

Using Metadata-Driven Taxonomies to Solve Business Problems
Using Metadata-Driven Taxonomies to Solve Business ProblemsUsing Metadata-Driven Taxonomies to Solve Business Problems
Using Metadata-Driven Taxonomies to Solve Business Problems
 
Why You Need Metadata-Driven Records Management Webinar
Why You Need Metadata-Driven Records Management WebinarWhy You Need Metadata-Driven Records Management Webinar
Why You Need Metadata-Driven Records Management Webinar
 
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
 
Eliminating End User Tagging – Minimizing Organizational Risk and Improving B...
Eliminating End User Tagging – Minimizing Organizational Risk and Improving B...Eliminating End User Tagging – Minimizing Organizational Risk and Improving B...
Eliminating End User Tagging – Minimizing Organizational Risk and Improving B...
 
Enough Talk – Solving GDPR Problems Through Metadata-Driven Compliance Webinar
Enough Talk – Solving GDPR Problems Through Metadata-Driven Compliance WebinarEnough Talk – Solving GDPR Problems Through Metadata-Driven Compliance Webinar
Enough Talk – Solving GDPR Problems Through Metadata-Driven Compliance Webinar
 
Overcoming Capability Gaps in Information Transparency, Knowledge Management,...
Overcoming Capability Gaps in Information Transparency, Knowledge Management,...Overcoming Capability Gaps in Information Transparency, Knowledge Management,...
Overcoming Capability Gaps in Information Transparency, Knowledge Management,...
 
Reduce Cost, Time, and Risk – eDiscovery and Records Management in SharePoint
Reduce Cost, Time, and Risk – eDiscovery and Records Management in SharePointReduce Cost, Time, and Risk – eDiscovery and Records Management in SharePoint
Reduce Cost, Time, and Risk – eDiscovery and Records Management in SharePoint
 
Drowning in Data and Starving for Information
Drowning in Dataand Starving for InformationDrowning in Dataand Starving for Information
Drowning in Data and Starving for Information
 
Why Most Migration Projects Fail – Don’t Be a Statistic Webinar
Why Most Migration Projects Fail – Don’t Be a Statistic WebinarWhy Most Migration Projects Fail – Don’t Be a Statistic Webinar
Why Most Migration Projects Fail – Don’t Be a Statistic Webinar
 
Taxonomy and tagging – manual tagging does not work!
Taxonomy and tagging – manual tagging does not work!Taxonomy and tagging – manual tagging does not work!
Taxonomy and tagging – manual tagging does not work!
 
ARMA NOVA’s Auto-Categorization Showcase
ARMA NOVA’s Auto-Categorization Showcase ARMA NOVA’s Auto-Categorization Showcase
ARMA NOVA’s Auto-Categorization Showcase
 
Data Breaches and Security Rights in SharePoint Webinar
Data Breaches and Security Rights in SharePoint WebinarData Breaches and Security Rights in SharePoint Webinar
Data Breaches and Security Rights in SharePoint Webinar
 
How To Drive Intelligent Migration Webinar
How To Drive Intelligent Migration WebinarHow To Drive Intelligent Migration Webinar
How To Drive Intelligent Migration Webinar
 
Why Metadata Matters in SharePoint Search and Information Governance Webinar
Why Metadata Matters in SharePoint Search and Information Governance WebinarWhy Metadata Matters in SharePoint Search and Information Governance Webinar
Why Metadata Matters in SharePoint Search and Information Governance Webinar
 
Modern content management technology
Modern content management technologyModern content management technology
Modern content management technology
 
Exploring Automatic Metadata Generation Based on SharePoint Term Sets
Exploring Automatic Metadata Generation Based on SharePoint Term SetsExploring Automatic Metadata Generation Based on SharePoint Term Sets
Exploring Automatic Metadata Generation Based on SharePoint Term Sets
 
Metadata Matters Eliminating Manual Tagging in AllRegs by Ellie Mae
Metadata Matters Eliminating Manual Tagging in AllRegs by Ellie MaeMetadata Matters Eliminating Manual Tagging in AllRegs by Ellie Mae
Metadata Matters Eliminating Manual Tagging in AllRegs by Ellie Mae
 
How to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right WebinarHow to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right Webinar
 
Collaboration Can Be Dangerous Webinar
Collaboration Can Be Dangerous WebinarCollaboration Can Be Dangerous Webinar
Collaboration Can Be Dangerous Webinar
 
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
 

Plus de Concept Searching, Inc

The Value of Adding Managed Metadata to Microsoft Online Search
The Value of Adding Managed Metadata to Microsoft Online SearchThe Value of Adding Managed Metadata to Microsoft Online Search
The Value of Adding Managed Metadata to Microsoft Online Search
Concept Searching, Inc
 
How To Implement Engineering Search Within Your Organization Webinar
How To Implement Engineering Search Within Your Organization WebinarHow To Implement Engineering Search Within Your Organization Webinar
How To Implement Engineering Search Within Your Organization Webinar
Concept Searching, Inc
 

Plus de Concept Searching, Inc (13)

Using Metadata and Classification in Records Management
Using Metadata and Classification in Records ManagementUsing Metadata and Classification in Records Management
Using Metadata and Classification in Records Management
 
Discovery, Risk, and Insight in a Metadata-Driven World Webinar
Discovery, Risk, and Insight in a Metadata-Driven World WebinarDiscovery, Risk, and Insight in a Metadata-Driven World Webinar
Discovery, Risk, and Insight in a Metadata-Driven World Webinar
 
Why You Need Intelligent Metadata and Auto-classification in Records Management
Why You Need Intelligent Metadata and Auto-classification in Records ManagementWhy You Need Intelligent Metadata and Auto-classification in Records Management
Why You Need Intelligent Metadata and Auto-classification in Records Management
 
What You Don’t Know May Hurt You – Achieving Insight and Knowledge Discovery
What You Don’t Know May Hurt You – Achieving Insight and Knowledge DiscoveryWhat You Don’t Know May Hurt You – Achieving Insight and Knowledge Discovery
What You Don’t Know May Hurt You – Achieving Insight and Knowledge Discovery
 
Going Meta – How to Use Metadata in SharePoint and Office 365
Going Meta – How to Use Metadata in SharePoint and Office 365Going Meta – How to Use Metadata in SharePoint and Office 365
Going Meta – How to Use Metadata in SharePoint and Office 365
 
Eliminate the 49% of Documents that Contain Data Breaches Webinar
Eliminate the 49% of Documents that Contain Data Breaches WebinarEliminate the 49% of Documents that Contain Data Breaches Webinar
Eliminate the 49% of Documents that Contain Data Breaches Webinar
 
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
 
ECM or CLM? A Fight to the Finish Webinar
ECM or CLM? A Fight to the Finish WebinarECM or CLM? A Fight to the Finish Webinar
ECM or CLM? A Fight to the Finish Webinar
 
SharePoint and Office 365 State of the Market Survey Results Webinar
SharePoint and Office 365 State of the Market Survey Results WebinarSharePoint and Office 365 State of the Market Survey Results Webinar
SharePoint and Office 365 State of the Market Survey Results Webinar
 
The Value of Adding Managed Metadata to Microsoft Online Search
The Value of Adding Managed Metadata to Microsoft Online SearchThe Value of Adding Managed Metadata to Microsoft Online Search
The Value of Adding Managed Metadata to Microsoft Online Search
 
How To Implement Engineering Search Within Your Organization Webinar
How To Implement Engineering Search Within Your Organization WebinarHow To Implement Engineering Search Within Your Organization Webinar
How To Implement Engineering Search Within Your Organization Webinar
 
conceptTermStoreManager Demo On Demand
conceptTermStoreManager Demo On DemandconceptTermStoreManager Demo On Demand
conceptTermStoreManager Demo On Demand
 
Optimize and Organize Your Content with conceptClassifier for File Shares
Optimize and Organize Your Content with conceptClassifier for File Shares Optimize and Organize Your Content with conceptClassifier for File Shares
Optimize and Organize Your Content with conceptClassifier for File Shares
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Management

  • 1. © Concept Searching 2018 www.conceptsearching.com marketing@conceptsearching.com Twitter @conceptsearch Graham Simms Director of Delivery and Consulting Services Concept Searching grahams@conceptsearching.com Why You Need Intelligent Metadata and Auto-classification in Records Management
  • 2. © Concept Searching 2018 Graham Simms – Director of Delivery and Consulting Services at Concept Searching has over 20 years’ experience in the IT industry, and received his Bachelor’s degree in Computer Science from the Victoria University of Manchester, where he studied under Professor Steven Furber, developer of the ARM processor. He is an expert in information retrieval and automated document classification solutions, and has developed taxonomies and classification systems for several companies, including AT&T, BP, DAI, and the US Air Force.
  • 3. © Concept Searching 2018 Agenda • Who we are and what we do • Why we all agree that metadata is key to records management • Why manual metadata doesn’t work • Auto-classification as a solution • Pros and cons of auto-classification types • Case studies • Conclusions
  • 4. © Concept Searching 2018 • Company founded in 2002 • Product launched in 2003 • Focus on management of structured and unstructured information • Profitable, debt free • Technology Platform • Delivered as a web service • Automatic concept identification, content tagging, auto-classification, taxonomy management • Only statistical vendor that can extract conceptual metadata • 9 years KMWorld ‘100 Companies that Matter in Knowledge Management’ 9 years KMWorld ‘Trend Setting Product’ • Authority to Operate enterprise wide US Air Force, NETCON US Army, and Canadian SLSA • Client base: Fortune 500/1000 organizations in Healthcare, Financial Services, Manufacturing, Energy, Professional Services, Pharmaceutical, Public sector and DoD • Microsoft Gold Certification in Application Development • Member of SharePoint PAC and TAP programs • Suitable for all versions of SharePoint on-premises and SharePoint Online, including the vNext dedicated platform and the government cloud The Global Leader in Managed Metadata Solutions
  • 5. © Concept Searching 2018 Why Are We Different? It’s all about metadata • Unique IP compound term processing • Identifies multi-word terms that form a complex entity • Ambiguity inherent in single words is eliminated • Works in any language, regardless of grammar or linguistic style • Generates non-subjective metadata based on an understanding of conceptual meaning
  • 7. © Concept Searching 2018 • Vocabulary and assumptions based on paper fail to scale • Confidence in records management, eDiscovery, security, privacy, and compliance remains low • Records managers must develop new lifecycle roadmaps to include on-premises, cloud, and mobile • The technical challenges tie the hands of records managers to make strategic decisions • Older records management applications have not kept pace • Preservation of older records should be a concern • Social media may contain legal and regulatory risk, provides little to no metadata • Email The Changing Role of Records Management Innovation or Disruption?
  • 9. © Concept Searching 2018 SharePoint Metadata • Many different field types • Term Store introduced in 2010 • Provides infrastructure for taxonomy management • Managed metadata properties designed for hierarchical metadata • Integrated with search via the refinement panel • Labels introduced in Office 365
  • 10. © Concept Searching 2018 What Does Metadata Impact?
  • 11. © Concept Searching 2018 How Is it Supposed to Work?
  • 12. © Concept Searching 2018 “Information which is not communicated is valueless, and information that cannot be found is similarly useless.” Robek, Brown, and Stephens Records Management, Fourth Edition
  • 13. © Concept Searching 2018 A manual metadata approach will fail 95%+ of the time What’s the Problem?
  • 14. © Concept Searching 2018 Metadata Tagging – the Problem Manage Store Preserve Deliver X X X XIneffective Capture Manual Meta-tagging Problems • Created from a subjective frame of reference • May not be in line with corporate governance • Limits document transparency in an ECM environment • Repercussions from noncompliance, impacts eDiscovery, potential privacy or sensitive information exposure, degrades enterprise search • Cost in-effective
  • 15. © Concept Searching 2018 Metadata Tagging – a Solution Manage Store Preserve Deliver Effective Capture Solution to Manual Meta-tagging Problems • Taxonomy management – organizational file plan/folder structure • Automatic metadata generation – produce highly relevant corporate metadata • Automatic document meta-tagging – eliminate all manual meta-tagging costs • Auto-classification of all documents – organize all content to organizational standard
  • 16. © Concept Searching 2018 “There is a debilitating disconnect between the proliferation of electronic information and the constant need to quickly and accurately find all of the information and expertise that is essential for work every day. From top to bottom, enterprises have failed to take seriously the high cost of being grossly inadequate at finding information, data, documents, experts. Instead they have settled for low performance, low-return techniques to… sort of handle Search.” Julie Hunt Search Consultant
  • 17. © Concept Searching 2018 Taxonomies • Hierarchical representation of entities of interest in an organization • Primary tool to provide structure to unstructured data • Front end and/or back end functionality • Actualized through metadata • Business taxonomies • Tend to be less rigid and constrained • Usability – minimize clicks • Content driven • Allows flexibility and redundancy • Provides a single methodology for classification (categorization) • Provides for entity extraction using NLP
  • 18. © Concept Searching 2018 Auto-classification
  • 19. © Concept Searching 2018 Types of Classification Metadata Intrinsic – information that can be extracted directly from an object (file name, size) Administrative/Management – information used to manage the document (author, date created, date to be reviewed) Descriptive – information that describes the object (title, subject, audience) Semantic – ability to extract concepts from within content and generate the metadata (intelligent metadata)
  • 20. © Concept Searching 2018 Content-Based – weight given to a particular subject in a document determines the class to which the document is assigned Request-Based – sometimes referred to as indexing, classification in which the anticipated requests from users influence how documents are classified Policy-Based – classification that is aimed at a particular audience or user group The History of Auto-classification
  • 21. © Concept Searching 2018 Automatic Document Classification Supervised – some external mechanism, such as human feedback, provides information on the correct classification Unsupervised – also known as document clustering, where the classification has no reference to external information Semi-supervised – where parts of the documents are labeled by an external mechanism and some by human intervention
  • 22. © Concept Searching 2018 Taxonomies and thesauri are the foundation of an auto-classifier. They provide the vocabulary against which rules are built and ‘teach’ the machine how to ‘understand’ and categorize content Statistical • Often use Bayes theorem: measures ‘degrees of belief’ (or ‘degrees of aboutness’) • Use frequency and location to determine important or useful concepts • Feed the system example text for the specific category • Statistically identifies and extracts significant keywords and patterns • Document training sets • Match word/concept patterns to categories • Often need sets of 50+ documents, or more • Poor document choice can cause pollution/noise • Drawbacks • Effort required to create the training set • Relies on the availability of keyword-rich text • Hard to determine problems Auto-classification Systems – Statistical
  • 23. © Concept Searching 2018 Rule-Based • Rely on Boolean (and, or, not) categorization rules to find either a positive or negative evidence of a match to a category • More control over behavior – More work! • Success depends on quality of rules • Example: (Google OR Salesforce) NOT LinkedIn • Drawbacks • Dependent on the richness of the taxonomy and collection of synonyms/keywords • Creating and/or tweaking the rules for each category – can be onerous Most popular taxonomy management suites include auto-classification modules • With few exceptions, taxonomy tools are generally rule-based systems Auto-classification Systems – Rule-Based
  • 24. © Concept Searching 2018 Linguistic • No commitment to a taxonomic tree, based on parts of speech and their relationships, typically not scalable • Related to parts of speech, syntactic parses, or semantic interpretations Machine Learning • Subfield of computer science (CS) and artificial intelligence (AI) that deals with the construction and study of systems that can learn from data, rather than follow only explicitly programmed instructions Semantic Networks • Refers to a set of relationships between concepts and words, including parts of speech and real-world relationships • These can include rules of various types – not just Boolean Auto-classification Systems – Other
  • 25. © Concept Searching 2018 Pros and Cons of Most Widely Used Classification Techniques Statistical Rule-based Work involved in building good training sets Work involved in building exhaustive rules (mitigated by taxonomy tools) If there’s a problem, can be difficult to diagnose and rectify/retrain If there’s a problem, go back to the rule set and tweak Machine learning can augment accuracy or lead to pollution (accuracy can wax and wane) System doesn’t evolve without new rules, but high degree of control (accuracy mostly increases) • Most widely used are statistical and rule-based • Several are a combination of both statistical and rule-based
  • 26. © Concept Searching 2018 Auto-classification Systems – What Do They Do? Document Preparation • Split into language blocks (paragraphs, headings), formatting, layout Parsing • Entity extraction • NLP: parts of speech, phrases • Terms, variants Weighting • Frequency • Location in text, phrase • Proximity • Combination • Format of text Classification • If threshold reached • Can influence search results This is where rules vs statistics come into play…Not all classification solutions are created equal!
  • 27. © Concept Searching 2018 • Concept Searching’s unique statistical concept identification underpins all technologies • Multi-word suggestion is explicitly more valuable than single term suggestion algorithms Concept Searching has a unique approach to ensure success • conceptClassifier will generate conceptual metadata by extracting multi-word terms that identify ‘triple heart bypass’ as a concept as opposed to single keywords • Metadata can be used by any search engine index or any application/process that uses metadata Concept Searching Provides Automatic Concept Term Extraction Triple Baseball Three Heart Organ Center Bypass Highway Avoid Building a Records Management Concept Index – Example
  • 28. © Concept Searching 2018 Designed for the Business Professional Unique to conceptTaxonomyManager • Compound term processing technology that identifies ‘concepts in context’ • Automatic intelligent metadata generation as content is created or ingested • Rule-based engine that eliminates the need for training sets and highly specialized human resources • Automatic taxonomy node clue suggestion • Dynamic screen updating to immediately see impact of changes in the taxonomy • Document movement feedback to see cause and effect of changes without re-indexing
  • 29. © Concept Searching 2018 There are more than 14,000 laws and regulations related to information management, many of which can be challenging to enforce across an enterprise-size IT infrastructure.
  • 30. © Concept Searching 2018 When Does Records Management Become Information Governance? • Records Management involves the implementation of a process or system for directing and controlling an organization's information (records) • Information Governance is the strategy or framework for controlling information (records) in a way which encourages compliance, mitigates legal risks, and aligns to corporate governance policies • Holistic approach to manage information in all its formats and forms, regardless of where it is stored or how it was acquired • Beyond Records Management – transitioning to information governance • Risk assessment • Legal mitigation • Defensible audit • Process “So called ‘data breaches’ are thefts of information and, as such, they are first and foremost a traditional records management problem. Until organizations understand this and include records management as a critical component of their long-term cybersecurity strategy, data breaches – and the disastrous consequences they bring – will continue unabated.” Don Lueders
  • 31. © Concept Searching 2018 Advantages • Ability to develop a single repository of organizationally relevant metadata to be made available to any application that requires the use of metadata • Elimination of costs and errors associated with end user tagging • Normalization of content across functional and geographic boundaries to remove ambiguity in vocabulary • Metadata managed and changed in one place • Ability to apply policy consistently across diverse repositories and applications • Provide flexibility to rapidly make changes to the repository for regulatory compliance where changes are immediately available for use by applications • Works bisynchronously with the SharePoint Term Store, reading and writing in real time The Value of Semantic, Multi-term Metadata
  • 33. © Concept Searching 2018 Records Management Situation: • Nonprofit public benefit corporation • Highly regulated • Relied heavily on web site to address the unique requirements of diverse audiences, updated daily • Unable to implement content lifecycle management Challenges: • Erroneous tagging of documents • Poor information retrieval • Needed to improve site visitor experience • Management of the complexity and amount of content and data • Leverage SharePoint investment Products Used: • conceptClassifier platform • conceptClassifier for SharePoint • conceptTaxonomyWorkflow Achievements • Automates document workflow for storage, preservation, access, and usage controls and eliminates end user tagging • Assists in the management of content by identifying records as well as content that should be archived or contains sensitive information • Facilitates the retrieval of records as well as highly correlated content that typically would not be found • Ensures compliance with industry and government mandates enabling rapid implementation to address regulatory changes • Native integration with SharePoint and the Term Store, maintains GUIDs
  • 34. © Concept Searching 2018 Identification of Sensitive and Privacy Information Situation: • US Government Military Agency Challenges: • Erroneous tagging of documents • Prevent data exposures of privacy, sensitive, and confidential information Products Used: • conceptClassifier platform • Hosted by Serco Outcome: • The solution scans both the Internet and intranet, and compiles a report of PII erroneously present on unprotected portals Achievements • Ability to proactively take action on the potential data breaches • Removes from unauthorized access • Prevents portability • Eliminated end user tagging • Not transmittable view email without specific authorization • No breach for 11 years • Increased productivity – 2,500 record codes • Processed information faster and achieved higher accuracy
  • 35. © Concept Searching 2018 Situation: • Budget of $6.9 Billion • Over 60,000 users • Runs 75 hospitals and clinics providing care to more than 2.6 million beneficiaries Challenge: • Data Privacy • Intelligent Migration • Before and after • Records Management • 72,000 Site Collections, 5,300 retention codes, classify 200,000 documents per hour with minimum resources (Proof of Concept) Solution: • conceptClassifier for SharePoint platform Benefits: • Automatic tagging based on organizational vocabulary and descriptors • Automatic routing and the ability to change the SharePoint content type • Eliminated manual tagging, removes from unauthorized access and portability • No security exposures or breaches in 5 years, since deployed The US Air Force deployed the technologies to implement data privacy protection processes and after five years has not had a data breach Automatic Tagging, Policy, and Governance
  • 36. © Concept Searching 2018 • All information is automatically tagged resulting in the classification of unstructured data to organizational taxonomies • All information is retrievable using concepts (high-precision) instead of key-words, proximity, full text, (low-precision) • Cleanses file shares, SharePoint, Exchange, and any repository • Identifies and protects privacy and sensitive information exposures in real time • General Data Protection Regulation (GDPR) compliance • Insight engine feeds search engine index enabling concept-based searching • Reduces the risks and costs associated with eDiscovery • Works interactively with records management applications to identify and classify records based on the file plan, automatically classifies them to a SharePoint content type, processes to records management application • Provides secure collaboration based on the contents within documents to be shared • Knowledge management, research, text mining and analytics What Are the Results?
  • 38. © Concept Searching 2018 www.conceptsearching.com marketing@conceptsearching.com Twitter @conceptsearch Graham Simms Director of Delivery and Consulting Services Concept Searching grahams@conceptsearching.com Thank You