80% of information growth today is in unstructured content and many companies are looking for ways to help identify and manage strategic records, satisfy stringent regulations, reduce the amount of content stored and increase employee productivity.
Artificial intelligence in the post-deep learning era
Improve ROI and Productivity with Content Cleansing and Enterprise Search
1. Improve ROI and Productivity with Content
Cleansing and Enterprise Search
August 27, 2013
2. Our Speaker
Ed Rawson
Principal, ECM Practice
• 30-year veteran of Enterprise Content Management
• Presents at a number of conferences and events for
associations
• He is a published author of white papers, co-authored two
books and blogs on ECM/CI topic
• Helps organizations across industries align content
management and governance solutions with business direction
to maximize the return on investment and maintain compliance
• Leverages his extensive experience to implement content
lifecycle, content analytics, information governance and content
intelligence programs using the latest technologies and best
practices
2
3. Content is “EXPLODING”
CONTENT
Volume
12 terabytes of
Tweets created
daily
Velocity
5 million trade
events per
second
Variety
4 terabytes
Per site/day
average
surveillance
video
15petabytes
of new
information
daily
500
million
Call detail
records per day
80%
information
growth is
unstructured
content……
• The amount of information available to
organizations is exploding daily
• According to one source, it doubles
every eleven hours
• Within every company there are
terabytes of unstructured content.
• Content is growing faster than can be
controlled by current methods.
• Companies are just keeping everything
with a large cost and compliance
exposure
• Companies are exposed to much high
compliance risk and very costly
discovery
• Knowledge workers spend too much
time looking for information to complete
a task
3
4. Anatomy of the File Share
6%
18%
24%
48%
4%
Active,
Known,
Relevant
Stale
Duplicates
Unknown
Non-business
related
Results from customer assessments
Stale is defined as files not accessed or modified for 6
months
In most cases 60% to 70% of the content
in a file share is unnecessary
Typical Structure
“The best way to reduce the amount of content – delete it”
- Sheila Childs, Research VP, Gartner
~24% of unstructured data is actively
used
~48% is stale: not touched in 6 months
~18% are duplicates
~6% is unknown or orphaned
~4% is not business related – pictures,
mp3, etc.
4
5. 6%
18%
24%
48%
4%
Active,
Known,
Relevant
Stale
Duplicates
Unknown
Non-business
related
Results from customer assessments
Stale is defined as files not accessed or modified for 6
months
Types of Content
• Business records that must be retained
and preserved for legal, audit and
business continuity reasons.
• Knowledge records – information
assets and reference material that are
the basis for the day-to-day operations
of the company; and in some cases
these information assets may be the
primary source of revenue for the
company.
• Electronic communications – some of
which is linked to business records and
must be retained and managed as such.
• “Junk Content” that should be deleted.
If this is a typical company, the volume
of this junk content is much larger than
many would like to believe.
Anatomy of the File Share (cont’d)
5
6. Content Cleansing
• Content cleansing is the methods and tools use two surface the relevant
content within any unstructured data store.
• By surfacing the relevant content allows for updating current taxonomies,
classification, records retention plans and build defensible disposition plan.
6
7. Content Cleansing Process
7
.PDF
Cloud File Shares
Content
Classification
Process
Requirements , Polices ,
Taxonomies , Dictionaries ,
etc .
Existing Content File Shares
and Repositories
Analysis
Classification
Disposition
Requirements
Process
Content
Inventory
File Types
Age
Last used
Volume
File Size
Cleansed Business Information
BPM BI/BA
Enterprise Information Management
Enterprise Search
Disposition
File Stores and
Shares
XML
.pst
July 23 ,2013 – Content Intelligence CER
Content Intelligence
Content Cleansing and Classification Process
8. Enterprise Search
• International Data Corporation IDC
published a report, “The High Cost of Not
Finding Information”, and other studies
show that knowledge workers spend at
least 15 to 25% of their workday
searching for information and only half of
the searches returning useful information
8
9. Source: Government Authority Survey of 287 Users
with Google Search Appliance Alone and Google Search Appliance + Smartlogic Semaphore
Enterprise Search Relevancy
Status
Green
Time
15s
30s
60s
User Search Experience
Info found in first 3 results
Info found between results 3-10
Info not found in first page of results (10)
User re-conducts search
Yellow
Red
Assumption: Users are performing 10 queries per day on average
Fully loaded cost of an employee averages $75/hour
$1.25 per 60 seconds of search versus $150 for average 2 hours of searching for content
that is never returned
9
10. Client Case Studies
Major Credit Card Company
• In a recent project a major credit card company had so many legal holds that
they All their content in a file share data grown to over 2 PB.
• This was costing them literally millions of dollars storage costs litigation costs
and the inability to access relevant information.
• After completing the content cleansing and classification process the relevant
knowledge base dropped to only 300 TB and within six months of completion
the Corporation save over $2.8 million with ongoing cost savings of $1.3
million per year.
Major Electric Utility
• Significant loss in intellectual capital due to aging workforce
• Inability to track, monitor and audit the length of time content was retained
• Determining ownership of the 50TB of content across multiple business entities
rendering the inability to place legal holds on content as required for litigation
• Growth of content that was increasing at a pace of 1TB/month
• After completing the content cleansing, classification process and defensible
disposition,
• Improvements in search relevancy of the valid content by 60%
• Reduce content storage cost (approx. $1.2M savings)
• Enable a migration strategy of the valid content after analysis was performed
• Increased productivity as a result of improved search, relevance capabilities
• Lowered Compliance Risk
10
11. Let’s Review
Cleansing Content allows for:
• Lower Storage Costs
• Increased Productivity
• Exploiting the Business Value – i.e. BI, BA,
• Increased Customer Service
• Content Life Cycle Management
• Lower Risk and Greater Control of Assets
• Increased ROI with Lower Storage Costs and
Increased Productivity
• Happy and Productive Work Force
11
12. • Detail Requirements
• Taxonomies
• Records Retention Policies
• Classification Policies
• Content Inventory
• Defensible Disposition
• Auto Classification
Next Steps
12
13. Daily unique content
about content
management, user
experience, portals
and other enterprise
information technology
solutions across a
variety of industries.
Perficient.com/SocialMedia
Facebook.com/Perficient
Twitter.com/Perficient
13
14. 14
Perficient is a leading information technology consulting firm serving clients
throughout North America.
We help clients implement business-driven technology solutions that integrate
business processes, improve worker productivity, increase customer loyalty and
create a more agile enterprise to better respond to new business opportunities.
About Perficient
15. • Founded in 1997
• Public, NASDAQ: PRFT
• 2012 revenue $327 million
• Major market locations throughout North America
• Atlanta, Boston, Charlotte, Chicago, Cincinnati, Cleveland, Columbus, Dallas, Denver,
Detroit, Fairfax, Houston, Indianapolis, Los Angeles, Minneapolis, New York City, Northern
California, Philadelphia, Southern California, St. Louis, Toronto and Washington, D.C.
• Global delivery centers in China, Europe and India
• ~2,000 colleagues
• Dedicated solution practices
• ~85% repeat business rate
• Alliance partnerships with major technology vendors
• Multiple vendor/industry technology and growth awards
Perficient Profile
15
16. Business Solutions
• Business Intelligence
• Business Process Management
• Customer Experience and CRM
• Enterprise Performance Management
• Enterprise Resource Planning
• Experience Design (XD)
• Management Consulting
Technology Solutions
• Business Integration/SOA
• Cloud Services
• Commerce
• Content Management
• Custom Application Development
• Education
• Information Management
• Mobile Platforms
• Platform Integration
• Portal & Social
Our Solutions Expertise
16
17. Thank you for your time
and attention today.
Please visit us at Perficient.com
17
Notes de l'éditeur
Based on current statistics the sheer quantity of content generated and maintained by a modern Fortune 500 company renders 20th-century solutions and approaches grossly inadequate.
As we mentioned statistics show that an average employee spends 15-25% of their day searching for content that is not found, which in turn equates to an average cost of 2 hours per day/ per employee for an overall average of $6M in lost productivity, annually per 1,000 staff membersThe message is clear……. it’s time for your Organization to get serious about Content Intelligence.