This document discusses how archiving services can improve enterprise content management (ECM) systems by reducing storage costs and improving performance. It describes how archiving inactive content from the ECM repository to lower-cost storage helps optimize the system for active content creation and collaboration. Integrating archiving capabilities also allows ECM users to access a wider range of business information directly from business applications.
OpenShift Commons Paris - Choose Your Own Observability Adventure
Information Govenance Webinar 17 Nov09
2. Topics
• Speaking Volumes
• Records Management and Compliance
►How ECM and Archiving “Fit” Together
• Technical Challenges
• Archiving as a Service
►For Enterprise Content Management (ECM)
• Spanning the digital divide :
► Unstructured Content (ECM) vs. Structured Content
3. RSD Corporate Background
• Founded in Geneva, 1973
► Affiliates in New York and London
• More than 1,200 customers worldwide
► Over 2,000,000 users
• Pioneer in high-volume mainframe report and output management
► EOS (Enterprise Output Solution)
• Innovator in records and document management, and Information
Governance
► RSD Folders
4. RSD Corporate Background
• Pioneer in report / output management
• Leader in Integrated Document Archive and Retrieval Systems
(IDARS) Magic Quadrant
• Innovator in Records and Document Management, Information
Governance * Magic Quadrant Disclaimer
The Magic Quadrant is
copyrighted June 27, 2006 by
Gartner, Inc. and is reused with
permission. The Magic Quadrant is
a graphical representation of a
marketplace at and for a specific
time period. It depicts Gartner's
analysis of how certain vendor’s
measure against criteria for that
marketplace, as defined by
Gartner. Gartner does not endorse
any vendor, product or service
depicted in the Magic
Quadrant, and does not advise
technology users to select only
those vendors placed in the
"Leaders" quadrant. The Magic
Quadrant is intended solely as a
research tool, and is not meant to
be a specific guide to action.
Gartner disclaims all
warranties, express or
implied, with respect to this
research, including any warranties
of merchantability or fitness for a
particular purpose.
5. Over 1200 customers sites, 2 million users
Banks
Insurance Telecom Distribution
Automotive Government
6. “Welcome to Information Governance”
• Facts
► Morgan Stanley: $1.45 billion
► Citibank: $400 million
► US Taxpayer: $10 million
• What do all of these have in common?
7. Content Explosion
Hard drive shipments will double
within the next five years
Digintal Storage Technology
Newsletter, July, 2009
8. Corporate Challenges
The Information Governance challenge
has created urgency at the executive
level in every enterprise
Basel II
Title 21 CFR 11
MiFID Patriot Act SEC 17a-4 DoD 5015.2
9. Managing Risk is Critical
• Increasingly important executive function
• Exposure (to fines) & compliance (with laws) vs. costs
► Must do so in a cost effective manner
► Must reduce their overall cost of operation
• Within constraints of providing seamless and secure access to their
information and content
New Regulations?
New Regulations?
Morgan Stanley Financial
E-Discovery irregularity fine $1.58b Crisis 2008
New Regulations?
UBS/Warburg
FRCP New Regulations?
Enron Scandal
Sarbanes-Oxley HIPAA
MoReq
DoD 5015.2 9/11
Patriot Act
1998 2001 2004 2007 2010
Pain related to costs (governance, infrastructure, etc.) Pain related to risks of over-retention
Pain related to e-discovery risks Pain related to data privacy breach risks
Pain related to future risks (yet to emerge)
10. A Model for Information Governance
Information Governance Program
Corporate Governance Content Retention
IT Governance Content Metadata Retention
Information Governance Data Privacy
Financial Governance ILM
Other E-Discovery
Digital Rights Lifecycle
Audit Trail Management
Majority of Information
Records Governance policies are defined
Retention at the Record Class level
Schedule
Content Retention Content Metadata Retention Data Privacy ILM E-Discovery & Holds Digital Rights Lifecycle
Policies Policies Policies Policies Policies Policies
Control Control Control Control Control Control
Enforcement Enforcement Enforcement Enforcement Enforcement Enforcement
? ? ? ? ?
RM + ECM EDD Systems
? ? ? ?
Business Corporate
Collaborative Corporate Corporate
Collaborative Operational and Information in
Content Information Information
Content in Transactional “Intelligent”
scattered in in Business in Data
ECM Systems Content in Storage
Infrastructure Applications Warehouses
Archives Appliances
Other
Content Bursting/Migration/Archiving Jurisdiction A Jurisdictions C n
Jurisdiction
Jurisdiction
11. Multifaceted and Integrated Definitions of Lifecycle
Record Lifecycle: Collaboration Long-term Record Archival metadata
1) Record Retention (RM): Capture/Declare Record - - Disposition -
2) Metadata Lifecycle: Metadata & Content Indexing Metadata Indexing - Basic Metadata Delete Metadata
3) Metadata Storage Lifecycle: Metadata on Tier 1 Archive Metadata Basic Metadata Delete Metadata
4) Content Storage ILM: Storage on Tier 1 Storage on Tier 2 - Delete, Expunge -
5) Vital Status: Vital Non-Vital - - -
6) Security Lifecycle: Security Classified - Security Declassified - -
7) Data Privacy Settings Lifecycle: Regulatory Controls ? Anonymize ? ?
8) Digital Rights Lifecycle: Assign Key ? Delete Key ? ?
9) Other (expandable - tbd): ? ? ? ? ?
Milestone #1
Milestone #2
Disposition
Content Value
Record Declaration
days weeks months years decades
12. Business Problems : pre-ECM
• Before you adopted Alfresco
► Inconsistent business processes
• Document creation
• Content review and approval
• Publish, update, dispose (lifecycle management)
► Poor productivity
• Many duplicate efforts
• Ineffective use of valuable resources
► Ineffective knowledge management
• Intellectual property not discoverable
• Cannot be leveraged
13. Challenges Created by ECM Success
• Since Alfresco
► More and more departments and users
• Success begets success
► More and more content being generated
• New content types are more storage intensive
– Image, video, audio
► Repository volumes expanding rapidly
► Performance begins to suffer
► New requirements
• COO and Compliance Officer: what about regulatory compliance?
• Why can’t I see customer statements and other production
application data in my ECM?
14. Why is performance suffering?
• ECM systems designed to support “active content”
► Content creation phase of the content lifecycle
• Frequent update and change
• Multi-step workflow to obtain editorial approval
• Activity level dramatically lower post-approval
► Most documents never changed after the approval cycle
► Yet they occupy space in ECM repository and database
• Alternative storage and retrieval services are required
► To ensure that ECM is optimized to serve users needs : creating
and updating “active content”
15. New Compliance Questions
• Board and executive level pressure
► Stakeholders need assurance business information is being handled
properly
• More content considered “business records”
► Paper records, email, voicemail recordings, instant messages, blog
postings…
• Strict retention rules now required by law
► Or by industry “best practice”
• Laws differ by jurisdiction
► Regulations where content is stored tend to “trump” rules where content
is accessed
► Content must be retained for specific time periods
► Content must be disposed of after retention
► Systems must support e-Discovery or litigation hold
– Which overrides retention and disposition rules
16. New Content Requirements
• ECM excels at organization of unstructured
content
► Office documents, spreadsheets, other user-
generated content
• Users want to see other relevant
information… in one place
► Statements, reports, other data from
datacenter business applications
• ECM systems do not span this “digital divide”
17. Robust Document Archiving Services are the Answer
• Design Center
► Built specifically for the post-approval, post-publication phases of the
document and records lifecycle
► Proven to handle extremely high volume document and report
environments, such as mission-critical business applications (billing
statements, invoicing systems, payroll slips)
► Designed to support
multi-level storage systems
typical of high volume
content environments
18. A wide range of improvements
• Performance, scalability, user experience ALL improve
► ECM database footprint dramatically reduced
• Bulk of inactive content retired; metadata remains
in place
• Performance improves where
it counts: active content
► ECM servers can support more users
• Extending the lifetime (and value) of deployed
servers
► Users see better response times, no loss of information
19. “Instantiated” mode
• Built on RSD Folders Z or Open Systems
• Indexes remain in Alfresco storage
“Instantiated” Document Storage
Documents
Document Storage
Alfresco Explorer
and/or Alfresco Share
Document Document
Metadata Metadata
“Instantiated” “Instantiated”
Document Storage
20. « Virtual Mode »
• Built on RSD Folders for Z or Open Systems
• All content is archived outside Alfresco (thus the name
« virtual mode »)
“Instantiated”
Documents Document Storage
“Virtual” Document
Document Storage
Metadata
Alfresco Explorer /
Alfresco Share
“Virtual” “Virtual”
Document Document
Metadata Metadata “Virtual” Document
Metadata
Document Storage
21. Integration of « external » documents
ArchLive for Alfresco brings to the collaborative workspace
access to business documents which may be tens of
thousands of pages in length.
These “logical” documents are available to users without
requiring “bursting” of the output stream into individual
documents.
Structured Documents
And
Logical Documents
Alfresco Explorer /
Alfresco Share
Structured
Single logical document
document flow
22. Document Capture
Performance and security:
ArchLive for Alfresco can capture approximately 600
documents per second as compared to approximately 50 per
second captured natively in Alfresco
Capture Rate
Document Storage Document Storage
Alfresco Explorer /
Alfresco Share
RSD Database
23. Archiving
Scalability and cost control:
ArchLive for Alfresco integrates with solutions like Tivoli
Storage Manager (TSM), and cartridge or tape media.
ArchLive also has strong integration with EMC Centera and
Tivoli DR550
Storage Flexibilty Alfresco Data RSD Folders
Data
and Security
Alfresco Explorer /
Alfresco Share
24. RSD ArchLive for Alfresco
• Robust document archiving service for Alfresco
► Improves Alfresco performance
• By retiring large volumes of infrequently accessed
content, reducing the “content footprint” of the
Alfresco ECM repository and database
► Reduces operating costs
• By moving large volumes of content from expensive
media to any major enterprise storage system
► Bridges the digital content divide, extending the Alfresco domain
• By delivering business information from enterprise
business applications directly to ECM users