SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
Cohasset Associates, Inc.

                                                                       NOTES
           Session 20

           Analytics
           The New Way to Manage e-Records




                            Tuesday, May 08, 2012 3:15 – 4:30 pm
       G




                          What is Analytics?


           Analytics is the application of computer
           technology, operational research, and statistics to
           solve problems in business and industry.
                 p                               y

             Wikipedia




                                                                   2




                            Why Analytics?


             Identification of Official Records
             Cleaning up ‘stuff’
             Finding relevant files for e-discovery
                    g                             y



             Business insight




                                                                   3




                                                                               20.1
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                          NOTES
                         Why Analytics?


           Ginormous volume         10 TB at 1 minute / doc = 311 years

           Those unreliable users
                  g
           Evolving locations




                                                                     4




              "In theory there is no difference
              between theory and practice.
              In practice there is."
                 p

              Yogi Berra




                                                                     5




          "If you come to a fork in the road, take it."

          Yogi Berra




                                                                     6




                                                                                  20.2
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                                                                           NOTES
        Enterprise Content Management




                                                     Doug Magnuson
                                                          IBM


                                                                                                                                  © 2012 IBM Corporation




        Enterprise Content Management


        Traditional approaches are converging


            More than keyword                                                                             Analyzing unstructured 
            search is needed                                                                              content no longer optional
            “Making unstructured data                                                                     “For many business process 
            searchable is now a presumed                                                                  professionals, access to structured 
            primary interface for applications of      Enterprise                     Business            data, even when supported by BI or 
            all kinds, as well as for intranets                                                           predictive analytics, lacks sufficient 
            and content repositories.”                  Search                      Intelligence          context for customer 
                                                                                                          service, finance, and other areas where 
             – Whit Andrews, Rita Knox Gartner
                           ,                                                                              communications with customers involves 
                                                                         Content                          many channels”
                                                                         Analytics                                        – Craig Le Clair Forrester


                    Increasing in business                                                        Converging toward 
                    importance                                             Text                   content analytics
                    “Early adopters of [text analytics]                  Analytics                “Every enterprise should understand 
                    are already gaining a competitive                                             how content analytics can produce 
                    advantage. Organizations that fail to                                         answers to its critical questions; 
                    do so will be at risk.”                                                       understanding this now will make it 
                                                                                                  possible to exploit these tools as their 
                                         – Sue Feldman IDC                                        availability proliferates.”

                                                                                                                        – Rita Knox Gartner



        8                                                                                                                         © 2012 IBM Corporation




        Enterprise Content Management


        Content Analytics Explained

                                                                                                             Analyzed Content
                                                                                              Extracted
                                                                Claimant: Soft Tissue Injury Concept            (and Data)

                                                       Person    Injury      Body Part     Location


                                                       Noun       Verb      Noun Phrase   Prep Phrase

                                                       John sprained his ankle on the step
                                                              ...




              Source Information
            Internal (ECM, Files, DBMS, etc.)
             and External (Social, News, etc.)



                    What is Natural Language Processing?
                    NLP describes a set of linguistic, statistical, and
                    machine learning techniques that allow text to be
                    analyzed and key information extracted for business
                    integration


        9                                                                                                                         © 2012 IBM Corporation




                                                                                                                                                                   20.3
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                                         NOTES
        Enterprise Content Management


        Real language is real hard

             Chess
              A finite, mathematically well-defined search
               space
              Limited number of moves and states
              Grounded in explicit, unambiguous
               mathematical rules


             Human Language
              Ambiguous, contextual and implicit
              Contains slang, riddles, idioms, abbreviations,
               acronyms and more
              Grounded only in human cognition
              Seemingly infinite number of ways to express
               the same concepts and meaning

        10                                                                                      © 2012 IBM Corporation




        Enterprise Content Management


        The key is: understanding natural language with confidence
        and accuracy


         Where was Einstein born?
                                   Unstructured                                       Structured
               One day, from among his city views of
               Ulm, Otto chose a watercolor to send
               to Alb t Ei t i
               t Albert Einstein as a remembrance
                                            b
               of Einstein’s birthplace.


         Welch ran this?
               If leadership is an art then surely Jack
               Welch has proved himself a master
               painter during his tenure at GE.



        11                                                                                      © 2012 IBM Corporation




        Enterprise Content Management

        Things we learned from The Jeopardy! Challenge
        5 key dimensions to drive the technology

                                                                         $200
                                                               If you're standing, it's the 
                                                               direction you should look 
                                                                    to check out the 
                                                                       wainscoting
                     Broad/open domain 
                                                                         $800
                     Complex language
                     C   l l                                    In cell division, mitosis 
                                                                  splits the nucleus & 
                                                              cytokinesis splits this liquid 
                     High precision                             cushioning the nucleus


                     Accurate confidence                                $1000
                                                               Of the 4 countries in the 
                                                               world that the U.S. does 
                     High speed                                  not have diplomatic 
                                                               relations with, the one 
                                                                 that’s farthest north

        12                                                                                      © 2012 IBM Corporation




                                                                                                                                 20.4
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                                                                                    NOTES
        Enterprise Content Management


        Decision Plans Layer Multiple Methods for Records classification


                  Consistency                Consistent
         High                                Participation &          Multiple
                                             Enforcement
                  Accuracy                                            Methods

                                                                 Context Based
                                              Imply
                                                                 Classification

                                                Rules Based
                                                                          Inspect
                                                Classification
                                                                                          Decision Plans
                                                                                       combine approaches t
                                                                                          bi            h to
                                 Ask                                                       classification
                        Manual
                     Classification
                                                                     Cost Savings

         Low                                                           Productivity
                  Low                                                          High


                 Context-based classification delivers high accuracy, rules-based classification
                  addresses hard-and-fast requirements. Combining methods delivers the best
                                                     results.
        13                                                                                                              © 2012 IBM Corporation




        Enterprise Content Management
                                                                                              High




        Rule Systems - the Effect of Real-Time Learning
                                                                                                                                                 Multiple
                                                                                                                                                 Methods

                                                                                                                                            Context Based
                                                                                                                                            Classification

                                                                                                                           Rules Based
                                                                                                                           Classification




        Use rule systems to act on existing meta data available in the
                                                                                                          Manual


        process, content system or document properties.                                       Low
                                                                                                       Classification




                                                                                                     Low                                                     High




        14                                                                                                              © 2012 IBM Corporation




        Enterprise Content Management
                                                                                              High




        Context Based Classification
                                                                                                                                                 Multiple
                                                                                                                                                 Methods

                                                                                                                                            Context Based
                                                                                                                                            Classification

                                                                                                                           Rules Based
                                                                                                                           Classification




        Use context based classification to inspect the document when
                                                                                                          Manual


        there is not enough meta data already available                                       Low
                                                                                                       Classification




                                                                                                     Low                                                     High




             Simple rules or keyword based analysis can be too coarse to make fine distinctions between long-form texts
                                                     with very different intent
        15                                                                                                              © 2012 IBM Corporation




                                                                                                                                                                            20.5
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                                                                              NOTES
        Enterprise Content Management
                                                                                        High




        Critical dimensions of classification:
                                                                                                                                           Multiple
                                                                                                                                           Methods

                                                                                                                                      Context Based
                                                                                                                                      Classification




        Magnified by exploding volumes
                                                                                                                     Rules Based
                                                                                                                     Classification




                                                                                                    Manual
                                                                                                 Classification


                                                                                        Low

                                                                                               Low                                                     High




             Use manual classification for high
             value documents or when other
             methods do not provide enough
             information.                                 Manual               Automated

         Accuracy
                y                                            X
                                                             92%                  60 – 90%
                                                            46%
         Cost (per doc)                                     $ 0.17                < $ 0.01


         Consistency                                          <50%                    100%

                         Increasing volume and variety of information magnifies the challenges of
                                             consistency and cost burdens
        16                                                                                                        © 2012 IBM Corporation




        Enterprise Content Management


        Quickly Understand Timeline & Essence of Custodian & Business
        Information
        Quickly get a view of the
        people, sender and recipient
        domains, and companies
        involved.
        Combine facets and filters to
        quickly include and eliminate
        custodians and data – such as
        people from certain locations or
        other combination.
        Automatically extracted phrases
        in the content show the essence
        of the information.
         f th i f     ti
        Organize a topographical view
        by key category. The “peaks”
        show frequency and phrases to
        quickly identify relevant
        information.




        17                                                                                                        © 2012 IBM Corporation




                                         Tom Reding, CRM
                                         Principal, Information Governance Practice
                                         tom.reding@emc.com




                                                                                                                                                                      20.6
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                                                             NOTES
            CIS - Automated Analytics




                                                                         Content Analytics

                                                                                                                       CONTENT
                     CONTENT                             TEXT                     ENTITIES            RELATIONSHIPS     EASILY
                      ADDED                            ANALYZED                  EXTRACTED               STORED         FOUND




           © Copyright 2011 EMC Corporation. All rights reserved.                                                                       19




          Discover and Act on Legacy Information
          File Intelligence – Understanding what you have



              File System
                                                      Intelligently
                                                      Identify Records                     Migrate &
                                                                                           Secure
                                                                                           Records
       Email Server



        SharePoint

                                                                                                                Secure Repository
                                            File Intelligence
         Content
                                                                                                                +
       Repositories
                                                                                                                Retention Policy
        Personal Email
           Archives
                            Notebook and
                              Desktop




           © Copyright 2011 EMC Corporation. All rights reserved.                                                                       20




            How File Intelligence Works

                         Catalog                                                   Analyze                                Act




                                                                      Classify       Search        Report



                     Crawl data                                     Classify files based on metadata keyword
                                                                                            metadata,                  Robust action
                     sources                                        content, and pattern matching                      set
                     Build index                                                                                        – Move, copy,
                     – Metadata basic                               Age, owner, location, file type, etc.                 delete,
                     – Metadata with                                                                                      retain,
                       document type                                Business value, security risk, intellectual           export, tag
                     – Metadata with                                property, PII, PCI
                       hash                                                                                            Policy-based
                     – Deep crawl full                              Analyze data with search and report tools          actions
                       text                                          – Semantic search with                             – One-time
                     – Deep crawl                                      Boolean, proximity, stemming, phrase support
                       with                                                                                             – Scheduled
                                                                     – More than 30 pre-built reports out of the box
                       classification                                                                                   – Recurring
                                                                     – Custom reports as needed




           © Copyright 2011 EMC Corporation. All rights reserved.                                                                       21




                                                                                                                                                     20.7
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                                                                                                    NOTES
        Solution Overview                                                          Secure, Retain, Discover 3


                                                                                                                                                Enterprise
                                                                                                                                                Retention
                                                                           1                       2
                                                                File                    Content
                                                            Intelligence                Capture /
                                                                                        Archiving



                                                                                                                                               Electronic
                                                                                                                                               Discovery

                                                                                                                                                              4



                                                                  • Crawl, Index , analyze, search, report information repositories in-place
                              File Intelligence                   • Take action upon the discovered information assets
           RPS                                                    • Examples: Decommission non-required information in-place, capture
                                                                  & classify records




         © Copyright 2011 EMC Corporation. All rights reserved.                                                                                                                22




        Experience with File Intelligence
              • ~24% of unstructured data is actively
                                                                                                                        Active,
                used                                                                                                    known,                       24%
              • ~48% is stale: not touched in 6 months                                                                 relevant
              • ~18% are duplicates
              • ~6% is unknown or orphaned
              • ~4% is not business related - pictures
                                                                                                                            Stale                    48%
              • Cost to the Customer:
                – It consumes expensive storage
                  capacity
                – It gets managed, backed
                                                                                                                     Duplicates                      18%
                  up, replicated, ...
                – It poses serious legal & compliance                                                                                                 6%
                                                                                                                      Unknown
                  risks
                                                                                                               Non-business                           4%
                – It gets recovered equally in a DR                                                                  related
                  scenario                                                                                   * Results from 37 Kazeon customer assessments
                                                                                                               Stale is defined as files not accessed or modified
                                                                                                             for 6 months




         © Copyright 2011 EMC Corporation. All rights reserved.                                                                                                                23




             Analytics for eDiscovery
               Full Case Management Workflow                                      Case Tracking                      Preservation Notification
                   –    New case creation                                           –   Collection Status               –    E-mail notification to custodians
                   –    Assignment of lead attorney & reviewers                     –   Document Review Status          –    Customize e-mail messages per matter
                   –    Case specific collection & culling                          –   Reviewer Workload               –    Full tracking during hold notice lifecycle
                   –    Legal case processing                                       –   Reviewer Progress Tracking      –    Automatic reminders
                   –    Document review & analysis                                                                      –    Custodian or Proxy acknowledgment



                                                            Early case assessment throughout the                                    Legal Hold Notices
                                                             eDiscovery process
           Keyword Hit Report
                                                            Identify relevant, important data
                                                            Analytics on relevant legal case data
                                                            E-mail communication threads
                                                            Prepare for FRCP meetings

         E-mail Threading




                                                                           Custodian & Concept Analysis




                                                                                                                                                                          24




         © Copyright 2011 EMC Corporation. All rights reserved.                                                                                                                24




                                                                                                                                                                                            20.8
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                           NOTES


         Network Drive Transformation
         Analytics

         Brian Tuemmler
        Information Management for Everyone                25




             Enterprise Knowledge


                                                         Case 
                 Data                          SAP
                                                         Mgt
                                                                       GIS

                                               Tech 
                 Wikis & Blogs                 Supp
                                                         Dev

                                              Share
                 ECRM                         Point
                                                         Other


                 Shared Drives                 M:


                 Hard Copy                    Central   Off site      Desks




        Information Management for Everyone                26




             Analytics Perspective


                                                         Taxonomy 
                                                        Development
                                                                              Records 
                                        ICM Tools
                                                                             Retention


                           ECM 
                             C                                                             Image 
                                                                                           I
                         Repository                                                      Conversion 
                        Architecture                                                      Services




                      ERM &                              Cleanup 
                    Compliance                             and                              Preservation
                      Policies
                                                        Transform


        Information Management for Everyone                27




                                                                                                                   20.9
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                                                                                                                                                                                  NOTES
                   Program Approach

                                                                                              Repeat by                                           Repeat by
                                                                                               Share                                              Workgroup
                                          Strategy 
                                                                                                                                                         Record 
                                        Governance                                            Group cleanup
                                                                                                                                                        Definition
                                         and Policy


                                              IT 
                                                                                                     Individual                                         Content 
                                       Infrastructure 
                                                                                                      cleanup                                           Migration
                                         (Mapping)


                                              Query 
                                                                                               Auto cleanup                                          Improvement
                                             Definition




        Information Management for Everyone                                                                      28




                   Cleanup Categories


                   Recent
                                   Capture




                   Voluminous
                                                Database
                                                                    Non‐capturable




                   Important
                                                Application
                   For review                                                        Garbage 
                                                                                                               Delete




                                                                                     opportunities
                                                                                     •Large                             Scheduled
                                                                                                                                            ed




                                                                                      duplicates
                                                                                                                                       Expire




                                                                                                                        Past 
                                                                                                                        Past
                                                                                                      Manually D




                                                                                     •Photos & 
                                                                                                                        retention                   Identified 
                                                                                                                                                                     Opt in Delete




                                                                                      media
                                                                                                                        date                        garbage ‐ "To 
                                                                                                                                                    be deleted"                      Temporary
                                                                                                                                                                                                    Auto‐Delete




                                                                                                                                                    Policy                           Backup
                                                                                                                                                    deletes
                                                                                                                                                                                     Zero content




                                                                                                     User Intent

        Information Management for Everyone                                                                      29




                   Example
          80000
                                                                                                                                                  Category Volume by Year
           60000

             40000

                  20000
        Storage




                       0
                                1997
                                       1998
                                              1999
                                                     2000                                                                                                              Document Image
                                                            2001                                                                                                    Content
                                                                   2002                                                                                         Photo
                                                                                2003
                                                                                       2004                                                                 Database
                                                                                              2005                                                      Archive
                                                                                                     2006                                            Media
                                                                                                                  2007                           Application
                                                                                                                         2008
                                                                                                                                2009


        Information Management for Everyone                                                                      30




                                                                                                                                                                                                                          20.10
2012 Managing Electronic Records Conference
Cohasset Associates, Inc.

                                                 NOTES

         Discussion




                                            31




                      Contact Information
         Marcy Zweerink
         Marcy.Zweerink@Cohasset.com

         Doug Magnuson
         dmagnuson@us.ibm.com

         Tom Reding
         Tom.Reding@emc.com

         Brian Tuemmler
         Brian.Tuemmler@gimmal.com



                                            32




                                                         20.11
2012 Managing Electronic Records Conference

Contenu connexe

Plus de MER Conference

M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data SystemsMER Conference
 
M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...MER Conference
 
M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!MER Conference
 
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
 M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ... M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...MER Conference
 
M12S13 - RIM for the Next Generation: A Call to Action
 M12S13 - RIM for the Next Generation: A Call to Action M12S13 - RIM for the Next Generation: A Call to Action
M12S13 - RIM for the Next Generation: A Call to ActionMER Conference
 
M12S11 - The Do's and Don'ts of Managing Social Media
 M12S11 - The Do's and Don'ts of Managing Social Media M12S11 - The Do's and Don'ts of Managing Social Media
M12S11 - The Do's and Don'ts of Managing Social MediaMER Conference
 
M12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move ForwardM12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move ForwardMER Conference
 
M12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and IssuesM12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and IssuesMER Conference
 
M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'MER Conference
 
M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...
M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...
M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...MER Conference
 
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...MER Conference
 
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...MER Conference
 
M12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part TwoM12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part TwoMER Conference
 

Plus de MER Conference (13)

M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 
M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...
 
M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!
 
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
 M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ... M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
 
M12S13 - RIM for the Next Generation: A Call to Action
 M12S13 - RIM for the Next Generation: A Call to Action M12S13 - RIM for the Next Generation: A Call to Action
M12S13 - RIM for the Next Generation: A Call to Action
 
M12S11 - The Do's and Don'ts of Managing Social Media
 M12S11 - The Do's and Don'ts of Managing Social Media M12S11 - The Do's and Don'ts of Managing Social Media
M12S11 - The Do's and Don'ts of Managing Social Media
 
M12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move ForwardM12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move Forward
 
M12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and IssuesM12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and Issues
 
M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'
 
M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...
M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...
M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification...
 
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
 
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
 
M12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part TwoM12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part Two
 

Dernier

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 

Dernier (20)

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 

M12S20 - Analytics: The New Way to Manage e-Records

  • 1. Cohasset Associates, Inc. NOTES Session 20 Analytics The New Way to Manage e-Records Tuesday, May 08, 2012 3:15 – 4:30 pm G What is Analytics? Analytics is the application of computer technology, operational research, and statistics to solve problems in business and industry. p y Wikipedia 2 Why Analytics? Identification of Official Records Cleaning up ‘stuff’ Finding relevant files for e-discovery g y Business insight 3 20.1 2012 Managing Electronic Records Conference
  • 2. Cohasset Associates, Inc. NOTES Why Analytics? Ginormous volume 10 TB at 1 minute / doc = 311 years Those unreliable users g Evolving locations 4 "In theory there is no difference between theory and practice. In practice there is." p Yogi Berra 5 "If you come to a fork in the road, take it." Yogi Berra 6 20.2 2012 Managing Electronic Records Conference
  • 3. Cohasset Associates, Inc. NOTES Enterprise Content Management Doug Magnuson IBM © 2012 IBM Corporation Enterprise Content Management Traditional approaches are converging More than keyword  Analyzing unstructured  search is needed content no longer optional “Making unstructured data  “For many business process  searchable is now a presumed  professionals, access to structured  primary interface for applications of  Enterprise Business data, even when supported by BI or  all kinds, as well as for intranets  predictive analytics, lacks sufficient  and content repositories.”  Search Intelligence context for customer  service, finance, and other areas where  – Whit Andrews, Rita Knox Gartner , communications with customers involves  Content  many channels” Analytics – Craig Le Clair Forrester Increasing in business  Converging toward  importance Text content analytics “Early adopters of [text analytics]  Analytics “Every enterprise should understand  are already gaining a competitive  how content analytics can produce  advantage. Organizations that fail to  answers to its critical questions;  do so will be at risk.” understanding this now will make it  possible to exploit these tools as their  – Sue Feldman IDC availability proliferates.” – Rita Knox Gartner 8 © 2012 IBM Corporation Enterprise Content Management Content Analytics Explained Analyzed Content Extracted Claimant: Soft Tissue Injury Concept (and Data) Person Injury Body Part Location Noun Verb Noun Phrase Prep Phrase John sprained his ankle on the step ... Source Information Internal (ECM, Files, DBMS, etc.) and External (Social, News, etc.) What is Natural Language Processing? NLP describes a set of linguistic, statistical, and machine learning techniques that allow text to be analyzed and key information extracted for business integration 9 © 2012 IBM Corporation 20.3 2012 Managing Electronic Records Conference
  • 4. Cohasset Associates, Inc. NOTES Enterprise Content Management Real language is real hard Chess  A finite, mathematically well-defined search space  Limited number of moves and states  Grounded in explicit, unambiguous mathematical rules Human Language  Ambiguous, contextual and implicit  Contains slang, riddles, idioms, abbreviations, acronyms and more  Grounded only in human cognition  Seemingly infinite number of ways to express the same concepts and meaning 10 © 2012 IBM Corporation Enterprise Content Management The key is: understanding natural language with confidence and accuracy  Where was Einstein born? Unstructured Structured One day, from among his city views of Ulm, Otto chose a watercolor to send to Alb t Ei t i t Albert Einstein as a remembrance b of Einstein’s birthplace.  Welch ran this? If leadership is an art then surely Jack Welch has proved himself a master painter during his tenure at GE. 11 © 2012 IBM Corporation Enterprise Content Management Things we learned from The Jeopardy! Challenge 5 key dimensions to drive the technology $200 If you're standing, it's the  direction you should look  to check out the  wainscoting Broad/open domain  $800 Complex language C l l In cell division, mitosis  splits the nucleus &  cytokinesis splits this liquid  High precision cushioning the nucleus Accurate confidence $1000 Of the 4 countries in the  world that the U.S. does  High speed not have diplomatic  relations with, the one  that’s farthest north 12 © 2012 IBM Corporation 20.4 2012 Managing Electronic Records Conference
  • 5. Cohasset Associates, Inc. NOTES Enterprise Content Management Decision Plans Layer Multiple Methods for Records classification Consistency Consistent High Participation & Multiple Enforcement Accuracy Methods Context Based Imply Classification Rules Based Inspect Classification Decision Plans combine approaches t bi h to Ask classification Manual Classification Cost Savings Low Productivity Low High Context-based classification delivers high accuracy, rules-based classification addresses hard-and-fast requirements. Combining methods delivers the best results. 13 © 2012 IBM Corporation Enterprise Content Management High Rule Systems - the Effect of Real-Time Learning Multiple Methods Context Based Classification Rules Based Classification Use rule systems to act on existing meta data available in the Manual process, content system or document properties. Low Classification Low High 14 © 2012 IBM Corporation Enterprise Content Management High Context Based Classification Multiple Methods Context Based Classification Rules Based Classification Use context based classification to inspect the document when Manual there is not enough meta data already available Low Classification Low High Simple rules or keyword based analysis can be too coarse to make fine distinctions between long-form texts with very different intent 15 © 2012 IBM Corporation 20.5 2012 Managing Electronic Records Conference
  • 6. Cohasset Associates, Inc. NOTES Enterprise Content Management High Critical dimensions of classification: Multiple Methods Context Based Classification Magnified by exploding volumes Rules Based Classification Manual Classification Low Low High Use manual classification for high value documents or when other methods do not provide enough information. Manual Automated Accuracy y X 92% 60 – 90% 46% Cost (per doc) $ 0.17 < $ 0.01 Consistency <50% 100% Increasing volume and variety of information magnifies the challenges of consistency and cost burdens 16 © 2012 IBM Corporation Enterprise Content Management Quickly Understand Timeline & Essence of Custodian & Business Information Quickly get a view of the people, sender and recipient domains, and companies involved. Combine facets and filters to quickly include and eliminate custodians and data – such as people from certain locations or other combination. Automatically extracted phrases in the content show the essence of the information. f th i f ti Organize a topographical view by key category. The “peaks” show frequency and phrases to quickly identify relevant information. 17 © 2012 IBM Corporation Tom Reding, CRM Principal, Information Governance Practice tom.reding@emc.com 20.6 2012 Managing Electronic Records Conference
  • 7. Cohasset Associates, Inc. NOTES CIS - Automated Analytics Content Analytics CONTENT CONTENT TEXT ENTITIES RELATIONSHIPS EASILY ADDED ANALYZED EXTRACTED STORED FOUND © Copyright 2011 EMC Corporation. All rights reserved. 19 Discover and Act on Legacy Information File Intelligence – Understanding what you have File System Intelligently Identify Records Migrate & Secure Records Email Server SharePoint Secure Repository File Intelligence Content + Repositories Retention Policy Personal Email Archives Notebook and Desktop © Copyright 2011 EMC Corporation. All rights reserved. 20 How File Intelligence Works Catalog Analyze Act Classify Search Report Crawl data Classify files based on metadata keyword metadata, Robust action sources content, and pattern matching set Build index – Move, copy, – Metadata basic Age, owner, location, file type, etc. delete, – Metadata with retain, document type Business value, security risk, intellectual export, tag – Metadata with property, PII, PCI hash Policy-based – Deep crawl full Analyze data with search and report tools actions text – Semantic search with – One-time – Deep crawl Boolean, proximity, stemming, phrase support with – Scheduled – More than 30 pre-built reports out of the box classification – Recurring – Custom reports as needed © Copyright 2011 EMC Corporation. All rights reserved. 21 20.7 2012 Managing Electronic Records Conference
  • 8. Cohasset Associates, Inc. NOTES Solution Overview Secure, Retain, Discover 3 Enterprise Retention 1 2 File Content Intelligence Capture / Archiving Electronic Discovery 4 • Crawl, Index , analyze, search, report information repositories in-place File Intelligence • Take action upon the discovered information assets RPS • Examples: Decommission non-required information in-place, capture & classify records © Copyright 2011 EMC Corporation. All rights reserved. 22 Experience with File Intelligence • ~24% of unstructured data is actively Active, used known, 24% • ~48% is stale: not touched in 6 months relevant • ~18% are duplicates • ~6% is unknown or orphaned • ~4% is not business related - pictures Stale 48% • Cost to the Customer: – It consumes expensive storage capacity – It gets managed, backed Duplicates 18% up, replicated, ... – It poses serious legal & compliance 6% Unknown risks Non-business 4% – It gets recovered equally in a DR related scenario * Results from 37 Kazeon customer assessments Stale is defined as files not accessed or modified for 6 months © Copyright 2011 EMC Corporation. All rights reserved. 23 Analytics for eDiscovery Full Case Management Workflow Case Tracking Preservation Notification – New case creation – Collection Status – E-mail notification to custodians – Assignment of lead attorney & reviewers – Document Review Status – Customize e-mail messages per matter – Case specific collection & culling – Reviewer Workload – Full tracking during hold notice lifecycle – Legal case processing – Reviewer Progress Tracking – Automatic reminders – Document review & analysis – Custodian or Proxy acknowledgment  Early case assessment throughout the Legal Hold Notices eDiscovery process Keyword Hit Report  Identify relevant, important data  Analytics on relevant legal case data  E-mail communication threads  Prepare for FRCP meetings E-mail Threading Custodian & Concept Analysis 24 © Copyright 2011 EMC Corporation. All rights reserved. 24 20.8 2012 Managing Electronic Records Conference
  • 9. Cohasset Associates, Inc. NOTES Network Drive Transformation Analytics Brian Tuemmler Information Management for Everyone 25 Enterprise Knowledge Case  Data SAP Mgt GIS Tech  Wikis & Blogs Supp Dev Share ECRM Point Other Shared Drives M: Hard Copy Central Off site Desks Information Management for Everyone 26 Analytics Perspective Taxonomy  Development Records  ICM Tools Retention ECM  C Image  I Repository  Conversion  Architecture Services ERM &  Cleanup  Compliance  and  Preservation Policies Transform Information Management for Everyone 27 20.9 2012 Managing Electronic Records Conference
  • 10. Cohasset Associates, Inc. NOTES Program Approach Repeat by Repeat by Share Workgroup Strategy  Record  Governance  Group cleanup Definition and Policy IT  Individual  Content  Infrastructure  cleanup Migration (Mapping) Query  Auto cleanup Improvement Definition Information Management for Everyone 28 Cleanup Categories Recent Capture Voluminous Database Non‐capturable Important Application For review Garbage  Delete opportunities •Large  Scheduled ed duplicates Expire Past  Past Manually D •Photos &  retention  Identified  Opt in Delete media date garbage ‐ "To  be deleted" Temporary Auto‐Delete Policy  Backup deletes Zero content User Intent Information Management for Everyone 29 Example 80000 Category Volume by Year 60000 40000 20000 Storage 0 1997 1998 1999 2000 Document Image 2001 Content 2002 Photo 2003 2004 Database 2005 Archive 2006 Media 2007 Application 2008 2009 Information Management for Everyone 30 20.10 2012 Managing Electronic Records Conference
  • 11. Cohasset Associates, Inc. NOTES Discussion 31 Contact Information Marcy Zweerink Marcy.Zweerink@Cohasset.com Doug Magnuson dmagnuson@us.ibm.com Tom Reding Tom.Reding@emc.com Brian Tuemmler Brian.Tuemmler@gimmal.com 32 20.11 2012 Managing Electronic Records Conference