SlideShare une entreprise Scribd logo
1  sur  6
Improve
Electronic
Discovery
Results
by Focusing on
the Beginning

       By Larr y Lieb




                        Larry Lieb is a National Director at Esquire Litigation

                        Solutions. He has extensive experience related to

                        electronic discovery and computer forensics. He is the

                        former Executive Director of the Litigation Support

                        Vendors Association, an organization dedicated to

                        establishing and maintaining quality standards and

                        professional certifications. Mr. Lieb is a graduate of the

                        University of Illinois at Urbana-Champagne.
Prior to the advent of electronic discovery, par-     to the situation. The FRCP in effect has made at-
           ties to major litigations had to make a decision early    torneys personally responsible for the disposition
           in the discovery process to either scan and code the      of their client’s electronic evidence (see Qualcomm,
           documents for review, or to manage the discovery          Inc. v. Broadcom Corp., 2008 WL 66932 (Jan. 7,
           collection using paper documents. If the creation of
                                                                     2008), vacated in part, 2008 WL 638108 (S.D. Cal.
           an image-enabled database was approved by a cli-
                                                                     March 5, 2008); Phoenix Four, Inc. v. Strategic Re-
           ent, the vendor decision and all key technical speci-
           fications were pushed down from the billing partner       sources Corp., 2006 WL 1409413 (S.D.N.Y. May
           to the partner in charge of discovery, and from there     23, 2006). On top of personal responsibility, costs
           to an associate and finally to a paralegal, if no offi-   related to the identification, harvesting, processing,
           cial litigation support staff existed. This was accept-   hosting, review and production of electronic discov-
           able, if not ideal, in the day of paper and scanned       ery have increased exponentially, putting attorneys
           images.                                                   at risk if vendor and review costs are dramatically
               The shift in discovery to electronically stored in-   underestimated.
           formation (ESI) has exacerbated the problems inher-
                                                                        So how should attorneys and their litigation sup-
           ent in a hands-off approach by attorneys. There are
                                                                     port staff address critical technical details of the
           serious implications if the technology involved in the
           discovery process is not understood. Certain details      discovery process as it relates to the production and
           must be understood and agreed upon in order for a         exchange of ESI? This article will cover the major
           database to provide the reviewers what they need to       technical areas that must be decided upon up front
           do their job.                                             in order to avoid time-consuming and expensive re-
               The recent Federal Rules of Civil Procedure           work, an possible mistakes, once ESI has been pro-
           (FRCP) changes relating to ESI have added urgency         duced to and received from opposing counsel.



                                                                                         Project Meeting

              The specifications described below should be           by counsel or their trusted surrogates following
           covered in a project meeting prior to the com-            the project meeting. Most good vendors have a
           mencement of processing ESI, whether it is done           robust project setup document wherein all of the
           in house or by outside vendors. These details             below items can be recorded for approval by at-
           should be documented and approved in writing              torneys.



                                                        General Processing Specifications

              Time Zone. Determine the time zone in which the        pensate for time zone differences without resetting
           ESI originates and relay that information to the pro-     the internal clock of the computer running the pro-
           duction team prior to processing to assure that dates     cessing software. However, many applications sim-
           and time stamps come out correctly. For instance, if      ply process dates and times based upon the setting of
           ESI originates from the Pacific Time Zone, but the        the production computer.
           clock on the computer or software processing the             De-duplication. ESI collections often contain
           ESI is set to Eastern Time Zone, the four-hour dif-       many copies of the same correspondence sent to dif-
           ference could make dates appear to be one day later       ferent people. As a result, if these duplicate e-mails
           than they really are. If one is exchanging metadata,      are not removed during processing, attorneys will
           including dates, as part of a document production,        be tasked with reviewing the same document over
           wrong dates as the result of incorrect time zones may     and over again, leading to increased review time and
           result in producing documents with incorrect dates.       higher client bills. Therefore, “de-duplication” has
           Some electronic discovery software is able to com-        become a commonly requested service. Unfortunate-

4 £ EDRM
… the actual technical process behind de-duplication
ly,                                                           this detail should be discussed up front during the
is usually poorly understood.                                 project meeting.
    There are two main methods used to de-duplicate               Exception Files; Files that Will Not Process. The
e-mails and loose files in a collection: through a com-       bane of any electronic discovery collection is semi-
parison of their metadata fields such as, “To, From,          corrupt, corrupt, or password-protected files. In
CC, Subject/Re:” or a hash value. E-mails are typi-           general, approximately 95 percent of most ESI col-
cally de-duplicated via their metadata fields whereas         lections will process without issue. By definition, a
loose files such as PDF, Excel, Word and PowerPoint           file that is able to be processed can be successfully          De-duplication
presentations are compared using their hash value.            converted to a TIFF image and have its full-text and           has become
A hash value is a unique value generated by running           metadata extracted. However, the remaining five                an accepted
the zeros and ones that comprise a file through a             percent of files will not cooperate with processing
mathematical algorithm. Changing just one charac-
                                                                                                                             practice to
                                                              software and get moved to an “exception list.”
ter of a Word file, for example, will alter the hash                                                                         reduce attorney
                                                                  Many hosted ASP solutions also offer “TIFF
value that is calculated for that file.                       on the fly” capabilities that can cause problems.              review time and
    The attorneys typically choose either to de-dupli-        TIFF’ing ESI is part science and part art and should           ultimately client’s
cate the entire, global collection or just within indi-       be approached with caution. The default TIFF’ing               bills. However, it
vidual custodians. Global de-duplication is usually           practice for most production software is to use a
more effective in reducing the total collection, but                                                                         is recommended
                                                              file’s native application, such as Microsoft® Excel,
adds to the amount of processing time as each new             to open and TIFF Excel files. Sometimes, a generic             that opposing
e-mail needs to be compared to every e-mail that pre-         file viewer is used to print ESI, such as Stellant’s (now      parties agree
ceded it in the processing queue.                             owned by Oracle) Quick View Plus, when the native              up front that
    If a loose Excel file is processed first, then all fol-   application is having trouble printing a given file.
lowing examples of that Excel file could be removed                                                                          de-duplication
                                                                  Semi-corrupt files are files that will open, but will
from the collection under de-duplication rules. How-          not print. One workaround method to address semi-              will take place
ever, most attorneys would prefer that attachments            corrupt files is to generate an image by performing            and address
to e-mails not be removed from the collection, even           a “print screen” of the opened file. This is a manual          the question
if they are duplicate files. If duplicate attachments         process that takes time and expense, but it may be
to e-mails are removed during processing, this could                                                                         regarding
                                                              important in the event that opposing counsel has a
become a problem for reviewing attorneys later on.            copy of the semi-corrupt file and goes the extra mile          global versus
    De-duplication has become an accepted prac-               towards processing it.                                         custodian level
tice to reduce attorney review time and ultimately                                                                           de-duplication.
                                                                  Corrupt files are files that will not open at all. It is
client’s bills. However, it is recommended that op-
                                                              common for attachments to e-mails to become cor-
posing parties agree up front that de-duplication will
                                                              rupt during transit and thus require a resend. Most
take place and address the question regarding global
                                                              ESI processing software will place a TIFF image
versus custodian level de-duplication.
                                                              placeholder in the final deliverable that reads “cor-
    Creating Image Files. When e-mails and their at-          rupt file,” thus alerting counsel and possibly oppos-
tachments are processed, typically an image file or           ing counsel that a file existed in the collection but
“TIFF” file is created. This TIFF image can be elec-          could not be opened and processed. In general, most
tronically redacted and Bates labeled, thus creating a        batches of ESI that are processed for review will
production set. Most times, TIFF images are created           generate a list of exception files. It is important to
in black and white if color is not gerund to the un-          acknowledge and address these files if they become
derlying matter. However, if a case revolves around           important later in the case.
intellectual property such as the color of a product’s
packaging, then black and white TIFF images may                   Most vendors will have access to password crack-
not be sufficient. It is important to discuss the na-         ing software that can get through most common file
ture of the case during the project meeting from this         types. Generally speaking, vendors will provide a list
standpoint.                                                   of password-protected files to their clients first to
    In most cases, single page TIFF images are accept-        avoid the charge of cracking passwords. Lotus Notes
able to both sides. However, some production agree-           files, which typically come in the form of “.NSF” are
ments call for multipage image file formats, such as          so secure that cracking their passwords is typically
Adobe Acrobat PDFs. If opposing counsel is expect-            impossible. If a discovery collection contains Lotus
ing PDF files, with embedded, searchable OCR, then            Notes e-mails, it is important to make sure they have

                                                                                                                             FALL 2008 £ 5
been created without a password during the collec-            Assigning Image Key/DocID. In either example
           tion process. All password-protected files should          above, it is important to decide upon an “image key”
           be included on the exception file list that is sent to     or “DocID” for either the extracted native files or
           counsel for further direction.                             TIFF images. This image key concept is similar to a
              Review Platform. Relay to the outside vendor            Bates number in that a unique value must be assigned
           or opposing counsel what type and version of your          to files during their processing. A sample DocID
           particular review tool will be used in order to avoid      would be: JSMITH0000001. Keep in mind that as
           costly rework once data arrives for loading. Unlike        one gigabyte of ESI can process out to 300,000 im-
           the old days when the mere fact that a review tool         ages, it is probably prudent to include a seven or
           was being employed was kept a secret, it is assumed        eight place numeric component to the DocID. After
                                                                      a production subset of images is created, an actual
           today that some form of automated database tool
                                                                      Bates number can be burned on to the TIFF images
           will be employed to manage large discovery collec-
                                                                      using a consecutive numbering scheme.
           tions. It is possible to create a detailed specification
           sheet that can be provided to vendors and oppos-              Database Fields. Whenever ESI is processed for
           ing counsel time after time as there are specifications    use in a database, metadata fields are extracted. Al-
           that will not change from case to case.                    though Microsoft files contain more than 300 differ-
                                                                      ent types of metadata fields, only a select portion are
               If an online hosted solution is being used, it is
                                                                      relevant to attorneys. Examples of relevant metadata
           still possible to create a delivery specification sheet.
                                                                      would include To, From, CC, SentDate, and Subject/
           Many leading hosted solutions will accept a Concor-
                                                                      RE. One area of confusion revolves around meta-
           dance “DAT” file as a load file format. A “DAT” file
                                                                      data extracted from e-mails versus their attachments.
           is typically used to store the extracted metadata and
                                                                      Oftentimes, the only useful pieces of metadata that
           perhaps full-text, or body of e-mails and files. Just
                                                                      can be extracted from attachments are the file name
           be aware of issues involved in text delimiters and         and type of file. The extracted metadata field “Au-
           field headers.                                             thor” for Microsoft Word files may be irrelevant and
               Native Review versus Image Review. For small           misleading as this piece of metadata refers to the
           collections, typically three gigabytes or less, many       original creator of a file, not the actual author of the
           attorneys choose to have all ESI processed to TIFF         contents of a file.
           images prior to review. One gigabyte alone of e-mails          Create a standard list of metadata fields that at-
           and their attachments can process from 50,000 up to        torneys would like to see extracted during processing
           300,000 TIFF images. Generally speaking, 20,000            or provided by opposing counsel as part of a produc-
           TIFF images can be one gigabyte in size and there-         tion. This will make the task of loading and creat-
           fore one gigabyte of native ESI could result in stor-      ing a review database much simpler. These standard
           age requirements of fifteen gigabytes or more. TIFF        fields, which could comprise the above described
           images are large files that occupy a lot of space on       Concordance DAT file, should be part of a firm’s
           the server, so it is important to alert the appropriate    technical standard.
           IT professional that a lot of storage space will be           Filtering Options. There are several common
           needed. Also allow for a long time to copy and load        techniques that can be applied to an ESI collection
           more than fifteen gigabytes of data and database           to reduce its size prior to review. These filtering tech-
           files to a network.                                        niques should be agreed upon up front with oppos-
               One alternative is to process ESI to extracted         ing counsel so that no questions arise later as to why
           metadata, full-text and links to native attachments.       only certain documents have been produced. The
           After e-mails and their attachments have been re-          most basic technique, de-duplication, has been ad-
           viewed and a subset of documents have been flagged         dressed earlier in this article. The three final filtering
           as responsive, a list of those responsive documents        techniques include: key word, date range, and file
           can be provided back to the vendor or internal pro-        type filtering.
           cessing group for TIFF’ing. As TIFF’ing documents             A list of key words can be agreed upon by oppos-
           takes a long time and creates a huge collection of         ing counsel prior to collection and processing of ESI.
           files that need to be loaded for review, native review     Occasionally litigants themselves will use key words
           prior to TIFF’ing is an attractive alternative.            to search e-mail servers for responsive files. How-

6 £ EDRM
ever, most IT tools are not capable of running key         image fidelity can be achieved by using the native
word searches against archived or compressed files         application to print a file. However, some semi-
such as ZIP files. If a Word file contains a key word,     corrupt files can only be printed by a generic file
but is inside of a ZIP file attached to an e-mail, the     viewer. The drawback of using a generic file viewer
key word search will typically not return that file        is that some formatting can be lost from the origi-
during a search.                                           nal settings. For example, Excel spreadsheets can
    Filtering by date range is a common technique          be forced to always print out with gridlines and all
that can reduce a harvested collection to just those       hidden columns and rows revealed. Excel spread-
files that fall within the period specified by a discov-   sheets are the most challenging file type to process
ery request. If e-mails are harvested at the custodian     as they are rarely set up with printing in mind.
level, by mailbox for example, it is likely that all of       One common discovery mistake is not discussing
the custodian’s e-mails will be collected, including       the format in which ESI will be exchanged and to
those that extend beyond the scope of the discovery        move forward with paper printouts. Opposing coun-
request. Date range filtering can be employed during       sel may make a good argument that printed Excel
processing to remove e-mails that are outside of the       spreadsheets are not acceptable as they need to have
desired time period from the final deliverable.            access to formulas. Therefore, it is important to agree
   Filtering by file type is a technique that can be       up front on the format that ESI will be exchanged in
employed to remove system files from an ESI col-           prior to processing, or counsel risks an expensive pro-
lection, or to target only those file types that are       cess to go backwards
necessary for review. Text files with the extension,       to native files.
“.TXT” are often included in processing, but turn
out to be system files containing useless informa-
tion. It is dangerous to exclude all text files, though,
as some may be actual correspondences that have
been saved as simple text. One area of concern
involves non-standard file types that may
derive from home grown or custom soft-
ware packages. If the underlying mat-
ter involves intellectual property, or                                                                                           Conclusion
specifically programming files, then
those special file extensions need
to be identified prior to process-                            The prevalence of electronic discovery, in conjunction with the new FRCP rules dictating
ing so that they are not sum-                              that issues related to ESI be addressed up front, indicate that that the time has come for
marily excluded.                                           attorneys to address technical issues, or be versed in the issues to a greater extent.
    Printing Options. ESI
                                                              Given that most law firms use the same review platforms over and over again, it is possible
processing software typi-
cally uses a given file’s na-                              to create a firm-wide technical standard governing the processing of ESI. All of the sections
tive application to open                                   addressed above can be memorialized into one document that can be used from case to case,
and print that file during                                 as well as given to opposing counsel prior to the exchange of discovery. Each of the areas
processing. The printing                                   that have been addressed deal with important technical decisions that, if left to the vendor
process involves the cre-
                                                           or internal processing team, inevitably will not be the correct or desired path. To avoid this
ation of a TIFF image,
versus sending that file to                                situation, most experienced vendors have resorted to refusing to move forward with a project
a printer. Most ESI pro-                                   unless attorneys sign off on a project setup specification sheet.
cessing software is config-                                   Many discovery agreements do not go into enough technical depth to cover such detail as
ured so it can override the
                                                           text delimiters, filtering options, field naming conventions and more. As a result, litigation
print settings of a given file
that have been put in place                                support professionals within law firms oftentimes have to perform a rework of produced data
by the original author of that                             or data received from a vendor. A large majority of this rework can be avoided with a robust,
file. Generally speaking, the best                         up-front project meeting that results in a document memorializing all possible technical details.

                                                                                                                                           FALL 2008 £ 7
Because Every Piece of
       Esquire      Electronic Evidence
Data Discovery                  Counts:
      Services    Count on Esquire
                        Esquire Litigation Solutions’
                     experienced staff ensures that all
                    documents are forensically sound.
                  They are carefully harvested, preserved,
                    organized, processed and exported
                       to protect against spoliation.

                           Count on Esquire to:
                                     •
                 Preserve the integrity of the documents
                                     •
                   Process files to your specifications
                                     •
                  Provide a searchable online database



                 Other Esquire Litigation Support Services:
                                      •
                              Full discovery services
                                       •
                                  Video services
                                       •
                      Discovery repository and production
                                       •
                            Trial preparation services
                                       •
                            Trial presentation services


                    Contact us today for a free consultation
                        or download a white paper at
                         www.esquirelitigation.com.

Contenu connexe

En vedette

Slideshares
SlidesharesSlideshares
Slideshares
brmcperu
 
Inside Counsel White Paper
Inside Counsel White PaperInside Counsel White Paper
Inside Counsel White Paper
larrylieb
 
Chicago LCCC Sourcing Aug 2007
Chicago LCCC Sourcing Aug 2007Chicago LCCC Sourcing Aug 2007
Chicago LCCC Sourcing Aug 2007
Nis-Peter Iwersen
 
Profile Juergen Schmied
Profile Juergen SchmiedProfile Juergen Schmied
Profile Juergen Schmied
juergenschmied
 

En vedette (20)

Slideshares
SlidesharesSlideshares
Slideshares
 
Inside Counsel White Paper
Inside Counsel White PaperInside Counsel White Paper
Inside Counsel White Paper
 
600732
600732600732
600732
 
Володимир Парасюк декларація 2015
Володимир Парасюк декларація 2015Володимир Парасюк декларація 2015
Володимир Парасюк декларація 2015
 
Ponte mais alta do mundo
Ponte mais alta do mundoPonte mais alta do mundo
Ponte mais alta do mundo
 
First love pp
First love ppFirst love pp
First love pp
 
HB_DIGITAL_ENGLISH
HB_DIGITAL_ENGLISHHB_DIGITAL_ENGLISH
HB_DIGITAL_ENGLISH
 
Josue seminario 1
Josue seminario 1Josue seminario 1
Josue seminario 1
 
Licencias poeticas actividad i
Licencias poeticas actividad iLicencias poeticas actividad i
Licencias poeticas actividad i
 
POSTER1
POSTER1 POSTER1
POSTER1
 
Chicago LCCC Sourcing Aug 2007
Chicago LCCC Sourcing Aug 2007Chicago LCCC Sourcing Aug 2007
Chicago LCCC Sourcing Aug 2007
 
Parque das Miniaturas em Bruxelas
Parque das Miniaturas em BruxelasParque das Miniaturas em Bruxelas
Parque das Miniaturas em Bruxelas
 
Josue seminario 8.1
Josue seminario 8.1Josue seminario 8.1
Josue seminario 8.1
 
Profile Juergen Schmied
Profile Juergen SchmiedProfile Juergen Schmied
Profile Juergen Schmied
 
Presentacion innova y emprende
Presentacion innova y emprendePresentacion innova y emprende
Presentacion innova y emprende
 
Alfonso Cuarón, Grandes Esperanzas - Análisis de la película
Alfonso Cuarón, Grandes Esperanzas - Análisis de la películaAlfonso Cuarón, Grandes Esperanzas - Análisis de la película
Alfonso Cuarón, Grandes Esperanzas - Análisis de la película
 
Taravali phalitalu
Taravali   phalitaluTaravali   phalitalu
Taravali phalitalu
 
O que tua gloria fez comigo fernanda brum
O que tua gloria fez comigo   fernanda brumO que tua gloria fez comigo   fernanda brum
O que tua gloria fez comigo fernanda brum
 
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantPOWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
 
Introducción al mercadeo
Introducción al mercadeoIntroducción al mercadeo
Introducción al mercadeo
 

Similaire à Esquire Edrm Q Fall 08 Article+Ad

Epiq E Discovery Faq Hong Kong
Epiq E Discovery Faq Hong KongEpiq E Discovery Faq Hong Kong
Epiq E Discovery Faq Hong Kong
DmitriHubbard
 
DEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCE
DEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCEDEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCE
DEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCE
DrupalCamp Kyiv
 
The PCNet Project (B)Dynamically Managing ResidualRisk042.docx
The PCNet Project (B)Dynamically Managing ResidualRisk042.docxThe PCNet Project (B)Dynamically Managing ResidualRisk042.docx
The PCNet Project (B)Dynamically Managing ResidualRisk042.docx
oreo10
 
White Paper: Identifying Problem Files during Upgrade
White Paper: Identifying Problem Files during UpgradeWhite Paper: Identifying Problem Files during Upgrade
White Paper: Identifying Problem Files during Upgrade
ConverterTechnology
 
Identify_Stability_Problems
Identify_Stability_ProblemsIdentify_Stability_Problems
Identify_Stability_Problems
Michael Materie
 
[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction
Perforce
 
Backup and recovery_redesign
Backup and recovery_redesignBackup and recovery_redesign
Backup and recovery_redesign
georgegaudi
 

Similaire à Esquire Edrm Q Fall 08 Article+Ad (20)

Epiq E Discovery Faq Hong Kong
Epiq E Discovery Faq Hong KongEpiq E Discovery Faq Hong Kong
Epiq E Discovery Faq Hong Kong
 
ELP Finance and Accounting Solution Cloud ERP
ELP Finance and Accounting Solution Cloud ERPELP Finance and Accounting Solution Cloud ERP
ELP Finance and Accounting Solution Cloud ERP
 
DEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCE
DEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCEDEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCE
DEVOPS & THE DEATH AND REBIRTH OF CHILDHOOD INNOCENCE
 
Effective and efficient distributed, web based document capture architecture
Effective and efficient distributed, web based document capture architectureEffective and efficient distributed, web based document capture architecture
Effective and efficient distributed, web based document capture architecture
 
The PCNet Project (B)Dynamically Managing ResidualRisk042.docx
The PCNet Project (B)Dynamically Managing ResidualRisk042.docxThe PCNet Project (B)Dynamically Managing ResidualRisk042.docx
The PCNet Project (B)Dynamically Managing ResidualRisk042.docx
 
Oracle database performance diagnostics - before your begin
Oracle database performance diagnostics  - before your beginOracle database performance diagnostics  - before your begin
Oracle database performance diagnostics - before your begin
 
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Cloud Computing for Lawyers: Practical and Ethical Uses of the CloudCloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
 
E-Discovery in the Cloud! What's the Positive Outcomes?
E-Discovery in the Cloud! What's the Positive Outcomes?E-Discovery in the Cloud! What's the Positive Outcomes?
E-Discovery in the Cloud! What's the Positive Outcomes?
 
191
191191
191
 
White Paper: Identifying Problem Files during Upgrade
White Paper: Identifying Problem Files during UpgradeWhite Paper: Identifying Problem Files during Upgrade
White Paper: Identifying Problem Files during Upgrade
 
Email Management & E-forms
Email Management & E-formsEmail Management & E-forms
Email Management & E-forms
 
Getting Rid Of Legacy And SharePoint Migration and Assement by Joel Oleson
Getting Rid Of Legacy And SharePoint Migration and Assement by Joel OlesonGetting Rid Of Legacy And SharePoint Migration and Assement by Joel Oleson
Getting Rid Of Legacy And SharePoint Migration and Assement by Joel Oleson
 
DMS and FMS
DMS and FMSDMS and FMS
DMS and FMS
 
Identify_Stability_Problems
Identify_Stability_ProblemsIdentify_Stability_Problems
Identify_Stability_Problems
 
Transaction processing system
Transaction processing systemTransaction processing system
Transaction processing system
 
Network Administrator Career
Network Administrator CareerNetwork Administrator Career
Network Administrator Career
 
[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction
 
Gfs论文
Gfs论文Gfs论文
Gfs论文
 
The google file system
The google file systemThe google file system
The google file system
 
Backup and recovery_redesign
Backup and recovery_redesignBackup and recovery_redesign
Backup and recovery_redesign
 

Esquire Edrm Q Fall 08 Article+Ad

  • 1. Improve Electronic Discovery Results by Focusing on the Beginning By Larr y Lieb Larry Lieb is a National Director at Esquire Litigation Solutions. He has extensive experience related to electronic discovery and computer forensics. He is the former Executive Director of the Litigation Support Vendors Association, an organization dedicated to establishing and maintaining quality standards and professional certifications. Mr. Lieb is a graduate of the University of Illinois at Urbana-Champagne.
  • 2. Prior to the advent of electronic discovery, par- to the situation. The FRCP in effect has made at- ties to major litigations had to make a decision early torneys personally responsible for the disposition in the discovery process to either scan and code the of their client’s electronic evidence (see Qualcomm, documents for review, or to manage the discovery Inc. v. Broadcom Corp., 2008 WL 66932 (Jan. 7, collection using paper documents. If the creation of 2008), vacated in part, 2008 WL 638108 (S.D. Cal. an image-enabled database was approved by a cli- March 5, 2008); Phoenix Four, Inc. v. Strategic Re- ent, the vendor decision and all key technical speci- fications were pushed down from the billing partner sources Corp., 2006 WL 1409413 (S.D.N.Y. May to the partner in charge of discovery, and from there 23, 2006). On top of personal responsibility, costs to an associate and finally to a paralegal, if no offi- related to the identification, harvesting, processing, cial litigation support staff existed. This was accept- hosting, review and production of electronic discov- able, if not ideal, in the day of paper and scanned ery have increased exponentially, putting attorneys images. at risk if vendor and review costs are dramatically The shift in discovery to electronically stored in- underestimated. formation (ESI) has exacerbated the problems inher- So how should attorneys and their litigation sup- ent in a hands-off approach by attorneys. There are port staff address critical technical details of the serious implications if the technology involved in the discovery process is not understood. Certain details discovery process as it relates to the production and must be understood and agreed upon in order for a exchange of ESI? This article will cover the major database to provide the reviewers what they need to technical areas that must be decided upon up front do their job. in order to avoid time-consuming and expensive re- The recent Federal Rules of Civil Procedure work, an possible mistakes, once ESI has been pro- (FRCP) changes relating to ESI have added urgency duced to and received from opposing counsel. Project Meeting The specifications described below should be by counsel or their trusted surrogates following covered in a project meeting prior to the com- the project meeting. Most good vendors have a mencement of processing ESI, whether it is done robust project setup document wherein all of the in house or by outside vendors. These details below items can be recorded for approval by at- should be documented and approved in writing torneys. General Processing Specifications Time Zone. Determine the time zone in which the pensate for time zone differences without resetting ESI originates and relay that information to the pro- the internal clock of the computer running the pro- duction team prior to processing to assure that dates cessing software. However, many applications sim- and time stamps come out correctly. For instance, if ply process dates and times based upon the setting of ESI originates from the Pacific Time Zone, but the the production computer. clock on the computer or software processing the De-duplication. ESI collections often contain ESI is set to Eastern Time Zone, the four-hour dif- many copies of the same correspondence sent to dif- ference could make dates appear to be one day later ferent people. As a result, if these duplicate e-mails than they really are. If one is exchanging metadata, are not removed during processing, attorneys will including dates, as part of a document production, be tasked with reviewing the same document over wrong dates as the result of incorrect time zones may and over again, leading to increased review time and result in producing documents with incorrect dates. higher client bills. Therefore, “de-duplication” has Some electronic discovery software is able to com- become a commonly requested service. Unfortunate- 4 £ EDRM
  • 3. … the actual technical process behind de-duplication ly, this detail should be discussed up front during the is usually poorly understood. project meeting. There are two main methods used to de-duplicate Exception Files; Files that Will Not Process. The e-mails and loose files in a collection: through a com- bane of any electronic discovery collection is semi- parison of their metadata fields such as, “To, From, corrupt, corrupt, or password-protected files. In CC, Subject/Re:” or a hash value. E-mails are typi- general, approximately 95 percent of most ESI col- cally de-duplicated via their metadata fields whereas lections will process without issue. By definition, a loose files such as PDF, Excel, Word and PowerPoint file that is able to be processed can be successfully De-duplication presentations are compared using their hash value. converted to a TIFF image and have its full-text and has become A hash value is a unique value generated by running metadata extracted. However, the remaining five an accepted the zeros and ones that comprise a file through a percent of files will not cooperate with processing mathematical algorithm. Changing just one charac- practice to software and get moved to an “exception list.” ter of a Word file, for example, will alter the hash reduce attorney Many hosted ASP solutions also offer “TIFF value that is calculated for that file. on the fly” capabilities that can cause problems. review time and The attorneys typically choose either to de-dupli- TIFF’ing ESI is part science and part art and should ultimately client’s cate the entire, global collection or just within indi- be approached with caution. The default TIFF’ing bills. However, it vidual custodians. Global de-duplication is usually practice for most production software is to use a more effective in reducing the total collection, but is recommended file’s native application, such as Microsoft® Excel, adds to the amount of processing time as each new to open and TIFF Excel files. Sometimes, a generic that opposing e-mail needs to be compared to every e-mail that pre- file viewer is used to print ESI, such as Stellant’s (now parties agree ceded it in the processing queue. owned by Oracle) Quick View Plus, when the native up front that If a loose Excel file is processed first, then all fol- application is having trouble printing a given file. lowing examples of that Excel file could be removed de-duplication Semi-corrupt files are files that will open, but will from the collection under de-duplication rules. How- not print. One workaround method to address semi- will take place ever, most attorneys would prefer that attachments corrupt files is to generate an image by performing and address to e-mails not be removed from the collection, even a “print screen” of the opened file. This is a manual the question if they are duplicate files. If duplicate attachments process that takes time and expense, but it may be to e-mails are removed during processing, this could regarding important in the event that opposing counsel has a become a problem for reviewing attorneys later on. copy of the semi-corrupt file and goes the extra mile global versus De-duplication has become an accepted prac- towards processing it. custodian level tice to reduce attorney review time and ultimately de-duplication. Corrupt files are files that will not open at all. It is client’s bills. However, it is recommended that op- common for attachments to e-mails to become cor- posing parties agree up front that de-duplication will rupt during transit and thus require a resend. Most take place and address the question regarding global ESI processing software will place a TIFF image versus custodian level de-duplication. placeholder in the final deliverable that reads “cor- Creating Image Files. When e-mails and their at- rupt file,” thus alerting counsel and possibly oppos- tachments are processed, typically an image file or ing counsel that a file existed in the collection but “TIFF” file is created. This TIFF image can be elec- could not be opened and processed. In general, most tronically redacted and Bates labeled, thus creating a batches of ESI that are processed for review will production set. Most times, TIFF images are created generate a list of exception files. It is important to in black and white if color is not gerund to the un- acknowledge and address these files if they become derlying matter. However, if a case revolves around important later in the case. intellectual property such as the color of a product’s packaging, then black and white TIFF images may Most vendors will have access to password crack- not be sufficient. It is important to discuss the na- ing software that can get through most common file ture of the case during the project meeting from this types. Generally speaking, vendors will provide a list standpoint. of password-protected files to their clients first to In most cases, single page TIFF images are accept- avoid the charge of cracking passwords. Lotus Notes able to both sides. However, some production agree- files, which typically come in the form of “.NSF” are ments call for multipage image file formats, such as so secure that cracking their passwords is typically Adobe Acrobat PDFs. If opposing counsel is expect- impossible. If a discovery collection contains Lotus ing PDF files, with embedded, searchable OCR, then Notes e-mails, it is important to make sure they have FALL 2008 £ 5
  • 4. been created without a password during the collec- Assigning Image Key/DocID. In either example tion process. All password-protected files should above, it is important to decide upon an “image key” be included on the exception file list that is sent to or “DocID” for either the extracted native files or counsel for further direction. TIFF images. This image key concept is similar to a Review Platform. Relay to the outside vendor Bates number in that a unique value must be assigned or opposing counsel what type and version of your to files during their processing. A sample DocID particular review tool will be used in order to avoid would be: JSMITH0000001. Keep in mind that as costly rework once data arrives for loading. Unlike one gigabyte of ESI can process out to 300,000 im- the old days when the mere fact that a review tool ages, it is probably prudent to include a seven or was being employed was kept a secret, it is assumed eight place numeric component to the DocID. After a production subset of images is created, an actual today that some form of automated database tool Bates number can be burned on to the TIFF images will be employed to manage large discovery collec- using a consecutive numbering scheme. tions. It is possible to create a detailed specification sheet that can be provided to vendors and oppos- Database Fields. Whenever ESI is processed for ing counsel time after time as there are specifications use in a database, metadata fields are extracted. Al- that will not change from case to case. though Microsoft files contain more than 300 differ- ent types of metadata fields, only a select portion are If an online hosted solution is being used, it is relevant to attorneys. Examples of relevant metadata still possible to create a delivery specification sheet. would include To, From, CC, SentDate, and Subject/ Many leading hosted solutions will accept a Concor- RE. One area of confusion revolves around meta- dance “DAT” file as a load file format. A “DAT” file data extracted from e-mails versus their attachments. is typically used to store the extracted metadata and Oftentimes, the only useful pieces of metadata that perhaps full-text, or body of e-mails and files. Just can be extracted from attachments are the file name be aware of issues involved in text delimiters and and type of file. The extracted metadata field “Au- field headers. thor” for Microsoft Word files may be irrelevant and Native Review versus Image Review. For small misleading as this piece of metadata refers to the collections, typically three gigabytes or less, many original creator of a file, not the actual author of the attorneys choose to have all ESI processed to TIFF contents of a file. images prior to review. One gigabyte alone of e-mails Create a standard list of metadata fields that at- and their attachments can process from 50,000 up to torneys would like to see extracted during processing 300,000 TIFF images. Generally speaking, 20,000 or provided by opposing counsel as part of a produc- TIFF images can be one gigabyte in size and there- tion. This will make the task of loading and creat- fore one gigabyte of native ESI could result in stor- ing a review database much simpler. These standard age requirements of fifteen gigabytes or more. TIFF fields, which could comprise the above described images are large files that occupy a lot of space on Concordance DAT file, should be part of a firm’s the server, so it is important to alert the appropriate technical standard. IT professional that a lot of storage space will be Filtering Options. There are several common needed. Also allow for a long time to copy and load techniques that can be applied to an ESI collection more than fifteen gigabytes of data and database to reduce its size prior to review. These filtering tech- files to a network. niques should be agreed upon up front with oppos- One alternative is to process ESI to extracted ing counsel so that no questions arise later as to why metadata, full-text and links to native attachments. only certain documents have been produced. The After e-mails and their attachments have been re- most basic technique, de-duplication, has been ad- viewed and a subset of documents have been flagged dressed earlier in this article. The three final filtering as responsive, a list of those responsive documents techniques include: key word, date range, and file can be provided back to the vendor or internal pro- type filtering. cessing group for TIFF’ing. As TIFF’ing documents A list of key words can be agreed upon by oppos- takes a long time and creates a huge collection of ing counsel prior to collection and processing of ESI. files that need to be loaded for review, native review Occasionally litigants themselves will use key words prior to TIFF’ing is an attractive alternative. to search e-mail servers for responsive files. How- 6 £ EDRM
  • 5. ever, most IT tools are not capable of running key image fidelity can be achieved by using the native word searches against archived or compressed files application to print a file. However, some semi- such as ZIP files. If a Word file contains a key word, corrupt files can only be printed by a generic file but is inside of a ZIP file attached to an e-mail, the viewer. The drawback of using a generic file viewer key word search will typically not return that file is that some formatting can be lost from the origi- during a search. nal settings. For example, Excel spreadsheets can Filtering by date range is a common technique be forced to always print out with gridlines and all that can reduce a harvested collection to just those hidden columns and rows revealed. Excel spread- files that fall within the period specified by a discov- sheets are the most challenging file type to process ery request. If e-mails are harvested at the custodian as they are rarely set up with printing in mind. level, by mailbox for example, it is likely that all of One common discovery mistake is not discussing the custodian’s e-mails will be collected, including the format in which ESI will be exchanged and to those that extend beyond the scope of the discovery move forward with paper printouts. Opposing coun- request. Date range filtering can be employed during sel may make a good argument that printed Excel processing to remove e-mails that are outside of the spreadsheets are not acceptable as they need to have desired time period from the final deliverable. access to formulas. Therefore, it is important to agree Filtering by file type is a technique that can be up front on the format that ESI will be exchanged in employed to remove system files from an ESI col- prior to processing, or counsel risks an expensive pro- lection, or to target only those file types that are cess to go backwards necessary for review. Text files with the extension, to native files. “.TXT” are often included in processing, but turn out to be system files containing useless informa- tion. It is dangerous to exclude all text files, though, as some may be actual correspondences that have been saved as simple text. One area of concern involves non-standard file types that may derive from home grown or custom soft- ware packages. If the underlying mat- ter involves intellectual property, or Conclusion specifically programming files, then those special file extensions need to be identified prior to process- The prevalence of electronic discovery, in conjunction with the new FRCP rules dictating ing so that they are not sum- that issues related to ESI be addressed up front, indicate that that the time has come for marily excluded. attorneys to address technical issues, or be versed in the issues to a greater extent. Printing Options. ESI Given that most law firms use the same review platforms over and over again, it is possible processing software typi- cally uses a given file’s na- to create a firm-wide technical standard governing the processing of ESI. All of the sections tive application to open addressed above can be memorialized into one document that can be used from case to case, and print that file during as well as given to opposing counsel prior to the exchange of discovery. Each of the areas processing. The printing that have been addressed deal with important technical decisions that, if left to the vendor process involves the cre- or internal processing team, inevitably will not be the correct or desired path. To avoid this ation of a TIFF image, versus sending that file to situation, most experienced vendors have resorted to refusing to move forward with a project a printer. Most ESI pro- unless attorneys sign off on a project setup specification sheet. cessing software is config- Many discovery agreements do not go into enough technical depth to cover such detail as ured so it can override the text delimiters, filtering options, field naming conventions and more. As a result, litigation print settings of a given file that have been put in place support professionals within law firms oftentimes have to perform a rework of produced data by the original author of that or data received from a vendor. A large majority of this rework can be avoided with a robust, file. Generally speaking, the best up-front project meeting that results in a document memorializing all possible technical details. FALL 2008 £ 7
  • 6. Because Every Piece of Esquire Electronic Evidence Data Discovery Counts: Services Count on Esquire Esquire Litigation Solutions’ experienced staff ensures that all documents are forensically sound. They are carefully harvested, preserved, organized, processed and exported to protect against spoliation. Count on Esquire to: • Preserve the integrity of the documents • Process files to your specifications • Provide a searchable online database Other Esquire Litigation Support Services: • Full discovery services • Video services • Discovery repository and production • Trial preparation services • Trial presentation services Contact us today for a free consultation or download a white paper at www.esquirelitigation.com.