There are two kinds of glossaries in the world. The first lists every related term and acronym in existence, implying users have the leisure to sort through hundreds or even thousands of entries. The second covers a select list of key terms and acronyms, distilling everything into a digestible format. The Ultimate eDiscovery Glossary is the second kind. Here we lay out essential terminology every legal professional should know. That’s important because, from general concepts like processing and production to granular terms for file types, eDiscovery is a language – and a complex one at that. The terminology can mix up even veteran litigators. No matter where you sit – in a law firm or a law department – the industry lingo can seem like a crazy alphabet soup at times. We designed this curated glossary to save users time. Dive in, and please let us know how we can keep improving the material. The Ultimate eDiscovery Glossary Table of Contents
- eDiscovery Practice
- Documents and Data
- eDiscovery Stages
- eDiscovery Terms
- Review
- Productions
Let’s begin.
1. eDiscovery Practice
Clawback Agreement: A safety net for privileged documents produced inadvertently, allowing parties to demand their return.
Deficiency Notice: A government notice of noncompliance concerning document productions and discovery obligations.
Deposition: The process of obtaining sworn oral testimony before trial.
ESI Agreement: An agreement by parties that outlines the scope, format, and requirements of electronic data that will be collected, reviewed, and produced in a matter.
Interrogatories: A list of questions from an opposing party you must answer in writing as part of discovery.
Legal Hold: Ahead of actual or anticipated litigation, an instruction from counsel to potential custodians to not delete anything that may be relevant and discoverable.
Meet and Confer: A meeting between opposing parties to resolve discovery issues regarding data collection, processing, review, and production.
Motion to Compel: A motion asking the court to order an opposing party to comply with a discovery request and produce information they have yet to provide during discovery.
Protective Order (or Confidentiality Order): A court order restricting the dissemination of information produced in discovery to protect private, confidential, or commercially sensitive content.
Request for Production (RFP): A discovery process used to gain access to documents held by an opposing party in a legal matter.
Subpoena: A command by a court or agency to produce documents.
Substantial Compliance: The good faith effort to provide all relevant information in response to discovery requests, even if every discoverable document has not yet been provided.
2. Documents and Data
Custodian: A named individual or discrete data source with administrative control over electronic files to be collected.
Data Sources: The specific accounts, systems, devices, and tools at issue in a legal matter.
Digital Forensics: The science focused on defensibly identifying, acquiring, processing, analyzing, and reporting on electronically stored information.
Electronically Stored Information (ESI): Information that exists in a digital environment. Examples include email, user documents, scans, chat files, mobile data, photos, videos, and voicemails.
Forensic Collection: Defensibly gathering ESI from a custodian or system using tools and protocols that preserve the data’s integrity.
Personal Storage Table (PST): A type of file format used to store or export and transfer emails, contacts, calendars, and metadata associated with an email account.
Spoliation: The destruction or alteration of electronic evidence.
System Files: Documents not generated by a human. Contrast with User Files.
User Files: Documents generated by a human. Contrast with System Files.
3. eDiscovery Stages
Collection: The process of defensibly gathering and securing potentially relevant ESI as defined in the identification phase of the electronic discovery process.
Early Case Assessment (ECA): Analyzing datasets prior to review to determine the scope, complexity, and potential relevance to help legal teams make informed decisions about the next steps.
Hosting: The process of storing, managing, and presenting data to legal teams for review in a secure online environment.
Identification Phase: Identifying the data sources for key players and systems important to a legal matter.
Phases: Strategic stages of the electronic discovery process, including identification, preservation, collection, processing, review, analysis, and production.
Presentation Phase: Displaying ESI in a manner that is easily accessible and viewable to a judge, jury, and counsel.
Preservation Stage: Reasonable and deliberate steps to safeguard potentially relevant ESI to a legal matter, including data retention policies and legal holds.
Processing: Standardizing and organizing ESI into a searchable, readable, and transferrable format.
Production Phase: The delivery of responsive, non-privileged information to opposing parties in a format consistent with a court order, subpoena specification, or mutual agreement.
Promotion: Moving ESI identified for review from the processing or ECA data store to the review workspace.
Review Phase: Evaluation of ESI to determine responsiveness, privilege, significance, and/or relevance to key themes.
4. eDiscovery Terms
Analytics: Statistical and machine learning technology used to organize ESI, provide insight, and potentially reduce the data volume requiring review.
Deduplication: The process of isolating identical documents in a collection and suppressing the duplicative copies from review.
Document Family: A group of documents that were electronically related in the ordinary course that are considered a single unit for the purpose of review and production.
Email Threading: Programmatically grouping related emails and identifying the original and all replies, forwards, and branching messages within that conversation.
Exception File: An electronic document that requires additional attention before review or production due to corruption, password protection, or the need for specialized software.
Images: Used during production, ESI converted into a picture reflecting the contents as if the file were in paper format or viewed on-screen.
Index: Generated during processing, an alphanumeric list of every word in the workspace used for accurate, efficient file retrieval during text-based searching.
Metadata: Information about an electronic document, such as email sender, filename, file extension, and file size.
MD5 Hash: A unique combination of characters generated for individual documents used as a digital fingerprint during processing and deduplication.
Native File: The structure, appearance, and functionality of an electronic document as defined by the application to create the file.
Near Duplicates: Documents that are similar in content but not identical are grouped to help streamline review and ensure consistency.
Optical Character Recognition (OCR): The process of converting an image of text into machine-readable and searchable text.
Review Platform: Software such as Relativity that standardizes, organizes, and presents ESI in a format that supports collaborative and efficient searching and review.
Searching: The process of finding individual or groups of documents by looking for terms of interest or specific document characteristics.
Text: The written contents of a document.
5. Review
Batches: A discrete set of electronic documents assigned to an individual reviewer for task distribution and progress management.
Coding: Applying tags to electronic documents based on discoverability, theme, significance, and other work product germane to the matter.
Dataset Analysis: The process of looking at a dataset holistically to identify patterns, themes, gaps, items of interest, and subsets warranting a particular next step.
Data Visualization: A tool within review platforms that displays metadata and work product in tables and charts, allowing users to pivot and filter quickly.
Domains: The organizational entity associated with individual user email addresses.
Fact Development: Discovering facts within a given dataset and building a story for use offensively or defensively in work on the merits.
Fields: A data container storing document-specific metadata, work product, and tracking details.
Folders: The area in a review platform that mimics the original folder structure of the source data used to organize documents by collection or production.
Layouts: A customizable display of fields reviewers’ reference and leverage.
Privilege Log: A log of privileged documents withheld from production that includes key metadata fields, a description of the subject matter, and the basis of the privilege asserted.
Quality Control (QC): The process used to validate work product quality and remediate errors.
Redactions: Obscuring text before production to protect privileged and sensitive content.
Saved Searches: A saved set of criteria that return documents that meet search and filter definitions.
Technology-Assisted Review (TAR 1.0): An iterative TAR workflow where humans code consecutive sample sets within a dataset to train a model that predicts and propagates coding on unreviewed or low-confidence items.
Technology-Assisted Review (TAR 2.0): A non-iterative TAR workflow that continuously learns, improves, and recodes the entire data set as human coding progresses.
Viewer: Functionality in eDiscovery platforms that allows users to read and interact with individual documents in a manner consistent with how the document presents in the ordinary course.
6. Productions
Bates Number: A unique, sequential number applied to each page in a production used to index and cross-reference items exchanged during discovery.
Encryption: A security protocol that makes readable data indecipherable until unlocked with an encryption key.
Endorsement: Text applied to production images, including Bates Numbers, confidentiality endorsements, and other production-specific references.
Load File: A data file that includes metadata and document identifiers for ESI that supports efficient upload and linking after transfer or production.
Placeholder: To avoid a Bates Number gap, a one-page non-substantive production image that serves as a reference for files withheld for privilege, produced natively, or that were unable to be fully processed.
Production Deliverable: A container file provided during discovery that aggregates production images, text files, native files, and the load file, organized in a way that supports efficient upload into a review platform.
Production Format: The components and format required for productions exchanged in discovery consistent with a court order, subpoena specification, or mutual agreement.
Production Image: The file format exchanged during discovery, ESI converted into a picture reflecting the contents as if the file were in paper format that also includes Bates Numbers, endorsements, and redactions if present.