There are two kinds of glossaries in the world.
The first lists every related term and acronym in existence, implying users have the leisure to sort through hundreds or even thousands of entries. The second covers a select list of key terms and acronyms, distilling everything into a digestible format.
The Ultimate eDiscovery Glossary is the second kind. Here we lay out essential terminology every legal professional should know.
That’s important because, from general concepts like processing and production to granular terms for file types, eDiscovery is a language – and a complex one at that. The terminology can mix up even veteran litigators. No matter where you sit – in a law firm or a law department – the industry lingo can seem like a crazy alphabet soup at times.
We designed this curated glossary to save users time. The handy links below allow for quick hops to the relevant sections.
Dive in, and please let us know how we can keep improving the material.
The Ultimate eDiscovery Glossary Table of Contents
- eDiscovery Practice
- Documents and Data
- eDiscovery Stages
- eDiscovery Terms
- Productions
- Review
Let’s begin.
1. eDiscovery Practice
Clawback Agreement: A safety net for privileged documents that get produced inadvertently allowing parties to demand their return.
Deficiency Notice: A government notice of noncompliance with respect to document productions and discovery obligations.
Deposition: The process of obtaining sworn oral testimony before trial.
ESI Agreement: An agreement by parties that outlines the scope, format, and requirements of electronic data that will be collected, reviewed, and produced in a matter.
Interrogatories: A list of questions from an opposing party you must answer in writing as part of discovery.
Legal Hold: Ahead of actual or anticipated litigation, an instruction from counsel to potential custodians to not delete anything that may be relevant and discoverable.
Meet and Confer: A meeting between opposing parties to resolve discovery issues regarding data collection, processing, review, and production.
Motion to Compel: A motion asking the court to order an opposing party to comply with a discovery request and produce information they have not yet provided during discovery.
Protective Order (or Confidentiality Order): A court order that restricts the dissemination of information produced in discovery to protect private, confidential, or commercially sensitive content.
Request for Production (RFP): A discovery process used to gain access to documents held by an opposing party in a legal matter.
Subpoena: A command by a court or agency to produce documents.
Substantial Compliance: The good faith effort to provide all relevant information in response to discovery requests, even if every single discoverable document has not yet been provided.
2. Documents and Data
Custodian: A named individual or discrete data source with administrative control over electronic files to be collected.
Data Sources: The specific accounts, systems, devices, and tools at issue in a legal matter.
Digital Forensics: The science focused on defensibly identifying, acquiring, processing, analyzing, and reporting on electronically stored information.
Electronically Stored Information (ESI): Information that exists in a digital environment. Examples include email, user documents, scans, chat files, mobile data, photos, videos, and voicemails.
Forensic Collection: Defensibly gathering ESI from a custodian or system, ensuring and preserving the integrity of the data, and satisfying the requirements of the court and opposing parties.
Personal Storage Table (PST): A type of file format used by Microsoft Outlook to store emails, contacts, calendars, and other items. PST files are commonly used by individuals and organizations to archive their email data or to transfer data from one computer to another. They can contain a large amount of data, including email messages, attachments, and metadata.
Spoliation: The destruction or alteration of electronic evidence.
System Files: Documents not generated by a human. Contrast with User Files.
User Files: Documents generated by a human. Contrast with System Files.
3. eDiscovery Stages
Collection: The acquisition of potentially relevant electronically stored information (ESI) as defined in the identification phase of the electronic discovery process.
Early Case Assessment (ECA): A process to quickly identify relevant electronic data in a legal matter. It involves analyzing datasets to determine their scope, complexity, and potential value in a case, which helps attorneys make informed decisions about the next steps. ECA can save time and money by reducing the volume of data that needs to be reviewed for discovery, making it a critical step in the litigation process.
Hosting Costs: A unit-based (per gigabyte/month typically) flat fee for the data being securely stored in the review platform.
Hosting: The process of storing and managing data in a secure online environment. This allows authorized parties to access and review the data, and might also include features such as search functionality, document tagging, and collaborative review tools.
Identification Phase: Identifying key players and their data sources.
Phases: Stages of the electronic discovery process which include identification, preservation, collection, processing, review, analysis, and production of electronic data. Each phase is designed to manage and extract relevant information from electronically stored information (ESI) and to ensure that the information is properly produced in accordance with legal requirements.
Presentation Phase: Displaying Electronically Stored Information in a manner that is easily understood and accessible to an audience, such as a judge, jury, or opposing counsel. This can involve using visual aids, such as charts, graphs, or timelines, to highlight key facts or arguments.
Preservation Stage: Taking reasonable steps to safeguard Electronically Stored Information that is potentially relevant to a legal matter. This can involve issuing litigation holds, implementing data retention policies, and taking other measures to prevent the destruction, alteration, or loss of relevant data.
Processing: Data processing starts with data in its raw form and converts it into a more readable format to make it searchable, readable, and able to be sent to other parties in a standardized format.
Production Phase: The process of delivering Electronically Stored Information that has been identified as relevant and responsive to opposing parties involving formatting, assigning a unique identifier, exporting, and transferring data in a mutually agreed-upon format or according to a court order.
Promotion: Moving data from the Early Case Assessment workspace/folder to a Review workspace/folder so large groups of people can work on it simultaneously.
Review Phase: The process of evaluating Electronically Stored Information to identify relevant documents and information for a legal matter or investigation. This process may include document review, privilege review, and quality control checks, and is typically conducted by legal teams and other relevant parties using specialized software and tools.
4. eDiscovery Terms
Analytics: Statistical and machine learning technology used to organize and potentially reduce data volume in order to streamline review, reduce costs, and increase the accuracy and completeness of the results.
Deduplication: A mathematical process used to determine if documents are exactly the same, and if so, promotes only one copy for review and production. Duplicate versions of the document can be identified using metadata.
Document Family: A group of related documents that are considered as a single unit for the purpose of review and production. Example: an email and its attachments.
Email Threading: An algorithmic process that determines if an email is part of a chain of back-and-forth replies, and if so, where in the chain the relevant email sits.
Exception File: An electronic document that is not produced or disclosed due to the inability to review for relevance or privilege. Reasons for exceptions include password protection, corruption, or proprietary format.
Hash: A delicious mix of potatoes and corned beef.
Images: Converting documents, emails, and other Electronically Stored Information to a static format that preserves the appearance and formatting of the original while reducing the risk of metadata or other hidden information being inadvertently disclosed.
Index: All of the text and metadata that is searchable by the front-end user.
Metadata: Information about a document, such as email sender, filename, and file size.
MD5 Hash: A unique combination of characters generated for every document in any eDiscovery software. The hash values are used for deduplication and vary depending on which software you use.
Native File: An electronic file in its original form. I.e. email, Excel, PowerPoint, Word.
Near Duplicates: Documents similar in content but not identical — often differing only by minor changes, such as formatting — which are grouped together to help streamline review and ensure consistency.
Near Native Forms: View option which renders natives with nearly identical resemblance to the original document, while providing enhanced search, review, metadata identification, and other advanced capabilities.
Optical Character Recognition (OCR): Used to generate text from an image.
Review Platform: Software that standardizes native files and extracts key document components into searchable fields for review and production, such as Relativity.
Searching: The method used to find individual or groups of documents by looking for content or characteristics.
Text: The contents of a document.
5. Productions
Bates Number: A unique number assigned to every page or document produced. Bates numbers act as cross-reference numbers when both parties refer to a specific document. Bates are usually endorsed onto an image or when a document is being produced natively; the filename of the native is replaced with the Bates.
Encryption: A security protocol that makes readable data indecipherable until it is unlocked with an encryption key.
Endorsement: A label or piece of information applied to printed documents. Common examples are Bates and Confidentiality. See Bates Number.
Load File: A file that provides the necessary information for importing electronic documents into a document review platform. It includes metadata, such as the document’s file name, date, and author, allowing the documents to be easily searched and reviewed.
Placeholder: A one-page image that serves as a bookmark for inaccessible files or files that are produced natively. AKA slipsheet.
Production Deliverable: Typically, a zip container holding a load file and folders with produced images, text files, and native files that can be loaded into another party’s review platform. The exact format is a point of negotiation between the parties. Seemingly can only be made between 5 p.m. and midnight on Fridays.
Production Format: The specific format of electronic document images, natives, and metadata delivered during the discovery process, which are stipulated by all parties in advance.
Production Image: The final version of a document that is presented to the court. A flattened-out version of the produced document that also shows the bates, redactions, and other endorsements.
6. Review
Batches: A manageable set of electronic documents assigned to each reviewer to enable progress tracking and promote consistency
Coding: Applying tags to electronic documents to identify and organize them according to various criteria, such as relevance, privilege, confidentiality, or issues categories.
Dataset Analysis: The process of looking at a dataset to identify patterns or to verify the completeness of collections.
Data Visualization: A tool within review platforms that displays metadata or work product in tables and charts. AKA widgets and dashboards.
Domains: The entity of an email address, such as amazon.com and target.com.
Fact Development: Building the story the data tells by focusing on the key points needed to win.
Fields: A container for storing information about a document. Fields store information such as metadata or work product in tables and charts. AKA widgets and dashboards.
Folders: The area in a review platform that mimics the original folder structure of the source data. Also used to organize documents from other parties.
Layouts: The specific way of displaying fields used by reviewers.
Privilege Log: A record of electronic documents withheld from the production on the basis of attorney-client privilege, work product protection, or other legal privileges. The log typically includes information such as the document’s title, author, recipient, date, and a description of the basis for the privilege claim.
Quality Control (QC): The process used to make sure the review was done correctly.
Redactions: Drawing black boxes over privileged or secret things. Think spy movies.
Saved Searches: The area in a review platform housing dynamic searches. A dynamic search is a saved set of criteria that returns the latest documents that meet that criteria.
Technology-Assisted Review (TAR 1.0): A method of using technology to help humans review large sets of documents or data more quickly and accurately. TAR 1.0 utilizes rounds of overturns to ensure accuracy.
Technology-Assisted Review (TAR 2.0): A method of using technology to help humans review large sets of documents or data more quickly and accurately. TAR 2.0 utilizes continuous active learning to ensure accuracy.
Viewer: Functionality in eDiscovery software platforms that allows users to see and search electronic documents. Viewers can display a wide range of file types, including text, image, and multimedia files, as well as features like keyword searching, highlighting, and redaction.