Boolean Searching in eDiscovery
Most search tools on the market today are going to use Boolean logic to define search parameters. Boolean logic is a complete system for logical operations, used often since popularization of mathematical logic and discussions concerning the foundations of mathematics. It was named after George Boole, who first defined an algebraic system of logic in the mid 19th century. Boolean logic consists of three logical operators: AND, OR, and NOT.
When using a combination of the AND and OR operators it may be essential to group combinations of search terms in parentheses to get components of the formula to calculate in the correct order. Possibly quotation marks are necessary to search for multi-word phrases. Proximity searching, wildcards, regular expression searches and field-specific searching (as opposed to searching through a full document) usually increase complexity of the Boolean searching.
Legal staff of a company may misunderstand Boolean logic rules. In this case usage of some search tool cannot get you to the expected results.
For example, when a company is asked to provide all documents belonging to certain individuals that contain any of a series of keywords, company staff may point the search tool at the relevant mail files and use something similar to the following search syntax:
The logic seems to be simple enough: a series of names, separated by «OR», which are then, in turn, separated with an «AND» from a series of keywords (themselves separated by «OR»). However, the returned results set may be nearly equal in size to the size of the combined mail files and almost useless.
The search syntax mentioned above was incorrect, because the «AND» operator calculated before the «OR». What was returned was essentially the logical equivalent of:
(«Ms C» AND product1) became one piece of search criteria. Every other name or word became its own, stand-alone search criteria. So, two mistakes were made. First of all, parentheses should have been used to completely separate the names from the search terms:
Besides, searching for the users names within their own mail files was redundant, since it would probably have returned nearly every document from the databases. Generally, the initial search had returned almost every document from 2 of the 3 users.
So, its important to understand how the search tool works, to understand the Boolean search logic. It can be also useful to run a sample search against a subset of the actual data to be sure the search is returning the appropriate results.
Preparing ahead and communication are the key principles for effective eDiscovery management. The ability to professionally and accurately find relevant data in a manner thats defensible in the eyes of the court is critical when litigation is concerned.