Predictive Coding Software In The E-Disclosure Process Approved By English Court.

Legal News & Analysis - Asia Pacific - Singapore – Dispute Resolution

25 April, 2016


Pyrrho Investments Limited and another v MWB Property Limited and others [2016] EWHC 256 (Ch) 




The huge potential for electronic discovery in Singapore has been recognised in Global Yellow Pages Ltd v Promedia Directories Pte Ltd and another suit [2013] 3 SLR 758 (“Global Yellow Pages”).


In the recent case of Pyrrho Investments Limited and another v MWB Property Limited and others [2016] EWHC 256 (Ch)

(“Pyrrho”), the High Court of England and Wales approved the use of predictive coding software in electronic discovery.


This meant that document review would be initially undertaken by proprietary computer software customized for the issues in the proceedings, instead of human beings. The software would score the reviewed documents for relevance to the issues raised, saving time and reducing costs. Unlike human review, the costs of electronic discovery would not increase at the same rate as the number of documents to be reviewed.




The 1st Claimant sued (as assignee of the 2nd Claimant) in respect of payments made by the 2nd

Claimant due to alleged breaches of fiduciary duty by the 2nd to 5th Defendants (who were directors of the 2nd Claimant).


There was also a specific claim in respect of a dividend that was declared by the 2nd Claimant on 11 June 2009 in the sum of approximately £9 million.


The claim was amended in 2014 to include a second group of claims that the 2nd to 5th Defendants caused the 2nd Claimant to enter into transactions with companies in which they themselves were secretly interested, and thereby extracted some £28.5 million from the 2nd Claimant over a period of 5 years.


The bulk of the documents were in the control of the 2nd Claimant. While the original total of 17.6 million files was reduced to 3.1 million by a process of electronic de-duplication, it was still a large and costly number to search.




The Predictive Coding Process


The Court in Pyrrho helpfully outlined a typical predictive coding process.


First, parties will settle a predictive coding protocol, including definition of the data set, sample size, batches, control set, reviewers, confidence level and margin of error. Criteria must then be decided upon for inclusion of documents, such as who had the documents and date range, as well as keywords.


Next, a representative sample of the documents is used to ‘train’ the software. A person (eg a lawyer involved in the litigation) considers and makes a decision for the documents in the sample, and each document is categorised. It is essential that the criteria for relevance be consistently applied at this stage (eg by a single, senior lawyer who has mastered the issues in the

case). Based on the ‘training’, the software then reviews and categorises each individual document in the whole document set as either relevant or not.


The results of this categorisation exercise are then validated through a number of quality assurance excercices. The higher the level of confidence, and the lower the margin of error, the greater the sample must be, the longer it will take and the more it will cost.


The samples selected are reviewed by a human for relevance. The software creates a report of decisions overturned. Where the decision is adjudged correct, it is fed back into the system. Where not correct, the document is removed from the overturns.


This sampling is repeated (usually not less than 3 times) as required to bring the overturns to a level within agreed tolerances.


The trend of overturns should be lower each round. This would result in a final report within the agreed tolerance and the list of documents can then be produced.


The Decision


In allowing the use of predictive coding software, the Court found that the factors in favour of approving the use of predictive coding technology were:


a) Predictive coding software can be useful in appropriate cases.


b) There is no evidence to show that the use of predictive coding software leads to less accurate disclosure, and indeed there is some evidence to the contrary.


C) There will be greater consistency in applying the approach of a single senior lawyer towards the initial sample to the whole document set.


d) The number of electronic documents which must be considered for relevance and possible disclosure in the case was huge.


e) The cost of manually searching documents would be enormous, of at least several million pounds, which would be unreasonable where a suitable automated alternative exists at lower cost.


f) The cost of using predictive coding software was in the case far less expensive than the full manual alternative.


g) The “value” of the claims made in the litigation was in the tens of millions of pounds. In the Court’s judgment the

estimated costs of using the software are proportionate.


h. The trial in the case would not be until June 2017, so there would be time to consider other disclosure methods if for any reason the predictive software route turned out to be unsatisfactory.


i. The parties have agreed on the use of the software, and also how to use it, subject only to the approval of the Court.

Notably, the Court found that there “were no factors of any weight pointing in the opposite direction”, and observed that the use of predictive coding software in the case would promote the overriding objective of the Civil Procedure Rules, which is “to deal with cases justly and at proportionate cost.” The Court further observed that whatever the cost of manual review - provided that the exercise is large enough, the costs overall of a predictive coding review should be considerably lower.




The decision has yet to be considered by the Singapore courts. Nevertheless, the case highlights novel points regarding the use of predictive coding software which litigants, particularly those involved in cases dealing with voluminous amounts of documents, will want to take note of.


In the judgement, Practice Direction B to rule 31 of the Civil Procedure Rules (“CPR”) and the Electronic Disclosure Protocol developed by the Technology and Construction Solicitors’ Association were referred to.


Under the English procedural regime, the quality of the search, specifically its reasonableness, was what mattered on a fundamental level. Steps taken in listing and production could not cure defective search processes.


The English procedural regime did not deal in detail with how the search was to be conducted, although Practice Direction B referred to “automated search techniques”, “automated searches”, and “automated methods of searching”. 


In Singapore, the definition of reasonable searching for electronic discovery is governed by paragraph 47 of the Supreme Court Practice Directions (“SCPD”) and paragraph 94 of the Singapore International Commercial Courts Practice Directions (“SICCPD”) (collectively, “Singapore PDs”).


Whilst neither of the abovesaid paragraphs prohibit the use of predictive coding software, unlike Practice Direction B to the UK CPR, there is no express reference in the Singapore PDs to the use of automated searches or the equivalent.


In fact, the Singapore PDs appear to contemplate the use of specified search terms or phrases in their definition of reasonable searches, and therefore the interpolation of manual review of the results arising from specified search terms. This is fortified by the clarification in the Singapore PDs to the effect that the party giving discovery is not obliged to review the search results for relevance.


Contra, in Pyrrho, the introduction of the human reviewer was specifically for the purpose of teaching the software about the difference between relevant and irrelevant documents.


Further, in Global Yellow Pages, Lee Sieu Kin J highlighted predictive coding as an alternative to search technology.


In our view, predictive coding and search technology need not be considered alternatives. If anything, predictive coding is an iteration, perhaps more advanced, of search technology.


The rationale for introducing predictive coding is already set out in paragraph 48 of the SCPD and paragraph 95 of the SICC PD. An important consideration in electronic discovery is the proportionality and economy of the proposed searches. Predictive coding, by itself or in conjunction with keyword search technology, has great potential to overcome both time and cost barriers in enabling lawyers to zero in on potentially relevant sets of documents.


It remains to be seen whether the Singapore courts are open to predictive coding. The first step would be to include an express reference to automated searches in the Singapore PDs. 


Drew & Napier

For further information, please contact:


Mahesh Rai, Drew & Napier