Got a match, Guv?
Will the changes recommended by the Law Commission on Expert Evidence improve the understanding of mark and impression evidence?
By Professor Allan Jamieson and Dr Scott Bader, The Forensic Institute
The use and abuse of expert testimony in the legal system and the assessment of expert testimony has been a source of much publicity, debate, opinion, and proposed remedies. The most recent proposals for the UK emanate from the Law Commission[i] (LC) and contained specific proposals regarding the admissibility of expert opinion. These proposals come close on the heels of a widening concern that both novel and apparently established forensic practices may not be as reliable as their disciples suggest; and indeed may not qualify, despite previous claims, as science.
Much of forensic science is about the matching of items to ascertain if they could have had a common source (e.g. DNA, fingerprints, footwear marks, paint, glass, fibres). Once that match is established, then the evidential value of such a match must be assessed. These two processes; matching and assessing significance, are the foundation for the reliability of these evidence types.
A recent wide-ranging and authoritative report on forensic science stated,
“Often in criminal prosecutions and civil litigation, forensic evidence is offered to support conclusions about “individualization” (sometimes referred to as “matching” a specimen to a particular individual or other source) or about classification of the source of the specimen into one of several categories. With the exception of nuclear DNA analysis, however, no forensic method has been rigorously shown to have the capacity to consistently, and with a high degree of certainty, demonstrate a connection between evidence and a specific individual or source. In terms of scientific basis, the analytically based disciplines generally hold a notable edge over disciplines based on expert interpretation.”[ii]
In particular, the areas generally termed ‘marks and impressions’ (e.g. fingerprints, bitemarks, footwear marks) are being subject to scrutiny as never before. A recent example involving footwear marks is the Appeal of R v T[iii].
This Appeal case involved,
“The extent to which evaluative expert evidence on footwear marks is reliable and the way in which it was put before the jury. …
The appeal raised an issue of some importance in relation to the use of likelihood ratios in the provision of an evaluative opinion where the statistical data available were uncertain and incomplete.”
We had examined all of the evidence in this case, including the case files created in the assessment and analysis of footwear marks. The Appellant’s case was presented by Mr James Wood QC who had extensive discussions with Professor Jamieson. The general conclusions expressed by Professor Jamieson were,
“There is no clear basis for the strength of evidence derived by Mr R [the Crown analyst], its reliability, nor for the expertise on which it rests.”
“I do not doubt that it is possible that such comparisons can provide useful evidence. I am not disputing Mr R’s opinion, but the scientific basis of it. It is my opinion that the state of development of this expertise is insufficient to ascribe any more than a very rough approximation to the probative value of the evidence, and such opinions cannot be considered scientific.”
This was not the first case in which we have considered the general reliability of the science, as opposed to the specific findings of the expert. This is in contrast to the traditional ‘fight fire with fire’ approach where solicitors generally seek a similarly qualified expert from the same field as the Crown’s witness. As stated by Professor Jamieson,
“It is unnecessary, and probably desirable, that I am not and do not claim to be a footwear expert in assessing the scientific value of such evidence. I would feel equally comfortable assessing claims for the validity of astrology or psychic phenomena without necessarily being a practitioner.”
The Crown expert had provided an opinion of evidential strength based on the so-called verbal scale after using what is termed the Bayesian approach, including a likelihood ratio. The principles of matching and significance highlighted in T are also used in other marks and impression evidence types. Despite a perception among some that T will have a limited effect on other cases, by increasing awareness of the parameters of scientific evidence and the Court’s views, the LC recommendations make it likely that the reliability of other practices will be subject to similar challenges.
Use of Likelihood Ratios
A likelihood ratio (LR) is a method of comparing the probability of two things by simply dividing one by the other. If a horse is 10-1 and another is 50-1 then the LR is 5 (=50/10) that the former horse will win rather than the latter, and 1/5 (=10/50=0.2) that the latter will win rather than the former. Note that the LR compares only those two horses in this instance. An LR of greater than 1 favours the ‘top line’ (numerator) outcome, whereas a LR less than 1 favours the ‘bottom line’ (denominator) outcome. It is simply a means of measuring how much more likely one thing is compared to another.
The use of the LR in the evaluation of forensic evidence has been promoted by some statisticians and groups of scientists, especially in the UK. They contend that this is the fair and balanced way to look at the evidence, by comparing the probability of the evidence given the prosecution story or outcome (Pp) with the probability of the evidence given the defence story or outcome (Pd). The LR is then Pp/Pd. However, as identified in R v T, this is not a standard having universal acceptance nor fully explored as yet by courts. Indeed, the Appeal Court has been quite specific in its rejection on a number of occasions.
It is unnecessary here to consider the scientific argument regarding the pros and cons of the LR. The argument advanced by the defence was mainly that there were no data to support any such calculation in this case.
It is important to understand that in our opinion the Court was not prohibiting the use of an LR in principle. The issues in the case were twofold; first, that it had never been made clear to the trial court that such an approach had been used by the expert and, second, that there were insufficient data to support any numerical calculation. The Court made clear that an expert was able to form an evaluative opinion even without statistics, but that it should not be presented as a mathematical calculation such as an LR,
“However there are cases where it would not be right to confine an examiner (where there are solely class characteristics) to opining on whether the mark could or could not have been made. There may be factors that enable him to go further than "could have made" and express, on the basis of such factors, a more definite evaluative opinion. …
In our judgment, an expert footwear mark examiner can therefore in appropriate cases use his experience to express a more definite evaluative opinion where the conclusion is that the mark "could have been made" by the footwear. However no likelihood ratios or other mathematical formula should be used in reaching that judgement for the reasons we have given.”
This is at odds with the views of some statisticians (and perhaps some scientists) who believe that subjective ‘degrees of belief’ can be incorporated in the LR. It would appear that it is that specific practice which the Appeal Court rejects. There was no blanket prohibition on the use of a LR if reliable data are available. This is explicit in the judgement,
“If there are reliable statistics and data, it would then be necessary to consider how likelihood ratios should be used and how their use should be explained to a jury.”
Evaluative opinion
We can therefore identify evaluative opinion as a global category in which the expert expresses their opinion of the strength of the evidence. Evaluative opinion may then be subdivided into,
- Comparative, where the
expert compares propositions using either,
- data, in which case the use of a LR is appropriate (e.g. DNA)
- experience, in which case some other term must be used, such as ‘comparative evaluation’ (e.g. footwear mark) with some assessment of the significance of the findings.
- Absolute, where the
expert assesses the weight of only one proposition using, again, either,
- data (e.g. frequency of glass type)
- experience (e.g. some clinical diagnoses).
The Law Commission makes extensive recommendations (at 9.11 and 9.12) regarding the assessment of the reliability of expert evidence. The Commission also lists some factors that would cause evidence not to be admitted, including untested hypotheses, unjustified assumptions. We have highlighted the difficulties in using experience as a reliable foundation for scientific opinion[iv]. It remains unclear how experience-based opinion, devoid of statistical or other proper scientific bases, will meet the criteria such as being ‘soundly based’, and justifying any proffered evidential strength absent any numerical data.
It therefore must remain a concern whether the judiciary will depart significantly from judgements about the reliability of expert opinion such as Atkins where the Court stated,
“An expert who spends years studying this kind of comparison can properly form a judgment as to the significance of what he has found in any particular case. It is a judgment based on his experience. A jury is entitled to be informed of his assessment. The alternative, of simply leaving the jury to make up its own mind about the similarities and dissimilarities, with no assistance at all about their significance, would be to give the jury raw material with no means of evaluating it.”
In our opinion that Court has compounded the two distinct requirements identified above (matching, significance). It apparently assumes that the mere experience of looking at things and identifying similarities and differences enables a reliable opinion as to the significance of those differences and similarities. The two elements of forensic assessment, matching and significance, can only be measured by scientific means.
To decide whether two things match (other than those that patently do so, in which case no expertise is necessary) requires controlled experiments to establish the range of variation within a population and within an individual. No matter how many footwear marks a person observes, he can never know whether they were made by a particular shoe unless experiments are conducted to establish the range of marks that shoe will produce and what other shoes may produce similar marks.
The human capacity to reinforce subjective perceptions (e.g. why you notice cars of the same type as yours more often than other makes) is one reason why scientific study normally demands counting and the assessment of variation. Only proper counting of features in relevant populations will determine their prevalence and enable a reliable estimate of the probability of finding a match by chance.
An expert’s experience in establishing the fact of a match cannot be used reliably to determine the significance of the match, in the absence of sufficient statistical data. The response to Atkins must surely therefore be that, in such a situation, the jury have neither expert guidance on the significance of the match nor the means to establish that significance for themselves, and therefore the evidence would be inadmissible.
Contradictions from Court regarding science are not unknown. For example, in Henderson[v] the Court states,
“The jury should examine the basis of the opinion. Can the witness point to a recognised, peer-reviewed, source for the opinion?”
But in Weller[vi],
“It therefore seems to us that what this appeal demonstrates is that if one tries to question science purely by reference to published papers and without the practical day-to-day experience upon which others have reached a judgment, that attack is likely to fail, as it did in this case.”
In that case, it was the Crown’s reliance on ‘case experience’ for an opinion that was an issue in the Appeal, despite the existence of controlled experimental data (from the same Crown laboratory) that supported an alternative opinion for the case. The judgement appears to ignore the fact that, as Werner von Braun is credited with saying, “one experiment is worth a thousand expert opinions”. In other words, no matter what ‘experience’ suggests, a controlled scientific experiment provides a reliable test of that experience. The NAS report comments,
“However, some courts appear to be loath to insist on such research as a condition of admitting forensic science evidence in criminal cases, perhaps because to do so would likely “demand more by way of validation than the disciplines can presently offer”
The NAS report was not complimentary on the topic of the legal system’s ability to differentiate reliable from unreliable science,
“In a number of forensic science disciplines, forensic science professionals have yet to establish either the validity of their approach or the accuracy of their conclusions, and the courts have been utterly ineffective in addressing this problem.”
Only time will tell if the Law Commission’s recommendations, some of which were already practice, will improve the law’s understanding of the nature of scientific evidence.
Professor Allan Jamieson
Dr Scott Bader
The Forensic Institute, Baltic Chambers, 50 Wellington Street, Glasgow, G2 6HJ
[i] The Law Commission (LAW COM No 325) (2011), Expert Evidence in Criminal Proceedings in England and Wales
[ii] National Academy of Sciences of the USA, (2009) Strengthening Forensic Science in the United States:
A Path Forward.
[iii] [2010] EWCA Crim 2439
[iv] “Experience is the name that everyone gives to their mistakes”, Barrister, 45, 2010
[v] [2010] EWCA Crim 1269
[vi] [2010] EWCA Crim 1085
