Rarely is a new yardstick of legal meaning created. But over the past decade, corpus linguistics has begun to be utilized as a new tool to measure ordinary meaning in statutory interpretation and original public meaning in constitutional interpretation. The legal application of corpus linguistics posits that an examination of every use of a term in a wide variety of documents can yield a more complete, impartial understanding of a word than can dictionaries, intuition, or an unsystematic survey of sources. Corpora could supplement, or even supplant, dictionaries and native-speaker intuition in legal analyses. For originalism in particular, legal corpus linguistics promises to offer what would be a more scientific methodology for a point of view which, until now, has lacked one.
However, corpus linguistics, as applied to legal problems, falls prey to a fatal methodological criticism – the frequency fallacy. The criticism states that in a corpus, an unusual meaning can have many corpus entries while a perfectly ordinary meaning can be completely absent from the corpus. That is, frequency is not a good measure of meaning. Since legal corpus linguistics relies on frequency, the corpus cannot inform legal meaning.
This article parries this otherwise fatal critique. It argues that while the frequency fallacy is self-evidently true, the fallacy is not inherent to the corpus, but rather is an artifact of misinterpreting the corpus by treating it like a dictionary. This defense consists of a number of steps. The first step distinguishes between two different methods of discerning ordinary meaning: extension and abstraction. As illustrated by Yates v. United States and United States v. Marshall, extension entails extending the statutory term to varying facts, while abstraction keeps the facts constant and abstracts out key qualities to find an appropriate term. Critically, this article argues that abstraction offers a way to avoid the frequency fallacy. Second, to use abstraction properly, one must analyze not only the presence of the legal term in question but also its absence; that is, one must determine the presence or absence of other terms to describe a similar factual scenario to distinguish between artifacts of language and facts about the world.
This article concludes by arguing that this method has a beneficial emergent quality. Not only does this answer make legal corpus analysis methodologically sound, but it also paves the way for the first tool to approximate how an ordinary person would read the law, thus potentially furthering the rule of law.