Consider “The” Dictionary

Consider “The” Dictionary

Think back to the last time you used a dictionary – perhaps it was to settle an argument that erupted during a “friendly” game of scrabble, perhaps it was to clarify the meaning of a long, jargony word in an article or perhaps it was when you were learning a new language and wanted to know how to say ‘beer’ or ‘please’ or ‘fuck’ – regardless of why you used it, your encounter with the dictionary was probably brief, inconsequential and even slightly mundane. Chances are that a quick Google search or the flick of a few pages allowed you to find what you were looking for, consult it, digest it and then go on your merry way without a second thought. At school, we are taught to use dictionaries and by adulthood, the weighty tomes have become part of the furniture of our everyday lives, their authority cemented and their function rarely examined or questioned. 

However, if we pause for a closer inspection, the pages (or webpages) of our dictionaries can be very revealing. In representing the common understanding of each and every word, they are infused with the politics of accepted and shunned norms, identities, values and behaviours. Hidden in their seemingly dry definitions is evidence of ideologies, worldviews and timely cultural references. 

A dictionary’s primary function – to classify, order and define the common uses of words – sounds simple enough. Yet words and language are not straight forward. They are perpetually changing, unstable, generative and ever-shifting. Each publication of a new dictionary edition signposts not only the development of language but also the shifts occurring in how we think, feel, understand and respond to the world we inhabit. In some ways, we can think of dictionaries as museums of language, but while museums collect cultural artefacts of the past, dictionaries collect words and our use of them across time. Consider these two definitions of marriage:

  1. The act of uniting a man and woman for life (American Dictionary of the English Language 1928) 
  2. The state of being united as spouses in a consensual and contractual relationship recognized by law (Merriam-Webster Online Dictionary, July 2020) 

The two definitions differ greatly, with the first reflecting the heteronormativity of the 1920s and the expectation that marriage was forever. The latter definition highlights equality and consent and is inclusive of same-sex marriages, reflecting 21st-century changes in both laws and attitudes towards what kind of love is ‘common’ or legitimate. 

Despite their inevitable obsolescence, each contemporary edition of the dictionary is important because words are very powerful. Words can uplift or oppress. They can reveal or conceal. Influence, manipulate, challenge or change. In the essay Politics and the English Language (1946), George Orwell extrapolates on the danger of vagueness when using words. He argues that “the present political chaos is connected with the decay of language, and that one can probably bring about some improvement by starting at the verbal end.” He calls for a deliberateness in using and defining words to affect change. An example of the power of language and definitions to affect change is the reclamation of the word ‘queer’ by the LGBTQIA+ community where the meaning was shifted from something with derogatory connotations to a word that is empowering and uniting. 

Given the importance of words, it is odd that we unthinkingly accept the definitions presented to us in the dictionary as a truth. Who is designing our language? What might their biases be? What are the mechanics of words entering the dictionary? And what are the ethics of definition-making? The people who compile, write and edit dictionaries are called lexicographers. They choose which words are included and excluded from dictionaries and therefore which words are legitimised and which are disregarded. Lexicographer Erin McKean estimates that there are less than 200 lexicographers working in the USA, indicating that our dictionaries are made by a very small, elite group.

It is tempting to paint a picture of lexicographers as power-hungry villains who want to control the world of words: the Gatekeepers of Language who are ruthless grammarians and intellectuals looking to prove their cleverness. You might imagine them as white-haired conservatives, sitting in their walnut-panelled private libraries scoffing at the abhorrent way the common folk are butchering words. And looking back on the history of dictionary-making, this imagined portrait isn’t so far from the truth.

For centuries, humans have been compelled to order, categorise and define words – the oldest known dictionary dates back to C.2300 BCE. “In the past, most dictionaries were written by one person, usually an educated middle-class person,” says Ramesh Krishnamurthy, a lexicographer and corpus linguist. He explains that these dictionary-makers relied on their memories, intuition and limited personal experiences to create definitions. “That meant that the only words and phrases that would be explained were words that were used by well-educated middle-class people. And of course, they were explained in terms of their prejudices and beliefs.” And because historically the educated and wealthy have been predominantly male, most early lexicographers were men. 

One of the most famous lexicographers is Dr Samuel Johnson who published The Dictionary of the English Language in 1755. In the introduction, he writes about his motivation for spending seven years single-handedly compiling his 40,000-word dictionary: “I found our speech copious and without order, and energetic without rules: wherever I turned my view, there was perplexity to be disentangled, and confusion to be regulated”. In other words, he wanted to order language and control its use. 

Despite this desire for control, Johnson’s dictionary is widely considered by lexicographers as “the first dictionary that was based on serious evidence of English usage as opposed to just making it up,” as he used references and citations from texts dating back to the 1500s to substantiate his definitions. Although Johnson had a more scientific approach to defining words than his predecessors, some of his definitions are still highly subjective. For example, he defines ‘shabby’ as ‘a word that has crept into conversation and low writing; but ought not to be admitted into the language’; and ‘patron’ as ‘one who countenances, supports, or protects. Commonly a wretch who supports with insolence, and is paid with flattery’, which is thought to be a dig at Lord Chesterfield, a prominent statesman and man of letters, who promised to be the patron of Johnson’s dictionary but did not follow through with support.

However, as time has marched slowly on and words have wriggled out of the definitions we try to bind them to, the role and attitudes of lexicographers have shifted too. In a 2007 TED talk, The Joy of Lexicography, McKean uses the analogy of a traffic policeman and a fisherman to describe this shift. She says that the public often sees her as a word-cop directing “real” words into the dictionary and pointing “bad” words away, but really, she is a fisherman [fisherperson] who throws her “net into the deep, blue ocean of English and see[s] what marvellous creatures I can drag up from the bottom.” So rather than designing language, McKean is observing, documenting and preserving it.

Similarly, when I asked Michael Rundell, who has been a lexicographer since the 1980s and is the Editor-in-Chief at macmilliandictionary.com, if lexicographers are the gatekeepers of language, he chuckled. “There is a lot of misunderstanding amongst the public. Linguists and lexicographers are the least pedantic and most tolerant people about language use, far more so than the general public, because we observe the way that language is changing rather than complaining about it.” He explains that during his career, there have been “two massive shifts which have revolutionised the dictionary-making business” and have completely changed the mechanics of how words are included and excluded from dictionaries. This, in turn, has radically democratised dictionaries and the role of the lexicographer. These two changes boil down to the advent of the internet.

“The first revolution was the availability of corpus data which affects the production end of dictionary-making,” i.e. the writing and editing, says Rundell. A corpus (or corpora in plural) is a huge data set comprised of billions of words (the largest at the moment is about 15 billion). These are words pulled from millions of varied sources such as articles, blogs, social media posts, books, film transcripts, gaming forums, packaging information etc. The use of corpora in dictionary-making began in the 1960s but it was cumbersome and the data sets were quite small. As technology improved, so too did the enthusiasm for corpus-built dictionaries, this led to the pivotal COBUILD project in the 1980s which radically advanced corpus linguistics and dictionary-making. It was the first dictionary based entirely on a custom-built corpus of over 7 million words (small by today’s standards but massive at the time). Krishnamurthy, who worked on COBUILD explains that using corpus data “meant we could get a view of the language not just as it was used by educated middle-class people but by the whole population. So basically it enabled us to have a more democratic view of how language is used in our society.” Another advantage of using corpora is the real time data they provide about word usage, such as how frequently words occur and their shifting meanings over time. This data is used to determine if a word is common enough for inclusion in the dictionary. 

To understand the usefulness of corpora, I tried using Sketch Engine, an industry-standard corpus database and text analysis software for dictionary-making. It was created in the early 2000s specifically for lexicographers and linguists to truly map and understand language. If you look up ‘common’ on Sketch Engine 3,376,093 concordances (which is a fancy way of describing lines of texts showing every instance of the given word in the context in which it occurs) appear. Although it would be impossible to read every entry, you quickly begin to see patterns of how the word ‘common’ is used – in what contexts and with what other words (e.g. ‘common sense’, ‘common ground’) – as well as many other specific, nuanced and deeper searches and analyses of words. Basically, it is kind of a super-advanced, analytical Google search designed specifically for language nerds.

One of Sketch Engine’s other features is a software called GDEX (derived from Good Dictionary EXamples) which provides shortlists of potential example sentences. The GDEX website explains that the sentences are chosen by the algorithms based on “length, advanced vocabulary, sufficient context, pronouns pointing outside of the sentence and other criteria.” When I searched ‘common’, my favourite result was “loss of bowel control is surprisingly common.” By taking example sentences from a database rather than from one’s own head, it is less likely that your subjectivities will skew the sentences. However, using corpora does not entirely alleviate the problem of biases in the dictionary. The texts within corpora are also products of their times with underlying prejudices (unconsciously or consciously) built into them. Rundell cites writing an example sentence for the verb ‘to nag’ to demonstrate the dilemma that using corpora can present: 

“If you look in older corpora, the word nag is always used about women – they are the people who are doing it. So if you then say, our dictionary examples should reflect common usage but you look at that and see that 90% of all the examples you get shown by the corpora reference women nagging, what do you do? Do you override that and have an example of men nagging which then could be seen as a sort of language engineering? Or do you get around it by saying something like “the children are nagging” and put the blame on them instead?”

To deal with these conundrums, lexicographers follow editorial style guides which assist in choosing good example sentences and how to write definitions. “There is also a blacklist of things you have to avoid,” Rundell says “like sexism or anything potentially offensive etc.”  Inevitably, some sentences will enter that are biased, outdated or offensive to someone. While these thorny sentences were once set in print, today lexicographers can be much more agile at updating problematic entries.

This brings us to the second revolution in the industry: online dictionaries. As dictionaries have gone digital, the size and cost restrictions associated with books that once forced lexicographers to choose one word over another has evaporated, leaving an endlessly updatable platform. Digitisation has redefined what makes a good entry, what should be included and how quickly new words are added. 

Rundell says that “If you are doing an online dictionary, one of the imperatives is that you stay up to date because instead of people saying ‘that word is not in the dictionary so it is not a good word’ which used to be the common reaction, now they are saying ‘this word is not in this dictionary and therefore it’s not a good dictionary, I’ll look it up in another one.’” There is a growing awareness that when we use dictionaries we are not using The Dictionary but a dictionary. There are many dictionaries with many different approaches and between these dictionaries, there are inconsistencies, gaps and varied underlying ideologies. We are in the process of dismantling the absolute authority of dictionaries and, in turn, the absolute power of lexicographers. They are increasingly considered language enthusiasts rather than language dictators.

Linguist Gretchen McCulloch welcomes the change in our attitudes towards words and dictionaries. She highlights that “Language is the ultimate participatory democracy. To put it in technological terms, language is humanity’s most spectacular open-source project […] It spreads and disseminates through conversations and interactions.” Therefore each person has the power to shape and shift our words and the definitions by simply using them, making us all quasi-experts in language.

So, next time you use a dictionary, whether for scrabble, definitions, translations or whatever else, don’t take its word for it. Pause for a moment and consider: what led to this definition?  How recently was it updated? Who’s worldview does it represent? Do I agree with it? Do I trust it? A dictionary is not an authority, it is a guidebook. Lara Chapman