Making the Anglo-Norman Dictionary

Like most dictionaries of its type, the Anglo-Norman Dictionary (AND) is, and has to be, a long-term project. It started in 1947, and the initial aim was simply to produce a glossary (a list of key words with their meanings) designed to help people to read literary texts written in Anglo-Norman. The idea was that individual volunteer scholars should each ‘take on’ a letter of the alphabet. In the course of their normal reading and research, they would then compile a list of relevant Anglo-Norman words beginning with ‘their’ letter along with the meanings in the context they encountered them. These lists would then be passed to an editorial committee who would draw up an alphabetical glossary. But what sounded like a simple idea proved to have all sorts of complications in practice, and for fifteen years or so after this idea was first proposed, little useful progress was made.

A fresh start

In 1962, William Rothwell, then Reader in French in the University of Leeds, and later to become Professor of French at the University of Manchester, was persuaded to become involved in giving new impetus and focus to the plan, working together with Louise Stone, the original editor. From that point onwards, the AND became in effect Bill Rothwell’s project and developed into a scheme to produce something much more than a simple word list. Between 1977 and 1992, seven separate printed fascicles, a total of 889 pages, appeared. (A ‘fascicle’ is a set of pages that are intended to make up a larger work which is being published in stages. They are generally issued in a soft temporary binding, with the pages numbered consecutively from one fascicle to the next. The idea is that libraries (the main purchasers of works of this kind) can buy the fascicles as they emerge and make them available to readers as they stand, then in the longer term, when all the fascicles have appeared, have them rebound permanently into more robust volumes, each of which will bring together several fascicles.)

Widening the scope

But even within this early period, a relatively short one by the standards of scholarly dictionary creation, there was a clear evolutionary development, leading to a noticeable difference between the earlier and later parts of the published work. From the middle of the alphabet onwards, Bill Rothwell had been granted access to two important collections which were to transform the nature, coverage, and scope of the AND as a whole: the collection of administrative and legal vocabulary assembled by Professor J. P. Collas of Queen Mary College, University of London, and the embryonic Dictionary of Law French compiled by Elsie Shanks for the Selden Society. Neither of these works had ever seen the light of day in its own right, but their contents, merged into the Anglo-Norman Dictionary, immeasurably enriched it and pointed towards further developments in its range, content and objectives.

In the workshop

Until well into the 1980s, creating a dictionary was essentially a very traditional enterprise: texts were read, both in printed and manuscript form, words were examined to determine their meaning in their contexts (not as easy as it may sound, especially when you are actually in the process of building the vital tool you require to do just that job), quotations illustrating the meanings and usage of those words were hand-written on individual slips of paper, additional material was merged in from other sources (such as the Collas and the Shanks collections just mentioned). The resultant thousands of separate bits of paper were painstakingly checked, sorted and arranged under headword groupings, then typed up on a manual typewriter in successive drafts which eventually went to the printer. Back from the printer came galley, then page proofs, which were often used not just to correct the inevitable errors caused by the typesetting process, but to add new quotations or modify existing definitions in the light of things that had emerged since the original typescript had been sent to press, as can be seen below.

 

The raw material is – to modern eyes – pretty old-fashioned, down to and including the now discoloured paper:

Some idea of what the process involved at typescript stage is apparent from the example below, part of work on the entry for lange1, which in modern French now means only means ‘nappy’ or ‘diaper’ (of the towelling variety):

The amount of correction and change even at this stage is revealing. Scholarly dictionaries (even once they have been printed) are never finished. Even with a language that is long since “dead” and so has ceased to evolve in the mouths and at the hands of its original speakers, previously unknown documents are still being found, or previously known ones read in a new light (sometimes quite literally so: technological advances in ultraviolet readers, multispectral photography and xray imaging are allowing scholars to see things in medieval manuscripts that had apparently long since vanished from view).

Into print

If dictionary editing at this stage was still firmly in the artisan tradition (pots of paste, scissors and kitchen tables were as important as learned texts), the printing was similarly traditional. As late as the 1980s, the AND’s printers, Maney’s of Leeds, continued setting the work by the so-called ‘hot-metal’ method, i.e., with little blocks of metal in trays, each of which printed one character. To all intents and purposes, then, the process which the Anglo-Norman Dictionary underwent once it was with Maney’s was no different from the method which Westerners claim was invented in the late fifteenth century by Gutenberg (in fact, Koreans were printing books in their recently-invented writing system using moveable metal type from much earlier in the same century, but they didn’t tell anybody about it).

No doubt to the considerable exasperation of the printers, correction and extension of the AND entries continued on the printers’ ‘proofs’ (normally intended only to allow correction of typesetting errors and very minor authorial amendments). As the galley-proof below shows, every single quotation which was inserted at this very late stage, after the type had been already been set in metal, meant that somehow or other text with exactly the same number of characters had to be removed to compensate, so that the result would fit on the same plate.

Towards computerisation

For printer and editors alike, such a manual system had its drawbacks. The disadvantages of multiple, retyped drafts are fairly obvious: a colossal expenditure of time and energy, and with a significant risk of accidentally introducing new errors in previously correct passages while adding new ones. And the editorial side was, for most of this period, essentially a one-man operation, largely reliant on one individual’s memory, powers of concentration and organising abilities. Then in 1986, Bill Rothwell was joined by Stewart Gregory and David Trotter as assistant editors. By this time, personal computers had arrived on the scene, and given the project a considerable boost. It became possible to dispense with repeated and uneconomical typewritten drafts, since the editing of entries could now be done in a word-processed format that could be corrected and updated at will. However, for the time being the core editorial processes of finding attestations in sources and grouping them to form the basis of entries still relied on the traditional methods. In the same year, however, two new arrivals at the University of Leeds met at a staff induction event. Andrew Rothwell had come from the University of Exeter, where the Pallas Project was pioneering the application of computers to Humanities teaching and research, while Michael Beddow had arrived from King’s College London, another institution where (under Roy Wisbey’s leadership) similar initiatives were being fostered. They began an informal but enduring technological collaboration that over the ensuing years was to lead, among other things, first to the fuller computerisation of AND editing procedures, then eventually to the shifting of the AND into a wholly-computerised operation at all its stages of production and delivery.

The first phase of the process that would eventually take the AND to the forefront of technology in the sphere of scholarly dictionaries was the use of OCR (optical character recognition) to convert printed sources into machine-readable form. Initially this was done using a Kurzweil Data Entry Machine at King’s College London, an extremely expensive device attached to its own ‘minicomputer’ (which in those days meant something the same size as a small car and a lot more costly to buy and run) but before long this task was being routinely performed on PC-driven equipment shared between the Leeds French and German Departments. Once the sources were converted to a form the computer could read, ‘gleaning’ of sources for suitable examples was no longer dependent on the once-and-for-all and time-consuming manual perusal of a text, requiring immense powers of concentration and memory to ensure that relevant instances were located. Instead, concordancing software (originally the DOS package TACT from the University of Toronto, later Rob Ward’s Concordance program for Windows, and most recently the project’s own Web-based concordancer, illustrated here, which is also accessible to Dictionary users on-line, to the extent that copyright restrictions on some source texts allow) made it possible to bring up on screen every single instance of every form, displayed in its context and select the most significant or characteristic instances for incorporation into entries.

Before long, the body of digitised texts converted from print editions could be supplemented by materials that had actually been created by computer, as various scholars who were by now preparing new editions using PCs generously made their document files available to the AND editors, often in advance of print publication. The benefits to the Dictionary of these new techological aids, both in terms of breadth of coverage and speed of production, were dramatic. Texts which had been (probably imperfectly) gone through once, could now be checked and rechecked at will. Huge quantities of material could be broken down in alphabetical word order, and searched for new words, new uses, and new phrases.

A new edition – and a new medium

Many factors combined to make it apparent, even before the last fascicles of the first edition appeared in the early 1990s, that a new edition of the Dictionary was called for. Work towards it was begun in 1989. Texts which had appeared since the first edition were gone through; electronic texts were searched; the Collas and Shanks proto-dictionaries were scoured for new material. It rapidly became apparent that the new edition was going to be much more than that – it would be an entirely new Dictionary. And so it turned out, in more ways than one.

In the first place, in the new edition, each letter has so far turned out to be at least three times as long as its counterpart in the old version. A-E in the second edition occupy 1100 pages as against 289 pages in the first edition. The entries for ajoindre in the two versions illustrate the difference: not only is the latter much longer, illustrating a greater range of meanings, but it draws on far more texts.

Though the entries for Letter A to E of AND2 were researched and compiled by computer, the original intention was that they should still be published in traditional print volume form. By the late 1990s however, the World Wide Web was assuming an importance and scope no-one had even anticipated a few years earlier, and emerging technologies were making it possible to publish scholarly works on-line at relatively low cost. What was then the Arts and Humanities Research Board of the United Kingdom (now a Research Council, hence AHRC) launched a Resource Enhancement Scheme to help make already existing research resources more readily and widely accessible via new technologies. The editors of the Anglo-Norman Dictionary submitted a successful bid to fund the conversion of the as yet unpublished new AND2 A-E articles from Microsoft Word documents to XML format and to create a server platform that would allow the resulting digital entries to be consulted on the Web using standard browsers. The idea, fully realised by the end of the funding period, was that there would be a parallel print and on-line publication of AND2 A-E, with the latter being made available at no cost to end users. This scheme, as well as proving to be a technological success, also awakened the interest of a new and much wider international user constituency, and as a consequence the AHRC awarded a second tranche of Resource Enhancement Funding under which those letters in AND1 that had not at that stage been revised (i.e. F-Z) could be digitised as well then published on-line alongside the revised A-E, making a fully scholarly dictionary of Anglo-Norman available without restriction to anyone who had Internet access, an undertaking that was completed early in 2006.

Same aims, new basis

While the AHRC-funded digitisation of AND2 A-E was underway, the editors had made a further application to the AHRC for a Major Research Award to fund the revision and on-line publication of AND letters F-H, and with the success of this application in 2003 the editorial operation entered another distinctive phase, now set to continue until at least 2012 after the award in mid 2007 of another AHRC grant to support the revision of letters I to M on the same basis.

Prior to this, the Dictionary had been, like most other long-established Humanities research projects in the UK, funded mainly from the baseline salary and premises provision of the Universities who employed the editors (although there had already been significant additional support for portions of the work from the Modern Humanities Research Association and the Leverhulme Foundation). But the AND was now shifted to a funding basis more akin to that of projects in the Sciences (and indeed in continental European lexicography), drawing on competitively-awarded time-limited and target-oriented research grants, employing full-time research assistants, retaining a technical consultant, ‘buying’ the time spent by the editors on overseeing the project from their employing institutions, and reimbursing those institutions for the resources, office space and other material costs entailed in hosting work on the Dictionary, with the whole operation subject to strict financial as well as academic controls, ensuring that the outcomes are delivered on time and are not only valuable to scholars, but also represent demonstrably good value for taxpayers’ money. The AND thus became a leading example of the new, more accountable and efficient, face (and substance) of UK Humanities research.

On this new funding basis, two full-time assistant editors, Dr Virginie Derrien and Dr Geert De Wilde, were appointed, who in due course assumed the main responsibility for the detailed work on the revision of old entries and compilation of new ones, under the day-to-day guidance of David Trotter and with assistance on detailed points of especial difficulty from Bill Rothwell, who also continues to review all entries before publication. Whereas in the previous phase of the revision, the editors had compiled the entries using Microsoft Word, with their word-processor documents then requiring specialist conversion to XML before they could be published on-line, the new editorial staff members worked from the very beginning in XML, meaning that entries created by the editors were ready for immediate on-line publication with no intervening conversion stage. They proved more than able to meet the ambitious time-scheme specified in the grant conditions for the completion and publication of specified letters of AND2, and the work, as well as progressing at an impressive rate while maintaining the highest academic standards, is being made freely available to an international public with none of the delays usually associated with the mechanics of publication, even by purely electronic means.

There are currently no plans for publishing any of the letters from F onwards in print form, though that possibility has not been ruled out in principle, and there are no obstacles to it in practice, aside from the (considerable) ones of cost. The AND has now become a purely digital undertaking in all its phases. Quite apart from the gains in scope, consistency and efficiency which electronic document preparation brings to the editing process, users of the on-line Dictionary enjoy numerous advantages thanks to Web delivery. They have unrestricted access to the Dictionary anywhere an Internet connection is available; it can contain far more, and longer, quotations than would ever be economical in a printed version; entries can be easily and quickly expanded or corrected after initial publication as knowledge advances, so that users can be sure they are seeing the very latest authoritative material; and as well as the sort of lookups by headword that are all a paper dictionary allows, the electronic Dictionary can be searched in a large number of ways impossible or extremely difficult on paper. The vast majority of users who have expressed a view are strongly in favour of the digital AND. One reviewer, after remarking that ‘the online AND permits an ease, speed and depth of consultation that a printed dictionary could never rival’, concluded that it ‘represents the future of lexicography, in a freely available form that surpasses in every respect the commercial electronic versions of other dictionaries in the field’ [D. Burrows in Medium Aevum Vol 26 (2007)].

The Dictionary now has its own premises, where, close by one of the three Internet servers (synchronised but self-contained and geographically dispersed) which deliver the dictionary to users, the still-invaluable Collas slips are safely housed in a modern filing system, rather than in shoeboxes and elderly and disintegrating drawers.

Links to the past are carefully preserved in the form of an substantial archive of previous drafts, old slips, and various largely uncatalogued papers. The underlying methodology – close analysis of original texts – remains what it has always been, but this is now harnessed to modern editing and delivery technology bringing together, we hope, the best of the old and the new, and making the results freely available to anyone interested, whether they are academics, students, or simply people who would like to find out more about all those aspects of the past that only a proper understanding of Anglo-Norman words and phrases and the civilisation they document can open up.