The controversial ethnographer Daisy Bates recorded many Aboriginal languages in the early 20th century, which would otherwise be lost today; now her papers have been digitised
Dreamtime Story mural
© Dreamtime Story (mural in Adelaide), Michael Coughlan/Flickr
In 1904 Daisy Bates, an Irish-Australian journalist and ethnographer, sent out a questionnaire to squatters, police, and other authorities across Western Australia asking them to record examples of the local Aboriginal language.

Mrs Bates (1859-1951) was something of an eccentric - wearing full Edwardian outfits even when living in small tents in Aboriginal camps - and she remains a controversial figure. But she was one of the few Europeans of the era who lived closely with Indigenous Australians and recorded their culture.

Indigenous Australians
Indigenous Australians. Daisy Bates was one of the few Europeans in the early twentieth century to live closely with Picture: South Australian Museum Archives
Importantly, the responses to her questionnaire, preserved in 21,000 pages of handwritten notes or typescript, are immeasurably valuable; in some cases recording all we have left of many Aboriginal languages, otherwise lost as a result of European invasion.

The value of the Bates papers

The papers are important not only for a general understanding of the diversity of languages that have been part of Australia's heritage for thousands of years, but also for the people associated with those languages. Aboriginal communities can not only reconnect to their languages through the papers, they can in some cases trace their named relatives. Some of the words listed have also been used in Native Title claims, establishing continuity of the language over time.

But these papers aren't just hugely valuable to interested linguistic researchers - even biologists trawl through them to understand the local plants and wildlife named in different Indigenous languages throughout.

However, until now, they've been largely inaccessible. The pages themselves have been spread across three libraries in different states and territories - the Barr Smith Library in Adelaide, the National Library of Australia in Canberra, and the Battye Library in Perth. Some have been published with English translations, but this work has not been linked back to the primary records in a way that is now possible with digital technology.

The Bates Online project has now done this and more, putting all 21,000 pages online in a searchable database, complete with maps showing where the words and phrases come from, as well as images of the original notes and typescript.

There are 4,500 pages of typescript representing languages from the Southern South Australia/Western Australia border all the way up to the Kimberley. At least 123 speakers are named in the vocabularies and, even now, it's not clear how many languages they represent.

The vocabularies preserved in the Daisy Bates questionnaires are extraordinarily precious as little else was recorded in the same time period, and nothing of the same scale has been attempted before or since.

The questionnaires she sent out contained some 2,000 prompt words and sentences in English, and asked each respondent to fill in as much as possible in the local Aboriginal language. It means that in addition to the lists of words totalling over 90,000 individual items, the collection includes grammatical information in the form of example sentences.

Presenting images of the original lists together with a searchable text version allows new information to be located within the pages - for example, where additional written material is included in the manuscript but not in the typescript.

By including images of all the original documents, people can also verify the accuracy of the typed versions.

We know of more than 300 Indigenous languages in Australia, or about 700 if all the different dialects are included. But for many of these, we have very few records, especially those languages that were spoken in locations where the European invasion occurred first. It means the Bates collection is a linguistic treasure, but one that has been largely neglected in the past, partly due to the papers being available only on paper.

New technologies for old papers

The project, a collaboration with the National Library of Australia (NLA), uses cutting-edge methods to make the collection available online. The text of all the vocabularies is encoded with XML tags which means they can be linked to the image of the source document, and a map of locations provides an entry point to the vocabularies. In addition to the search system, we are also developing a 'fuzzy' search system that will allow users to find what they are looking for despite a range of different spellings.
Aboriginal languages
© Getty ImagesDigitising the Daisy Bates papers creates a new opportunity for Aboriginal communities reconnect with their endangered languages.
The technology underpinning this project is the Text Encoding Initiative, normally used to create a textual version of classical douments like ancient texts or mediaeval manuscripts. This is the first time it has been applied to Aboriginal language manuscripts in Australia. One of the key benefits of the project is that all this information that has been created is stored in text files that can be accessed over time.

Today, fewer than 50 Australian Indigenous languages are still spoken, and only about 15 of those are being passed on to the next generation. At a time when Australian Indigenous languages are under severe threat from English, it's critical to make the best historical sources available to everyone so they're not forgotten, but particularly to Indigenous Australians who want to relearn and reinforce their languages.

Bates Online will be officially launched on 12 June, 2018 at the National Library of Australia in Canberra. The project was undertaken as part of Associate Professor Nick Thieberger's ARC Future Fellowship, and was has been supported by the Faculty of Arts, University of Melbourne; the Australian Research Council; the ARC Centre of Excellence for the Dynamics of Language; and The National Library of Australia.