Endangered Languages


I read in this morning’s Herald that a school in Victoria has been trialing the use of iPods for facilitating school work. iTouches1 are being used to research and submit assignments, to download music and for students to communicate with their teachers over email. The results so far suggest that students are much more likely to interact with school work over the medium of an iPod than more traditional methods, and are more likely to use the iPods than laptops.

This story ties in with James and my work over the past year, which will continue throughout this year, into the use of mobile phones for the maintenance of endangered languages. It also overlaps with the government’s ‘education revolution’ promise of the last election, in which each student receives a laptop.

So far the government’s plan has been marred by cost blowouts - although I’m almost certain this is due to the ‘Government letterhead’ effect2 - and concerns about the long-term technical support of the computers. The iTouch wins hands down on both counts, as they’re much cheaper - about 300 bucks as opposed to a grand at least - and they can be easily supported by Apple’s existing technical support infrastructure, especially if the iTouches come with the extended warranty.

Another issue raised here is the future of personal technology - though this is getting considerably geeky of me. I’ve long thought that there was too much increasing overlap between personal portable computers and mobile phones. More and more, mobile phones are internet enabled (although costly, as you have to go through your telco), support more data, can run programs, and generally operate like mini-computers. My prediction has been that mobile phones will get bigger and more functional, and laptops will get smaller and more portable, until they meet in the middle with personal PDA-style touchscreen computers with phones in them. Obviously such things have already been created, like Blackberries, iPods and, until recently, palm pilots, but the market is only beginning to catch on.

In addition to mobile phone applications for dictionaries of endangered languages, we think we can probably make downloadable programs for other devices, like iPods, and mobile phones that run Android (Google’s open-source and free answer to Apple’s iPhone). And we dont just mean dictionary viewing programs, but dictionary creation tools as well.

Imagine, for instance, if students of outback schools were equipped with iTouches pre-loaded with bilingual Kriol-English learning programs, and were pre-configured with a Kriol language pack, so that the iTouch’s menus and options started out in Kriol, until such a time as their English literacy reaches the point where they can switch it over to operate it in English.


  1. I’ve written right to the end of this post and realised that I’ve said ‘iTouch’ way too many times. I should point out right from the start that the device may as well be any of this new breed of mobile phone - though preferably something developed by the Open Handset Alliance and running Android. But for ease, I’m just going to refer to ‘iPod’ and ‘iTouch’ all the way through.
  2. The Government letterhead effect is when a private contractor increases their prices exponentially when they receive a quote request with a government letterhead. Remember the guys that wrote ‘No War’ on the Sydney Opera House in red paint? It cost $100,000 to clean.

    As if.

I’ve been back in Sydney for almost a week now, having been in Melbourne before that to attend the University of Melbourne Linguistics and Applied Linguistics Postgraduates Conference, where I presented the Kaurna Electronic Dictionary1 to a sell-out crowd. It was the final leg of an epic, two part world wind whirlwind tour that began in Wellington almost two weeks ago. (more…)


  1. For some background on the dictionary, see these posts (definitely not automatically generated):
    Mobile Phone Dictionaries

    Ceased to Be

    Conferences, Seminars and Dictionaries

    More Good News
    One down, one to go

As I promised last week, I’ve managed to find a copy of the SBS World News report in which I appeared, that mentions and demonstrates the mobile phone dictionary - thanks to Jeremy who recorded it - and so I’ve put it up here.

Just bear in mind that I had no idea that I was going to be interviewed, which is why I’m unshaven and wearing - ahem - a Transformers T-shirt (Decepticons, no less).

I suppose this destroys for good any semblance of internet anonymity that I had feigned.

<UPDATE>
As Michael noticed, I think the large video file was causing some strife for the company that generously hosts this site, Affernet, so I’ve YouTubed it instead.
</UPDATE>

When collecting field recordings, always, always begin each audio file with a little blurb mentioning the date, the location, who’s present, and what language is being researched. It’ll cost you about 10 seconds of each recording and you’ll sound like a bit of a tool repeating yourself, but you’ll save yourself hours of work years later when you (finally) get around to archiving your recordings and you need to find all this information from other sources, like airline booking confirmation emails.

Oh, and transcribe your recordings while they’re fresh in your head, lest you find yourself devoting countless hours of unpaid work to do so when you have a brazillion1 other things to do.


  1. I’m alluding to a George W. Bush joke here:
    One of the president’s advisers rushes into the oval office and tells the president that there’s been a terrorist attack in Rio and that 2 Brazilians have been killed.
    “Oh my God!” Screams the president, to the astonishment of the advisor, who didn’t think the death of a mere 2 people would have fazed the president so much. “How many are in a brazillion?”

If you’re in Australia, tune in to SBS World News tonight either tomorrow or Sunday night [I just got a call from them; they've bumped it back to the weekend] at 6:30pm. I have a feeling that there’ll be an interesting report on indigenous languages in Australia, and the use of modern technology (such as electronic dictionaries and mobile phones) in their revitalisation.

Or such was the impression I got when I gave them the interview.

A few weeks ago I mentioned that a bunch of us at Sydney Uni had submitted an abstract for a conference presentation of the Kaurna electronic dictionary.

Just recently, we received the news that our abstract has been accepted. So, if you’re planning on coming along to Australex ‘08 at the Victoria University of Wellington in November and you’d like to see the public unveiling of our Kirrkirr and mobile phone dictionaries, then by all means look out for us - by which I mean me.

As it’s been about a month since my last post, it’s probably about time I posted something at least to ensure that this site doesn’t get referred to as a ‘dead blog’. To make matters worse, not only have I not been posting, I’ve also been neglecting my reciprocal blogger duties of reading other people’s work, which I hope is a good indicator of how busy I’ve been. Reading through the myriad of blogs in my feed reader is  normally one of my most favoured activities.

So what is my excuse then?

The same old story really — work. But this time the various jobs are a little different. Besides my regular duties as audio engineer at Paradisec and my unrelenting duties as tutor of first-year linguistics, I have been preparing a grant application with a colleague to continue our work developing electronic dictionaries of minority languages, including dictionaries available as java applications on your mobile phone1.

We have also been preparing several papers, conference talks, seminars and so on to detail our project and our process of producing visually-rich multimedia electronic dictionaries from basic wordlists. There are a couple of conferences later in the year that this sort of thing would be perfect for, but we also plan to get a paper sent off to some prestigious lexicography journal somewhere.

As a teaser, here’s an abstract that we sent off to one such conference earlier this month:

Kaurna is the indigenous Australian language of Adelaide and the Adelaide Plains. It has not been actively used since 1929, when the last native speaker died. More recently, efforts have been undertaken to restore Kaurna to a state of community use. One recent project involved the creation of an electronic Kaurna dictionary carried out by a team at the University of Sydney during the first half of 2008. As this was a community-driven project, it had certain requirements, such as the need to archivally preserve the two main documentary sources of Kaurna: a book published in 1840, and a hand-written manuscript from 1857.

In an effort to maximise flexibility, portability and transparency, the Kaurna dictionary project opted for an XML formatted master dictionary that could then be converted to other formats, such as an HTML web-page, or even a printed dictionary. The current means of presentation is through Kirrkirr,  a multimedia-rich dictionary visualisation tool.

In this project we also developed software for presenting the dictionary on mobile phones. Mobile phones are almost ubiquitous today and most modern mobile phones have the memory capacity and features necessary for storing and presenting the dictionary content. They therefore present an excellent opportunity for learners of minority languages to have access to a dictionary. The mobile phone dictionary software is currently in its early stages, but we hope to improve it with further work and make it available to people compiling electronic dictionaries for other languages.

I’ll let you know how it all goes.


  1. You can read all about this project, which began with Kaurna, at a post of mine here, and at James’ post here. James’ post also includes example software for download, in case you want to try any of this out.

To continue the saga of the stolen wordlists (see my own posts on this here and here, or Peter Austin’s posts here and here for background) I’ve decided that if you can’t beat ‘em, join ‘em.

It is with that in mind that I give you (over the fold) the Murrinh-Patha crossword puzzle, my own creative work, using Philip M. Parker’s online dictionary of the Murrinh-Patha language. (more…)

A few posts back, I wrote about a book that David Nash had found on Amazon.com, which appeared to be a bi-directional crossword-puzzle book between English and Wageman [sic1]. It seemed as though these books, and a few others on Amazon on Wageman, contained the very same wordlist collected by a previous researcher and published under copyright at AIATSIS.

This is by no means an isolated incident. Parker has wordlists for around 600 languages stored online, and could potentially create crossword books, dictionaries and thesauri for each of them. See also Peter Austin’s post at Transient Languages and Cultures regarding a similar thing having happened to the Kamilaroi/Gamilaraay dictionary.

Instead of letting this issue slide into the obscurity of my Mabitjbaran, or Archives, I bought a copy of each, English to Wageman and Wageman to English, and have made contact with the ‘author’, Philip M. Parker, to solicit his explanation of what appears to be a blatant violation of copyright restrictions.

First thing’s first though. The books actually appear to be a pretty good educational resource, assuming that the school in Pine Creek is up to the point of recommencing its Wagiman language programs, of which I’ve only ever seen fleeting bits of evidence of ever having taken place2. The books comprise probably hundreds of automatically generated crosswords with the solution words in alphabetical order at the bottom. In spite of the books’ copyright restrictions by their supposed author, I’ve scanned a page of one of these books, which you can view here.

I’ve also done a little more background research on the author of these books, Philip M. Parker, and as it turns out, he’s not at all involved with dictionary compiling, language work or language education. In actual fact, he’s a professor of marketing and a generic entrepreneur at the Singapore campus of an international private business and marketing college based in France, called INSEAD. He even has a biography page on Wikipedia, which is interesting to this topic, as it goes into detail about his book publishing career. Apparently he’s quite famous in the marketing and entrepreneurial world.

His fame derives from the fact that he has developed a process that automatically produces and prints books on demand, with little or no interactive work. Each book that gets printed costs him an estimated 12 pence Sterling. So good is his software apparently that he has authored 85,764 books on sale at Amazon.com.

Parker estimates that it costs him about 12p to write a book, with, perhaps, not much difference in quality from what a competent wordsmith or an MBA might produce.

Nothing but the title need actually exist until somebody orders a copy. At that point, a computer assembles the book’s content and prints up a single copy.

Not much difference in quality from what a competent3 wordsmith might produce? If you check a random selection of some of these books, you’d be forgiven in not seeing what sort of quality he’s referring to:

The 2007-2012 Outlook for Tufted Washable Scatter Rugs, Bathmats, and Sets That Measure 6-Feet by 9-Feet or Smaller in India

Riveting. And that costs US$495.00, in case you were wondering.

What Parker does is harvest data, irrespective of what sort of data it is, and churns out books with it. It doesn’t matter if no one’s interested in the statistical prognostications for the Indian mid-sized bathmat industry, because each book is printed if and only if someone actually orders it; a copy may never actually exist. But considering there are libraries around the world that will buy a copy of each and every publication under the sun, Parker is probably earning a lot of money.

As I mentioned at the start, I’ve made contact with Parker and courteously attempted to solicit some information, such as which wordlist he used, and whether there were any copyright protections on that data. This is the response I got back:

Thank you for your concern; there are no copyright violations. Please feel free to copy my puzzles for your teaching4.

p.s. translations of words, themselves, cannot hold copyright, only the format in which they are presented (translations of single words are public knowledge; translations of creative works are not). I will later be doing anagrams, poems, rhyming sections, etc.. java-based web games (free to use), etc.

I felt a little confused by this response; I’m not very knowledgeable about copyright law and would have expected that someone’s research and work would be protected under copyright. At the same time though, I’m sure that Parker has done his legal research and knows full well what he can and cannot do. Peter Austin has a legal advantage over me in this respect; his Gamilaraay dictionary included some reconstructions:

It is not possible to copyright common knowledge such as words and meanings. Unfortunately for Parker, some of the quoted forms, like muRumuRu on page 11 are creative works since they are reconstitutions which I have posited on the basis of 19th century published and unpublished amateur recordings (as explained in the preface of my dictionaries — note that the orthographic R is not a Gamilaraay sound but a cover term for where I could not determine whether the source represented a flap rr or a continuant r). Now that is copying of creative work without attribution, in my view.

It may turn out to be a little more difficult to demonstrate some ‘creative work’ with the Wagiman dictionary, and we may just have to accept that legally, this sort of blatant plagiarism will be allowed to continue.

Let my warning be this: If you find a book written by Philip M. Parker that looks interesting, avoid it; you can probably find the content online for free.


  1. We spell it Wagiman these days. Wageman was the spelling adopted by earlier researchers, Ethnologue and AIATSIS. Phonetically speaking, I couldn't judge either way. For ease of fact-checking, I'll retain the spelling used in the books.
  2. Perhaps Wamut could help me out here.
  3. Notice also that he implies here that he is an incompetent wordsmith.
  4. I take my blog to be ‘teaching’, thereby indemnifying myself against the apparent copyright violation of my publishing of a scan of one of his crosswords

Over the weekend, David Nash drew my attention to a book that he found on Amazon, that purported to contain bilingual crosswords puzzles in English and Wageman1.

I was a bit perlexed by this, since, well, Wagiman doesn’t have much in the way of practical applications such as second-language learning, that is, of course, beyond the community of Wagiman people. It should be noted at this point though, that this book is not being marketed towards the small community of non-Wagiman speaking Wagiman people, but to a North American audience.

The book is published by a mob called Webster’s Online Dictionary, who I take to have no connection whatsoever to Merriam-Websters, given the look of their respective websites. Theirs appears to contain worldlists of hundreds and hundreds of languages, many of them minority languages, and it seems some of them have been converted to print, albeit in the bizarre form of bidirectional crossword puzzle books.

Here is the product description, as supplied by Amazon, and likely supplied by Philip M. Parker, the person behind Webster’s Online Dictionary:

Webster’s Crossword Puzzles are edited for three audiences. The first audience consists of students who are actively building their vocabularies in either Wageman or English in order to take foreign service, translation certification, Advanced Placement® (AP®) or similar examinations. By enjoying crossword puzzles, the reader can enrich their vocabulary in anticipation of an examination in either Wageman or English.

A translation certificate, Advanced Placement certificate, in Wagiman?  Really?

The second includes Wageman-speaking students enrolled in an English Language Program (ELP), an English as a Foreign Language (EFL) program, an English as a Second Language Program (ESL), or in a TOEFL® or TOEIC® preparation program.The third audience includes English-speaking students enrolled in bilingual education programs or Wageman speakers enrolled in English speaking schools.

EFL, ESL, TOEFL or TOEIC programs being run anywhere near Wagiman country? Really?

However, I can see in this book a benefit for some eventual teaching of Wagiman language in the local school, to help increase literacy in Wagiman, but unfortunately, the book uses an outdated orthography and may actually undermine increased Wagiman literacy efforts.

I wouldn’t want to financially support someone who - it appears - has taken a wordlist published in the public domain2 and has created something proprietary, like a book, with the goal of profit in mind, but I think I might still have to have a Wagiman-English crossword puzzle book on my shelf, just for the fun of it.


  1. Wageman was one of the variant spellings. Others include Wakiman (Cook, Austin) and Wogeman (Tyron).
  2. I find it ironic, furthermore, that while the original wordlist was a public domain web-publication, Webster’s Online Dictionary prohibits automatic harvesting of any of their data. I doubt that they copy-pasted each and every entry from the wordlist.

Next Page »