2011-08-22

“We're here, we're using this language”: Michael Bauer on Scottish Gaelic

   Scottish Gaelic (a.k.a. "Scots Gaelic", "Gàidhlig", or just "Gaelic") is the Celtic language traditionally spoken in Scotland.  It is closely related to my own language of Irish, and also to Manx Gaelic which is spoken on the Isle of Man.  While it has a relatively healthy population of around 60,000 speakers (2001 census), there has been a steady shift to English over the last hundred years, even in the places where the language is the strongest and where it remains the primary community language, on Scotland's Western Isles.  UNESCO's Atlas of the World's Languages in Danger lists Gaelic as "definitely endangered".

   Gaelic has been used online for many years.  Indeed, my friend Caoimhín Ó Donnaíle, who teaches Computing at Sabhal Mòr Ostaig, the Gaelic-medium university on the Isle of Skye, co-founded the email list GAELIC-L as far back as 1989!  You'll find more than twenty years of messages, millions of words of Irish, Manx and Scottish Gaelic, in the archives of that list.

  Over the last couple of years, a flurry of open source software packages has been made available in the language, mostly due to the tireless work of Michael Bauer, who was kind enough to take time out of his busy schedule to talk with us about the state of the language and some of his recent projects.  Michael is self-employed as a full-time language consultant, providing what he calls "Gaelic Language Services": translation, proofreading, adult teaching, linguistic research, and, latterly, micro-publishing. He has produced some truly remarkable online resources for speakers of the language: a large bilingual dictionary (Am Faclair Beag), a high-quality digitized version of Dwelly's famous 1911 dictionary (a massive undertaking, produced over a ten year period), an open source spell checker (An Dearbhair Beag), and translations of several important software packages, including Firefox, Thunderbird, Opera, and Freeciv, the open source version of the classic game "Civilisation".  All of these projects were done on a purely volunteer basis, with no external funding a good lesson for any small language groups that might be waiting for financial support before beginning terminology development or software translation projects!  Since the launch of Indigenous Tweets in March, Michael has provided a huge amount of help to me personally, using his broad linguistic expertise to find tweets in several new languages.   You'll find him on Twitter as @LowRisingTone and @akerbeltzalba (Gaelic tweets only).

KPS: For readers unfamiliar with Scottish Gaelic, tell us a bit about the language, how many speakers, whether it's taught in schools, etc.

MB: Scots Gaelic is in a peculiar situation today. It enjoys official support not seen for centuries while at the same time suffering severe attrition of speakers and usage. The 2011 census figures aren't out yet but in 2001, the census reported just under 60,000 speakers. That's just over 1% of the population and to put that into perspective, that's down from just over 200,000 (about 4.5% of the population) in 1901. The largest challenge posed to the language today is a mixture of rural depopulation, an ageing speaker demographic and a collapse of everyday usage in the remaining majority Gaelic-speaking communities up and down the West Coast. Yet at the same time, Gaelic-medium education (GME) is on the increase (though still pitifully low, with about 0.4% of all schoolchildren receiving GME), as are adult learner numbers, an improving offer of books published, there is a Gaelic TV channel and a government broadly supportive of the language.

Michael Bauer
Legally the language has a similarly ambiguous status – for example for immigration purposes, a knowledge of Scots Gaelic fulfils the legal requirements of speaking a UK language and you can sit the Citizenship Test in Gaelic but on the other hand, it's not an official language which you are entitled to use at an official level unless it happens to be on offer.

The other challenge it still faces is widespread ignorance of the language and its history in the general population. The curriculum makes little to no reference to the position of dominance the language enjoyed for centuries or the reasons for its decline and though a majority of people feels broadly supportive of the language, there is still much animosity towards the language and a vocal minority who feels it is irrelevant to Scottish identity in the 21st century. The emergence of the concept of the Gaelic-speaking Highlander (them) and the Scots or English-speaking Lowlander (us, for most) goes back so far that most simply aren't aware that there was a time when a Scot de facto spoke Gaelic. 

KPS: What opportunities are there to use the language online? 

MB: It's a mixed picture. Google has had a Gaelic interface since about 2001 (which I started working on while at university) but I don't know what the uptake is. The main problem is that Google has a very strange approach to selecting which parts of their software suite are up for localisation and which aren't. For example the simple search interface is available, Google Docs isn't.

Facebook isn't available - their selection process is even stranger, but on the bright side, it doesn't seem to deter a lot of people from using the language on Facebook. And I know of a fair number of people who use the Irish interface.


There's an old release of OpenOffice (and I'm working on the update); Microsoft has been working on a (C)LIP [Language Interface Pack] for … oh, a long time but hasn't released anything yet. There are no localised operating systems but I personally feel that's a low priority. With limited resources, I always try to focus on projects which maximise impact. Few everyday users tinker with their OS on a daily basis and even fewer would be confident doing that in Gaelic – there is no Gaelic support team, and you have the problem that many computers are shared by speakers and non-speakers. So it's sort of on my to-do list but way down.

There are a few spell-checkers, only two of which are used widely (again thanks to you for helping us create one of them!). The Firefox app version of one of them has about 400 daily users, which is encouraging. As is the increase in the number of Open Source software packages in general. Firefox was launched in Gaelic in 2010 (and I'd like to thank you for bullying me into that!), followed by Thunderbird (Mozilla's email program), an app for Firefox that let's you switch between interface languages (the Quick Locale Switcher), and hopefully the upcoming release of Lightning, Mozilla's calendar program, and a localised version of Accentuate.us which automatically inserts grave accents. And then there's the re-release of the Opera browser at the end of 2010 (the project had fallen dormant and way behind until I took it over in 2010). The phpBB forum interface has also been translated by a friend and me, and is used in several places now.

There are three main dictionaries online now, plus a few smaller ones. One is essentially a big wordlist (the Stòr-dàta), the other a digitised version of the nearest equivalent Gaelic has to the OED (Dwelly's dictionary) and the third is a merger of Dwelly's and more modern material (called Am Faclair Beag 'the small dictionary'). Dwelly-d and Am Faclair Beag were developed between me and another friend who's a software developer in our free time (at least that's what other people call it).

There's a Gaelic Wikipedia (the Uicipeid) which isn't doing too badly considering the number of speakers but we could do with more active editors, especially fluent ones. I hear the Welsh are thinking of giving retired Welsh teachers some training in how to edit Wikipedia to add more content, which might be a way forward for us too.

Beyond that, there's not much else but overall, the Open Source movement has been a great opportunity so far for Gaelic and will continue to benefit the language. I just wish I could clone myself!

KPS: Many speakers of indigenous and minority languages are reluctant to use their languages online, for many different reasons (orthography, terminology, etc.)  How do speakers of your language feel about using the language online?

MB: Depends on whether we're talking just casually using the language or using localised software. In terms of casual use, terminology on the whole is not a massive issue, both in speaking and writing the language people code-switch a lot and I've rarely come across complications when using an English term in a Gaelic phrase. The only way that usually happens is when you get a learner who hasn't yet developed enough sensitivity to adjust the number of newly coined words they use depending on their audience. That can be a bit of an issue.

Gaelic-medium school in Glasgow, Scotland
Literacy is an issue for many older speakers that still needs addressing, sadly, which unfortunately reduces the number of potential users of the language online overall.

Translation of software on the other hand is an interesting challenge. There's not much that you cannot translate into Gaelic but the challenge is translating it in such a way that a non-technical user of the language can find their way around without having to resort to the dictionary all the time which tends to turn people off. But it can be done with a bit of forethought and a healthy approach to using loanwords. For example, when I was translating Firefox, we had to tackle the term 'export', quite a good example of subtle language engineering. There are several terms in dictionaries for the verb 'export' but they all try to carry the meaning by using native roots, for example 'às-mhalairt' – literally 'out-trade'. That sort of word sometimes works but in this instance it leaves most native speakers confused. So after some debate we settled on a new term, 'às-phortaich' or 'out-port' because it gives non-technical users more clues as to the meaning and that seems to have worked very well. The other aspect of this involves a bit of best practice in translation – when you get volunteers who translate software they often stick too close to the original language which results in really bad translations which put off end-users but if you get it right, it makes the localised versions much more readily acceptable to everyone, including non-technical native speakers.

The writing system is not too much of an issue – there is a grave accent (and an acute if you follow the traditional spelling) but casually, you can understand written Gaelic even without the accents. The one thing that causes minor headaches is the Gaelic ampersand – Gaelic doesn't use '&' but instead the so-called Tironian Ampersand ⁊. And most of you are probably seeing a square box now. QED. On the bright side, the mathematical operator ┐ looks just like it and bizarrely, displays widely so I tend to use that. I'm quite pleased that Mozilla and Opera have it. It may sound like a so-what issue but even Gaels are generally ignorant about the period where Ireland and Scotland were are the forefront of scholarship in Europe, writing in their own languages and with their own scribal tradition that it's an important little landmark for the language to have the ┐.

KPS: How is computing terminology for Gaelic developed?  Is there a "language board" or are terms developed naturally by the community?  If there are official terms, how are they communicated to the community?

MB: Yeees... good question. It's very haphazard. There's no official body that oversees terminology development so there are the usual gaps and the problem of too many terms for the same thing or indeed some terms getting overused.

Interestingly, being locale leader on Mozilla, Opera, phpBB, Google and other projects has allowed me to standardize at least web-terminology across most of the software on offer. For example, there are about 4 words each for copy, browser and import but virtually all now use the same terms. A very small number of people is upset about some of the choices but on the whole, people are glad that software is beginning to speak 'the same language'.

KPS: Are there other special challenges your community faces in terms of developing technology for the language and/or communicating online? For example, differences in dialects, different spelling systems, problem with fonts, lack of computing expertise in the community, lack of interest from software vendors like Microsoft/Apple/Google?

MB: All of the above? No, it's not that bad. Funnily enough, access to fast internet is what I'd put at the top of the list. The web is an increasing source of Gaelic stuff, from TV to radio and news, software, the web, access to services and so on. But access is not always straightforward, especially bearing in mind the geography of the West Coast with its many inhabited islands. So funnily enough, in this regard I'd ask Santa for superfast broadband in those remote Gaelic-speaking communities.

There are dialect issues but they're not insurmountable, fortunately the writing system is native and very old so usually a single spelling can accommodate a vast variety of pronunciations. For example the word 'bainne' (milk) has about a dozen pronunciations but fortunately they can all be derived from the same spelling.

With less than 60,000 speakers, interest from Apple & Co is as you'd expect. Low. Elsewhere, it's not quite that bad. The expertise is there but getting people to commit time is much more of a challenge. That and, which is not so much the fault of the community, the fact that even for the Open Source movement, localisation seems to be a bit of an afterthought. You could argue that the option was always there, which is true, but if you look at the processes within each project and across projects, they are nothing but arcane to your average educated user of any language.

It's a "you'd-think" thing – you'd think that there would be a central pool of translations for all Open Source projects (they're Open Source after all...), pooling all of OpenOffice, Mozilla, Linux, WikiTranslate and so on in one place, with each project able to draw upon this pool. Real-time would be nice but even a manual or nightly update would be great. Instead, I don't know how many times I have translated the word “Edit” or “Close”, “Save as” and so on. On their own, it doesn't seem like much but it adds up. And you have to remember that the ratio of speakers to localisers is big, scarily big. Gaelic has some 60,000 speakers and at the most, 2 1/2 people whom I would regard as being “regularly active” in unpaid localisation of Open Source software. Irish has somewhere between 80,000 to 100,000 highly fluent speakers and at a guesstimate, I'd say maybe half a dozen active people. If you look, for example, at the Mozilla localisation dashboard, you'll find that even large languages like Bengali, Hebrew or Indonesian are struggling to stay up to date. So there's something in the localisation process that's not working as well as it could. Or should.

From a personal angle too, I would have translated Firefox a long time ago but being a good translator is quite obviously not good enough – don't get me wrong, the Mozilla team are great people and very supportive but it's still a big challenge to understand a lot of what you have to do. I wouldn't recommend it without the help of someone who can speak code. That's really something that the Open Source community needs to improve.

Related to that are the more general problems of translation and localisation – programmers in general are very keen to rush off and program some neat bit of code that will calculate your tax, make a roach dance rumba across your screen and remind you to eat and drink once in a while but they rarely seem to consider cross-linguistic issues. They write their code and then downstream, some poor translator is going insane because they chopped up sentences in a way that's ok in English but not any other language, or they go placeholder happy. It's getting better but there's still a lot that needs improving. Plurals for example are getting quite good these days. English has 1 file and 2+ files. So you'd often get things like “You are about to delete %s file” and “You are about to delete %s files” to translate. Let's just say that this is a pattern few languages follow... Today on most localisation projects you can specify which numbers go with which plurals, which is good. But there's still a lot of weird language appearing on screens because such issues are rarely thought about. For example, in English you can use a sentence like “Give me results in” and the just have a dropdown of language names. But in a lot of languages the preposition “in” plus a language name results in a variety of different outcomes. For example “in English” is “sa Bheurla” in Gaelic, but “in Japanese” is “san t-Seapanais”. Or worse, there are languages which don't do prepositions. In Basque for example you have to use an instrumental suffix, resulting in “Ingelesez” and “Japonieraz”. And usually, you don't have the option of having two lists of languages. So you end up with strange syntax and strange idiom, which isn't that great for the user.
Language revitalization from below!

KPS: Are young people using the language online?  Do you think social media sites like facebook and twitter are helping encourage language use by younger speakers?

MB: Good question. If people between 20-35 are young, then yes. As for those under that age bracket, I think they are but I'm not sure, I'm not really connected to any really young people or following any.

But speaking of younger people, there's another project which isn't particularly technical but nonethless exciting.  It's called the Sgoil-Choimhearsnachd or 'Community School'. The underlying issue we're trying to address is the fact that in a place like Glasgow, even though there are more than 10,000 speakers, they're hard to find. Also, perhaps only some 200 or so regulary show up at Gaelic cultural events – which means we're losing a lot of opportunities for interaction in Gaelic. Beyond that, there's not much on offer for adults as everything is centred on kids. Which is jolly good for the kids but what will they do after leaving school? Not to mention the age makeup of those 10,000 speakers...

Working on the bold assumption that as important as traditional stuff is, it's not everyone's cup of tea. Not every American shoots moose and not every Welshman owns a harp. So it's unreasonable to assume that every Gaelic-speaker likes waulking songs or indeed should like them.

So we ran a pilot where we got members of the community who were willing to pass on skills they have to run a 6 week pilot teaching 2 hours a week, offering an art course, Esperanto, creative writing, Tae Kwon Do, Jazz Dancing and Chinese arts and crafts – all taught through the medium of Gaelic. We had some problems with advertising and attendance but we hope to improve that next time round because the feedback was great, people really enjoyed doing something totally different where Gaelic wasn't the target but just a means of interacting. People have to pay a contribution which pays the tutor and the rooms and so on but split between several people, that's not a lot. We also don't require the tutors to have teaching qualifications or suchlike or indeed offer certificates – for the most part, people are just interested in the subjects. It's a very simple model but we have great hopes for it and I think it could be easily applied to other communities.

KPS: Tell me about the third picture above, of the "No Fouling" sign!

MB: It's a picture I took on Skye. It's the other side of the coin, in a way the more precious one and the one that is much harder to achieve. It's just a cheap sign, a piece of wood on a stake with a laminated page someone ran off the printer. But in it stands for someone locally who decided to put their own language on the sign as well. No application for funding, no big fuss but a small bit of linguistic landscape that says “we're here, we're using this language”.


KPS: What is your vision for your language in ten years, both in general terms and in terms of software/online use?

MB: In terms of technology, I think we can look forward to a few more programs and applications in the language, with Open Source playing an increasing role. In particular, I'd like to see smaller languages exploit the games niche more, perhaps even on a cross-national collaborative basis. If games can teach German speaking children English without a teacher, then that's something we cannot afford NOT to use. In a way I'm quite proud that 2011 is the year that saw the release of the first Scots Gaelic computer game (Freeciv – the open source development of what many of you over 30 will remember as Civilisation II). It was fun doing the translation, actually, so much better scratching your head over how to say “The Basque catapult has been destroyed by the Babylonian horsemen” than some policy document. But I also can't believe it's 2011... we're really missing a trick here, given how much grief my mum used to give me over playing this game.
Screenshot of Freeciv in Scottish Gaelic

I'd also like to see speech technology advance, in particular to support speakers with shaky literacy. One item on my personal wishlist which probably won't happen is one I mentioned earlier – a shared online repository for all these localisation projects, linked into a better online translation memory. Sort of a mega-Pootle with live suggestions bringing together Mozilla, OpenOffice, Linux and the rest. I waste a lot of time re-translating the same strings.

Overall... I'd like to see GME become compulsory in those areas that still have a strong Gaelic-speaking element, which also entails more teachers being trained. Less money on white, flashy elephants and more for someone to grapple with the thorny issue of seriously increasing language use in the Gaelic-speaking communities before we lose them. And nationally, better education about the historical role of Gaelic in Scottish history. Some of the views people have about Gaelic are about as down-to-earth as the birther debate in the United States.