What is the Glossed Audio Corpus of Ainu Folklore?

The last decade has been marked with an increase in global awareness of language endangerment and the emergence of language documentation as a separate field that focuses on building multi-purpose corpora of data from endangered languages.

Now Ainu is very gravely endangered. Originally, Ainu was not a written language but thanks to accumulated efforts on recording Ainu that started more than a century ago, the language, culture and oral literature have been well documented. Thus, Ainu studies will continue and they are most likely to thrive when presented on a wider international scale. This will strengthen the connection of Ainu studies to parallel endangered-language communities elsewhere.

A Glossed Audio Corpus of Ainu folklore is the first fully glossed and annotated digital collection of Ainu folktales with translations into Japanese and English. The materials were recorded by Hiroshi Nakagawa in 1977 to 1983 with a very talented speaker and story-teller, Mrs. Kimi Kimura (1900-1988, born in Penakori Village, upper district of the Saru River) whose proficiency in Ainu considerably surpassed that of her Japanese. The abundance, repertoire and tempo of the folktales are outstanding.

To provide a safe long-term repository of language materials, the audio files were deposited with the Endangered Language Archive of SOAS, University of London, along with other outcome of the project “Documentation of the Saru Dialect of Ainu” (2007-2009; principal investigator: Anna Bugaeva) funded by the Endangered Languages Documentation Programme (out of the Rausing Foundation). The deposit http://elar.soas.ac.uk/deposit/0107 includes 23 folktales, viz. 20 uepeker ‘prosaic folktales’ and 3 kamuy yukar ‘divine epics’; the total recording time is about 7 hours and the total number of Ainu words is 44,717.

In fiscal year 2015, we are pleased to release 10 glossed folktales (8 uepeker ‘prosaic folktales’ and 2 kamuy yukar ‘divine epics’) with a total recording time of about 3 hours. In the next fiscal year, we plan to release the remaining 4 hours of the corpus.

This corpus was created as part of the “Typological and Historical/Comparative Research on the languages of the Japanese Archipelago and its Environs” (project leader: John Whitman; the Ainu research group leader: Anna Bugaeva) and “Documentation and Transmission of Endangered Languages and Dialects in Japan” (project leader: Nobuko Kibe), NINJAL Collaborative Research Projects and funded by the FY2015 Grant for Publication of Project Outcomes.

Ainu texts were transcribed by Hiroshi Nakagawa (Chiba University, professor; NINJAL project member) and Anna Bugaeva (NINJAL project associate professor; Ainu research group leader). Translations into Japanese were carried out by Hiroshi Nakagawa and into English by Anna Bugaeva (with the assistance of Sarah Rumme). English and Japanese glossing (morphological annotation) was done by Miki Kobayashi (Chiba University PhD student; NINJAL adjunct researcher; NINJAL project member) under the supervision of Anna Bugaeva. This outcome would not been possible without the help of Shirō Akasegawa (Lago Institute of Language) who built the online system.

We truly hope that the corpus will be useful to the Ainu people who are now in the process of revitalizing their language and culture, to the international community of linguists and cultural anthropologists, and to all people who are interested in the Ainu language and oral literature, which are an integral part of human intellectual heritage.

Hiroshi Nakagawa, Anna Bugaeva, Miki Kobayashi

Tokyo, February 23, 2016