Systems and Means of Informatics

2020, Volume 30, Issue 2, pp 124-135

INSTABILITY OF NEURAL MACHINE TRANSLATION

A. Yu. Egorova
I. M. Zatsman
V. V. Kosarik
V. A. Nuriev

Abstract

The paper describes an experiment focused on studying the instability of neural machine translation (NMT). In the course of a year, an array of text fragments in Russian was repeatedly translated into French. The time step was one month. To produce translations, the Google's NMT system was used. The experiment helps reveal the instability of NMT, i.e., it shows that translations of a given text fragment tend to change with time but not always improving the quality. The generated translations were linguistically annotated, which led to uncovering several different types of the NMT instability. While annotating, a previously designed classification of machine translation errors was employed.
It was altered to meet the objectives of the experiment, the ultimate goal of which was to obtain a frequency distribution of different types of the NMT instability.
Yet, the first step of the experiment limited itself to only categorizing the NMT instability, and it is this very step that the paper describes. As the empirical data, the experiment uses Russian-French annotations generated in a supracorpora database. Each annotation contains a fragment of the source Russian text, its translation into French, and the description of translation errors occurring there.

[+] References (15)

Moorkens, J., S. Castilho, F. Gaspari, and S. Doherty, eds. 2018. Machine translation: Technologies and applications. Vol. 1: Translation quality assessment. Cham: Springer International Publishing. 287 p.
Scott, B. 2018. Machine translation: Technologies and applications. Vol. 2: Translation, brains and the computer: A neurolinguistic solution to ambiguity and complexity in machine translation. Cham: Springer International Publishing. 241 p.
Popovic, M. 2018. Error classification and analysis for machine translation quality assessment. Machine translation: Technologies and applications. Vol. 1: Translation quality assessment. Cham: Springer International Publishing. 129-158.
Buntman, N. V., A. A. Goncharov, I. M. Zatsman, and V. A. Nuriev. 2018. Kolichest- vennyy analiz rezul'tatov mashinnogo perevoda s ispol'zovaniem nadkorpusnykh baz dannykh [Using supracorpora databases for quantitative analysis of machine translations]. Informatika i ee Primeneniya - Inform. Appl. 12 (4): 96-105.
Nuriev, V., N. Buntman, and O. Inkova. 2018. Machine translation of Russian connectives into French: Errors and quality failures. Informatika i ee Primeneniya - Inform. Appl. 12(2): 105-113.
Goncharov, A. A., N. V. Buntman, and V.A. Nuriev. 2019. Oshibki v mashinnom perevode: Problemy klassifikatsii [Machine translation errors: Problems of classification]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 29(3):92-103.
Rychikhin, A. K. 2019. O metodakh otsenki kachestva mashinnogo perevoda [On methods of machine translation quality assessment]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 29(4): 106-118.
Cheng, Y., Z. Tu, F. Meng, J. Zhai, and Y. Liu. 2018. Towards robust neural machine translation. 56th Annual Meeting of the Association for Computational Linguistics Proceedings. Melbourne: Association for Computational Linguistics. 1:1756-1766. Available at: https://www.aclweb.org/anthology/P18-1163 (accessed June 1, 2020).
Natsional'nyy korpus russkogo yazyka [Russian National corpus]. Available at: http:// www.ruscorpora.ru/ (accessed February 25, 2020).
Nuriev, V. A. 2019. Arkhitektura sistemy neyronnogo mashinnogo perevoda [Architecture of a machine translation system]. Informatika i ee Primeneniya - Inform. Appl. 13(3):90-96.
Zaliznyak, A. A., I. M. Zatsman, and O.Yu. Inkova. 2017. Nadkorpusnaya baza dannykh konnektorov: postroenie sistemy terminov [Supracorpora database on connectives: Term system development]. Informatika i ee Primeneniya - Inform. Appl. 11 (1): 100-108.
Zatsman, I.M., and M.G. Kruzhkov. 2018. Nadkorpusnaya baza dannykh konnek- torov: razvitie sistemy terminov proektirovaniya [Supracorpora database of connectives: Design-oriented evolution of the term system]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 28 (4): 156-167.
Egorova, A. Yu., I. M. Zatsman, and O. S. Mamonova. 2019. Nadkorpusnye bazy dannykh v lingvisticheskikh proektakh [Supracorpora databases in linguistic projects]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 29(3):77-91.
Inkova-Manzotti, O. Yu. 2001. Konnektory protivopostavleniya vo frantsuzskom i rus- skom yazykakh. Sopostavitel'noe issledovanie [Connectors of opposition in French and Russian: A comparative study]. Moscow: Informelektro. 429 p.
Inkova, O. Yu. 2018. Nadkorpusnaya baza dannykh kak instrument formal'noy varia- tivnosti konnektorov [Supracorpora database as an instrument of the study of the formal variability of connectives]. Computer Linguistic and Intellectual Technologies: Conference (International) "Dialog" Proceedings. Moscow. 17(24):240-253.

[+] About this article