1
|
Evaluating the Impact of Integrating Similar Translations into Neural Machine Translation. INFORMATION 2022. [DOI: 10.3390/info13010019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Previous research has shown that simple methods of augmenting machine translation training data and input sentences with translations of similar sentences (or fuzzy matches), retrieved from a translation memory or bilingual corpus, lead to considerable improvements in translation quality, as assessed by a limited set of automatic evaluation metrics. In this study, we extend this evaluation by calculating a wider range of automated quality metrics that tap into different aspects of translation quality and by performing manual MT error analysis. Moreover, we investigate in more detail how fuzzy matches influence translations and where potential quality improvements could still be made by carrying out a series of quantitative analyses that focus on different characteristics of the retrieved fuzzy matches. The automated evaluation shows that the quality of NFR translations is higher than the NMT baseline in terms of all metrics. However, the manual error analysis did not reveal a difference between the two systems in terms of total number of translation errors; yet, different profiles emerged when considering the types of errors made. Finally, in our analysis of how fuzzy matches influence NFR translations, we identified a number of features that could be used to improve the selection of fuzzy matches for NFR data augmentation.
Collapse
|
2
|
do Carmo F, Shterionov D, Moorkens J, Wagner J, Hossari M, Paquin E, Schmidtke D, Groves D, Way A. A review of the state-of-the-art in automatic post-editing. MACHINE TRANSLATION 2021; 35:101-143. [PMID: 34720417 PMCID: PMC8550288 DOI: 10.1007/s10590-020-09252-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/09/2020] [Indexed: 11/25/2022]
Abstract
This article presents a review of the evolution of automatic post-editing, a term that describes methods to improve the output of machine translation systems, based on knowledge extracted from datasets that include post-edited content. The article describes the specificity of automatic post-editing in comparison with other tasks in machine translation, and it discusses how it may function as a complement to them. Particular detail is given in the article to the five-year period that covers the shared tasks presented in WMT conferences (2015–2019). In this period, discussion of automatic post-editing evolved from the definition of its main parameters to an announced demise, associated with the difficulties in improving output obtained by neural methods, which was then followed by renewed interest. The article debates the role and relevance of automatic post-editing, both as an academic endeavour and as a useful application in commercial workflows.
Collapse
Affiliation(s)
- Félix do Carmo
- Centre for Translation Studies, University of Surrey, Surrey, UK
- ADAPT Centre, Dublin City University, Dublin, Ireland
| | - Dimitar Shterionov
- Department of Cognitive Science and Artificial Intelligence, Tilburg University, Tilburg, The Netherlands
- ADAPT Centre, Dublin City University, Dublin, Ireland
| | - Joss Moorkens
- School of Applied Language and Intercultural Studies, Dublin City University, Dublin, Ireland
- ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland
| | - Joachim Wagner
- ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland
| | - Murhaf Hossari
- ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland
| | - Eric Paquin
- ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland
| | - Dag Schmidtke
- Microsoft, South County Business Park, Leopardstown, Dublin, Ireland
| | - Declan Groves
- Microsoft, South County Business Park, Leopardstown, Dublin, Ireland
| | - Andy Way
- ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland
| |
Collapse
|
4
|
Lore Vandevoorde, Joke Daems, and Bart Defrancq (eds.): New Empirical Perspectives on Translation and Interpreting. MACHINE TRANSLATION 2020. [DOI: 10.1007/s10590-020-09246-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|