29 Jun The 2018 EAMT Conference: Key Takeaways
The 21st Annual Conference of the European Association for Machine Translation (EAMT) took place in Alicante, Spain. The conference brought together researchers, developers, LSPs, and translators from around the globe to discuss the present and future of machine translation. Speakers of many different backgrounds invited the audience to ask questions and share their opinions and ideas, which is the essence of this type of event.
Broadly speaking, we can divide the lectures into two main categories: those more connected with the development of MT products, and those more focused on the use of MT.
The first group, more academic, were mostly individuals from research departments of various universities. They presented examples of their current work and current trends in the field. One of the trends that caught our attention was the increase in the use of synthetic corpora to create an MT engine from scratch. A synthetic corpus is created from aligned content in which the target language is actually the result from processing the source content with an existing MT engine (as opposed to an authentic corpus, in which the target text comes from human translation).
The idea of the better the training material, the better the results is very consolidated in the MT industry, but the results of some works with synthetic corpora show that some concepts, no matter how consolidated they are, sometimes need to be revised. Another work that is worth mentioning is the one done with minor languages or languages without large bilingual corpora available. It is well known that a large corpus is required to get an acceptable level of quality with NMT. However, this is simply unrealistic for many languages across the world for which parallel corpora are reduced or almost non-existent. And this does not only apply to minor languages: Telugu, for example, is spoken by over 70 million people in India, but the available bilingual corpora are not enough to build NMT engines with decent results, so statistical or ruled-based engines are still the only option.
Regarding the use of MT products, the other main track in the conference, LSPs and translation agencies showed how they integrate MT into their production cycle. Depending on their size, experience or resources, the companies may adopt different solutions, like managing their own MT installations or working with a third-party provider. However, no matter how big or experienced a provider is, all of them seem to share the same challenge: finding post-editors. Many professional translators are still reluctant to work with MT and recognize that they prefer to keep doing traditional editing when possible. They claim that some agencies try to use a generic rule to assess the effort of a post-editing assignment, but this effort could change considerably from one job to another. Additionally, many of them believe that post-editing tasks are not well paid and that they normally take more time than estimated. The key point, hence, is to find an objective way of paying for this service that is easily understood by the translators and fair to both parties. While this sounds quite challenging, a consensus will need to be reached within a matter of time. In fact, something similar happened with how exact matches, fuzzies and repetitions were paid when CAT tools started to spread around 20 years ago. Therefore, we can expect standard metrics for MT post-editing will arise in the near future.
On the translator’s side, a generalized comment was that the development teams don’t normally consult them or take their opinions into account. Though this can be a personal impression, it is true that most of the presentations from the research and development track did not include the views or feedback from linguistic teams. Perhaps involving linguists in the initial stages would pave the path to acceptance of MT among translators.
At Donnelley Language Solutions, which is now part of SDL, we share these conclusions. We have always catered and fully serviced our community of linguists by providing them with fast and multiple payment methods among other services like SDL MultiTrans training. ProZ’s Blue Board, a risk management tool that aggregates reviews from linguists, gives us an aggregated score of 4.7 out of 5 for the last five years. Even though it’s a good score, we continuously strive to make improvements and our Vendor Management team works hard at addressing and managing our linguists’ requests and concerns.
As for how our tools provide an environment for translators to better deal with the output of Machine Translation, SDL MultiTrans TMS connects to multiple MT engines and allows post editors to do their work in its Web Editor. The web editor was designed to improve the post editor’s job. Adhering to the client’s terminology is a key aspect of translation. By continuously updating our MT engines we ensure that terminology, style, and preferred changes are reflected in the MT output, thus improving the quality and making the post-editing task smoother. Furthermore, the post-editor has access to the approved terminology from SDL MultiTrans Web Editor, which allows them to run consistency checks and deliver top-quality translations.
All in all, the EAMT conference was a good opportunity to meet professionals in the MT industry and exchange views and ideas. Next year, the European conference will coincide with the International one, which will take place in August in Dublin. We hope to attend it again and meet more professionals to continue exchanging ideas about this exciting and ever-evolving industry!