Sentence-Alignment and Application of Russian-German Multi-Target Parallel Corpora for Linguistic Analysis and Literary Studies

  • Desislava Zhekova Ludwig-Maximilians-Universität München
  • Robert Zangenfeind Ludwig-Maximilians-Universität München
  • Alena Mikhaylova Ludwig-Maximilians-Universität München
  • Tetiana Nikolaienko Ludwig-Maximilians-Universität München


This paper presents the application of multi-target parallel corpora consisting of a single source text and multiple target translations of it for linguistic analysis. We discuss the alignment, interactive search and visualization of this type of data within a specific tool called ALuDo (Alignment with Lucene for Dostoyevsky). This is a Java implementation that uses local grammars, ontological information, bilingual dictionaries and statistical approaches for alignment and search. The data set in use is the Russian novel Crime and Punishment by Fyodor Dostoyevsky and three German translations of it. With this bilingual corpus quite a number of investigations in the field of linguistics and of literary studies are possible. Additionally, we release part of the resulting parallel corpus.


Secção Temática | Thematic Section


interactive alignment; rule-based alignment; statistical alignment; coreference resolution; paraphrase identification