TRANSLATION MEMORY


Translators rarely translate completely new documents that are unrelated to any texts they have seen before. More often, a translator may recall examples of translations for an idiom or domain-specific expression but be unable to locate the source material to confirm her suspicions.

In these situations, a memory of good translations easily accessible by keyword searching would be an ideal aid to the translator. CRL's ``Translation Memory'' tool provides this capability. In most Translation Memory schemes, however, getting examples into the database can be difficult. CRL's, ``XAlign'' provides the ability to automatically pair sentences or passages from translated documents with high accuracy. Translations can then be stored in Translation Memory directly from XAlign, available for immediate searching for example translations.

Operation

XAlign and Translation Memory are separate windows that work together. The first step is getting translations into Translation Memory. Once translations are in, you can then search for examples of past usages and quickly scan the examples for the most appropriate ones.

You use XAlign to get translated texts from the Tipster Document Manager. Texts are displayed side-by-side, applying a segmentation strategy to chunk the texts according to a user-specified scheme. This may be by punctuation, or it may be by SGML or HTML markup. After segmentation, you can perform automatic alignment of the segments. This is not foolproof, but it is often very helpful in getting an initial pairing of translated segments. You can then manually change incorrect pairings and send the results to Translation Memory to be stored in an existing or new database.

Highlights

After you have created a Translation Memory database, you can then use the TM tool to search for examples. Some features of TM are:

In combination, XAlign and Translation Memory provide you with the tools to manage translations and make them available for future use.

Configuration

XAlign segmentation schemes can be designed by the user to meet specific segmentation needs. A segmentation scheme for HTML documents that splits-up documents based on HTML markup may not be appropriate for free-text, for example, and sentence-splitting punctuation may not be the same between languages. In Xalign, the segmentation schemes are transparently saved and available to each user from session to session.

From within XAlign, you can also create new Translation Memory databases and can then add translations to the database. The databases are all managed by the CRL's NDS server possibly on a remote computer. This frees up processing and indexing of the translation texts from the local host computer. The location of the Translation Memory databases is specified in the NDS configuration file.

Status

XAlign and Translation Memory are integrated components of Oleada. They make use of the Tipster Document Manager (TDM) and Norm Data Server (NDS) for fully distributed text computing. To use XAlign and Translation Memory, you must import or create translated documents within Oleada. You can then load them into XAlign for alignment and save them to Translation Memory.

The automatic alignment algorithm used by XAlign was developed to cope with real-world documents, including documents with different markup schemes. At present, the algorithm is best suited for French and Spanish, although XAlign is fully multilingual and alignments can be prepared in any of the supported Oleada languages.


Oleada/C¨Şbola Home Page
Last Modified: 12:54pm MDT, July 25, 1996