We’ll show you how to edit the XLIFF files that WPML produces in OmegaT.

1. Create and open a new project

Select Project → New… from the menu.

Navigate to the folder where you wish to save the project files, and type in a name for the translation project. This name will be used for the main project folder; all the project files will be stored in this folder or its subfolders.

OmegaT will prompt you to confirm or change the project folders to be created, using the dialog box below:

Create and open a new project

 

Select the language code (2 letters) or language-and-region code (2 + 2 letters) from the drop-down list, or type them in by hand.

If you want OmegaT to segment your file in some specific way, check the Segmentation Rules or do that later on from the menu : Options → Segmentation

When you click OK to accept the project set-up, OmegaT will prompt you to select the source documents to import. You can import individual files, or you can import entire folder trees (with all files in all subfolders).

To check your list of files to be translated, consult the Project Files window (Menu: Project → Project Files…, if it does not open automatically). If you have had to change the Contents of the Source folder, remember to reload the project first (Menu: Project → Reload). OmegaT opens the first file in the project list by default.

OmegaT can only translate files in the formats below if they match the patterns defined in the file filers. Any other files will be ignored.

  • OpenDocument/OpenOffice.org
  • Plain text
  • .po
  • Bundle.properties Java
  • XHTML, HTML
  • HTML Help Compiler
  • INI (‘key=value’ format)
  • DocBook
  • Microsoft Open XML
  • Okapi monolingual XLIFF
  • QuarkXPress CopyFlowGold
  • Subtitle files (SRT)
  • ResX
  • Android resource
  • LaTeX

2. How to import XLIFF files

What seems to draw attention from the files listed above are the XLIFF files. As you try to import a file in *.xliff format, you’ll have the surprise that the file is not recognized. This problem can be easily solved by checking in file filters (Menu: Options → File Filters) if the source filename pattern for the XLIFF files includes de *.xliff extension.

How to import XLIFF files

Unfortunately the default settings of OmegaT include only the *.xlf extension, therefore you have to add the *.xliff extension by yourself by clicking the row with XLIFF files and then Edit. You will simply have to Add a row identical to the existing one, only with the source filename pattern changed to the one you want to use.

Adding .xliff extension

After you finish with these operations, the file will successfully be imported into your project.

3. Source segmentation

OmegaT has 2 ways to segment a text: paragraph segmentation and sentence segmentation. In order to select the type of segmentation, select Project → Properties… from the main menu and use the available check box.

Edit Project

Note that paragraph segmentation is pretty much outdated and that for the majority of projects the sentence segmentation is a choice to be preferred. If sentence segmentation has been selected, you can setup the rules by selecting Options → Segmentation… from the main menu.

OmegaT first parses the text for structure-level segmentation. During this process it is only the structure of the source file that is used to produce segments.

For example, text files may be segmented on line breaks, empty lines, or not be segmented at all. Files with formatting (OpenOffice.org documents, HTML/XLIFF documents, etc.) are segmented on block-level (paragraph) tags.

After segmenting the source file into logical units, OmegaT will further segment these blocks into sentences.

There are two kinds of segmentation rules:

Break rules separate the source text into segments. For example, “Did it make sense? I was not sure.” should be separated in two segments means, there should be a break rule for “?”.

Break Rules

Exception rules specify what parts of text should NOT be separated. In spite of the period, “Mrs. Dalloway ” should not be separated in two segments, so an exception rule should be established for Mrs (and Mr and Dr and prof etc), followed by a period.

Exception Rules

4. Rules creation

In order to edit or expand an existing set of rules, simply click on it in the top table. The rules of the set will appear in the bottom half of the window.

Edit or expand an existing set of rules

In order to create an empty set of rules for a new language/type of file pattern click Add in the upper half of the dialog. An empty line will appear at the bottom of the upper table (you may have to scroll down to see it). Change the name of the rule set and the language pattern. Syntax of the language pattern conforms to regular expression syntax. If your set of rules handles a language-country pair, we advise you to move it to the top using Move Up button.

Create an empty set of rules for a new language/type of file pattern

The Break/Exception check box determines whether it is a break rule (check box set) or an exception rule (check box unset). Two regular expressions Before and After specify what must be before and after some position so that it qualifies for exception rule or for break rule.

5. Segmentation rules creation for XLIFF files

When dealing with XLIFF files we will immediately want to define a segmentation rule, or the document to translate will look something similar like:

Segmentation rules creation for XLIFF files

For any file with formatting, you will notice that segmentation is at paragraph-level tag, which for XLIFF files don’t look too organized, so you’ll probably want to extract the translatable content into different segments. In conclusion, you need to create a segmentation rule.

Although the XLIFF files are standard, they contain some elements that are unique to WordPress. Normally, in HTML, you insert <br /> tags to generate line breaks. In WordPress, new-lines generate line breaks too. The new-lines are now replaced with a tag: <br class=”xliff-newline”/>.

You can easily create a rule for this kind of file following the steps in the “Rules creation section” and using this tag as the pattern after segmentation and any character (except for line terminators) for the pattern before segmentation (where “.” is a predefined character class).

Create the rule as in:

Segmentation setup-creating a rule

After the rule is applied, the inconvenient tags will be organized in separated segments, in case of an empty line, or before the sentence, where the line contains information. Thereby, the translatable content will be more noticeable then before.

Segmentation setup-rule applied

 

6. Create translated documents

When you have translated all the segments (or earlier if you wish), OmegaT will update the target document(s) using the translations stored in the translation memory. To do so, select Project → Create Translated Documents from the menu. OmegaT will build translated versions of all the translatable documents in the Source folder of the project, whether or not they have been fully translated. The wholly or partially translated files will be saved to the project’s Target folder with the same name and the same extension as the source document. To finalize your translation, open the target files in their associated applications (browser, word processor…) to check the content and formatting of your translation. You can then return to OmegaT to make any necessary corrections; do not forget to recreate the translated documents.

Project’s directory

In the print screens above you can see the whole project’s directory.

Let’s say you just finished translating a XLIFF file and you want to upload it into WPML. To do so it is only necessary to find the translated file in the Target folder and upload it from that location.

Target folder

You can also make a copy of it and save it to some other location. Any modification regarding the translation’s formatting will be made in WPML and corrections of your translation can be made either in WPML, either in OmegaT.

One Response to “Translating WordPress XLIFF Files in OmegaT”

  1. The just released version of OmegaT (2.3) has added the .xliff extension as a default for the XLIFF filter, along with .xlf and .sdlxliff. I put the request on Sourceforge and sent the code modification right after seeing this article.

    http://tech.groups.yahoo.com/group/OmegaT/message/22308

    https://sourceforge.net/projects/omegat/files/OmegaT%20-%20Standard/OmegaT%202.3.0/

Leave a Reply

Please leave here comments about this page only.
For technical support and feature suggestions, head to our forum. We are waiting there!

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>