Skip Navigation

This is the technical support forum for WPML - the multilingual WordPress plugin.

Everyone can read, but only WPML clients can post here. WPML team is replying on the forum 6 days per week, 22 hours per day.

Sun Mon Tue Wed Thu Fri Sat
- 12:00 – 14:00 12:00 – 14:00 12:00 – 14:00 12:00 – 14:00 12:00 – 14:00 -
- 17:00 – 21:00 17:00 – 21:00 17:00 – 21:00 17:00 – 21:00 17:00 – 21:00 -

Supporter timezone: Europe/Vienna (GMT+01:00)

This topic contains 10 replies, has 2 voices.

Last updated by Bigul 1 year, 4 months ago.

Assisted by: Bigul.

Author Posts
June 28, 2023 at 8:56 am #13908429

olegA-5

I have a question about a specific algorithm that WPML uses to handle the translation of the parts of text similar to those that have previously been translated and merged (or merged and translated). I have repeatedly encountered the following. If I have, for example, a text consisting of three sentences, say, "Sentence one. Sentence two. Sentence three", WPML (in ATE) shows each of these sentences as a separate part to translate. I merge them and translate (or translate and merge - it doesn't matter). Now I create a new text: " Sentence one. Sentence two special. Sentence three". I also translate and merge them (or merge and translate - ATE also allocates three parts to translation). Now if I add something to my first text (not change anything in what I already had written, add a new sentence after it), the translations of everything previously translated will be lost, and the previously merged three parts will be disjointed. I suspect this is because WPML sees similar sentences in different texts (here, the first and the third sentences are identical) and therefore executes some algorithm that leads to such unfortunate consequences for me as disjoining parts and losing their translations. I want to understand the text processing algorithm when WPML sees the same sentences in already translated and new texts (the sentences here are synonymous with the minimal parts allocated for translation by ATE by default).

June 29, 2023 at 10:04 am #13916611

Bigul
Supporter

Languages: English (English )

Timezone: Europe/Vienna (GMT+01:00)

Hello,

Welcome to the WPML support forum. I will do my best to help you to resolve the issue.

ATE is working on the basis of translation memory - https://wpml.org/documentation/translating-your-contents/how-wpml-keeps-track-of-your-translations/

So please share the following details for tracking this.

1) To help you faster, I've enabled debug information for this support ticket. Please see this link for how to get this information from your site and give it to us: http://wpml.org/faq/provide-debug-information-faster-support/

2) Please share a screencast of this bug for a clear picture. Also, it will help us a lot in our internal communication. You can share it via Google Drive or Dropbox.

3) Visit WPML>>Support>>Advanced Translation Editor>>Error Logs and check if you are getting any errors while having this problem.

--
Thanks!

Bigul

June 29, 2023 at 10:48 am #13916943

olegA-5

Hello Bigul,
I don't have any relevant errors in the log.
The screenshots are below.

Untitled11.png
Untitled09.png
Untitled08.png
Untitled07.png
Untitled06.png
Untitled05.png
Untitled04.png
Untitled03.png
Untitled02.png
Untitled01.png
June 29, 2023 at 4:41 pm #13919843

Bigul
Supporter

Languages: English (English )

Timezone: Europe/Vienna (GMT+01:00)

Hello,

Thank you for the details. This requires further debugging. Therefore I have created a test site in our Sandbox Server for further debugging of this bug. Because reproducing the issues in a fresh/minimal installation will help us a lot in debugging and also we can troubleshoot the issue without affecting your live site & escalate the ticket directly to our developers.

Please try the following steps and make sure the bug is existing or is not on the sandbox site.

1) Click this URL to visit the Sandbox site backend - hidden link
2) Configure WPML like your live site and activate WPML add-ons
3) Create a Page and translate it
4) Create another Page with similar contents
5) Open it for translation

--
Thanks!

Bigul

June 29, 2023 at 5:13 pm #13919919

olegA-5

Done. With the results I described. I didn't do some special configuration except activating the Classic Editor, cause I hate the Gutenberg.

June 29, 2023 at 5:15 pm #13919927

olegA-5

Please note that your instructions (points 3-5) _do not_ describe my situation. I did not write about problems with creating and translating a page with _exactly_ the same content! Please re-read what I wrote.

June 29, 2023 at 6:32 pm #13920125

olegA-5

Here's what I've done in the sandbox, point by point:
- created a new page;
- wrote on it: " Sentence one. Sentence two. Sentence three."
- opened ATE to translate this page into Spanish;
- translated each of the three segments (sentences) allocated by the ATE;
- merged all three translated segments;
- saved the translation of the new merged segment (in this example, the whole text on the page, which fits into one segment);
- published the translation;
- created a new page;
- wrote on it: "'Sentence one. Sentence two special. Sentence three.", i.e., created the text on it, which _partially_ matches the one on the previously created and already translated page and matches in such a way that there are full matches of the _original_ segments allocated before they were translated (in this case, those segments are sentences one and three);
- opened the Spanish translation of this page (no segments have been translated - and that's OK);
- translated each of the three segments;
- merged all three segments;
- saved the translation of this new single segment (in this example, again, the whole text on the page);
- published the translation;
- opened the first of the created pages for editing, added the next sentence after the existing text: "sentence four" (I didn't change anything in the existing text; I didn't touch it - there were still three sentences that had been merged and translated);
- saved the new version of this page;
- opened ATE to update the translation of this page;
- found that ATE allocated four segments none of them translated, i.e. each of the four sentences is again a separate segment; in other words, the single merged segment consisting of the first three sentences is again disjointed, and its translation is lost.

July 3, 2023 at 7:03 am #13930325

Bigul
Supporter

Languages: English (English )

Timezone: Europe/Vienna (GMT+01:00)

Hello,

Thank you for the details and for configuring the sandbox site. We are currently working on it and get back to you as early as possible. Please wait.

Sorry for the late response because of the holidays and weekend.

--
Thanks!

Bigul

July 5, 2023 at 2:06 pm #13951517

Bigul
Supporter

Languages: English (English )

Timezone: Europe/Vienna (GMT+01:00)

Hello,

We had detailed testing from different systems. Please check the following screencasts. Hope you are mentioning the same.

hidden link

hidden link

As of now, this is an expected result. Advanced Translation Editor (ATE) breaks long sentences into many segments of content so it's easier to translate and maintain the translation memory.

When we send a post or page for translation to Advanced Translation Editor(ATE) an XLIFF file is created. ATE will receive it. The ATE segments the XLIFF into small sections and ATE will show it to you. WPML doesn't create any segmentation, the translation tools do and it is following the semantic rules of HTML and grammar. Every CAT(Computer-assisted translation) Tool will be doing something like this.

--
Thanks!

Bigul

July 5, 2023 at 3:47 pm #13952311

olegA-5

Bigul,
I'm not sure that you understand the issue.
What is "an expected result?" The erasing of the completed translation of a merged segment upon editing of an original post _without making any changes to the part of text consisting of the segments that were merged and translated_ when there is a new _different_ text with segments identical to some of those merged? Is it en expected result?! In case it is, why don't you erase all translations of merged segments upon _any_ update of an original post? Just because the smaller the segments -- the better, yeah? I write: "One. Two. Three." Merge, translate, save. Done. Than I create compleatle independent page, where I for some reason have segments "One" and "Three." I translate them, save. After that I decide to make some changes to the first page absolutely unrelated to the part I described above. And as a result have my merged, translated, completed segment disjointed, its translation -- lost. How is it expected?!
Your video is not about it. It has nothing to do with any _other_ page having identical segment.

July 6, 2023 at 6:18 am #13955007

Bigul
Supporter

Languages: English (English )

Timezone: Europe/Vienna (GMT+01:00)

Hello,

Thank you for the details. The Advanced Translation Editor (ATE) employs segmentation rules to improve the translation process and maintain the translation memory. These rules are similar to those used in other CAT tools.

Segmentation occurs under the following circumstances:

1. Full Stop (.)

When a full stop is encountered in the text, ATE considers it as the end of a line and creates a new segment. These segments can be joined together if needed.

2. HTML Blocks

HTML block-level elements, such as those listed at this link: hidden link, start on a new line, and there is automatic margin space before and after the element.

3. Break Tags

If a line contains a break tag (<br/>), ATE will create a new segment based on it.

4. Inline Elements

Inline elements like "em," "strong," and "span," listed at this link: [HTML Inline Elements](hidden link), are not segmented immediately. Instead, ATE considers the word count and the number of inline tags in the paragraph. If the word count exceeds 50 or there are more than 15 inline tags, segmentation will occur.

Regarding your specific case:

Case 1:

Hello 1. Hello one 2. Hello two 3.
Segmentation:
Hello
1. Hello one
2. Hello two 3.

Case 1 Updated:

Hello 1. Hello one 2. Hello two 3. Hello three
Segmentation:
Hello
1. Hello one
2. Hello two
3. Hello three

Please be aware that if you make changes to the text, like in Case 1, and WPML sends the translations to ATE for re-evaluation, any differences in segmentations, as seen in Case 1 updated, will be reflected. This may result in disjoined segments as the previous segmentation is modified.

On the other hand, if you add text to the end of the existing content, leading to a new segmentation being added without affecting the previous ones, the original joined segmentations will remain unchanged.

I hope this explanation helps clarify the segmentation rules. If you have any further questions, please feel free to ask.

--
Thanks!

Bigul

The topic ‘[Closed] WPML algorithm to handle texts similar to parts of already translated and merged’ is closed to new replies.