#2 When updating file in Zanata, translation are removed

Closed
opened 5 years ago by LecygneNoir · 6 comments
Owner

Hello,

We have noticed than when we update a file in Zanata with a new version, sometimes the existing translation is removed.

From my investigation, there are 2 cases:

  1. The dev have totally changed a text. Its key, the content, etc, thus Zanata assumes the old one is deleted and the new one is totally new.
  2. The have changed a text content but not its key. In this case, it seems Zanata assumes the text has so changed than the translation is not valid anymore and drop it.

For the 1., it could be difficult to patch. If the dev changed the key and the text we have no real means to find the original translation :'(

On the other hand for the 2., it would be better if Zanata does not remove the translation but mark it fuzzy. We need to search if there is parameters to change how Zanata deals with that. Perhaps using a pot key instead of comparing the text content, or tell it to use fuzzy instead of removal.

I'll dig the documentation, support and options to see if something exists.

Hello, We have noticed than when we update a file in Zanata with a new version, sometimes the existing translation is removed. From my investigation, there are 2 cases: 1. The dev have totally changed a text. Its key, the content, etc, thus Zanata assumes the old one is deleted and the new one is totally new. 2. The have changed a text content but not its key. In this case, it seems Zanata assumes the text has so changed than the translation is not valid anymore and drop it. For the 1., it could be difficult to patch. If the dev changed the key and the text we have no real means to find the original translation :'( On the other hand for the 2., it would be better if Zanata does not remove the translation but mark it fuzzy. We need to search if there is parameters to change how Zanata deals with that. Perhaps using a pot key instead of comparing the text content, or tell it to use fuzzy instead of removal. I'll dig the documentation, support and options to see if something exists.
Poster
Owner

Probably related: https://zanata.atlassian.net/browse/ZNTA-1134

But the ticket is old and still not solved, and options exists to deals with copy/compare translations 🤔

Also it talks about the key system, perhaps the actual compilation does not use them properly. Specifically I need to check the different commentary format for gettext, I should have missed something to allow Zanata recognize identical translations

Probably related: https://zanata.atlassian.net/browse/ZNTA-1134 But the ticket is old and still not solved, and options exists to deals with copy/compare translations :thinking: Also it talks about the key system, perhaps the actual compilation does not use them properly. Specifically I need to check the different commentary format for gettext, I should have missed something to allow Zanata recognize identical translations
Poster
Owner

I have created a debug version in Zanata to made some tests on it.

https://translate.zanata.org/iteration/view/thea2/debug-version?dswid=-8148

The pool of options to check existing translation and decide to copy or not copy them when uploading files seems a good start :-)

I have created a debug version in Zanata to made some tests on it. https://translate.zanata.org/iteration/view/thea2/debug-version?dswid=-8148 The pool of options to check existing translation and decide to copy or not copy them when uploading files seems a good start :-)
Poster
Owner

Okay in fact Zanata keep the translated info in its translation memory, BUT does not copy directly if the english text has too massively changed.

In the attached screenshot, English change from The target has to be an empty card slot in either ally melee or ally ranged row. to Targets an empty card slot in either ally melee or ally ranged row. so Zanata chooses not to copy the translation, as it matches "only" 88%...

The good new is the translation is not lost, we could copy it from the memory by using the "Copy" button.

But it's still manual action, clearly not perfect...

Okay in fact Zanata keep the translated info in its translation memory, BUT does not copy directly if the english text has too massively changed. In the attached screenshot, English change from `The target has to be an empty card slot in either ally melee or ally ranged row.` to `Targets an empty card slot in either ally melee or ally ranged row.` so Zanata chooses not to copy the translation, as it matches "only" 88%... The good new is the translation is not lost, we could copy it from the memory by using the "Copy" button. But it's still manual action, clearly not perfect...
Poster
Owner

There is a tool in Zanata which do exactly what we need, the merge translation from translation memory

Basically it search all files for string matching translation memory and copy it as fuzzy for review.

From test, it found a lot of untranslated strings, and even fill files we have still not translated :-)

Copied as Translated
100% Match
28,441 word(s) — 152,887 character(s) — 1,483 message(s)

Copied as Fuzzy
100% Match
26,735 word(s) — 142,783 character(s) — 1,800 message(s)

90% to 99% Match
4,040 word(s) — 21,915 character(s) — 154 message(s)

80% to 89% Match
2,638 word(s) — 14,396 character(s) — 140 message(s)

0% to 79% Match
1,858 word(s) — 10,012 character(s) — 101 message(s)

There are somes pains however:

  • It's a manual option, so I need to launch it each time I update documents
  • It takes ages! Around 2h30 / 3h with all the text we have...
  • It takes at maximum 75% similarity. Bad luck, lots of DES translation are 71% similarity, and thus not copied...

I am looking forward for how we can increase this process!

There is a tool in Zanata which do exactly what we need, the [merge translation from translation memory](http://docs.zanata.org/en/release/user-guide/versions/version-tm-merge/) Basically it search all files for string matching translation memory and copy it as fuzzy for review. From test, it found a lot of untranslated strings, and even fill files we have still not translated :-) ``` Copied as Translated 100% Match 28,441 word(s) — 152,887 character(s) — 1,483 message(s) Copied as Fuzzy 100% Match 26,735 word(s) — 142,783 character(s) — 1,800 message(s) 90% to 99% Match 4,040 word(s) — 21,915 character(s) — 154 message(s) 80% to 89% Match 2,638 word(s) — 14,396 character(s) — 140 message(s) 0% to 79% Match 1,858 word(s) — 10,012 character(s) — 101 message(s) ``` There are somes pains however: - It's a manual option, so I need to launch it each time I update documents - It takes **ages**! Around 2h30 / 3h with all the text we have... - It takes at maximum 75% similarity. Bad luck, lots of DES translation are 71% similarity, and thus not copied... I am looking forward for how we can increase this process!
LecygneNoir changed title from When updating file in Zanata, trnaslation are removed to When updating file in Zanata, translation are removed 5 years ago
Poster
Owner

Okay, new test for the 370 version:

  • Create a new translation from the last one (v0.1-ea here) (get translated AND reviewed text)
  • Upload new files (lost everything that is not 100% similar)
  • Run a Merge Tranlation with:
    • locale fr
    • 75% similarity
    • Use precedent versions with priority to the last one
    • Copy everything as translated (from other doc, from other context)

With this, we get all reviewed text, translated text still similar, and the majority of texts that has changed.

Everything in fuzzy needs a review.

Okay, new test for the 370 version: - Create a new translation **from the last one** (v0.1-ea here) (get translated AND reviewed text) - Upload new files (lost everything that is not 100% similar) - Run a Merge Tranlation with: - locale fr - 75% similarity - Use precedent versions with priority to the last one - Copy everything as translated (from other doc, from other context) With this, we get all reviewed text, translated text still similar, and the majority of texts that has changed. Everything in fuzzy needs a review.
Poster
Owner

Tested with the version 382, it does the job.

Can be closed, details are in other tickets.

Tested with the version 382, it does the job. Can be closed, details are in other tickets.
LecygneNoir closed this issue 5 years ago
Sign in to join this conversation.
No Label
No Milestone
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.