How your CAT tool handle entity references?
Thread poster: tz7
tz7
tz7
United States
Local time: 05:16
English to Japanese
Apr 2, 2020

Entity References for particular symbols, like ", &nbsp, and &, used in HTML or XML source should be shown as actual characters, like ", [space], and & when you are translating (  may be shown differently from regular space in some CAT tool), so that you can see what they are and if necessary you can change or remove them in your translation. These entity references need to be kept as is, unless you changed or removed, in the resultant target file. I guess many of po... See more
Entity References for particular symbols, like ", &nbsp, and &, used in HTML or XML source should be shown as actual characters, like ", [space], and & when you are translating (  may be shown differently from regular space in some CAT tool), so that you can see what they are and if necessary you can change or remove them in your translation. These entity references need to be kept as is, unless you changed or removed, in the resultant target file. I guess many of popular CAT tools work like this; Trados, MemoQ, Smartling, Memsource... except XTM 

Handling these entities correctly is ver important for translators. Otherwise, you have to deal with strings like below:
    <RecipientStatuses>

How does your CAT tool handle these entities?

[Edited at 2020-04-02 22:57 GMT]

[Edited at 2020-04-02 22:58 GMT]
Collapse


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 12:16
English to Russian
XTM Apr 3, 2020

Why “except XTM”?
Both XTM and Smartling insert a special tag for nbsp, unlike Memsource, memoQ and Trados that use regular character (“invisible” degree sign).


 
Jorge Payan
Jorge Payan  Identity Verified
Colombia
Local time: 04:16
Member (2002)
German to Spanish
+ ...
DéjaVu DVX3 Apr 3, 2020

It allows exporting special characters either as entities or directly as special characters. The XML filter has to be configured for this purpose.

 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:16
Member (2006)
English to Afrikaans
+ ...
@TZ7 Apr 3, 2020

tz7 wrote:
Entity References for particular symbols, like ",  , and &, used in HTML or XML source, should be shown as actual characters, like ", [space], and & when you are translating.


1. Using the full-width ampersand instead of the generic ampersand in the forums, to avoid the forum software from interpreting it, is a neat trick.

2. Yes, I agree that that would be what I would have assumed, if the source text is XML or an XML-like format.

... except XTM 


Are you speaking as a project manager or a translator? If the latter, do you have access to the source file so that you can verify the source says e.g. " and not perhaps " ? Or: are you sure the source text is an HTML or XML file, and not e.g. a Word file with XML-like text in it?


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:16
Member (2006)
English to Afrikaans
+ ...
@TZ7 II Apr 3, 2020

Stepan Konev wrote:
Both XTM and Smartling insert a special tag for  ...


Smartling does so by default for XML files, but it can be disabled by the project administrator:
https://help.smartling.com/hc/en-us/articles/360008000893-XML

Smartling does support DOCX but I don't know what Smartling does to non-breaking spaces in DOCX files (the help files don't say). The stuff that I usually do in Smartling are localisation files, so I imagine special rules might apply to such types of text, but I can't be certain what the actual file types are.

I confirm that I have seen non-breaking spaces displayed as an "sp" tag in XTM, but I don't know for which file formats those were. I was not able to find an XTM user manual page on this.


 
tz7
tz7
United States
Local time: 05:16
English to Japanese
TOPIC STARTER
RE: XTM Apr 3, 2020

Stepan Konev wrote:

Why “except XTM”?
Both XTM and Smartling insert a special tag for nbsp, unlike Memsource, memoQ and Trados that use regular character (“invisible” degree sign).


Yes, as default, XTM represent non-breaking space as an inline tag in red. If you want, like me in Japanese, you can remove them from your translation. The issue was XTM changes non-breaking space ( ) in XML to   automatically and this broke our build process, because   is illegal for XML. The solution provided from XTM was representing all the entities as-is. So translators see "    <RecipientStatuses>", instead of "{sp}{sp}{sp}{sp}<RecipientStatuses>", when translating. Well... at least we could get the valid target XML files. But, as expected, translators complained. Another solution provided was representing all entities, including  , as inline tags, which you can't remove from your translation. We couldn't accept the 2nd solution. So currently, translators are seeing "    <RecipientStatuses>".

I am working as a translator for Japanese and a PM for other languages. I have access to both source files and target. We don't translate any Word files in XTM.

Thank you everyone for your comments!


Stepan Konev
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How your CAT tool handle entity references?







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »