https://arm.proz.com/forum/translator_resources/278962-language_identifiers.html

Language Identifiers
Շարքի հրապարակողը: Erwin_Franz
Erwin_Franz
Erwin_Franz
Լատվիա
Local time: 23:04
ռուսերենից լատվիերեն
+ ...
Dec 10, 2014

Dear colleagues,

Do you know any language identifying tools for processing tmx files?


 
Jack Doughty
Jack Doughty  Identity Verified
Մեծ Բրիտանիա
Local time: 21:04
ռուսերենից անգլերեն
+ ...
Ի հիշատակ
Polyglot Dec 11, 2014

I don't know anything about .tmx files, but I use a language identifier called Polyglot 3000.
http://www.polyglot3000.com/

[Edited at 2014-12-11 08:48 GMT]

[Edited at 2014-12-11 08:49 GMT]


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 22:04
անգլերենից հունգարերեն
+ ...
Interesting Dec 11, 2014

I didn't know there were GUI language ID tools.
I've played a little bit with a perl module that does this:
http://search.cpan.org/~ambs/Lingua-Identify-0.56/lib/Lingua/Identify.pm
It seems to work pretty well.

If you need to do this automatically on a large number of files, I may be able to write a perl script to do it.

[Edite
... See more
I didn't know there were GUI language ID tools.
I've played a little bit with a perl module that does this:
http://search.cpan.org/~ambs/Lingua-Identify-0.56/lib/Lingua/Identify.pm
It seems to work pretty well.

If you need to do this automatically on a large number of files, I may be able to write a perl script to do it.

[Edited at 2014-12-11 11:55 GMT]
Collapse


 
Rolf Keller
Rolf Keller
Գերմանիա
Local time: 22:04
անգլերենից գերմաներեն
The question is ?? Dec 11, 2014

On principle, any .tmx file includes language indentifiers. So, what is the question?

 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 22:04
անգլերենից հունգարերեն
+ ...
Mislabeled Dec 12, 2014

I assumed that the tmx files in question have incorrect or missing language identifiers.
On second thought, it may well be a case of having a bunch of tmx files (potentially hundreds or even thousands) and needing to sort them, e.g. find all the en-fr files among the lot based on the language codes. That could also be automated with software. It would be easier than recognizing them based on the text itself.


 


To report site rules violations or get help, contact a site moderator:

Այս ֆորումի մոդերատորները
Maria Castro[Call to this topic]
Nawal Kramer[Call to this topic]

You can also contact site staff by submitting a support request »

Language Identifiers


Translation news





Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »