Language Identifiers
Շարքի հրապարակողը: Erwin_Franz
Erwin_Franz
Erwin_Franz
Լատվիա
Local time: 01:52
ռուսերենից լատվիերեն
+ ...
Dec 10, 2014

Dear colleagues,

Do you know any language identifying tools for processing tmx files?


 
Jack Doughty
Jack Doughty  Identity Verified
Մեծ Բրիտանիա
Local time: 23:52
ռուսերենից անգլերեն
+ ...
Ի հիշատակ
Polyglot Dec 11, 2014

I don't know anything about .tmx files, but I use a language identifier called Polyglot 3000.
http://www.polyglot3000.com/

[Edited at 2014-12-11 08:48 GMT]

[Edited at 2014-12-11 08:49 GMT]


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 00:52
անգլերենից հունգարերեն
+ ...
Interesting Dec 11, 2014

I didn't know there were GUI language ID tools.
I've played a little bit with a perl module that does this:
http://search.cpan.org/~ambs/Lingua-Identify-0.56/lib/Lingua/Identify.pm
It seems to work pretty well.

If you need to do this automatically on a large number of files, I may be able to write a perl script to do it.

[Edite
... See more
I didn't know there were GUI language ID tools.
I've played a little bit with a perl module that does this:
http://search.cpan.org/~ambs/Lingua-Identify-0.56/lib/Lingua/Identify.pm
It seems to work pretty well.

If you need to do this automatically on a large number of files, I may be able to write a perl script to do it.

[Edited at 2014-12-11 11:55 GMT]
Collapse


 
Rolf Keller
Rolf Keller
Գերմանիա
Local time: 00:52
անգլերենից գերմաներեն
The question is ?? Dec 11, 2014

On principle, any .tmx file includes language indentifiers. So, what is the question?

 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 00:52
անգլերենից հունգարերեն
+ ...
Mislabeled Dec 12, 2014

I assumed that the tmx files in question have incorrect or missing language identifiers.
On second thought, it may well be a case of having a bunch of tmx files (potentially hundreds or even thousands) and needing to sort them, e.g. find all the en-fr files among the lot based on the language codes. That could also be automated with software. It would be easier than recognizing them based on the text itself.


 


To report site rules violations or get help, contact a site moderator:

Այս ֆորումի մոդերատորները
Maria Castro[Call to this topic]
Nawal Kramer[Call to this topic]

You can also contact site staff by submitting a support request »

Language Identifiers







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »