Էջերը այս նյութում:   [1 2] >
Plagiarism Scanner as translation ressource?
Շարքի հրապարակողը: Noe Tessmann
Noe Tessmann
Noe Tessmann  Identity Verified
Ավստրիա
Local time: 13:20
անգլերենից գերմաներեն
+ ...
May 13, 2008

Dear colleagues,

I always wondered if it is possible to use some software to detect plagiarism for translation purposes. Many of the authors copy parts of their texts from the internet and sometimes it would be helpful to know if there are existing translations in your language for exemple on the EU website before you start working.

It should work like any CAT tool looking for fuzzy matches. The best thing would be a rough online alignment, but this seems be too good t
... See more
Dear colleagues,

I always wondered if it is possible to use some software to detect plagiarism for translation purposes. Many of the authors copy parts of their texts from the internet and sometimes it would be helpful to know if there are existing translations in your language for exemple on the EU website before you start working.

It should work like any CAT tool looking for fuzzy matches. The best thing would be a rough online alignment, but this seems be too good to be true.


Does anyone use such software and if yes what are your experiences?


Thanks in advance

Noe


[Edited at 2008-05-13 13:38]
Collapse


 
Attila Piróth
Attila Piróth  Identity Verified
Ֆրանսիա
Local time: 13:20
Անդամ
անգլերենից հունգարերեն
+ ...
Wordfast May 13, 2008

Noe Tessmann wrote:

The best thing would be a rough online alignment, but this seems be too good to be true.



Check out the Very Large Translation Memory (VLTM) project: http://www.wordfast.net/index.php?whichpage=jobs&lang=engb

Attila


 
Jack Doughty
Jack Doughty  Identity Verified
Մեծ Բրիտանիա
Local time: 12:20
ռուսերենից անգլերեն
+ ...
Ի հիշատակ
Exam cheats May 13, 2008

I don't know the answer to this, but there were some stories in the UK press a few months ago about examiners using software to check exam papers, degree theses etc. for plagiarism.

 
Noe Tessmann
Noe Tessmann  Identity Verified
Ավստրիա
Local time: 13:20
անգլերենից գերմաներեն
+ ...
TOPIC STARTER
Inspiration May 13, 2008

Hello,

I think about situations where you realise at the end of your translation that large paragraphs already exist somewhere on the web, which at least could have inspired you.


@ Atila:

I think about about machine alignment not human alignment, asking for a TM would be too much.
A part of the EU website is now accessible through the DGT TM but the rest has to scanned by hand?

Regards

Noe


 
Tony M
Tony M
Ֆրանսիա
Local time: 13:20
Անդամ
ֆրանսերենից անգլերեն
+ ...
SITE LOCALIZER
Google helps a lot! May 13, 2008

Hardly a systematic way of going about it, but I very often find that if I Google a salient chunk of text, I come up with the exact-same document that I'm working on, and once I've found that, it often leads on to other treasures...

 
Noe Tessmann
Noe Tessmann  Identity Verified
Ավստրիա
Local time: 13:20
անգլերենից գերմաներեն
+ ...
TOPIC STARTER
Help Google further May 13, 2008

Hello Tony,

yes but there is software, that examines every chunk of your text with Google and tells you if the author found some pieces in the internet. From your experience you could than mostly tell if there is a chance, that something exists in your language.



Noe

[Edited at 2008-05-13 17:14]


 
Lia Fail (X)
Lia Fail (X)  Identity Verified
Իսպանիա
Local time: 13:20
իսպաներենից անգլերեն
+ ...
software May 13, 2008

Noe Tessmann wrote:

yes but there is software

Noe

[Edited at 2008-05-13 17:14]


Nie

Noone seems to know exactly what yoiu are talking about but I've had a similar experience to Tony and discovered plagiarism using my 6th sense and Google. There IS software, pub houses use it, and if you do a search on Google fori "plagiarism software" there lost of even free softaware.

As a translator I don't use this software, it doesn't seem to fit with what translators are required to do (meaning that I think the eds of pub houses are more responsible for this kind of "policing").

Yet I do take a principled stand, and difficult tho it is, I manage somehow to get out of translating anything that shows evidence of plagiarism.


 
Daina Jauntirans
Daina Jauntirans  Identity Verified
Local time: 06:20
գերմաներենից անգլերեն
+ ...
Have not run into this May 14, 2008

I have not run into this issue myself in business and financial translation. I could see that this would be useful if authors re-use parts of their own company's materials (that's not plagiarism, though). Often they do this, but don't mention which parts of a report or document may have been taken from other material. In that case, it would be useful to find the other document and a possible translation, since consistency is important.

On a different note, a friend of mine teaches o
... See more
I have not run into this issue myself in business and financial translation. I could see that this would be useful if authors re-use parts of their own company's materials (that's not plagiarism, though). Often they do this, but don't mention which parts of a report or document may have been taken from other material. In that case, it would be useful to find the other document and a possible translation, since consistency is important.

On a different note, a friend of mine teaches online classes and is required to run all student work through plagiarism software. She has caught people this way.

[Edited at 2008-05-14 00:54]

[Edited at 2008-05-14 00:54]
Collapse


 
hazmatgerman (X)
hazmatgerman (X)
Local time: 13:20
անգլերենից գերմաներեն
Doughty post May 14, 2008

The British press referred to was the Economist AFAIK. Your may be able to use their Website to get hold of the article and start from there. Otherwise let me know an I go through my issues. Good luck.

 
Noe Tessmann
Noe Tessmann  Identity Verified
Ավստրիա
Local time: 13:20
անգլերենից գերմաներեն
+ ...
TOPIC STARTER
Results of a test with plagarism finder May 14, 2008

Hello,

sorry for being not clear enough. Here's a result for a 2 pages text about flexicurity.
Some of the links seem to be useful especially those from the EU website, but I am not willing to pay 50 euros for the full version of plagarism finder.


I'll do some more tests.

Regards

Noe


BIBLIOGRAPHY
--------------------------------------------------------------------------------

All sources with
... See more
Hello,

sorry for being not clear enough. Here's a result for a 2 pages text about flexicurity.
Some of the links seem to be useful especially those from the EU website, but I am not willing to pay 50 euros for the full version of plagarism finder.


I'll do some more tests.

Regards

Noe


BIBLIOGRAPHY
--------------------------------------------------------------------------------

All sources with a match of at least 100 characters are shown:

http://ec.europa.eu/employment_social/calls/2007/vt_2007_016/tenderspecs_de.pdf # 171 characters
http://ec.europa.eu/employment_social/employment_strategy/flex_meaning_en.htm # 154 characters
http://www.springerlink.com/index/6LCFUTWCC6HM5QWE.pdf # 140 characters
http://www.ingentaconnect.com/content/sage/j279/2003/00000025/00000005/art00012 # 140 characters
http://www.mt-archive.info/LREC-2000-Cucchiarini.pdf # 140 characters
http://64.233.183.104/search?q=cache:e-u5lKJH4XEJ:www.mt-archive.info/LREC-2000-Cucchiarini.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=2&gl=de # 140 characters
http://www.tcd.ie/Germanic_Studies/localpages/bsg/texts/course/bsghandbook03.htm # 140 characters
http://www.wesc.ac.uk/tc/te-wcrwesc.pdf # 140 characters
http://files.idiominc.com/Globalization2020-MultilingualComputing.pdf # 140 characters
http://64.233.183.104/search?q=cache:JCLHLMo6R_IJ:files.idiominc.com/Globalization2020-MultilingualComputing.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=5&gl=de # 140 characters
http://64.233.183.104/search?q=cache:B-bifCymmGYJ:www.ist-world.org/ProjectDetails.aspx?ProjectId=38d58da3d3234f4bb917e249266a319e%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=1&gl=de # 140 characters
http://64.233.183.104/search?q=cache:tZKuXzj8a8sJ:language123.com/l/working_as_a_translator.html%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=7&gl=de # 140 characters
http://language123.com/l/working_as_a_translator.html # 140 characters
http://64.233.183.104/search?q=cache:rZDzGFnamhYJ:ec.europa.eu/translation/reading/articles/pdf/2001_03_30_brussels_goetschalckx.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=8&gl=de # 140 characters
http://ec.europa.eu/translation/reading/articles/pdf/2001_03_30_brussels_goetschalckx.pdf # 140 characters
http://64.233.183.104/search?q=cache:YJqrSGHEc6AJ:www.tcd.ie/Germanic_Studies/localpages/bsg/texts/course/bsghandbook03.htm%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=6&gl=de # 140 characters
http://64.233.183.104/search?q=cache:ca4idSMoFkIJ:www.wesc.ac.uk/tc/te-wcrwesc.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=9&gl=de # 140 characters
http://www.ist-world.org/ProjectDetails.aspx?ProjectId=38d58da3d3234f4bb917e249266a319e # 140 characters
http://64.233.183.104/search?q=cache:WLzQSvKrU7UJ:www.air.org/news/documents/AERA2005Test%20Translation%20Advantages.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=10&gl=de # 140 characters
http://www.air.org/news/documents/AERA2005Test%20Translation%20Advantages.pdf # 140 characters
http://ec.europa.eu/employment_social/news/2007/jun/flexicurity_en.pdf # 104 characters
http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID1118725_code983355.pdf?abstractid=1118725&mirid=1 # 104 characters


--------------------------------------------------------------------------------
RESULT OF THE EXAMINATION
--------------------------------------------------------------------------------

number of words in document 736
therefrom examined words 98
therefrom congruent words
found in the Internet 90

record length 7
increment 50

so that 13 % of all words have been examined
a total of 12 % congruent words
have been found in the Internet
from all examined words this is a total of 92 % congruent words
found in the Internet
Collapse


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 13:20
անգլերենից հունգարերեն
+ ...
interesting idea May 14, 2008

I just google characteristic bits and go from there (one job contained a very strange word which is not in any dictionary I have access to. I googled it and there was only one hit... the document where my the bits that made up my job came from. The site contained the official translation as well.)

Automating this could come in handy.

I just had a look and there are a coup
... See more
I just google characteristic bits and go from there (one job contained a very strange word which is not in any dictionary I have access to. I googled it and there was only one hit... the document where my the bits that made up my job came from. The site contained the official translation as well.)

Automating this could come in handy.

I just had a look and there are a couple of free programs, like this one: http://www.plagiarism.phys.virginia.edu/

No idea if any of them are any good, never tried any.


The "rough online alignment" bit is unlikely esp as you need to find a translation as well as your original document.
But I just did a sizable alignment project (73000 TUs) and can vouch for hunalign. If you have a word pair dictionary of your language pair and the originals are mostly in synch it does a remarkable job even without human intervention. Most of the time it detected correctly when one of the texts had 5 paragraphs missing!
With precisely formatted and exactly matching input texts like the europarl corpus you just throw the text at it and get a 99.9% correct automatic alignment.
Collapse


 
Allesklar
Allesklar  Identity Verified
Ավստրալիա
Local time: 22:50
անգլերենից գերմաներեն
+ ...
iMacros May 15, 2008

If you are inclined and able, iMacros could be some help with this.

I haven't looked into it properly yet, but the commercial version also has a scripting interface which could theoretically be used to link a Google search macro to an alignment application.

Don't know whether the potential benefits would justify the trouble, but certainly interesting to play around with.


 
Samuel Murray
Samuel Murray  Identity Verified
Նիդեռլանդներ
Local time: 13:20
Անդամ (2006)
անգլերենից աֆրիկանս
+ ...
Yes, but you can't use the translation, can you? May 15, 2008

Noe Tessmann wrote:
...and sometimes it would be helpful to know if there are existing translations in your language for exemple on the EU website before you start working.


What good would it do you if you find an existing translation? You can't use someone else's translation except for terminological purposes. Copyright of exiting translations belong to their respective translators, even if the client owns copyright or has a licence to use the source text.


 
Alex Eames
Alex Eames
Local time: 12:20
անգլերենից լեհերեն
+ ...
Good point Samuel May 15, 2008

Samuel Murray wrote:

Noe Tessmann wrote:
...and sometimes it would be helpful to know if there are existing translations in your language for exemple on the EU website before you start working.


What good would it do you if you find an existing translation? You can't use someone else's translation except for terminological purposes. Copyright of exiting translations belong to their respective translators, even if the client owns copyright or has a licence to use the source text.


Good point! If you then use this to plagiarise the text you have found, this software might be used against you and you end up with egg on your face at best and at worst a breach of copyright lawsuit. (Of course whether or not you get caught totally depends on the use to which your translation will be put, but the question remains is it morally acceptable to copy large chunks of text without permission?)

I remember a situation once where we were translating a long document issued by the Polish State Treasury. It was a prospectus showing the tendering/bidding procedures for building power stations. By coincidence, we found out through an associate that they were translating the same document for another client. Happy days. We pooled resources and shared the work. No copying issues in this case, but it was somewhat exceptional.

Alex Eames
http://www.translatortips.com/
helping translators do better business


 
Noe Tessmann
Noe Tessmann  Identity Verified
Ավստրիա
Local time: 13:20
անգլերենից գերմաներեն
+ ...
TOPIC STARTER
I am a plagiarist May 15, 2008

Hello,

thanks for your input, but I am a plagiarist I don't reinvent phrases like "turn the screw to the left". I borrow formulations from Wikipedia even from Windows glossaries. I know you're the good guys you would never do this.

Actually I translate a lot of EU related texts so I even have to copy entire paragraphs from white or green papers, press releases, communications, quotations and so on. I am soooo evil. Sometimes they're hard to find and a plagiarism detec
... See more
Hello,

thanks for your input, but I am a plagiarist I don't reinvent phrases like "turn the screw to the left". I borrow formulations from Wikipedia even from Windows glossaries. I know you're the good guys you would never do this.

Actually I translate a lot of EU related texts so I even have to copy entire paragraphs from white or green papers, press releases, communications, quotations and so on. I am soooo evil. Sometimes they're hard to find and a plagiarism detector would be helpful for me.

Kind regards


Noe
Collapse


 
Էջերը այս նյութում:   [1 2] >


To report site rules violations or get help, contact a site moderator:

Այս ֆորումի մոդերատորները
Maria Castro[Call to this topic]
Nawal Kramer[Call to this topic]

You can also contact site staff by submitting a support request »

Plagiarism Scanner as translation ressource?







Pastey
Your smart companion app

Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.

Find out more »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »