Establishing a word count for scanned PDFs
Thread poster: Ashley Wans
Ashley Wans
Ashley Wans  Identity Verified
United States
Local time: 08:55
Spanish to English
+ ...
May 8, 2011

Hi all,

I'm hoping to find out what other translators do in regards to the following issue. Lately, I have been getting a lot of work in the form of scanned PDF files (often these are certificates or diplomas, but a few times I have received longer types of files this way which, for whatever reason, the client cannot get in any other format). I actually don't mind working with these files, as I have become fairly savvy at converting and formatting them, etc. My issue is determini
... See more
Hi all,

I'm hoping to find out what other translators do in regards to the following issue. Lately, I have been getting a lot of work in the form of scanned PDF files (often these are certificates or diplomas, but a few times I have received longer types of files this way which, for whatever reason, the client cannot get in any other format). I actually don't mind working with these files, as I have become fairly savvy at converting and formatting them, etc. My issue is determining a word count with which to charge the client. In my language pair, the word count is usually established by the number of source words, but I can't/won't go through and manually count the words in a PDF.

So what do you when faced with this problem? I know some translators decline to work with scanned PDFs because they can be problematic to format, but I'm happy to keep doing it as long as I have a better method of determining how much to charge the client.

Thanks for your input.
Collapse


 
Germaine
Germaine  Identity Verified
Canada
Local time: 11:55
English to French
+ ...
A search in the forums... May 8, 2011

will give you instant answers to this question as it has been discussed very often.

 
Ashley Wans
Ashley Wans  Identity Verified
United States
Local time: 08:55
Spanish to English
+ ...
TOPIC STARTER
What search terms did you use? May 8, 2011

Germaine wrote:

will give you instant answers to this question as it has been discussed very often.



I actually didn't pull up what I was looking for when I searched. What terms did you use to find the other threads about this?


 
Henry Hinds
Henry Hinds  Identity Verified
United States
Local time: 09:55
English to Spanish
+ ...
In memoriam
Count May 8, 2011

Sometimes it is possible to copy a .pdf and stick it into an MSWord file and get a count, and sometimes not. Many times I am faced with paper documents or a .pdf, so I cannot say I can't or won't go through and manually count the words in a PDF. I just do it, by averaging words by line and then counting lines.

That is just part of what we do, so you just do it. It also may be reflected in your charges.


Colleen Roach, PhD
 
Lesley Clarke
Lesley Clarke  Identity Verified
Mexico
Local time: 09:55
Spanish to English
I always convert the pdf May 8, 2011

I always use the wordcount I get from a rough conversion fo the pdf. It is usually not to dreadfully different from a proper conversion.

 
TargamaT team
TargamaT team  Identity Verified
France
Local time: 16:55
Member (2010)
English to Arabic
+ ...
Adobe New Service "Export PDF to Word" May 8, 2011

https://exportpdf.acrobat.com/convert-pdf-to-word.html

This will OCR documents and then export them to word.

You can use Anycount for OCRed PDF files...

If you have Adobe Acrobat (full version/not Reader), you can OCR PDFs then apply Anycount...

Rgds,

Oussama


Colleen Roach, PhD
 
Diana Coada (X)
Diana Coada (X)  Identity Verified
United Kingdom
Local time: 15:55
Portuguese to English
+ ...
Wordfast Anywhere May 8, 2011

can OCR the scanned pdf and it exports the file in .rtf so you will have a wordcount of the original.

 
Edward Vreeburg
Edward Vreeburg  Identity Verified
Netherlands
Local time: 16:55
Member (2008)
English to Dutch
+ ...
another (VERY EASY) option May 9, 2011

Invoice based on TARGET word count - which is easily verifiable by the client, although he does not get an accurate upfront quote...
depending on your language combination this may be advantageous to you (however English is pretty high on the word count side)


Ed


 
Ashley Wans
Ashley Wans  Identity Verified
United States
Local time: 08:55
Spanish to English
+ ...
TOPIC STARTER
Estimation not a problem? May 9, 2011

Thanks every one for your input on this.

I initially assumed that a (close to) exact word count was always needed. But what I gather from all of the feed back thus far is that estimation, when made based upon a conversion or line count, is a perfectly fine alternative?


 
John Fossey
John Fossey  Identity Verified
Canada
Local time: 11:55
Member (2008)
French to English
+ ...
Agree first May 9, 2011

Ashley Wans wrote:

Thanks every one for your input on this.

I initially assumed that a (close to) exact word count was always needed. But what I gather from all of the feed back thus far is that estimation, when made based upon a conversion or line count, is a perfectly fine alternative?


The main thing is to agree in advance on the price or at least on how it will be calculated, so there's no surprises after its done. How you arrive at that price is less important than agreeing.


 
DIANNE BEREST
DIANNE BEREST  Identity Verified
Montenegro
Local time: 16:55
Spanish to English
+ ...
Simple onlline OCR and a way to split PDFs Jun 24, 2019

First, years later, adding my thanks to those who answered this question. In case it's helpful for anyone, https://ocr.space/ allows you to extract text from a pdf online, providing the text in a box that you can copy and paste into a Word document to get a word count. (There are other options, but this is the one I used.) The limit is 5MB. For longer pdf documents, this tells you how to split them by opening them ... See more
First, years later, adding my thanks to those who answered this question. In case it's helpful for anyone, https://ocr.space/ allows you to extract text from a pdf online, providing the text in a box that you can copy and paste into a Word document to get a word count. (There are other options, but this is the one I used.) The limit is 5MB. For longer pdf documents, this tells you how to split them by opening them on google chrome, then you can upload the parts onto ocr.space to get the full word count: https://www.wikihow.com/Split-PDF-Files. The text provided by ocr, by the way was not perfect. (Probably the results are more exact if you use one of their other options. This was the fastest option.) My document was complicated, with lots of numbers and tables, but for a pretty close word count, it was perfect.Collapse


Marta Fernández, M.A.
 
Colleen Roach, PhD
Colleen Roach, PhD  Identity Verified
United States
Local time: 08:55
French to English
+ ...
yep; averaging words per line +counting --can work Jun 24, 2019

[quote]Henry Hinds wrote:

Many times I am faced with paper documents or a .pdf, so I cannot say I can't or won't go through and manually count the words in a PDF. I just do it, by averaging words by line and then counting lines.

Yes, this is what I have done generally. Not that time consuming really, even for a long document, providing the lines are approximately the same size (no. of characters). The problem with things like diplomas, etc. (which she cited) is that the lines may well be of varying sizes so you can't do a meaningful average. And, as someone mentioned, it's always worth a try copying and pasting from a PDF into a Word doc -- sometimes it works and sometimes it doesn't. (I haven't yet figured out why and when it works).


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Establishing a word count for scanned PDFs






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »