Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacBH928

macrumors G3
Original poster
May 17, 2008
8,885
4,000
I am having difficulty finding such app. I want to drop a PDF file where the text is OCRed then it translates that text and produces a new translated PDF document similar to the original but translated.

I found apps that will OCR a PDF in English. Another that will out put the OCRed text to a TXT File. Then I can upload the file to Google Translate and that will translate the PDF file but nothing that does a one stop shop for the whole process.

any one knows such a solution? I got UPDF then it didn't have OCR for Intel Macs. PDF Gear will output file as TXT. Textify will not translate.
 
There are several solutions to this via the terminal. A few questions: which languages would you like to have translated to which languages? because you mention OCR: you work with PDFs which contain text as image and therefor require the OCR-step, or do they contain (e.g. selectable) text which you want to extract (and translated)?
a new translated PDF document similar to the original but translated
You want something which holds the translated text but preserves the original page/document layout?
 
DeepL has a macOS app that can translate documents (pdf, docx, txt), but I couldn't get it to work in Monterey 12.7.2.
I think it fails to properly ask for Accessibility permissions.
DeepL for Mac https://www.deepl.com/en/macos-app/
Translating whole documents with the app https://support.deepl.com/hc/en-us/articles/360020613199-Translating-whole-documents-with-the-app
Free user “PDF (.pdf) 5 MB 100,000 characters”
You can try the online version to get an idea (requires free account registration) https://www.deepl.com/translator/files
 
There are several solutions to this via the terminal. A few questions: which languages would you like to have translated to which languages? because you mention OCR: you work with PDFs which contain text as image and therefor require the OCR-step, or do they contain (e.g. selectable) text which you want to extract (and translated)?

You want something which holds the translated text but preserves the original page/document layout?

Thanks for the reply,

1) English -> to Arabic

2) Text can be selectable OR can be from a picture image converted to PDF and needs OCR

3) I was 90% successful by using Textify app then uploading the PDF to google translate.

To demonstrate what I want here is an image:
bbbb.jpeg
 
DeepL has a macOS app that can translate documents (pdf, docx, txt), but I couldn't get it to work in Monterey 12.7.2.
I think it fails to properly ask for Accessibility permissions.
DeepL for Mac https://www.deepl.com/en/macos-app/
Translating whole documents with the app https://support.deepl.com/hc/en-us/articles/360020613199-Translating-whole-documents-with-the-app
Free user “PDF (.pdf) 5 MB 100,000 characters”
You can try the online version to get an idea (requires free account registration) https://www.deepl.com/translator/files

thanks, Google Translate site can do it too albeit without OCR
 
the translation to spanish in the example is quite awful 😂

otherwise: If you have a Google account, you can use Google Drive to upload the PDF and transform it into editable text via 'Open with > Google Docs'. You can use 'Tools > Translate Document' within to translate the PDF in place. There is support to OCR images as @bogdanw already indicated.

Using pdftotext (part of poppler-utils) with the -layout and/or -table flag - or converting to html via ebook-convert from Calibre - then running the result to translate, would be another option.

Preserving the layout of a page while changing the language as well as the writing direction, as you intent to do, requires probably (quite) some human intervention. 🙃
 
Google Translate can translate images directly https://translate.google.com/?sl=en&tl=ar&op=images

my man, thats what I was looking for. Its not perfect but close enough. Thanks!

the translation to spanish in the example is quite awful 😂

otherwise: If you have a Google account, you can use Google Drive to upload the PDF and transform it into editable text via 'Open with > Google Docs'. You can use 'Tools > Translate Document' within to translate the PDF in place. There is support to OCR images as @bogdanw already indicated.

Using pdftotext (part of poppler-utils) with the -layout and/or -table flag - or converting to html via ebook-convert from Calibre - then running the result to translate, would be another option.

Preserving the layout of a page while changing the language as well as the writing direction, as you intent to do, requires probably (quite) some human intervention. 🙃

-I did some editing to the Spanish text to send my idea cross. I have no idea what the Spanish text says.

-The Google Docs trick works with searchable PDF, with none searchable PDF not so much. I am surprised you figured this Google Docs work around.

-I am not a terminal guy. I kind of do not like to install CLI apps because I do not know where the files are installed on my computer to uninstall it later on.

-I downloaded Calibre for MacOS but I got an icon that has a 🚳 . I think it doesn't work on Intel macs.

thanks for the tips and the helps. I appreciate it!
 
The Google Docs trick works with searchable PDF, with none searchable PDF not so much. I am surprised you figured this Google Docs work around.

🙃 I read the manual 😁

-I am not a terminal guy. I kind of do not like to install CLI apps because I do not know where the files are installed on my computer to uninstall it later on.

most of these tools come with a clean uninstalled - HP/Google/now free Tesseract allows to comfortable ocr PDFs with hundreds of pages composed of image scans of text, outside the terminal you don’t have to look far: Adobe provides another free tool to do the same, Acrobat Professional does this of course too.
 
🙃 I read the manual 😁



most of these tools come with a clean uninstalled - HP/Google/now free Tesseract allows to comfortable ocr PDFs with hundreds of pages composed of image scans of text, outside the terminal you don’t have to look far: Adobe provides another free tool to do the same, Acrobat Professional does this of course too.
hey thanks for the tips and tools!
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.