We're a Windows 7 Enterprise (32-bit) SP1 shop running Office 2007 SP2 and when copying text from PDF's & pasting them anywhere (a web application, Word, Outlook etc), the pasted text is not in the original copied language but rather random English letters & characters.
I'm working with a handful of PDF's in various languages (English, Japanese, Chinese, Korean and Russian). Most, if not all, PDF's have English on the first 2-3 pages then the rest of the document is in the foreign language. All PDF's appear to have been OCR'd to some degree since I can select text in the PDF:
- On the English pages I can select, copy & paste the English text & numbers etc without issue.
- On the pages with the foreign languages, I can select, copy & paste the English text, numbers & characters (like !, &, $ etc) without issue.
- On the pages with the foreign languages, I can select & copy the foreign language (words or characters in the case of Asian languages), but the copied text, as seen in the clipboard or when pasting, is incorrect.
Following item three above: When copying sections of the document in the foreign language (be it one or more words, characters, or sentences etc), the pasted text is in random English characters, not the original copied language. For example:
We copy this Japanese text from the PDF: これは私のテストです
We paste it we'll get this text instead: XAM E§ 1& OD Tz.Q C AtIl.aztrl 2
In fact, when we repeat the process & open an Office application to view the clipboard, it too reflects that: random characters, letters & numbers
We've tried using Reader X (10.1.0) & XI (11.0.0) with & without the Japanese Font Packs (X, XI), as well as Nuance PDF Converter Enterprise 7.0 & 8.1.1 to open the documents but the result is the same.
After seeing these results a few times, we added the Japanese keyboard and enabled the language bar & switched the language in the application when copying (i.e.: Adobe, Nuance) as well as the language in the application we were pasting into (e.g.: Word, Internet Explorer etc). No change. I even went as far as adding the Japanese display language (Control Panel > Region & Language > Keyboards and Languages tab > Install/uninstall languages...) but that did not seem to help either.
Is this a problem unique to these PDF's?
Is this a Windows [configuration] problem?
Do we need something else in order to be able to do this?