Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

anandds

macrumors newbie
Original poster
May 8, 2009
12
0
Hi,

We have a server client application where the client runs on Mac and the Server runs on windows.

The client sends some Mac file pathnames to the server and the server displays this to the user. I have problems in sending the high-ASCII characters.

For example, I send the HIRAGANA LETTER BO ぼ from the Mac system, but in Windows the server displays this as HIRAGANA LETTER HO ほ followed by a junk character (looks like a dot in the UI).

The Mac client UI displays this rightly, but its only the Windows display that is causing the problem.

Any help would be great

Thanks,
Anand
 
Are you sure they are using ASCII? Seems like Unicode is much more likely. I imagine the Mac is sending as UTF-16. What is the server interpreting it as?
 
There's no such thing as "high ASCII". ASCII is a 7-bit code. Anything outside the range 0x00-0x7F isn't ASCII.


HFS+ filenames are stored on disk in "a form very nearly the same as Unicode Normalization Form D (NFD)":
http://en.wikipedia.org/wiki/HFS_Plus

Look at the article, read the links, and then refer to the references.

One of the references will be this table:
http://developer.apple.com/mac/library/technotes/tn/tn1150table.html

You will almost certainly need to understand Unicode, normalization, and composed vs. decomposed forms in order to solve this.
 
Posix filenames are in UTF-8 encoding, using canonically decomposed Unicode. Go to www.unicode.org to find out what that means. While the Mac handles this fine, whatever code you are using on your Windows box doesn't (which means it is broken - any process that cannot handle composition of Unicode characters is not conforming with the Unicode standard). Either convert the Unicode text to canonically precomposed Unicode on the Macintosh or on Windows.
 
Looks like we had to normalize (CFNormalize) the path name before sending it across to the windows box. I normalize using the kCFStringNormalizationFormC argument and it seemed to work fine now.

At the same time, we deal with Mac path names (HIRAGANA Japanese) coming in from the windows box to Mac. Here again, we have to normalize the paths before storing it in Mac filesystem. But here, we have to use the argument kCFStringNormalizationFormD.

Its actually Unicode, but people conventionally call it high ASCII meaning higher ascii values or whatever it means.

Thanks for the help.
 
Its actually Unicode, but people conventionally call it high ASCII meaning higher ascii values or whatever it means.

Anyone calling it "high ASCII" doesn't have the slightest clue what they are talking about.

And you may have to be careful about Macintosh filenames that are not allowed in the Windows filesystem. On the Macintosh, all characters are allowed in a filename except the nul character and the slash character ("/"). In Windows, many characters are not allowed. You might have fun if I name a file "*.*" on the Macintosh and you try to create a file with that name on a Windows machine.
 
And you may have to be careful about Macintosh filenames that are not allowed in the Windows filesystem. On the Macintosh, all characters are allowed in a filename except the nul character and the slash character ("/"). In Windows, many characters are not allowed. You might have fun if I name a file "*.*" on the Macintosh and you try to create a file with that name on a Windows machine.

Would it help in such a case to use the NSString method -stringByAddingPercentEscapesUsingEncoding: to convert to a standard URL? How would a windows box handle the translation back to a filename?
 
Looks like we had to normalize (CFNormalize) the path name before sending it across to the windows box. I normalize using the kCFStringNormalizationFormC argument and it seemed to work fine now.

At the same time, we deal with Mac path names (HIRAGANA Japanese) coming in from the windows box to Mac. Here again, we have to normalize the paths before storing it in Mac filesystem. But here, we have to use the argument kCFStringNormalizationFormD.

Its actually Unicode, but people conventionally call it high ASCII meaning higher ascii values or whatever it means.

Thanks for the help.

Holy smokes.

Unicode is not conventionally called "high ASCII". Please seek the path of character encoding enlightenment:

http://www.joelonsoftware.com/articles/Unicode.html
 
High ASCII is like unicorn tears.

IF ascii had more bits
AND those bits were standardized
THEN high ascii would exist.

IF an albino equid had a single horn extending from its forehead
AND it wept tears for emotional reasons instead of eye irritation
THEN unicorn tears would exist.

The logic is impeccable.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.