Jump to content

ZLIB CRio (VxWorks) bug?


Recommended Posts

Posted

So I have been trying to get the OpenG zip library working on a cRio (VxWorks) and I think I have finally got it to work but not before finding a nasty little bug. If any of the file path controls use an uppercase 'C' for the drive letter (like "C:ni-rtsystem") the zip functions will fail to find the files in question (The open/create returns an error code of 7). Now maybe I am missing something but this took me about an hour to diagnois so I thought I would share incase anyone else runs into this problem they will know. I have no idea if the bug is a problem on any other systems (windows or Pharlap) but I suspect it is not. Hopefully it is an easy fix in the code.  Thanks for getting zip functionality to work on the Crio!

 

Stephen

Posted (edited)
So I have been trying to get the OpenG zip library working on a cRio (VxWorks) and I think I have finally got it to work but not before finding a nasty little bug. If any of the file path controls use an uppercase 'C' for the drive letter (like "C:ni-rtsystem") the zip functions will fail to find the files in question (The open/create returns an error code of 7). Now maybe I am missing something but this took me about an hour to diagnois so I thought I would share incase anyone else runs into this problem they will know. I have no idea if the bug is a problem on any other systems (windows or Pharlap) but I suspect it is not. Hopefully it is an easy fix in the code.  Thanks for getting zip functionality to work on the Crio!

 

Stephen

 

VxWorks paths are case sensitive (windows aren't by default),

Edited by ShaunR
Posted

In addition to what Shaun said, there are several potential problems in the current OpenG ZIP code in respect to localized character sets. If you use filenames that use characters outside of the 7 bit ASCI code table the result will be very platform dependent. Currently the OpenG ZIP library simply takes the names as handled by LabVIEW which is whatever MBCS the platform uses at that moment. This has many implications.

 

The ZIP standard only supports local encoding or UTF8, and a flag in the archive entry says what it is. This is currently not handled at all in OpenG ZIP. Even if it was there are rather nasty issues that are not trivial to work out.

 

For one if you run the library on a platform that uses UTF8 encoding by default (modern Linux and MacOSX versions) the pathnames in an archive created on that computer will in fact be UTF8 (since LabVIEW is using the platform MBCS encoding) but the flag saying so is not set so it will go wrong when you move that archive to a different platform.

 

On the other hand on Windows LabVIEW is using the CP_ANSI codepage for all its string encoding since that is what Windows GUI apps are supposed to use (unless you make it a full Unicode application which is a beast of burden on its own even for normal GUI apps and an almost impossible thing to move to in a programming environment like LabVIEW if you do not want to throw out compatibility with already created LabVIEW VIs). CP_ANSI is an alias for the codepage set in your control panels depending on your country settings. pkzip (and all other command line ZIP utilities) traditionally use the CP_OEM codepage, This is an alias for another codepage depending on your country settings. It contains mostly the same language specific characters in the upper half of the codepage than what CP_ANSI does but in a considerably different order. It traditionally seems to come from the IBM DOS times, and for some reasons MS decided to go for once for an official standard for Windows rather than the standard set by IBM.

 

So an archive created on Windows with OpenG ZIP will currently use the CP_ANSI codepage for the language specific characters and therefore come up with very strange filenames when you look at it in a standard ZIP utility.

 

The solution as I have been working on in the past months is something along these lines:

 

On all platforms when adding a file to the archive:

 

- Detect if a path name uses characters outside the 7bit ASCI table. If not just store it as is with the UTF8 flag cleared.

- If it contains characters outside the 7bit ASCI range do following:

 

  On non Windows and MacOSX systems:

 

  - Detect if we are on UTF8 system, if not convert path to UTF8, in all cases set UTF8 flag in archive entry and store it

 

  On Windows and MacOSX:

 

  - Detect if we are on UTF8 (likely not), if so just set UTF8 flag and store file

  - otherwise convert from CP_ANSI to CP_OEM and in case of successful conversion store file with this name without UTF8 flag

  - in case the conversion fails for some reasons, store as UTF8  anyhow

 

When reading, there is not very much we can do other than observing the UTF flag in the archive entry.

 

On Non-Windows systems if the flag is different than the current platform setting we have a real problem. codepage translation under unix is basically impossible without pulling in external libraries like icu. Although their existence is fairly standard nowadays there exist a lot of differences in Linux distributions. Making OpenG ZIP depend on them is going to be a big problem. On VxWorks it is not even an option without porting such a library too.

 

On Windows we can use MultiByteToUnicode and vice versa to do the right thing. On MacOSX we have a similar API that "tries" to do mostly the same as the Windows functions but I'm 100% positive that there will be differences for certain character sets.

 

There still is a big problem since the ZIP standard in fact does only allow for the flag if the names are in UTF8 or not. If they are not, there is no information anywhere as to what actual codepage it is in. Remember CP_OEM is simply an alias that maps to a codepage which depends on your language settings. It is a very different codepage for Western European or Eastern European country settings and even more different than for Asian country settings.

Posted

 

Nice link. I thought the same thing but upon further testing it seems to only matter with the drive letter. A folder "Log" was accessable as "log" in the file path. This might explain why their document mentions the drive letter as well. Man seems like NI could have fixed this one pretty easily but what do I know.

 

Well in any case I am glad it is working finally.

 

Thanks, for you help.

Stephen

Posted
Nice link. I thought the same thing but upon further testing it seems to only matter with the drive letter. A folder "Log" was accessable as "log" in the file path. This might explain why their document mentions the drive letter as well. Man seems like NI could have fixed this one pretty easily but what do I know.

 

Well in any case I am glad it is working finally.

 

Thanks, for you help.

Stephen

Well the vxWorks based controllers are a bit of a strange animal in the flock. VxWorks uses a lot of unix and posix like functionality but also has quite a bit of deviations from this. I'm not really sure if the Windows like file system is part of this at all, or if the drive letter nomenclature is in fact an addition by NI to make them behave more like the Pharlap controllers. Personally I find it strange that they use drive letters at all, as the unix style flat file hierarchy makes a lot more sense. But it is how it is and I'm in fact surprised that the case sensitivity does not apply to the whole filename. But maybe that is a VxWorks kernel configuration item too, that NI disabled for the sake of easier integration with existing Pharlap ETS tools for their Pharlap based controllers. VxWorks only was used because Pharlap did not support PPC compilation and at that time x86 based CPUs for embedded applications were rather non-existent, whereas PPC was more or less dominating the entire high end embedded market from printers to routers and more. The use of PPCs for Mac computers was a nice marketing fact but really didn't mount up to any big numbers in comparison to the embedded applications of that CPU.

Posted

You are correct. NI have added path labels so that it still looks like windows/pharlap. When you use the file path types on VxWorks LabVIEW converts under the hood so c:foo.bar becomes /c/foo.bar. If you put the windows style paths directly in as a string it wouldn't work.

I am surprised that the drive letters are case sensitive as well considering this conversion is happening. May have to have a play.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.