Jump to content

Reading the string output of a DLL


Recommended Posts

Hi,

 

I am writing a LabVIEW API based on a 3rd party DLL. The DLL function I am working on right now returns a string and 2 U32. The string is only an output, but the 2 U32 are used as both inputs and outputs. So long story short the 3 parameters are defined as pointers. The 2 U32 work as specified in the documentation, one of them (psize) returning the number of characters that the string is supposed to return.

 

However I can't get the string properly. The documentation tells me that it is a C String and I configured the parameter as such, but when I run the function I get only the first character of what I'm supposed to get. The next test I made (see below) returns all the other characters except the first one and helped me understand why this first test returned only the first character: the string is Unicode! This means that each character (though able to be represented in ASCII) takes 2 bytes, one of them being the null character. And since a C string ends when the null character is encountered, this explain why LabVIEW only gives me the ASCII value of the first character.

 

In the second test I'm talking about, I modified the parameter to be a Pascal String Pointer instead of C String Pointer. I got all the characters except the first one, which makes sense I guess since the first character is the length in a Pascal string.

 

Basically so far I haven't managed to find a way to read the whole string in one call. I would love to find a solution where I just ask for an array of bytes and just perform my own parsing, but the DLL returns an Invalid Parameter error when I try it.

 

Have you run into something similar? Do you have any tips?

 

Help would be much appreciated!! (on top of this issue, LabVIEW crashed 50% of the time :throwpc: )

post-14511-0-88844600-1427763648_thumb.p

Link to comment

Hi,

 

I think your best bet is to create a wrapper DLL that converts that string to a LabVIEW-friendly string. You can't do the conversion in LabVIEW, as there is no way to tell LabVIEW how many bytes it should read.

 

 

the string is Unicode! This means that each character (though able to be represented in ASCII) takes 2 bytes, one of them being the null character. And since a C string ends when the null character is encountered, this explain why LabVIEW only gives me the ASCII value of the first character.

 

To be precise, the string is encoded in UTF-16 (or maybe UCS2). There are other Unicode encodings.
 
If the string was encoded in UTF-8 (another Unicode encoding), then you would not have faced this issue because UTF-8 is a strict superset of ASCII. That means, when you convert an "ASCII string" to UTF-8, the the output bytes look exactly the same as the input bytes. Thus, LabVIEW would be able to treat it as a plain C string.

 

 

(on top of this issue, LabVIEW crashed 50% of the time  :throwpc: )

 
If it crashes when you use your Pascal string approach, that's probably because LabVIEW read the first character, interpreted is as a (very long) length, and then tried to read beyond the end of the string (into memory that it's not allowed to read).
Link to comment

Hi,

 

I am writing a LabVIEW API based on a 3rd party DLL. The DLL function I am working on right now returns a string and 2 U32. The string is only an output, but the 2 U32 are used as both inputs and outputs. So long story short the 3 parameters are defined as pointers. The 2 U32 work as specified in the documentation, one of them (psize) returning the number of characters that the string is supposed to return.

 

However I can't get the string properly. The documentation tells me that it is a C String and I configured the parameter as such, but when I run the function I get only the first character of what I'm supposed to get. The next test I made (see below) returns all the other characters except the first one and helped me understand why this first test returned only the first character: the string is Unicode! This means that each character (though able to be represented in ASCII) takes 2 bytes, one of them being the null character. And since a C string ends when the null character is encountered, this explain why LabVIEW only gives me the ASCII value of the first character.

 

In the second test I'm talking about, I modified the parameter to be a Pascal String Pointer instead of C String Pointer. I got all the characters except the first one, which makes sense I guess since the first character is the length in a Pascal string.

 

Basically so far I haven't managed to find a way to read the whole string in one call. I would love to find a solution where I just ask for an array of bytes and just perform my own parsing, but the DLL returns an Invalid Parameter error when I try it.

 

Have you run into something similar? Do you have any tips?

 

Help would be much appreciated!! (on top of this issue, LabVIEW crashed 50% of the time :throwpc: )

 

LabVIEW takes the specification you set in the Call Library Node pretty literally. For C strings it means that it will parse the string buffer on the right side of the node (if connected) for a 0 termination character and then convert this string into a LabVIEW string. For a Pascal string it interprets the first byte in the string as a length and then assumes that the rest of the buffer contains as much characters (although I would hope that it uses an upper bounding of the buffer size as it was passed in on the left side).

 

Since your "String" contains embedded 0 bytes, you can not let LabVIEW treat it as a string but instead have to tell it to treat it as binary data. And a binary string is simply an array of bytes (or in this specific case possibly an array of uInt16) and since it is a C pointer you have to pass the array as an Array Data Pointer. You have to make sure to allocate the array to a size big enough for the function to fill in its thing (and probably pass in that size in pSize so the function knows how big the buffer is it can use) and on return resize the array buffer yourself to the size that is returned in pSize.

 

And you have of course to make sure that you treat the pSize correctly. This is likely the number of characters so if this is an UTF16 string then it would be equal to the number of uInt16 elements in the array (if you use a byte array instead on the LabVIEW side the size in LabVIEW bytes would be likely double that of what the function considers as size). But note the likely above! Your DLL programmer is free to require a minimum buffer size on entry and ignore pSize altogether, or treat pSize as number of bytes, or even number of apples if he likes. This information must be documented in the function documentation in prosa text and can not be specified in the header file in any way.

 

Last but not least you will need to convert the UTF16 characters to a LabVIEW MBCS string. If you have treated it as uInt16 array, you can basically scan the array for values that are higher than 127. These would need to be treated specially. If your array only contains values up to and including 127 you can simply convert them to an U8 byte and then convert the resulting byte array to a LabVIEW string. And yes values above 128 are not directly translatable to ASCII. There are special translation tables that can get pretty involved especially since they depend on your current ANSI codepage. The best would be to use the Windows API WideCharToMultiByte() but that is also not a very trivial API to invoke through the Call Library Node. On the dark side you can find some more information here about possible solutions to do this properly.

 

The crashing is pretty normal. If you deal with the Call Library Node and tell LabVIEW to pass in a certain datatype or buffer and the underlaying DLL expects something else there is really nothing LabVIEW can do to protect you from memory corruption.

Link to comment

Just to expand on what Rolf is saying. There is a catch 22 with these sorts of functions. To resolve the "need to know buffer size before knowing the buffer size" problem. Many functions allow calling with a null pointer for the array/string which then populates the size parameter. You can then call the function a second time knowing the size from the previous call.

Link to comment

To be precise, the string is encoded in UTF-16 (or maybe UCS2). There are other Unicode encodings.

 

You're right, it is likely UTF-16.

 

If it crashes when you use your Pascal string approach, that's probably because LabVIEW read the first character, interpreted is as a (very long) length, and then tried to read beyond the end of the string (into memory that it's not allowed to read).

 

It actually crashes in general, not just with Pascal strings, and not just with this function. Even when I open the LabVIEW examples provided by the DLL manufacturer, it regularly works (the dll returns values as expected) and when the VI stops, LabVIEW crashes...

 

Since your "String" contains embedded 0 bytes, you can not let LabVIEW treat it as a string but instead have to tell it to treat it as binary data. And a binary string is simply an array of bytes (or in this specific case possibly an array of uInt16) and since it is a C pointer you have to pass the array as an Array Data Pointer.

 

It worked!!! :worshippy:  Array of U8 and Array Data Pointer did the trick. I can't thank you enough for saving me from hours of coffee+headache pills combo!

 

Your DLL programmer is free to require a minimum buffer size on entry and ignore pSize altogether, or treat pSize as number of bytes, or even number of apples if he likes. This information must be documented in the function documentation in prosa text and can not be specified in the header file in any way.

 

So as it turns out the psize I wire in has to be the number of bytes of the allocated buffer but the psize returned is the number of characters, so half the number of "meaningful" bytes that it returns. That's ok, as long as I know how it works, it's easy to adapt the code to it.

 

If your array only contains values up to and including 127 you can simply convert them to an U8 byte and then convert the resulting byte array to a LabVIEW string.

 

Yes all the characters are less than 127 so I just decimated the array to keep all the even indexes (all the odd indexes being null characters) and then converted this array of bytes into a string.

 

Shaun you have a good point. I could always guarantee that the string will fit by feeding a huge psize, but that's probably a waste of memory allocation in most cases, so what I will do is feed a reasonable psize, and then compare it to the returned psize. If the comparison shows that some characters are missing, I will call the function a second time, this time with the exact number of expected characters since I know it.

Thank you all for your help again!

(I'm pretty sure I'll run into more DLL issues VERY soon)

Link to comment

 so what I will do is feed a reasonable psize, and then compare it to the returned psize. If the comparison shows that some characters are missing, I will call the function a second time, this time with the exact number of expected characters since I know it.

That approach will probably end in random crashes. When dealing with DLLs the memory to be written to must be the correct size or bigger. If it so much as one byte too small, a crash (GPF) is inevitable, although sometimes not predictable. The only safe way is to know how big the memory allocation needs to be before writing data into it.

Link to comment

 

You're right, it is likely UTF-16.

 

 

It actually crashes in general, not just with Pascal strings, and not just with this function. Even when I open the LabVIEW examples provided by the DLL manufacturer, it regularly works (the dll returns values as expected) and when the VI stops, LabVIEW crashes...

 

 

It worked!!! :worshippy:  Array of U8 and Array Data Pointer did the trick. I can't thank you enough for saving me from hours of coffee+headache pills combo!

 

 

So as it turns out the psize I wire in has to be the number of bytes of the allocated buffer but the psize returned is the number of characters, so half the number of "meaningful" bytes that it returns. That's ok, as long as I know how it works, it's easy to adapt the code to it.

 

 

Yes all the characters are less than 127 so I just decimated the array to keep all the even indexes (all the odd indexes being null characters) and then converted this array of bytes into a string.

 

Shaun you have a good point. I could always guarantee that the string will fit by feeding a huge psize, but that's probably a waste of memory allocation in most cases, so what I will do is feed a reasonable psize, and then compare it to the returned psize. If the comparison shows that some characters are missing, I will call the function a second time, this time with the exact number of expected characters since I know it.

Thank you all for your help again!

(I'm pretty sure I'll run into more DLL issues VERY soon)

 

A proper API would specifically document that one can call the function with a NULL pointer as buffer to receive the necessary buffer size to call the function again.

 

But that thing about that you have to specify the input buffer size in bytes but get back the returned characters would be a really brain damaged API. I would check again! What happens if you pass in the number of int16 (so half the number of bytes)? Does it truncate the output at that poistion?

 

And you still should be able to define it as an int16 array. That way you don't need to decimate it afterwards.

Link to comment

I just tried declaring the string parameter as a U16 array instead of a U8. In this case it does treat the psize input as being the number of characters, not the number of bytes. The reason why it seems to be the number of bytes is that the string was defined as an array of U8.

 

As you said Rolfk I don't need to decimate anymore, I directly feed the U16 array into the convert bytes to string function. I get a coercion dot of course since it needs to be forced into a U8 array first, but that's fine, it's what we want...

 

OK Shaun, I will then increase the psize to be pretty big (maybe 500 characters). I expect it always to be big enough.

 

There is really a lot to know to be able to call DLL properly in LabVIEW. I will do some googling of course, but do you know what is the best source to learn this?

 

Side-note: GPF = General Protection Fault?

Link to comment
There is really a lot to know to be able to call DLL properly in LabVIEW. I will do some googling of course, but do you know what is the best source to learn this?

Learn C. All of this is based on C conventions. When you understand C data types and pointers, the LabVIEW part will make sense (mostly - I still get thrown off by the way clusters are packed versus C structures). Unfortunately I don't know of any other way to learn it.

Link to comment

I just tried declaring the string parameter as a U16 array instead of a U8. In this case it does treat the psize input as being the number of characters, not the number of bytes. The reason why it seems to be the number of bytes is that the string was defined as an array of U8.

 

This can't be! The DLL knows nothing about if the caller provides a byte buffer or a uInt16 array buffer and conseqently can't interpret the pSize parameter differently.

 

And as ned told you this is basically all C knowledge. There is nothing LabVIEW can do to make this even more easy. The DLL interface follows C rules and those are both very open (C is considered only slightly above assembly programming) and the C syntax is the absolute minimum to allow a C compiler to create legit code. It is and was never meant to describe all aspects of an API in more detailed way than what a C compiler needs to pass the bytes around correctly. How the parameters are formated and used is mostly left to the programmer using that API. In C you do that all the time, in LabVIEW you have to do it too, if you want to call DLL functions.

Learn C. All of this is based on C conventions. When you understand C data types and pointers, the LabVIEW part will make sense (mostly - I still get thrown off by the way clusters are packed versus C structures). Unfortunately I don't know of any other way to learn it.

 

LabVIEW uses normal C packing rules too. It just uses different default values than Visual C. While Visual C has a default alignment of 8 bytes, LabVIEW uses in the 32 bit Windows version always 1 byte alignment. This is legit on an x86 processor since a significant amount of extra transistors have been added to the operand fetch engine to make sure that unaligned operand accesses in memory don't invoke a huge performance penalty. This all to support the holy grail of backwards compatibility where even the greatest OctaCore CPU still must be able to execute original 8086 code.

 

Other CPU architectures are less forgiving, with Sparc having been really bad if you would do unaligned operand access. However on all other current platforms than Windows 32 Bit, including the Windows 64 Bit version of LabVIEW, it does use the default alignment.

 

Basically this means that if you have structures in C that are in code compiled with default alignment you need to adjust the offset of cluster elements to align on the natural element size when programming for LabVIEW 32 bit, by possibly adding filler bytes. Not really that magic. Of course a C programmer is free to add #pragma pack() statements in his source code to change the aligment for parts or all of his code, and/or change the default alignment of the compiler through a compiler option, throwing of your assumptions of Visual C 8 byte default alignment.  :P

 

This special default case for LabVIEW for Windows 32 Bit does make it a bit troublesome to interface to DLL functions that uses structure parameters if you want to make the code run on 32 Bit and 64 Bit LabVIEW equally. However so far I have always solved that by creating wrapper shared libraries anyhow and usually I also make sure that structures I use in the code are really platform independent by making sure that all elements in a structure are aligned explicitedly to their natural size.

Link to comment

This can't be! The DLL knows nothing about if the caller provides a byte buffer or a uInt16 array buffer and conseqently can't interpret the pSize parameter differently.

 

Fair enough. I probably misinterpreted my tests. I define the minimum size of the array to be equal to psize in the call declaration, this is probably what influences the results, not the DLL itself. Does that make more sense?

post-14511-0-97323400-1427914111_thumb.p

Edited by Manudelavega
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.