Jump to content

Human Readable Vrs Machine Data


Recommended Posts

In the context of a large application that will be maintained/added on to for years, what is the best choice for data communication?

For the purposes of this discussion, we are talking about using TCP to communicate between two LV applications. Each application will be updated separately. The server side will ALWAYS be up to date with the latest transmission/receiving types, the clients may be out of date at any given time, but newer transmission types will not be used on legacy clients.

Would it be better to:

Use a messaging system relying on a typedef'ed enum with variant data. (Flattened String to Variant, Variant to flattened string used for message formation/conversion.) each message has an associated typdef for variant conversion.

OR

Use a messaging system relying on a typedef'ed enum with a human readable string following the bin data of the enum. Each message has its own typedef and a string formation VI as well as a custom string parser to data.

Additionally, LVCLASSES cannot be used, so don't go there.

Would love to hear some takes including your perceived benefits/drawbacks to each system.

Link to comment

In the context of a large application that will be maintained/added on to for years, what is the best choice for data communication?

For the purposes of this discussion, we are talking about using TCP to communicate between two LV applications. Each application will be updated separately. The server side will ALWAYS be up to date with the latest transmission/receiving types, the clients may be out of date at any given time, but newer transmission types will not be used on legacy clients.

Would it be better to:

Use a messaging system relying on a typedef'ed enum with variant data. (Flattened String to Variant, Variant to flattened string used for message formation/conversion.) each message has an associated typdef for variant conversion.

OR

Use a messaging system relying on a typedef'ed enum with a human readable string following the bin data of the enum. Each message has its own typedef and a string formation VI as well as a custom string parser to data.

Additionally, LVCLASSES cannot be used, so don't go there.

Would love to hear some takes including your perceived benefits/drawbacks to each system.

Why not take a leason from the 7-Layer model and look at how packets are transfered specifically, design a generic envelope that you can pack any type of data into adn ID the type of packet. So something like

32 bits message type

32 bits length of the data

The first value will tell you what language is being used, and the length tells you how big the data is.

A quick check of th first value will tell your software if it can handle the protocol. if it can then look at the data.

Just my quick 2 cents,

Ben

Link to comment

Unless message speed is absolutely critical (and maybe not even then, given the speed of modern computers) or you need to obscure the data, I'd go with a human-readable format. There's a reason most of the major internet protocols are human-readable (HTTP and SMTP come to mind) - the communication is easy to debug, log and capture, and you can write clients in any programming language without worrying about how that platform interprets data internally.

Link to comment

Why not take a leason from the 7-Layer model and look at how packets are transfered specifically, design a generic envelope that you can pack any type of data into adn ID the type of packet. So something like

32 bits message type

32 bits length of the data

The first value will tell you what language is being used, and the length tells you how big the data is.

A quick check of th first value will tell your software if it can handle the protocol. if it can then look at the data.

Just my quick 2 cents,

Ben

thats what the variant model is doing exactly. the data is always the flattened string and it is always prepended by the type string (which itself is prepended by type string length)

Link to comment

In the context of a large application that will be maintained/added on to for years, what is the best choice for data communication?

For the purposes of this discussion, we are talking about using TCP to communicate between two LV applications. Each application will be updated separately. The server side will ALWAYS be up to date with the latest transmission/receiving types, the clients may be out of date at any given time, but newer transmission types will not be used on legacy clients.

Would it be better to:

Use a messaging system relying on a typedef'ed enum with variant data. (Flattened String to Variant, Variant to flattened string used for message formation/conversion.) each message has an associated typdef for variant conversion.

OR

Use a messaging system relying on a typedef'ed enum with a human readable string following the bin data of the enum. Each message has its own typedef and a string formation VI as well as a custom string parser to data.

Additionally, LVCLASSES cannot be used, so don't go there.

Would love to hear some takes including your perceived benefits/drawbacks to each system.

I always expect I'm missing something, but why do you need the variant? If you've defined a header that includes message length and message type, that works to provide enough info to unflatten the message at the receiver if you just flatten whatever. And don't variants limit you to non-RT targets? If you use flattened data I don't think you have that restriction.

Second, if you decide to use a human-readable/cross platform protocol, check out the XML-RPC server in the Code Repository - that's a published standard (although an old one, it still gets used) that defines message packing/unpacking in XML formatted texts and has a protocol for procedure invocation and response. It's pretty lightweight but still applicable for many tasks. And clients can be language/platform independent. But any of these human readable schemes are less efficient than byte streams. For instance, to move an I32, you need something like

<params>      <param>     	<value><i4>41</i4></value>     	</param>      </params>

that's a lot of data for a 4 byte value! But it is easy to understand and debug. And if you need to move arbitrary chunks of data, the protocol supports base64 encoding of binary.

Mark

Link to comment

I always expect I'm missing something, but why do you need the variant? If you've defined a header that includes message length and message type, that works to provide enough info to unflatten the message at the receiver if you just flatten whatever. And don't variants limit you to non-RT targets? If you use flattened data I don't think you have that restriction.

Second, if you decide to use a human-readable/cross platform protocol, check out the XML-RPC server in the Code Repository - that's a published standard (although an old one, it still gets used) that defines message packing/unpacking in XML formatted texts and has a protocol for procedure invocation and response. It's pretty lightweight but still applicable for many tasks. And clients can be language/platform independent. But any of these human readable schemes are less efficient than byte streams. For instance, to move an I32, you need something like

<params>      <param> 		<value><i4>41</i4></value> 		</param>      </params>

that's a lot of data for a 4 byte value! But it is easy to understand and debug. And if you need to move arbitrary chunks of data, the protocol supports base64 encoding of binary.

Mark

the reason for using variants is that it is a public data structure (i.e. not labview data. some people are concerned that LV will change the way it flattens clusters or something to that effect.)

Here is what I consider the core issue:

Using human readable (and custom for that matter) means our team will have to program every parser, and every function. The data is turned into a variant/enum at the other end NO MATTER WHAT. Is it really worth the time to convert (CODING TIME) a piece of labview data to a human readable screen, transmit it over tcp, convert it back to labview data, then turn it into a varaint/combo?

I don't actually have to maintain this code, I have a bias toward a particular method (it might be more obvious now.)

The criteria for selection are:

Flexibility.

Speed.

Maintainability.

Time to implement.

Ease of debugging.

By the way, the xml-rpc is awesome. I'll suggest that as another alternative.

~Jon

Link to comment

Well. My 2 cents.

In practical terms; to transmit data over TCPIP you only need to know the length (ignore transport layers-at the application layer). How you bundle that data into the payload is irrelevant as long as you know how many bytes you are expecting. So simplest and most effective is a n-bit length and then your payload. You can use delimiters, but then you cannot send binary data without escaping it all and/or you have to put a lot more logic into your software to keep reading and testing data to find the end.

That ticks all your boxes, for sending and receiving. It's the payload, however, that you need to decide how to package to make it "future" proof. Abstract the interface from the data and treat them separately. Once you have decided on how you are going to package it, it will either be a simple case of adding a length byte and transmitting, or the packaging will dictate that you use delimiters and (probably) some bloaty engine to parse it.

Link to comment

the reason for using variants is that it is a public data structure (i.e. not labview data. some people are concerned that LV will change the way it flattens clusters or something to that effect.)

Weird reasoning. The LabVIEW variant is not the same as an ActiveX Variant. LabVIEW does wrap ActiveX variants in its own variant to make them compatible diagram wise, but the actual internal implementation of LabVIEW variants is NOT an OLE/ActiveX Variant. And the same applies to the flattened format which is just as proprietary as the other flatten formats, although fairly well documented, except for the Variant. NI usually does a good job in maintaining compatibility with documented behavior but reserves the right to change any non-documented detail at any time. In case of the flatten format they changed the typedef description internally with LabVIEW 8.0 and in the mean time even documented that new format but maintained a compatibility option to return the old flatten typedef. The actual flattened data format stayed the same, except of course was extended to support new datatypes (I/U64, Timestamps, etc.).

The only real change in flattened data itself was in LabVIEW 4 when they changed the Boolean to be an 8 bit value instead of a 16 bit value (and boolean arrays changed to be arrays of 8 bit integers, whereas before they were packed).

Other changes in the flatten data format were in various versions in how refnums got flattened but NI does not document most refnums internals and therefore it's internal implementation is private and can not be relied upon. But if you know where refnums are you can usually skip them in the datastream without version dependency (almost :rolleyes:).

And claiming ActiveX Variants are a standard is also a bit far reaching. It's a Windows only implementation and many other platforms don't even have such a beast.

Link to comment

OK, so it sounds like the only way to be sure of your compatibility for future labview version is to not flatten data to string. NI will just change it at some point and it will affect all past projects. I guess a human readable, proprietary transfer mechanism will the preferred system for us.

Link to comment

OK, so it sounds like the only way to be sure of your compatibility for future labview version is to not flatten data to string. NI will just change it at some point and it will affect all past projects. I guess a human readable, proprietary transfer mechanism will the preferred system for us.

I'm not sure this is the right take-away message. How LabVIEW flattens data to string (serializes) is up to NI to decide but they've done a good job providing documentation and backward compatibility. And if you're going to use TCP/IP, you have to serialize (flatten) the data at some point since the TCP/IP payload has to be a flattened string (byte array) anyway. I've got code going back to LabVIEW 7.1 that I use the flatten to string functions with and it hasn't broken yet (LabVIEW 2010) and I don't expect it to in any major way. The flattened string (serialized data) is used by way too many people in way too many applications (like yours, possibly!) for NI to risk arbitrary and non-backward compatible changes.

Mark

Link to comment

OK, so it sounds like the only way to be sure of your compatibility for future labview version is to not flatten data to string. NI will just change it at some point and it will affect all past projects. I guess a human readable, proprietary transfer mechanism will the preferred system for us.

I'm sure you are not working with Windows anymore, since Windows changes the way of working and data formats with every new version. Not sure you will find another solution though without at least some of that problem too :rolleyes:.

As to flattened dataformat changes, lets see, one incompatible change at version 4.0 since LabVIEW was introduced as multiplatform version in 1992. (Version 4.0 was around 1995.) Sounds at least to me like a VERY stable dataformat.

Link to comment

I'm sure you are not working with Windows anymore, since Windows changes the way of working and data formats with every new version. Not sure you will find another solution though without at least some of that problem too :rolleyes:.

As to flattened dataformat changes, lets see, one incompatible change at version 4.0 since LabVIEW was introduced as multiplatform version in 1992. (Version 4.0 was around 1995.) Sounds at least to me like a VERY stable dataformat.

My preference is to take the risk that flattened data will maintain compatible with future versions of labview.

Link to comment

Is there any labview data friendly standardized (i.e. open source) way to flatten LV data? I would see that as useful as a "future proof" scheme. Something like C style structs (only a struct is a cluster) It wouldn't have to do all LV data (I don't care about non fundamentals like refnums, or picture controls.)

Extend that question for a human readable version.

It definitely would have to do the flatten/unflatten (to a variant is OK at that point because we'd be in the native LV version.)

~Jon

Link to comment

Is there any labview data friendly standardized (i.e. open source) way to flatten LV data? I would see that as useful as a "future proof" scheme. Something like C style structs (only a struct is a cluster) It wouldn't have to do all LV data (I don't care about non fundamentals like refnums, or picture controls.)

Extend that question for a human readable version.

It definitely would have to do the flatten/unflatten (to a variant is OK at that point because we'd be in the native LV version.)

~Jon

Have you read the actual document that describes the flatten format of LabVIEW data? For the fundamental datatypes like skalars and structs it can't get much more standard than the default C data format. The only LabVIEW specifics are the prepended string and array sizes, and the structure alignment of 1 bytes, as well as the default big endian byte order.

It only gets LabVIEW specific when you talk about the aforementioned array sizes that get prepended, complex numbers and extended precision datatype, and other LabVIEW specific datatypes such as timestamps, refnums, etc. As to Open Source there exists an implementation although not in C but in LabVIEW. Checkout the OpenG lvdata Toolkit. Feel free to translate that to a C library or any other language of your choice :D.

  • Like 1
Link to comment

Have you read the actual document that describes the flatten format of LabVIEW data? For the fundamental datatypes like skalars and structs it can't get much more standard than the default C data format. The only LabVIEW specifics are the prepended string and array sizes, and the structure alignment of 1 bytes, as well as the default big endian byte order.

It only gets LabVIEW specific when you talk about the aforementioned array sizes that get prepended, complex numbers and extended precision datatype, and other LabVIEW specific datatypes such as timestamps, refnums, etc. As to Open Source there exists an implementation although not in C but in LabVIEW. Checkout the OpenG lvdata Toolkit. Feel free to translate that to a C library or any other language of your choice :D.

I would be interested in a document describing the flattening format (I didn't think that this information was released). This is the closest thing I could find on a google search:

http://mp.ustu.ru/Users/Kalinin/Files/LabView/National%20Instruments6.0/LabVIEW%206/manuals/datastrg.pdf

Something on ni.com or in the help files would be preferred.

Link to comment

I would be interested in a document describing the flattening format (I didn't think that this information was released). This is the closest thing I could find on a google search:

http://mp.ustu.ru/Us...ls/datastrg.pdf

Something on ni.com or in the help files would be preferred.

Another case of someone not finding the forest because of all the trees :rolleyes:. It's part of the LabVIEW online documentation for quite some time already. Based on that document too. Help->LabVIEW Help...->Fundamentals->How LabVIEW Stores Data.

Opening the help and searching for flatten would have given you this in less time than it took to write your post.

Link to comment

OK, here is my 2 cents. Regarding the comment about wasting developer times and doing things simply because it is quick is NOT the best mindset for solving a problem. You have repeatedly mentioned concerns about future proofing your code so it would seem it is worth your time to design a good, maintainable solution. Quick and dirty doesn't sound like the best approach. While it might work now, it could very likely bite you in the butt later. Spend the time to plan up front.

Some quick questions I though of which may help you decide the best solution are:

Will this forever be a LabVIEW only solution?

If yes, variant or flatten to string will work. If there is any chance you may have these messages come or go to an application written in another language then don't use ANY native LabVIEW type. The basic tuple style message suggested earlier is probably the most flexible and will easily work in other languages.

What is the reason to select human readable?

If it is simply because it is generic it is only beneficial if you will actually need a human to read it. If only machines need to deal with the data use a format that is machine friendly and save the effort of translation.

Given the track history of National Instruments and maintaining the format for variants/flatten to string do you really need to be that concerned about using that format? The small likely hood of this happening can be dealt with in the future if necessary.

My personal recommendation would be to define a generic, language agnostic format. This gives you the greatest flexibility and allows clients written in other languages to easily be used.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.