Serializing aggregated objects?

Daklu · November 12, 2011

So we've talked a lot about the dangers of using LV's built-in object serialization techniques. When I save objects to disk I'll throw the data in a typedef, convert it to xml, and write it to disk. All is good. If I'm going to need version mutation I can wrap the xml data in a cluster with a version number, convert *that* to xml, and write to disk.

That second xml conversion makes it a little difficult to read the original xml string. Usually it's not that big of a deal--I can figure it out enough to modify the data if I need to.

In my current project I have a SystemConfig object that is an aggregation of its own data and unique config objects for several subsystems. I did this so I wouldn't have half a dozen different config files. Each config object has it's own serialization implementation, similar to the one above. (It doesn't use JKI's Easy XML. I've been exploring that on my own.) When the SystemConfig.Serialize method is invoked it in turn invokes each object's Serialize method, puts all the serialized strings in an array, adds the version number, and flattens it all to xml so it can be written to disk.

This works, but all those xml conversions really mess up the readability. Has anyone found a good way to serialize aggregated objects while maintaining the ability to manually mutate the data and preserve readability?

Jon Kokott · November 12, 2011

I've given up on trying to do this. It is an incredibly difficult task to manage mutation history alone and have things work. I've resolved to never support old objects (even though it sometimes works.) and always store as binary. If someone needs to edit the file I create an editor program which is released alongside the actual test software, and use windows to dispatch it on a special file extension.

Probably not what you want to hear, but I've had waay too many headaches from people manipulating .ini, xml, or any other type of human readable file.

~Jon

mje · November 13, 2011

I've grown an aversion to XML for serialization due to the size of the documents I'm creating. Also, let's face it: to the non programmer, XML is NOT readable.

However, I think the only way of really doing this is to have each class implement a common interface for serialization, whether it's XML, binary, or something in between. In the end, it means nothing you serialize will directly inherited from LABVIEW Object, but whatever your core serialization superclass is. Then each class implements their own ToXML, or whatever. Messy and cumbersome. If you support reading multiple versions, somewhere is a case structure monolith in each class too.

For the record, my methods usually involves dumping serial data into an anonymous cluster and writing it to disk (or whatever). Depending on the implementation the cluster might be proceeded by a version and class id of some type. Usually I do binary, but I do use XML from time to time if the data is small, say less than 50 MB.

crelf · November 14, 2011

The question becomes defining what "readable" means. In these cases, I tend to think that the software can read it, so that's one level or readability, and I can read it to a certain level, so that's another level of readability. Does your average user need to dive into the xml? Hopefully not - your software should be what's taking care of the data (who knows what a user can do to it), and if you do need the user to modify the data in an xml file, then provide a file editor that protects the stuff they shouldn't play with, and only exposes encapsulated access to the things they should.

I know it doesn't answer your question, but I think your question might lead to others in the application of your architecture.

Daklu · November 14, 2011

Good answers from everyone. Thanks.

The question becomes defining what "readable" means.

Primarily I meant readable by me and readable enough for an advanced user to be able to make a change if I gave them instructions on what to change and where to change it.

One of the things stored in the config file is a list of available microscope zoom levels populating a certain ring control. The customer wanted to be able to easily add new zoom levels to the list. Creating a user interface and doing that via software is the better solution, but time is short so I was thinking directly editing the xml config file would be an easy work around. That's when I ran smack into the readability issue of flatteninig an xml string to xml.

Based on the feedback and time constraints, I think I'll stick with binary config files and push that feature off to version 2.

Daklu · November 14, 2011

Hmm... odd question I had while converting all my xml outputs to binary strings. Flatten to String has error input and outputs, but the help file doesn't indicate if function generates any errors on its own or if it just passes through the errors it receives. I don't think it produces its own errors, but I'm not sure. Anyone else know?

Darin · November 14, 2011

I don't think it produces its own errors, but I'm not sure. Anyone else know?

If you use 7.x mode and wire something which can not be represented by the Type String then supposedly you can get an error. Don't ask me what that something is, I don't know.

Aristos Queue · November 15, 2011

Flatten to String can return Out Of Memory.

PaulL · November 15, 2011

OK, I will ask again what I think is an obvious question here: Why doesn't NI include a native feature to serialize LabVIEW objects in an exchangeable way? (Alternatively, why doesn't NI provide enough access to allow a third party to develop such a framework?)

For me, "exchangeable" definitely means in a manner that allows the data to shared between platforms. (Hence having "default data" without specifying the values of the default data is not allowed.) Moreover, using a more common format (such as "Simple XML" is appropriate.)

Of course, including the object version number is only meaningful within LabVIEW, but this is useful within LabVIEW thanks to the LabVIEW objects capability to translate between versions. (Note: I recognize the versioning can't avoid all possible issues, but in practice I think that is rarely a practical issue.)

I understand that for security reasons a developer may want to turn off the ability to serialize an object. To support that, I envision a checkbox to allow serialization (default = True) in the class properties dialog.

I think XML is the best option for this for several reasons:

1) It is a common way to serialize objects in different environments. This means that I can exchange serialized data with Java applications, for example.

2) It is readable, albeit not easily readable, by human beings. (I actually don't want humans to read serialized data very often--and really never the operator, but it is good that they can on the rare occasion when they need to do so.)

Why I think NI should implement this:

1) This is relatively straightforward for NI to do since NI can already serialize a class to the current (noninterchangeable) LabVIEW XML format.

2) Having this capability would greatly expand the application space of LabVIEW, since it would make it orders of magnitude easier to interface with nonLabVIEW applications. This is by far the most compelling reason to include this feature.

3) That there is a need for this is quite obvious, given the number of lengthy discussions just on LAVA about this topic.

4) The current situation, in which each class must contain specific code for serialization, is patently inefficient and nonsensical.

5) In other major languages meaningful object serialization is a given, and LabVIEW should include (indeed, must include) this functionality to be competitive.

For the record, to serialize LabVIEW object data for communication within LabVIEW we use either the methods to flatten to string or to XML, and this works fine. I realize it's not theoretically 100% fool-proof, because of potential issues across different object versions, but in practice we use version control, so that we build applications using the same versions of interface code (usually), and we only have one large system, so we can pretty easily control our deployed applications. (I think that versioning an application could achieve the same.) In practice, we've never experienced a version problem with this approach, and it avoids having to write any class-specific code (which, again, a developer should definitely not have to do) to support serialization.

PaulL · November 15, 2011

By the way, we don't use objects directly to store configuration information any more.

We started out having a Configuration Handler class that contained classes with different specific configurations. We moved away from that since:

1) A LabVIEW class does not have a control (natively) associated with it. A cluster (or any primitive) does. We need the control for the configuration editor view.

2) In practice, configuration data is just flat data. A primitive or typedef is perfectly capable of representing this. The only advantage I can see to using an object is the ability to remember versions, and this is in practice not worth the trade-off given the limitations of LabVIEW objects with respect to 1 and 3.

3) We can easily and directly flatten data to a file (e.g., XML) if it is a primitive or a typedef.

So, what we have for each application is a set of items (almost all are strict typedef'd clusters, but even primitives are just fine) that we write to a corresponding set of files. (Each typedef saves to one file. Each typedef defines a group of closely related values, e.g., compensator parameters.)

Each class that uses these parameters has an an associated initialization method. Thus, Compensator.init() reads compensatorParameters.xml. Each class may read multiple configuration files, and more than one class can read any given configuration file.

Yes, there is a trade-off here because clusters do not have version memory. In practice, though, we have encountered zero issues because of this limitation, and the advantages of this approach have been many.

For clarification, our configuration editor is an Object-Oriented application, but the configuration items are not objects.

PaulL · November 15, 2011

I thought I had posted such an idea to ni.com/ideas, but I couldn't find it. Hence I created a new idea:

http://forums.ni.com/t5/LabVIEW-Idea-Exchange/Support-serialization-of-LabVIEW-objects-to-interchangeable-form/idi-p/1776294

PaulL · November 16, 2011

Dave,

The scheme my group developed a couple years ago is very similar to what you did.

I have attached an example. The missing VIs are EasyXML VIs--my computer crashed and I haven't reinstalled these just yet.

The major difference is that we didn't create an intermediate cluster for each class. Doing so is extra maintenance, doesn't help with writing except to include a version number, and I don't think helps much on reading. (If the data exists with the correct tag and format in the XML, the reader can always find it--even if the order of the data has changed. If the data doesn't exist the reader will throw an exception, unless we write some complicated mutation code within the reader method itself, which, I think, is not a winning proposition. Even then, we would really have a different object as an output. I'm not sure this is even possible.) I must be missing something here, but I don't see what.

Paul

"In my current project I have a SystemConfig object that is an aggregation of its own data and unique config objects for several subsystems. I did this so I wouldn't have half a dozen different config files. Each config object has it's own serialization implementation, similar to the one above. (It doesn't use JKI's Easy XML. I've been exploring that on my own.) When the SystemConfig.Serialize method is invoked it in turn invokes each object's Serialize method, puts all the serialized strings in an array, adds the version number, and flattens it all to xml so it can be written to disk."

OK, we have found it is much easier to have multiple configuration files. Small, specific files are much, much easier to maintain. (If I add an element, just the one small file changes, not everything.) Moreover, we can read the required configuration exactly where we need it, which we have found much simpler than reading everything globally and then distributing the pieces. When each class has only the configuration it needs it is much more coherent as well (which is the most important benefit).

The individual configuration files are much smaller (so that we don't have the monstrous files another poster complained about) and the files are much more easily readable as well, since the structures are inherently simpler. We can always read our files!

For clarification: Again, within LabVIEW (i.e., for configuration files) we don't use EasyXML. We just use the native serialization tools (flatten to string or XML) with clusters, not objects. This works fine. Yes, there could be an issue between versions, but in practice this is not a problem for us, since when we modify a typedef we generate new XML files anyway.

Aristos Queue · August 31, 2012

Prototype of my Serialization library now available (JSON, XML, your favorite format...)

Sign In

Serializing aggregated objects?

Recommended Posts

Daklu

Jon Kokott

mje

crelf

Daklu

Daklu

Darin

Aristos Queue

PaulL

PaulL

PaulL

PaulL

Aristos Queue

Join the conversation

Browse

Activity

Important Information