Persisting classes and geneology

Daklu · April 7, 2010

For those that missed it, AQ gave an interesting Advanced LVOOP online presentation today. In it he briefly touched on the subject of how persistant classes are able to mutate data from previous versions of the same class. Very few things will break the geneology; however, if the class is renamed the version info is reset and all geneology is lost.

As I've been thinking about this I realized this has major repurcussions on my workflow with respect to reuse code. My standard practice is to use regular naming styles in my source code, such as MessageQueue.lvclass or FileIoAdapter.lvclass. During the build process, whether I'm using OpenG Builder or Source Distribution builds, I'll append a major version tag to my class name (MessageQueue__v1.lvclass) so I can have multiple major versions installed on my dev system at the same time.

If I'm understanding this correctly, when the build process creates MessageQueue__v1 it discards the geneology information, meaning even though MessageQueue version 1.0.0.4 can successfully convert MessageQueue version 1.0.0.3 persistance data, MessageQueue__v1 version 1.0.0.4 cannot convert MessageQueue__v1 version 1.0.0.3 persistance data. In other words, none of the classes I distribute will be able to convert persistance data from previous versions of the same class.

Furthermore, at first I assumed it would only affect those classes data which is designed to be saved to disk. On further reflection this has much more severe, and subtle, implications. Other developers could have any number of reasons for needing to persist the state of my reusable classes. As a reuse code developer I've unknowingly made certain asumptions about how it will be used--namely, that it won't need to be saved to disk. It looks like my development practice and lack of foresight has created a gaping compatibility problem with persisting any of my reuse code.

Am I understanding these consequences correctly? If so, is there a way to transfer the geneology information from MessageQueue.lvclass to MessageQueue__v1.lvclass? I assume I cannot simply copy the xml from one lvclass file to the other.

mje · April 7, 2010

Is this presentation available for those of us who couldn't make the live version? Link?

Aristos Queue · April 7, 2010

Am I understanding these consequences correctly?

Yes, it sounds like you have the right of it.

If so, is there a way to transfer the geneology information from MessageQueue.lvclass to MessageQueue__v1.lvclass? I assume I cannot simply copy the xml from one lvclass file to the other.

Yes, but it wouldn't help you.

You have A.lvclass at version 1.0.0.4. You save lots of data. Now you rename the class to B.lvclass.

You certainly could bump the class version in the file from 1.0.0.0 to 1.0.0.4. And you could cut and paste the mutation history from A.lvclass to B.lvclass. That's great -- you now have a version 4 of B that can load any data that was flattened as B versions 0, 1, 2 or 3. The problem is that all of your data was flattened as *A*. The name of the class and the version number are both part of the flattened data. It has to be because when I flatten data on an A wire, that data might be a child class like ChildOfA.lvclass. When I unflatten data onto an A wire, the data has to declare which class it represents.

So, yes, you could preserve all the mutation instructions of your previous class on the new class, but it doesn't make any of your data accessible. To be able to unflatten old data, you have to be a new version of the same class and "same class" means same name.

This also has ramifications for people who use the Application Builder options to add a prefix to their files as part of the build. You should stop using that option if you want data versioning of class data to work from one version of your distribution to the next. Otherwise every distribution you produce is version zero because the class is just brand new renamed.

Is this presentation available for those of us who couldn't make the live version? Link?

The presentation isn't posted yet. When it is, it'll be linked from here: http://decibel.ni.com/content/docs/DOC-8462

smenjoulet · April 7, 2010

If so, is there a way to transfer the geneology information from MessageQueue.lvclass to MessageQueue__v1.lvclass? I assume I cannot simply copy the xml from one lvclass file to the other.

Yes, but it wouldn't help you.

You have A.lvclass at version 1.0.0.4. You save lots of data. Now you rename the class to B.lvclass.

You certainly could bump the class version in the file from 1.0.0.0 to 1.0.0.4. And you could cut and paste the mutation history from A.lvclass to B.lvclass. That's great -- you now have a version 4 of B that can load any data that was flattened as B versions 0, 1, 2 or 3. The problem is that all of your data was flattened as *A*. The name of the class and the version number are both part of the flattened data. It has to be because when I flatten data on an A wire, that data might be a child class like ChildOfA.lvclass. When I unflatten data onto an A wire, the data has to declare which class it represents.

So, yes, you could preserve all the mutation instructions of your previous class on the new class, but it doesn't make any of your data accessible. To be able to unflatten old data, you have to be a new version of the same class and "same class" means same name.

So admittedly this is the end of the day and I haven't given it too much thought, but would it be possible in future version of LV to decouple the Class Name from the .lvclass file name on disk?

Would that allow us to avoid this issue with renaming a class? I'm willing to bet that in most cases (as Dave points out) we're just needing the *file* renamed and not the actual CLASS itself. If you actually needed to rename the CLASS, then it works as it does now and wipes the mutation history.

Any problems that could engender?

-Scott

Nice presentation by the way!

Daklu · April 8, 2010

You have A.lvclass at version 1.0.0.4. You save lots of data. Now you rename the class to B.lvclass.

My use case is actually a little different. In source I create A.lvclass ver 1.0.0.3 and during the source distribution build rename it to B.lvclass ver 1.0.0.3. The class is deployed and lots of data is saved as B.lvclass. Back in source I modify A and save it as A.lvclass ver 1.0.0.4. During the build I rename it to B.lvclass ver 1.0.0.4 and deploy it as such. A.lvclass ver 1.0.0.4 contains the correct geneology data, but doesn't work because it has the wrong name. B.lvclass ver 1.0.0.4 has the right name, but is missing the geneology information.

You certainly could bump the class version in the file from 1.0.0.0 to 1.0.0.4. And you could cut and paste the mutation history from A.lvclass to B.lvclass. That's great -- you now have a version 4 of B that can load any data that was flattened as B versions 0, 1, 2 or 3.

That's exactly what I am needing. From a practical standpoint I'm not at all comfortable including that as part of my deployment process. It's far too fragile. You mentioned a mutation history api in your presentation but I didn't catch it's location. Would that provide a more robust way to transfer geneology? This could be the issue that forces me to abandon major version suffixes. (We may come kicking and screaming into the 21st century, but we'll get there eventually.)

Why have I continued to use suffixes? Convenience primarily. While developing reuse code I often need to have the source code, built code, and deployed code open all at the same time. I have not gotten into the habit of opening a new project before double clicking on the code in windows explorer. That means when I do open other code it uses either the default app instance or the top most user app instance. I frequently end up with unintentional linking. Maybe I just need to adopt better habits. If I had the choice I'd set up LV to automatically open a new app instance when interacting through windows explorer. (wink, wink, nudge, nudge)

So admittedly this is the end of the day and I haven't given it too much thought, but would it be possible in future version of LV to decouple the Class Name from the .lvclass file name on disk?

Would that allow us to avoid this issue with renaming a class? I'm willing to bet that in most cases (as Dave points out) we're just needing the *file* renamed and not the actual CLASS itself. If you actually needed to rename the CLASS, then it works as it does now and wipes the mutation history.

Ultimately what you're talking about is namespaces. I know I've harped on LV's relatively primitive namespacing functionality in the past--I'll spare you the agony this time around. I'll just say I would absolutely love to see more robust and flexible namespacing in Labview, (moreso than Interfaces, for those that remember that soapbox) but I'm not holding my breath.

Yair · April 8, 2010

This was precisely the reason I brought up this point yesterday - people are not aware of this issue and it's very easy to get bitten.

When I unflatten data onto an A wire, the data has to declare which class it represents.

The problem is that this declaration is done by name. There are legitimate reasons for wanting to change a class name without wanting to lose the history, so it would have been nicer if this was some sort of GUID and we would get the option to reset this GUID when renaming the class.

Here are some examples for legitimate reasons:

Code distribution (Daklu's case).
Spelling mistakes (Oh, I forgot to type the F in "shift"?).
Changes in the class subject (I first encountered this issue when The Device was nearing the end of the development and marketing gave it its final production name. "OK, I'll just rename the class", thought I).

I'm willing to bet that in most cases (as Dave points out) we're just needing the *file* renamed and not the actual CLASS itself.

I would really like to avoid this. The class name is supposed to be descriptive and having different names will just create confusion (see case 3 above).

For further reading, you may wish to have a look at the preserving class data document: http://zone.ni.com/devzone/cda/tut/p/id/6316

Daklu · April 8, 2010

This was precisely the reason I brought up this point yesterday - people are not aware of this issue and it's very easy to get bitten.

Yeah, I was pretty sure renaming the class cleared the geneology; I just hadn't realized the impact renaming the class during the build process has on the ability to reuse my code in applications.

There are a couple reasons I didn't make the connection:

1. I've always assumed the LV builds automatically compensate for changes made during the build. Preserving geneology seems like a pretty basic expectation if I'm creating a source distribution. (Recent attempts at renaming mnu files in my reuse modules during the build clued me in that the builder isn't quite what I expected.)

2. I don't typically persist classes data to disk, so I don't think about it that much. Designing reuse code requires thinking about what other developers will do with my code, and I hadn't properly considered they might want to persist my classes to disk.

Lesson learned.

The class name is supposed to be descriptive and having different names will just create confusion

I agree. Descriptive naming really helps readability. I frequently rename classes, libraries, and vis as their functionality changes over time. I do wish LV better supported refactoring.

Aristos Queue · April 8, 2010

There are legitimate reasons for wanting to change a class name without wanting to lose the history, so it would have been nicer if this was some sort of GUID and we would get the option to reset this GUID when renaming the class.

Yeah, and as soon as we implemented that, even more people would be complaining that when you change the name of a class that the file *doesn't* get renamed. I say this based on the screams and complaints that the project originally generated when renaming a virtual folder did not rename the folder on disk and moving a VI from one folder to another didn't move the file on disk.

But beyond my wild speculation about the preferences of LV users, there's a very big usability problem with the GUID solution... file copying. How do you resolve the problem of File >> Save As? When a user does Save As, are they making a backup copy (where you'd want to preserve the GUID exactly) or are they forking a copy (where they'd want a new GUID)? When they have to recreate a missing class, how does a user fill in the GUID? Ok, you upgrade this to be a user defined name instead of a GUID. Great... now you have the same problem with misspelling the name that you had before making you want to change it. So that solves nothing. And further, now when you create a class you are prompted for the name of the class and the file name for the class. I'd actually recommend making them *not* match, otherwise you're going to get burned in the rare cases when they don't match -- I know this because it is exactly what happens to me right now in MS Visual Studio where the name of the class and the name of the file start off the same but someone may change the internal class name without renaming the file (generally because they don't want to mess up the source code control change tracking). LabVIEW made the opposite choice, and what we gain in clarity of file contents and findability, we lose in data preservation and source code control. Which is better?

The debate about name handling in computer environments is endless. I have joked before that management of names is the primary job of a computer scientist. We, as code poets, must give to airy nothing a local memory address and a name. And my studies of mythology and history have taught me that renaming a thing is not done lightly nor without consequence. In the end, we picked a paradigm that seemed like it would work for most situations. It has its pros and its cons.

If you *really* want a solution, do this: Save every individual .lvclass file inside its own .llb. That gives you the ability to rename the file on disk without changing the class name. It won't help the data unflatten case because that's the data name and any layer of indirection there leads to the GUID problems, but it will help with the source code control name change problem. But I promise you, you'll be ticked off by it, just like I am in MSVS, just as soon as the file name stops matching the contained class name. Trust me.

Is this presentation available for those of us who couldn't make the live version? Link?

The presentation is now available.

http://zone.ni.com/wv/app/doc/p/id/wv-2003

Yair · April 9, 2010

people would be complaining that when you change the name of a class that the file *doesn't* get renamed.

And I would be one of them. I didn't say the GUID should have anything to do with the class name. Ideally, it should be something which the user does not see.

I agree that there are problems with the GUID suggestion, but I'm not sure the ones you mentioned are necessarily the biggest ones. Here are some attempts at addressing those concerns.

First, any copy operation on the class creates a full copy, GUID and all. One option of handling the GUID issue is that when you next open the class, it recognizes that its current name is different from its last name and asks you "do you want to change the GUID?". A good solution? I don't know. It will likely confuse some users.

Another potential mechanism is adding a "change GUID" button to the properties dialog.

The fill-in-a-missing-class scenario? Probably not a problem, if the class is missing, so is its mutation history.

LabVIEW made the opposite choice, and what we gain in clarity of file contents and findability, we lose in data preservation and source code control. Which is better?

I don't know. Both clarity and data preservation are important, or I wouldn't be running into this issue. All I know is that there is a real requirement here - I want to be able to rename a class and still maintain its history. Is there a satisfactory solution? I don't know. Is the current one the best that can be done? Don't know that either. I'm sure you gave this way more thought, but the problem remains.

Aristos Queue · April 9, 2010

First, any copy operation on the class creates a full copy, GUID and all. One option of handling the GUID issue is that when you next open the class, it recognizes that its current name is different from its last name and asks you "do you want to change the GUID?". A good solution? I don't know. It will likely confuse some users.

If you do this, you end up with not allowing a class to load into memory because it has the same GUID as another class because having two classes in memory with the same GUID means you cannot unflatten any data because it is ambiguous which class the data represents. The name saved with the data -- what I'll call the data name of the class -- has to be constant from the time the data is flattened to the time it is unflattened. Either that data name is tied to the file name (so that there is an easy and obvious way to control what that name is and to explain why two classes conflict with each other when the data names match) or it is its own independent entity (so that you can change it without changing the name of the class).

Black Pearl · April 9, 2010

AQ: thank you for the presentation. On the download, the audio tracks are a bit screwed. I get the Q&A section played over you speech. It starts about min 32.

Felix

Daklu · April 9, 2010

Yeah, and as soon as we implemented that, even more people would be complaining that when you change the name of a class that the file *doesn't* get renamed. I say this based on the screams and complaints that the project originally generated when renaming a virtual folder did not rename the folder on disk and moving a VI from one folder to another didn't move the file on disk.

Chalk it up to users adjusting to a new paradigm. Do you still receive those complaints? And looking back, don't you think *not* renaming folders or moving vis on disk was the right decision?

there's a very big usability problem with the GUID solution... file copying. How do you resolve the problem of File >> Save As? When a user does Save As, are they making a backup copy (where you'd want to preserve the GUID exactly) or are they forking a copy (where they'd want a new GUID)?

Well it's generally not a good idea to change the default behavior of operations, so although I'd much rather have the Save As and Rename operations preserve the geneology it's probably better to leave them alone. Perhaps new options for "Save As and Preserve" and "Rename and Preserve?" Or adding a "Preserve Geneology" checkbox to the Save As dialog box?

When they have to recreate a missing class, how does a user fill in the GUID?

The only time I have to recreate a missing class I've purposely deleted is when LV won't let me remove the missing class from a library that for one reason or another didn't get the message the class is being deleted. This seems like an overrestrictive UI issue, not a real use case.

If you do this, you end up with not allowing a class to load into memory because it has the same GUID as another class because having two classes in memory with the same GUID means you cannot unflatten any data because it is ambiguous which class the data represents. The name saved with the data -- what I'll call the data name of the class -- has to be constant from the time the data is flattened to the time it is unflattened. Either that data name is tied to the file name (so that there is an easy and obvious way to control what that name is and to explain why two classes conflict with each other when the data names match) or it is its own independent entity (so that you can change it without changing the name of the class)

It sounds like the class file name is the first level type check... if the file name embedded with the data doesn't match the unflattening class file name all other operations are skipped and an error is raised. What happens if I save an object of A.lvclass to disk, replace A's geneology with B.lvclass' geneology, and try to unflatten the data from disk? My original thought was to insert additional error checking during the unflatten process, then I realized you're probably not saving any type information other than the class name itself and another layer of error checking isn't possible.

So given that flatten to string is a very low level function... maybe we need a higher level api for certain class behaviors. Since LV doesn't have Interfaces, perhaps Labview Object could have some default methods (such as Flatten/Unflatten Strings) added to it. Developers using the class level api gain the benefit of more flexibility, such as being able to unflatten class data of a different name (as long as the GUIDs match) and overriding the Flatten/Unflatten methods to make it possible to recreate dynamic resources (queues, etc.) at runtime.

We, as code poets, must give to airy nothing a local memory address and a name.

Obviously I'm not a code poet -- this crashes my parser.

And my studies of mythology and history have taught me that renaming a thing is not done lightly nor without consequence.

I'm pretty sure the concepts of refactoring and well-documented code weren't around while Prometheus was watching his liver get eaten.

Aristos Queue · April 10, 2010

I'm pretty sure the concepts of refactoring and well-documented code weren't around while Prometheus was watching his liver get eaten.

Ah, a non-believer. A real Greek knows his liver is still being eaten. :-)

Daklu · April 10, 2010

Ah, a non-believer. A real Greek knows his liver is still being eaten. :-)

The only Greek I know is "Gyro."

Yair · April 10, 2010

Your concerns about having multiple classes with the same GUID are legitimate. Daklu's suggestion of throwing an error in such a case and adding an explicit "rename and preserve history" option sound reasonable.

This will mean that people who copy and rename the file outside of LV will get an error.

We, as code poets, must give to airy nothing a local memory address and a name.

Assuming you got grades which were less than A in your CS courses, did you defend the work under the shield of the poetic license argument?

Also, personally, I'm more of a code cartoonist than a code poet, and as such, should be exempt from such stringent requirements.

Daklu · April 12, 2010

So, given that someone editing a reuse class can unintentionally screw up LV's ability to unflatten previously saved class instances, what's the best way to create a unit test to make sure that doesn't happen? Is it to create an instance with non-default data, save it to disk, and make sure that all subsequent class versions can correctly recover the data? If so, I presume I'd need to create a new saved object for unit testing every time LV bumps the version number?

Your concerns about having multiple classes with the same GUID are legitimate. Daklu's suggestion of throwing an error in such a case and adding an explicit "rename and preserve history" option sound reasonable.

I take back that suggestion. The more I think about it the more I believe the issues with the mutation history isn't the problem, it's a symptom of the problem. The problem as I see it is:

1. NI's recommended way of persisting an object is to flatten it, and

2. NI hasn't provided class developers with a way to hook into and override the flatten prim.

I can't fault NI for removing the requirement to write mutation code from LVOOP. That could have easily become an obstacle preventing the adoption of classes by the community. However, in hiding the code mutation capabilities behind the curtain they have also removed the ability for developers to write their own mutation code. That decision leaves developers with more complex use cases unable resolve them.

Also, personally, I'm more of a code cartoonist than a code poet, and as such, should be exempt from such stringent requirements.

Huh... a poet and a cartoonist. I'm more of a sparkly glue and dry macaroni kind of coder myself...

Yair · April 13, 2010

That decision leaves developers with more complex use cases unable resolve them.

I'm not sure how to describe the general problem, but I would say that this isn't a complex use case, rather one which is likely to be encountered on occasion. The rest of the mutation code seems to work fine and has some great advantages (such as correctly mutating the data when you rename a control, which I view as basically the same kind of change), although I admit to having some discomfort at not being able to supervise or override it easily.

I'm more of a sparkly glue and dry macaroni kind of coder myself...

But surely that would mean that you end up with spaghetti code!

Daklu · April 13, 2010

But surely that would mean that you end up with spaghetti code!

Rats... good point. Maybe I should move on up to sidewalk chalk. (And I so liked the sparkly glue...)

Aristos Queue · April 13, 2010

1. NI's recommended way of persisting an object is to flatten it, and
2. NI hasn't provided class developers with a way to hook into and override the flatten prim.

You are about to scowl at me just as much as Paul did (I forget his username on LAVA -- he's the Paul who works on all the astronomy stuff) when I gave him the following answer when he was talking about customizing the XML flattened format...

The answer is that as soon as you cross the line where the contents of the string are not enough to know which class should unflatten it, you have to write your own UnflattenFromString function from scratch. You might use our UnflattenFromString internally, but there's no way for LV to magically choose which class should unflatten the data if you want a class other than the one named to do the work of unflattening. Even if LV gave you the ability to add class specific code to the unflatten sequence, the class that would be invoked would be the class named in the flattened data, and that's exactly the class that no longer exists in your scenario. The only way this might work is if the class of the old name were still loaded into memory and had in its mutation history instructions that said, "When I am asked to unflatten, I actually unflatten data of this entirely other class over there..." Then you get into some really messy cross link scenarios where the mutation history has to have a path to the other class even though nothing in the class' current version references that other class at all, you have to keep old classes around long after they should be gone, etc.

I've delved into allowing classes to have custom hooks to flatten and unflatten data, but none of the variations that I've played with would solve the rename problem. For that, you have to write your own. The easiest is a subVI that takes in a string, does a find-and-replace to substitute the old name for the new name and then calls the unflatten primitive with the modified string. That's a fairly expensive operation, and it's pretty heavy when you don't know if it is needed or not (i.e., you're reading a string that was written out after the name change, but you still have to do the find and replace just in case). It is error prone if there happens to be an actual string value embedded in the flat data that is the name of the old class but means something else entirely.

Magical mutation is a major time savings. But once you break the spell, you're back to writing the full parse-and-mutate-if-needed code that programmers must write in other languages.

Daklu · April 13, 2010

You are about to scowl at me just as much as Paul did...

Not at all. I enjoy learning about Labview's implementation details--especially when it explains why I can't do what I think I should be able to do.

but there's no way for LV to magically choose which class should unflatten the data if you want a class other than the one named to do the work of unflattening. Even if LV gave you the ability to add class specific code to the unflatten sequence, the class that would be invoked would be the class named in the flattened data, and that's exactly the class that no longer exists in your scenario.

I agree there's no good way to resolve the problems of my or Yair's existing persisted data in our use cases. I was thinking about changes that would allow us to deal with those problems in the future.

I've delved into allowing classes to have custom hooks to flatten and unflatten data, but none of the variations that I've played with would solve the rename problem.

So from what I've seen it looks like unflatten first checks the class name. If those match then it checks the class version. If the class version in memory is greater than the class version on disk, then it goes on to unflatten the data. I think there are still error checks in the unflattening process. If the data on disk is 4 bytes and the class is only expecting 2 bytes LV throws an error. Speaking conceptually, wouldn't it be possible to have flatten/unflatten hooks that replace the class name with a class guid?

(FWIW, I wasn't actually proposing having classes hook into the flatten/unflatten prims. I think in the long run having class methods magically hook into an existing LV prim would be very confusing. I'd much rather see NI develop a better OO framework designed specifically for classes.)

Magical mutation is a major time savings.

No doubt... but it's also an insurmountable obstacle when the implementation doesn't fit our needs.

Sign In

Persisting classes and geneology

Recommended Posts

Daklu

mje

Aristos Queue

smenjoulet

Daklu

Yair

Daklu

Aristos Queue

Yair

Aristos Queue

Black Pearl

Daklu

Aristos Queue

Daklu

Yair

Daklu

Yair

Daklu

Aristos Queue

Daklu

Join the conversation

Browse

Activity

Important Information