Are you misusing the Not A Refnum function and putting your app at risk?

Aristos Queue · March 11, 2012

Just so everyone is clear:

Using the Not A Refnum function to decide to create a refnum is ok because the zero refnum cannot be destroyed/released in a parallel thread. But since the primitive does the extra work of validating the refnum, you take a performance hit on the cases that are going to succeed. Using the Not A Refnum function to decide to do an action or not is a race condition because a parallel thread could be destroying that refnum in between the test and the action.

If you need to actually do an operation, do one of these:

If you need the combined behavior of "if the refnum is bad, allocate it, and do some operation on it", then do this:

What is an acceptable use case for the Not A Refnum primitive?

When you're evaluating the status of one refnum before taking an action on another refnum!
Or if you're just displaying information in a Custom Probe.
Or if you have absolutely nothing happening in parallel (in which case, you might reconsider your use of references anyway).

(For advanced users who suggest the use of semaphores to protect the access, that counts only as an OK usage because you still have a performance hit of unlocking and locking the refnum twice.)

If you ever are tempted to file a bug report that the LabVIEW queue functions are broken, please check your use of Not A Refnum first. I guarantee I will when the bug report gets to me. 🙂

And, for the record, all of the above also applies to using "Get Queue Status" and "Get Notifier Status" functions. And any other similar "is this refnum still valid" functions that you are using to make decisions in code.

Edited February 25, 2022 by Aristos Queue
Expand the last sentence to cover other similar nodes.

Fredy · March 12, 2012

Hi Aristos,

is the last "COMBINED GOOD USAGE" correct?

I would expect, that in case of error on Enqueue element, I would create a new queue...

Jiri

Phillip Brooks · March 12, 2012

Also, if the queue is created as a result of an error using enqueue, the subsequent pass through the loop does not pass out the newly created queue (use default if unwired).

Zombies!

Edited March 12, 2012 by Phillip Brooks

Aristos Queue · March 12, 2012

You're right... I need to flip the combined case to the Error case instead of No Error. I took the screenshot before fixing that.

Corrected image (I'll ask Moderator to move this image into the original post)

Phillip Brooks · March 12, 2012

Cross link from the Idea Exchange:

http://forums.ni.com/t5/LabVIEW-Idea-Exchange/Allow-References-to-be-Wired-into-Case-Selectors-to-Check-for/idi-p/1022833

Would the implementation of this idea make this situation less likely?

Daklu · March 12, 2012

I was curious about the performance difference between the left implementation in the first image and the left implementation in the second image, so I whipped up a quick benchmarking test. (Unfortunately I'm still working through some premium membership issues so I can't upload any files or images.) I created an array of 250,000 queues and iterated through them once using Not A Refnum, then again comparing them against a queue constant using the Is Equal function.

Not A Refnum took 39 ms while Is Equal took ~1 ms. So yeah, Not A Refnum definitely takes significantly longer. But Not A Refnum still only takes ~150 nanoseconds per call, so is it worth worrying about? Perhaps if you're running in a really tight loop, but I have to admit I'm rarely concerned about a time hit that small.

(As a side note, creating all those queues took 53.1 seconds for an average of 212 microseconds each. That's 1,000 times longer than the Not A Refnum function.)

--------------------------------

To phrase AQ's point in a slightly different way, it's a question of pretesting versus posttesting. In general, pretesting feels cleaner to me. I find it easier to reason through the code when fewer errors are possible. However, operations on a reference can't be pretested without exposing your self to a race condition. The only thing you can do is attempt the operation and then see if it worked as AQ shows in the "Combined Good Usage" example.

Using the Not A Refnum function to decide to create a refnum is ok because the zero refnum cannot be destroyed/released in a parallel thread.

I don't think it is a practice that should be encouraged. In fact, it should probably be actively discouraged. You're still exposing yourself to a race condition. Let's use your "Performance Problem" snippet as an example and ignore the performance issues you raised.

The purpose of that snippet is to guarantee the output terminal contains a valid refnum. You are correct that we get the desired behavior in those cases where the input terminal has a zero refnum. It will fail the test, allocate a new queue, and because nobody else has that new refnum it is guaranteed to be valid.

But if the input terminal already has a valid refnum on it all bets are off. You end up with the exact same race condition as in your "Bad Usage" example. The only place we can safely use that snippet is in situations where we know the input queue will have a zero refnum, and if we know that there's no reason to check the refnum in the first place.

What is an acceptable use case for the Not A Refnum primitive?

Rather than trying to list the specific examples of where it is and isn't okay to use Not A Refnum, I'd just go with,

Pretesting a reference before performing an operation on it creates race conditions.

It doesn't matter what function is used for the pretest (Not A Refnum, Is Equal, etc.) or what you're actually testing for (valid refnum, specific data values, etc.,) if you're pretesting to decide execution flow there is a race condition. (Unless, as you pointed out, there is nothing happening in parallel.)

Would the implementation of this idea make this situation less likely?

No, but it would make writing code with race conditions easier and more visually pleasing, possibly increasing the number of users who encounter that race condition.

JackDunaway · March 12, 2012

Cross link from the Idea Exchange: http://forums.ni.com...r/idi-p/1022833 Would the implementation of this idea make this situation less likely?

I'll copy my response from the LV IdEx over here:

I have a tendency to agree that this idea could promote "seemingly safe" but actually bad practices, and after AQ's warning in 2009, I actually came to realize this is probably not a good feature to add to the language. Granted, it would be acceptable and syntactically slick to execute code in the <Not a Refnum> case, but the tendency for developers to place code in the <Valid Refnum> case makes it dangerous (even my snippet to promote this Idea makes this mistake!).

I respectfully suggest this Idea be Rejected on the grounds of potentially causing more problems than the syntactical sugar is worth (and also because it complicates the case selection of class types, a much cooler and more useful Idea)

Aristos Queue · March 12, 2012

Cross link from the Idea Exchange:

http://forums.ni.com...r/idi-p/1022833

Would the implementation of this idea make this situation less likely?

Nope. It would make it just as likely... the structure node could not hold the lock on the refnum because you're using nodes that need to be able to lock the refnum inside the structure node. Even if you taught the nodes to recognize when they are directly within a structure node AND the refnum they're given is the one that the structure node is locking, it wouldn't help if the nodes were in a subVI inside that structure node.

Ultimately, this is the problem with references and is why I push so hard against being able to wire a reference directly to a by value terminal for method invocation. Without a single function that checks "is valid and if so do the operation" atomically, there's a race condition. If I write a by value class that has methods "Is Initialized" and "Do Something", wiring a reference to those two functions in sequence is incorrect usage. Whoever writes the reference API needs to build a single VI that locks the reference once and then does both of those operations before unlocking. These sorts of race conditions become ubiquitous very quickly and they're nearly impossible to debug. Heck... half the time I can't even convince people they have a bug because "it works just fine when I test it." And it will... until you've deployed it on your largest customer's end system. And then it will mysteriously fail.

Not A Refnum took 39 ms while Is Equal took ~1 ms. So yeah, Not A Refnum definitely takes significantly longer. But Not A Refnum still only takes ~150 nanoseconds per call, so is it worth worrying about? Perhaps if you're running in a really tight loop, but I have to admit I'm rarely concerned about a time hit that small.

Your time will vary depending upon the type of the refnum. Some types of refnums take longer to validate than others, depending on what system is used for the backing store for that refnum type and, in some cases, how many refnums are in memory. Also, some plug-in type refnums may require calling into a DLL using the UI Thread, so there could be a thread swap. Finally, the "equals not a refnum" test is deterministic, whereas the other test is not. Edited March 12, 2012 by Aristos Queue

ShaunR · March 12, 2012

Well .this isn't a new phenomena. And the accepted way is to create a LV2 style global to get the reference. In fact. I supplied exactly that solution for this thread (although it was for events). I've also noted that JGCode uses this method in his classes. KISS.

jzoller · March 12, 2012

I'm a bit confused on AQ's second "Good usage"... why throw away the error, rather than just handling it downstream?

Daklu · March 12, 2012

And the accepted way is to create a LV2 style global to get the reference... KISS.

Just because it's an accepted idea doesn't mean it's a good one. I took the liberty of downloading the code you provided for the user on the other thread and poked around. Before I comment on the code, let me first say this...

--Your solution is based on the code supplied by the other user, who was trying to address a very specific problem. Sometimes we help people bandage the cut on their arm without pointing out the railroad spike buried in their skull. I understand that. In that particular case it sounds like the solution you gave him does the job. I also understand examples are necessarily simplistic and cannot cover all the cases we are likely to encounter. Still, I'm not sure it's an example of a good general purpose solution.

First, the purpose of your Not A Refnum test is simply to initialize the references on startup. If anyone calls the AE's Close action the entire thing breaks down. The NAR test isn't particularly helpful for keeping the system up and running. The limitations of that AE would be far more clearly communicated by replacing Not A Refnum with an Is First Call function. (Or even better, an explicit "Allocate References" action.)

Second, using an AE as a poor man's mutex can't prevent a race conditions as long as the refnum is exposed to other code. The only way to verify a race condition does not exist is by inspecting the code and making sure no operations are performed on that reference anywhere other than in the AE. Imagine how much fun that will be on a large project.

There are other things that smell too (one event refnum attached to multiple event structures(!?), no clear owner of the references, etc.) but they are sidebars to the question of pretesting references.

Aristos Queue · March 13, 2012

ShaunR: The AE solution doesn't provide any protection against the race condition I originally posted about as long as the refnum is shared anywhere else... you'd have to encapsulate all operations on the refnum within the AE.

I'm a bit confused on AQ's second "Good usage"... why throw away the error, rather than just handling it downstream?

Because, in general, the whole point of "check if the refnum is valid and only do the operation if it is valid" is from use cases where you don't want an error if the refnum is invalid. So doing the op but ignoring the error gets you that effect.

asbo · March 13, 2012

Because, in general, the whole point of "check if the refnum is valid and only do the operation if it is valid" is from use cases where you don't want an error if the refnum is invalid. So doing the op but ignoring the error gets you that effect.

I was curious about this too. I think it's going to vary from developer to developer because personally I'd want some feedback if the enqueue failed. One of the modules we use internally has two error outputs on its action VIs: one for the outcome of the action of the VI and one for the pass-through and/or generic logic of the VI. Sometimes the terminals get merged, sometimes they are handled separately, sometimes the action error just gets discarded, but it's nice to have that flexibility.

In short, I definitely agree with "the whole point", but think it's valuable to expose the error output anyway.

Aristos Queue · March 13, 2012

Asbo: You miss the point... these are cases where you *don't care* whether the enqueue succeeds or not. Things such as "I am a task that fires this event if there's someone listening... if there's no one listening anymore, I keep doing my work." Literally, you don't want *any* notification if the thing fails. That's almost always the reason that I see "Not A Refnum" included in people's code.

Sign In

Are you misusing the Not A Refnum function and putting your app at risk?

Recommended Posts

Aristos Queue

Fredy

Phillip Brooks

Aristos Queue

Phillip Brooks

Daklu

JackDunaway

Aristos Queue

ShaunR

jzoller

Daklu

Aristos Queue

asbo

Aristos Queue

Join the conversation

Browse

Activity

Important Information

Are *you* misusing the Not A Refnum function and putting your app at risk?

Recommended Posts

Join the conversation

Important Information

Are you misusing the Not A Refnum function and putting your app at risk?