What can kill a queue?

John Lokanis · October 16, 2008

QUOTE (nighteagle @ Oct 15 2008, 04:52 AM)

So my question is, is the above shown VI somewhere in a timed structure???

No, we do not use any timed loops. Only the standard producer-consumer while loop structures. We do a lot of dynamic launching of reentrant VIs, however. I really think this is a memory corruption issue. We see this happen sporatically and it is always a reference to a queue or notifier or user event reference that has gone bad.

I am still on 8.5, not 8.5.1. I only see this in deployed apps running under the RTE.

Our next plans are to upgrade to 8.6 and see if the problem goes away.

John Lokanis · October 17, 2008

This just gets more interesting. I have now trapped another 'impossible' error. The reference to my top level VI became invalid inside a sub-vi. I cannot think of any way this could happen. I derive the reference in the top VI via the 'This VI' static reference node and then pass it to the sub-vi. The error I get (very rarely even though this sub-vi is called 1000's of times a day by this top level VI) is:

Error 1055 occurred at Property Node in <this VI's Path>

Possible reason(s):

LabVIEW: Object reference is invalid.

How is this even possible? The reference input to the sub-vi is a required input. And I have traced the code to ensure that it is not being closed somewhere else or passed through a non-iterating for loop. If the sub-vi is running, how could it's caller's ref go bad???

See the attached VI for an example of the code.

Download File:post-2411-1224117137.vi

-John

crelf · October 17, 2008

QUOTE (jlokanis @ Oct 15 2008, 08:34 PM)

This just gets more interesting. I have now trapped another 'impossible' error.

Random question: could you have a RAM hardware issue? Or maybe something in the OS is screwing with you.

John Lokanis · October 17, 2008

QUOTE (crelf @ Oct 15 2008, 05:51 PM)

Random question: could you have a RAM hardware issue? Or maybe something in the OS is screwing with you.

I am seeing this on multiple machines and we have taken the machines of line and run full memory diagnostics. I think the LV 8.5 RTE is just plain buggy. At this point, I have done everything I can in the code. I now need to take the leap and just upgrade to 8.6 and cross my fingers...

Michael Aivaliotis · October 17, 2008

I just stumbled across this thread and have some comments. I find queues to be the most rock solid feature in LV. Most of the time, when they do cause problems it's because I had an oversight and the VI that created the queue went idle so the queue died.

One thing that I see as a red flag is the usage of shared clones. Disable shared clones, rebuild it and give it a spin.

John Lokanis · October 17, 2008

QUOTE (Michael_Aivaliotis @ Oct 15 2008, 10:26 PM)

One thing that I see as a red flag is the usage of shared clones. Disable shared clones, rebuild it and give it a spin.

Thanks for the tip, but can you tell me why that is a red flag? Is there something about the shared clone vs. preallocated clone that is bad? I was under the impression that shared clones were good when you used a lot of reentrant VIs because they reduced the memory footprint. As long as you avoided FuncGlob types of VIs where previous states mattered, they should work the same, right?

I will give this a try but I would like to know more about the difference so I can educate myself.

thanks,

-John

Michael Aivaliotis · October 18, 2008

QUOTE (jlokanis @ Oct 16 2008, 12:25 AM)

Thanks for the tip, but can you tell me why that is a red flag? Is there something about the shared clone vs. preallocated clone that is bad? I was under the impression that shared clones were good when you used a lot of reentrant VIs because they reduced the memory footprint. As long as you avoided FuncGlob types of VIs where previous states mattered, they should work the same, right?
I will give this a try but I would like to know more about the difference so I can educate myself.

thanks,

-John

If you've really been using LabVIEW since 1993 as your profile states then you've also worked through many versions of LabVIEW. From my experience if things have been working fine, you look at what changed. In specific you look at what changed in this version of LV and specifically, you ask yourself, "am I using any of the new features?".

I've been burned many times over the years when jumping onto new LV features, and the shared clone feature was added in LV 8.5. I haven't personally had problems with it but let's just say, I have a hunch.

In the end, using preallocate clone does not change the functionality of your program, so just give it a shot. If you still have the problem then at least you have eliminated one question in your mind, "could it be that?"

John Lokanis · October 18, 2008

QUOTE (Michael_Aivaliotis @ Oct 16 2008, 03:53 PM)

If you've really been using LabVIEW since 1993 as your profile states then you've also worked through many versions of LabVIEW. From my experience if things have been working fine, you look at what changed. In specific you look at what changed in this version of LV and specifically, you ask yourself, "am I using any of the new features?".
I've been burned many times over the years when jumping onto new LV features, and the shared clone feature was added in LV 8.5. I haven't personally had problems with it but let's just say, I have a hunch.

In the end, using preallocate clone does not change the functionality of your program, so just give it a shot. If you still have the problem then at least you have eliminated one question in your mind, "could it be that?"

Yes, that is a good point. I actually saw a big improvement when I moved to 8.5 from 8.2. The application seems to speed up dramatically on multi core machines. I also changed all the VIs to shared mode because it seemed to make more sense from a memory allocation perspective.

I have converted the whole app back to preallocate and it does not seem to do any harm so far. I will need to run it through the test release process and deply it to production for a while before I know if the problem is gone.

I just wish someone at NI could give me a 'warm fuzzy' about this by saying 'yes, there are some bugs in the garbage collection routines in the 8.5 RTE'. It would be even nicer if they also said this was fixed in 8.5.1 or 8.6.

I have seen my share of LV bugs going back to version 3.1. Some critical and some just annoying. This one is the most bizzare, however because it is so hard to reproduce.

I will post the results after a few days of testing...

-JOhn

John Lokanis · November 6, 2008

QUOTE (jlokanis @ Oct 16 2008, 03:02 PM)

I will post the results after a few days of testing...

Well, changing to the pre-allocate clone method does seem to have corrected the memory corruption issue. So, beware of shared clones! Especially if you are doing a lot of dynamic launching.

Now, unfortunatly, I am dealing with a speed issue. Likely caused by all that allocation each time I create a new clone of a reentrant VI. I have posted a new thread of this topic so I will see if anyone has ideas on this one.

thanks for all the comments and help on this bug. I will follow up with NI to see if they can repro this and fix it in LV8.7 or whatever the next version is...

-John

Aristos Queue · February 5, 2009

This has been confirmed as CAR 136680.

Summary: "Queue/Notifier references can become unexpectedly invalid when using "Force Destroy" option for long-running VIs"

Details: A bug has been discovered with the queue and notifier refnums that affects all versions of LabVIEW from 6.1.0 through 8.6.1. It can only occur when you have multiple top-level VIs that are all using queues or notifiers AND you occasionally use the "force destroy?" option when releasing refnums AND your VIs run for an extremely long time without stopping. If you are seeing queue or notifier refnums becoming invalid unexpectedly, this bug may apply to you.

When you use Obtain Queue or Obtain Notifier, LabVIEW records that refnum so that if the VI that allocated it goes idle (stops running), the refnum will be automatically released. If you call Release Queue or Release Notifier explicitly, then the refnum is released and is removed from the list of refnums that need to be automatically cleaned up.

However, when calling Release Queue or Release Notifier with the "force destroy?" option set to TRUE, all refnums that refer to that same named queue are invalidated, but only the refnum that is explicitly passed to the Release function is removed from the automatic clean up list. The result of this is that all the refnums are now available for LabVIEW to reuse for future calls to Obtain and Release Queue that are used by other VIs. The result is that a first top-level VI can obtain multiple references to the same named item, then "force destroy?" those references. A second top-level VI could now obtain a refnum to a different item. When the first top-level VI goes idle, it will automatically destroy the refnums that were incorrectly left on its list. This results in the refnums, which are still in use by the other top-level VI, returning errors when they are used for operations.

Because released refnums are reused only when LV wraps around on the refnum count, the first top-level VI must be running for a significant amount of time before this bug can occur. It has taken as long as three days to manifest in some of the applications that were experiencing problems with this bug.

Workaround: Avoid using "force destroy" to destroy all the refnums for a given item (queue or notifier). One solution to this is to use the NamedRefnumManager.vi in the attached VI file. Unzip the attachment and take a look at "DemoWorkaround.vi" for more information. Download File:post-5877-1233763916.zip

Status: This bug was not isolated in time to be included in the 8.6.1 bug fix release. It will be fixed in the next release of LabVIEW.

Grampa_of_Oliva_n_Eden · February 5, 2009

QUOTE (Aristos Queue @ Feb 4 2009, 11:12 AM)

This has been confirmed as CAR 136680.
Summary: "Queue/Notifier references can become unexpectedly invalid when using "Force Destroy" option for long-running VIs"

...

Thank you very much for that update!

I believe I just delivered an app with this bug (that I was writting off as an issue with loading and unloading templates. I was setting up a half-dozen queues for each and destroying them when unloading). My observation fit with the above description if the "wrap-around" occured after about 1000 queues were created.

So....

What is the number of queue create/destroys that are required to hit this bug?

Ben

jdunham · February 5, 2009

QUOTE (Aristos Queue @ Feb 4 2009, 08:12 AM)

This has been confirmed as CAR 136680.

We have been vexed by this too, though it has been difficult to isolate into something we can report to NI. Thanks for the update!

John Lokanis · February 5, 2009

QUOTE (Aristos Queue @ Feb 4 2009, 08:12 AM)

Status: This bug was not isolated in time to be included in the 8.6.1 bug fix release. It will be fixed in the next release of LabVIEW.

Yes, I have been working with NI support on this for what seems like a year now. It is unfortunate we were not able to find the root cause in time for the 8.6.1 release. I am just glad they finally have confirmed the exisitance of this bug that has plagued my code for so long. I was beginning to fell like Hurley on Lost. Was I just crazy or was there really something wrong in the LabVIEW libraries?

Too bad it couldn't have been called CAR 4815162342.

One benefit of this has been to review my entire architecture and change it to no longer share queue references between threads. Instead every queue in a VIT gets namespaced and then every sub thread gets and destroys it's own reference to that queue. This eliminates the need to force destroy a queue to clean things up.

-John

Grampa_of_Oliva_n_Eden · February 5, 2009

QUOTE (jlokanis @ Feb 4 2009, 12:26 PM)

Yes, I have been working with NI support on this for what seems like a year now. ....
-John

Thanks John for doing so. I nominate you for the "Voice Crying in the Wilderness Award".

Ben

Aristos Queue · February 5, 2009

QUOTE (neBulus @ Feb 4 2009, 10:49 AM)

What is the number of queue create/destroys that are required to hit this bug?

The minimum is slightly above 4000 queues, although you probably won't see anything that soon because the refnum is made up of three components, one that is a cycling number and one that is based on something like the tick count and one that is random. Probability is in your favor until you get up around 10000 refnums of the same kind (meaning that if you create 5000 queues and 5000 notifiers, you're probably still safe on both).

Grampa_of_Oliva_n_Eden · February 5, 2009

QUOTE (Aristos Queue @ Feb 4 2009, 12:54 PM)

The minimum is slightly above 4000 queues, although you probably won't see anything that soon because the refnum is made up of three components, one that is a cycling number and one that is based on something like the tick count and one that is random. Probability is in your favor until you get up around 10000 refnums of the same kind (meaning that if you create 5000 queues and 5000 notifiers, you're probably still safe on both).

Thank you. Thats the right order of magnitude for what I was seeing*.

Ben

* Control on the Fly app where each object had command response, subscription and update queues. 100 objects in both screens. Object (and their queues) created and destroyed (using destroy switch). Most of the time I saw no trouble but if I was loading and unloading (Switching from operate mode to edit mode removed all of the old and reloaded with the new set there where about 800 destroy/create Queue operations required )large designs it was only a matter of time until things "just stopped happening" and my Event logger started complaining about invalid queue refs.

Matteo.T · December 16, 2020

HI there, i am using LabView2018 since a couple of months for my master thesis, so i am not an expert. I am co-simulating Labview and Multisim and simulating an FPGA target on Desktop with the " Desktop Execution Node". I am always encountering this error " Error 1055 occurred at Property Node in niFpgaGetTargetClassFromTarget.vi:6910002- >niFpgaDEN_Execute.vi:32001- >Testbench_base.vi Possible Reason(s): LabVIEW:( hex 0x41F) Object reference is invalid" after lot of minutes of simulation. I could't figure out the reason why it appears only after lot of time the start of the simulation and i have no idea how to fix it. I would be very glad if anyone had a suggestion. Thanks in advanced!

LogMAN · December 17, 2020

@Matteo.T You need to start a new topic for your question, it doesn't belong to this thread.

Neil Pate · December 17, 2020

But I am actually grateful for necro-ing this thread as I had not seen it first time and was interested to read it.

Matteo.T · December 22, 2020

On 12/17/2020 at 8:07 AM, LogMAN said:

@Matteo.T You need to start a new topic for your question, it doesn't belong to this thread.

Ok thank you i will do it

Sign In

What can kill a queue?

Recommended Posts

John Lokanis

John Lokanis

crelf

John Lokanis

Michael Aivaliotis

John Lokanis

Michael Aivaliotis

John Lokanis

John Lokanis

Aristos Queue

Grampa_of_Oliva_n_Eden

jdunham

John Lokanis

Grampa_of_Oliva_n_Eden

Aristos Queue

Grampa_of_Oliva_n_Eden

Matteo.T

LogMAN

Neil Pate

Matteo.T

Join the conversation

Browse

Activity

Important Information