Jump to content

Timeout of -1 Queue Performance on RT


Recommended Posts

I'm having some weird behavior on RT on my softmotion application. Softmotion is throwing errors like it's not getting enough CPU to perform the needed tasks, but my CPU usage on the RIO is only in the 50% range.

 

I have several(6-10) concurrent tasks going in my RT code, most of them are idle however, sitting on a dequeue with a -1 for the timeout. My communication manager loop is waiting for a message from the PC, and when it does it will enqueue a message to on of the concurrent tasks. The task will do a little casting as needed, then start executing a task that can take anywhere from 5 sec to 5 min. Some of these tasks involve time critical sub components. None of the time critical parts run for longer than 30 seconds.

 

So as someone who's more experienced with desktop programming, all those dequeues look like 0% processor usage to me. And judging by the distributed system manager, the CRIO seems to agree. When there's nothing going on, my CPU usage drops down to the 15%-20% range. But I'm still getting weird softmotion errors (like -77055).

 

So my question is: Does a dequeue with a -1 timeout tie up the CPU in RT, even though the usage doesn't seem to show it?

Link to comment

Dequeue with -1 timeout should use virtually zero of your CPU resources. You can rely on this mechanism on RT the same as your do on the PC. 

 

RT-FIFOs do have the option where read and writes are polling (using CPU) or blocking (CPU sleeping), but as you state you are using queues and not RT-FIFOs then this does not apply to your application.

 

Is it possible that perhaps something else is going on that periodically causes a big CPU spike? If it is a quick event you may not notice it in your CPU profiling code, but some other bit of code may be affected.

 

The other culprit could be timed loops. Are you using any?

Link to comment

Thanks for the response.

 

Yes I'm using timed loops. Most of the time there is only 1 timed looped running, it is a loop that is responsible for broadcasting some communication values up to the PC. It's set with a low priority and it's period is 200 ms. I've recently gone back and switched my low priorty timed loops out to while loops with waits, and it doesn't seem to be helping at all (although i've only run it once or twice). Related followup question: If a timed loop only has a little bit of work to do in a longer period (2ms of work to do in a 10ms period we'll say) is this affecting my CPU more than I would think?

 

I've done some CPU profiling with RT trace toolkit and there doesn't seem to be any spikes. I usually have a pretty healthy amount of time in the "VxWorks Null thread"

 

Im currently investigating the possibility of a bad 9512 module (admittedly it's a longshot). I have 5 axis that I need to control. 4 of them are on an ethernet rio and all work fine, the one axis that is plugged into the CRIO is the one giving me problems.

Link to comment
Thanks for the response.

 

Yes I'm using timed loops. Most of the time there is only 1 timed looped running, it is a loop that is responsible for broadcasting some communication values up to the PC. It's set with a low priority and it's period is 200 ms. I've recently gone back and switched my low priorty timed loops out to while loops with waits, and it doesn't seem to be helping at all (although i've only run it once or twice). Related followup question: If a timed loop only has a little bit of work to do in a longer period (2ms of work to do in a 10ms period we'll say) is this affecting my CPU more than I would think?

 

I've done some CPU profiling with RT trace toolkit and there doesn't seem to be any spikes. I usually have a pretty healthy amount of time in the "VxWorks Null thread"

 

Im currently investigating the possibility of a bad 9512 module (admittedly it's a longshot). I have 5 axis that I need to control. 4 of them are on an ethernet rio and all work fine, the one axis that is plugged into the CRIO is the one giving me problems.

 

What cRIO are you using?

I've seen very bad performance with the older range of cRIOs (VxWorks), and was told that that was because they had a very limited number of threads.

The point was that anything that had a sleep function, like TimedLoop, Dequeue, Enqueue etc. caused the performance to go down, and caused a lot of jittering

 

Our solution was to move to a more poll-based design rather than interrupt driven.

 

/J

Link to comment

I think the older cRIOs ran PharLap ETS, and it's the newer ones that run VxWorks? But maybe I have it backwards. A bit off-topic, but I had a possibly-related issue with code on a newer sbRIO (running VxWorks, if I remember correctly) where it crashed or wouldn't run an executable that used more than one execution subsystem. I was in a rush to get it working and so didn't have time to investigate or report the problem, I just set everything to the same execution subsystem and it ran fine.

Link to comment
Do you mean resetting the system clock on the cRIO?

Yeah, I was trying to sync the CRIOs time with the connecting PC's time, so I could use "Get Date/Time in Seconds" on the CRIO.

 

Sync'ed time was a "nice to have but not necessary" feature, so I just scrapped it.

Link to comment

I recently had some nasty issues with doing something very similar. I have reasonably solid evidence that setting the system clock too often could case the cRIO to crash. I was trying to sync the cRIO clock from a master PC.

 

Something I wanted to try (but never got around to doing it) was to use a time sync protocol rather than "manually" doing it in the RT code.

Link to comment
 We have some systems using 1588 to sync clocks, haven't noticed any problems with that under VxWorks.

 

That I think is the root of my problem, I was *not* using 1588 (which I presume you set up in the RT system ini file?), rather manually getting time updates over UDP and manually adjusting the system clock (i.e. in code). Doing this too quickly (i.e. more than a couple of times a minute) seemed to cause my RT code to crash horribly (some bits were still running, but not most of it).

 

This happened on a bunch of new cRIO-9024s

Link to comment
1588 (which I presume you set up in the RT system ini file?)

 

Had to install the right software to the cRIO, then enable it via MAX (Time Settings tab, under Time Synchronization).  Then some tweaks to the configuration so that we had a consistent master, in order to avoid a situation where on every power cycle, a different clock in our network became master (which caused the time to jump around a lot more than we wanted).

 

Also using cRIO-9024 and 9025s.

Edited by MikeC3
Link to comment
Had to install the right software to the cRIO, then enable it via MAX (Time Settings tab, under Time Synchronization).  Then some tweaks to the configuration so that we had a consistent master, in order to avoid a situation where on every power cycle, a different clock in our network became master (which caused the time to jump around a lot more than we wanted).

 

Also using cRIO-9024 and 9025s.

 

Well, hopefully my cRIOs are stable now. I have nine of them running a long term test, with uptimes of about 60 days so far, so hopefully my issues are resolved.

 

Next time I will definitely go the time sync route.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.