Jump to content

async call by ref workers


Recommended Posts

Hey all,

 

I've spent a little time here and there working on this and I figured now was the right time to ask for feedback.

 

Typically when making a new UI I'll use something like AMC and have a producer (the event structure) and consumer (a QMH). This is the standard template in AMC (image here) and its also used in, for example, the sample projects. This is ok and has done well for a long time, but there are weak points. (a) the QMH can get clogged. After all you're sending all of your work down there and if something is slow, the consumer will run slow. (b) This pattern seems to always end up with a weird subset of state and functionality shared between the two loops. For example maybe your UI is set up to disable some inputs in state X, except that its your QMH, not your UI loop, which determines that you're in state X. So, maybe you send a message to the QMH, it takes some time and so your user is able to press buttons they shouldn't be able to. You fix this by putting the disable code in your UI loop, but then you need both loops to know that you're in state X. Another example: if you're using some features like right click menus, you need to share state between the UI and the QMH so you can generate the appropriate right click menu. Theres many examples like this. None of them is particularly heartbreaking, but my hope is that this is a better way.

 

At one point a few months ago I was in a conversation with R&D about events and we got onto some of these issues. Aristos and some others pointed out this was basically making two UI threads and suggested pulling everything back into a single loop (just the event handler) but then using async call by ref to take care of all the work that takes more than 200 ms (or whatever you personally consider the cutoff to be). This solves both problems because (a) async call by ref has a pool of VI instances it can use, so the code never blocks and (b) you only have one loop for the UI and associated state information, so there are fewer chances for weird situations.

 

Since the code for doing all that manually is kind of tedious, I put together this prototype library to hopefully make the above design really really easy.

 

Feedback I am looking for:

-is this a worthwhile pursuit at all? (ie do you agree with the first couple paragraphs above?)

-has this been done before (I searched and searched but I may have missed something)

-any thoughts on this first draft at implementation?

 


The code is here and examples are in the project or here. The main example is "example UI get websites" but this example also requires the lovely variant repository. Not for any particular reason, I just like it. There are more details about the code in the readme.

 


 

 

  • Like 1
Link to comment

Have a look at the “Async Action.lvlib:Action.lvclass†in my "Messenger Library†on the Tools Network, as well as child classes such as “Time Delayed Messageâ€, “Async File Dialogâ€, or “Address Watchdogâ€.  This is my OO implementation of what your describing, where child classes override “Execute.viâ€, and optionally “Setup.vi†(run before Async Call). 

 

However, though I have 10 asynchronous actions built into “Messenger Libraryâ€, and can create new ones easily, I don’t generally make application-specific ones.  Instead, I use either an on-diagram “helper†loop, or have an independent subactor.  The pattern could be described as a “Manager†(a event- and message-handling loop with state data) and a number of “Workers†(specialized “helper†loops or subactors).  

 

Asynchronous actions are still very worthwhile pursuing, though, as things like a delayed enqueue or an asynchronous dialog box are very valuable.

 

—James

 

BTW> The “Producer-Consumer†QMH templates produced by NI are to be avoided, for all sorts of reasons.  I just did a talk on this for the CSLUG user group on Monday; there is a recording of it in their Google+ account (I mentioned the issue of state-sharing between the loops, but my primary criticism was the horrifying potential for race conditions).

Link to comment

Have a look at the “Async Action.lvlib:Action.lvclass†in my "Messenger Library†on the Tools Network, as well as child classes such as “Time Delayed Messageâ€, “Async File Dialogâ€, or “Address Watchdogâ€.  This is my OO implementation of what your describing, where child classes override “Execute.viâ€, and optionally “Setup.vi†(run before Async Call). 

 

However, though I have 10 asynchronous actions built into “Messenger Libraryâ€, and can create new ones easily, I don’t generally make application-specific ones.  Instead, I use either an on-diagram “helper†loop, or have an independent subactor.  The pattern could be described as a “Manager†(a event- and message-handling loop with state data) and a number of “Workers†(specialized “helper†loops or subactors).  

 

Asynchronous actions are still very worthwhile pursuing, though, as things like a delayed enqueue or an asynchronous dialog box are very valuable.

I thought about yours and af before moving forward on this and decided it still made sense as more of a loop co-processor than as a dedicated logical actor. That is its more of an off-diagram "helper" loop, in your terminology.

 

I may need to go back and look at the code, though, as I was under the impression that everything in there was an actor and I wanted to avoid that because you still have the problem of clogging the QMH. If every instance is its own async call then I think I must have just missed the right spot in the code or misunderstood.

 

Looking at it again, now that I'm looking in the right place, it looks like yours does most of the same stuff mine does. I had been under the impression that your library was more focused on communicating between actors but now I see its way more general-purpose.

 

 

 

BTW> The “Producer-Consumer†QMH templates produced by NI are to be avoided, for all sorts of reasons.  I just did a talk on this for the CSLUG user group on Monday; there is a recording of it in their Google+ account (I mentioned the issue of state-sharing between the loops, but my primary criticism was the horrifying potential for race conditions).

I can't access the google+ or youtube page. Could you upload the slides here or just describe the race condition problem? I think we're talking about similar sets of issues but I don't see them as being all that horrible, so I'm curious why you're so against the idea. I tend to think it just ends up being a lot more work than it needs to be to make good code.

Link to comment

Some comments after having a look at the code:

1) Is there value in the Execution System stuff?  I’ve never discovered much reason to mess with Execution Systems.  It would be simpler (and possibly less overall overhead) to have a single clone pool for all Async stuff, created on first use as in the Actor Framework or "Messenger Libraryâ€.

 

2) Consider combining the “Action†and “Function†classes (and thus making Actions children of “Taskâ€).  This reduces the number of wire types and would make Tasks fully recursive (so a Batch can contain other Batches).

 

3) Do you really need the Variant Parameter stuff in “Actionâ€, given that children of Action can just add whatever parameters they want in private data?

 

— James

  • Like 1
Link to comment

We have a lot of third-party IP in dlls here so the Execution System wrappers would help us out a lot (and avoid continuously customizing ini files to extend the number of threads per system). But agreed this wouldn't be beneficial to everyone.

Edited by ak_nz
  • Like 1
Link to comment

Some comments after having a look at the code:

1) Is there value in the Execution System stuff?  I’ve never discovered much reason to mess with Execution Systems.  It would be simpler (and possibly less overall overhead) to have a single clone pool for all Async stuff, created on first use as in the Actor Framework or "Messenger Libraryâ€.

 

2) Consider combining the “Action†and “Function†classes (and thus making Actions children of “Taskâ€).  This reduces the number of wire types and would make Tasks fully recursive (so a Batch can contain other Batches).

 

3) Do you really need the Variant Parameter stuff in “Actionâ€, given that children of Action can just add whatever parameters they want in private data?

 

— James

1- ^^what he said, its mostly just there if you know you're going to block an entire thread doing something. For example with http get I believe its calling a dll, so you're blocking a thread during that process. Same thing with some of the other I/O types. Its not clearly documented, but those inputs can be completely ignored and it will automatically create a pool of size 10 on the standard exec system and always run it there.

 

2-I thought about that one a lot and went back and forth. On the one hand I liked the idea of batches of batches, and of course you can still do that with your own tasks. But I figured that it could be handy to focus things down somewhat, which is why I added the actions. That way people who are afraid of objects can use callbyref actions, people who are ok with objects can make new actions, and people who want to use all the features can make tasks.

--> At the same time, given that the inputs are basically the same, I kind of see what you mean. It probably makes sense to merge them.

 

3-I also went back and forth on this. First, you're absolutely right. But... it kind of simplifies things to always have a 'parameter' input that you can call on any type of action. I think what would probably be the best solution would be to remove it from the parent action (which combined with #2 above would basically mean i delete action entirely) but leave it on the callbyref class since thats supposed to be the easiest to use and there has to be a generic parameter input on that one anyway.

Link to comment

Ok I made the changes you suggested and I think I like it better this way. :thumbup1:

 

Also I realized I forgot to address one point on your #1

 

 

Is there value in the Execution System stuff?  I’ve never discovered much reason to mess with Execution Systems.  It would be simpler (and possibly less overall overhead) to have a single clone pool for all Async stuff, created on first use as in the Actor Framework or "Messenger Libraryâ€.

 

Even if I did change it to just support the standard execution system, I still prefer having a specific 'context' or whatever that everything runs on, rather than using what is basically an inaccessible FGV. If you want a semi-real reason its this: the async call pool can only grow, which makes it kind of scary to me to use in a long running application unless you have the ability to shut down the entire clone pool (which I don't think you can do in AF or yours). Having a separate reference to a specific clone pool means that as you launch or shut down parts of your application you could launch and shut down the paired clone pool. Not a huge deal, but just makes me feel more comfortable using it. 

Link to comment

Not a huge deal, but just makes me feel more comfortable using it. 

Well, clone pools are a “high water mark†type thing; you will never have more than the maximum number that you had running at any one time.  I’ve done testing with my Async Actions and Actors, and running a few thousand is no issue.  Note that any shared-renetrant subVIs called by your top-level “function†won’t be cleaned up when you release the pool, so I’m not sure if you can reliably recover from a “do a million things at once†bug.  Whatever you do, you need to have a simple high-level API for the new User, with a minimum of different new wire types and “Create so-and-so†calls, even if you have a more complex low-level API semi-hidden in an “Advanced†palette.

  • Like 1
Link to comment

BTW, I didn’t comment on the Cancelation-token functionality.  I don’t have that, because once I have to send something a message, even if that message is just “cancelâ€, I consider making the something an actor, which can be extended to new messages as needed.   My “actions†are to be as simple as possible.   They always poll their equivalent of the “results reporterâ€, and abort if that object dies, since they then have no reason to continue (their job being to send results).  This feature means you don’t have the “cancel†them in order to shut down the application.  

 

So I suggest you think about either building the cancelation token up into something more message-like, or eliminating it for simplicity.

 

Added Later>> Check out the “CancelableObserver†class in Messenger Library.  This allows one to make a cancelable “forwarding address†out of any other standard address.  You could do this for your “Results Reporterâ€.   Then your Actions can just poll the validity of their Results Reporter instead of a Cancelation token.  Note that it is guaranteed that no message can be sent after you call “cancel†on the communication method, in contrast to calling cancel on a token, where the running Action may have just checked the token and be about to send the results.  The latter behavior can lead to race conditions.

Link to comment

Well, clone pools are a “high water mark†type thing; you will never have more than the maximum number that you had running at any one time.  I’ve done testing with my Async Actions and Actors, and running a few thousand is no issue.  Note that any shared-renetrant subVIs called by your top-level “function†won’t be cleaned up when you release the pool, so I’m not sure if you can reliably recover from a “do a million things at once†bug.  Whatever you do, you need to have a simple high-level API for the new User, with a minimum of different new wire types and “Create so-and-so†calls, even if you have a more complex low-level API semi-hidden in an “Advanced†palette.

Meh, you're right. I hadn't thought about that issue...the DD calls will eventually all add up, and they'll be shared across all the call pools probably. Oh well :(

 

On the adv vs. easy API topic what I was considering was creating a FGV which has the same behavior as what you and AF have, so it would initialize a default call pool and provide a simple 'run task' function you can just grab and use. But having a backing API makes me happy :)

 

 

 

BTW, I didn’t comment on the Cancelation-token functionality.  I don’t have that, because once I have to send something a message, even if that message is just “cancelâ€, I consider making the something an actor, which can be extended to new messages as needed.   My “actions†are to be as simple as possible.   They always poll their equivalent of the “results reporterâ€, and abort if that object dies, since they then have no reason to continue (their job being to send results).  This feature means you don’t have the “cancel†them in order to shut down the application.  

 

So I suggest you think about either building the cancelation token up into something more message-like, or eliminating it for simplicity.

 

Added Later>> Check out the “CancelableObserver†class in Messenger Library.  This allows one to make a cancelable “forwarding address†out of any other standard address.  You could do this for your “Results Reporterâ€.   Then your Actions can just poll the validity of their Results Reporter instead of a Cancelation token.  Note that it is guaranteed that no message can be sent after you call “cancel†on the communication method, in contrast to calling cancel on a token, where the running Action may have just checked the token and be about to send the results.  The latter behavior can lead to race conditions.

 

I've been kind of on the fence about the cancellation thing since I made my UI example, as its kind of hard to keep track of. It felt like it would be easier to ignore a result than to cancel *and* ignore the partial result. That having been said, my goal was not really to make shutdown faster, just to let the task know we don't care if it finishes...but then we get back to if there is really a benefit. I tend to think that for my purposes I'd chose to avoid modifying the state of the system, so cancelling really just saves CPU time, which isn't an issue. Since any really long running tasks (like wait on TCP or whatever) can't be effectively cancelled, it makes me think your suggestion of eliminating for simplicity is the right one. Will think about it more.

Link to comment

Looks pretty straightforward. On the one hand I like the type safety the object gives you but on the other hand, objects are a pain to use in large quantities. I hate the whole documentation, inheritance, etc process. I know there are some fixes out there but really I'm ok with giving up type safety in exchange for just passing in a vi server reference...of course nothing about your version prevents someone from doing that. May just have to use yours in the future :)

Link to comment

I'm ok with giving up type safety in exchange for just passing in a vi server reference...

What would be nice would be if LabVIEW supported a multi-part “Call by Referenceâ€, where one could fill the inputs of a function, pass it to an async process for execution, then read to outputs when it comes back.  That would be type-safe and very simple.  For async HTTP Get, you’d just need a reference to HTTP Get.  Might also simplify command messages as in the AF.

Link to comment

What would be nice would be if LabVIEW supported a multi-part “Call by Referenceâ€, where one could fill the inputs of a function, pass it to an async process for execution, then read to outputs when it comes back.  That would be type-safe and very simple.  For async HTTP Get, you’d just need a reference to HTTP Get.  Might also simplify command messages as in the AF.

So I mean they were trying to get there with call and collect, I just don't think its very user-friendly. Your concept makes a lot of sense, and it would be handy for all the different types of call-by-ref. Instead of a single node, we have "latch values", "run", and "get values", and I suppose we'd have to have a function to "get instance from clone pool". The other usability items I can think of:

  • Timeout on the wait/collect node
  • Easy way to abort reentrant clone pools if we need to shut down (probably solvable if we had a "get instance from clone pool" function).
  • Improved type propagation so if you update your connpane it doesn't break everything in your code (this seems to happen more with objects...to fix this I've resorted to just feeding a variant through everywhere :/)
  • Decorate VI server references with different settings, so you don't have to remember the correct call by ref setting (and so the compiler could check the type for you--if you say "open this for reentrant run" and the VI isn't reentrant, it should break.
  • Some of the functional programming/lamda discussions from one of the other forums would be handy (I was thinking earlier "well hey most of this I could probably do with an xnode" and then I realized that I'd need to make a second VI, and this would be solved if I could script the VI I needed inside of the node...)

How do those sound to you?

Link to comment
  • 4 months later...

I can't access the google+ or youtube page. Could you upload the slides here or just describe the race condition problem? I think we're talking about similar sets of issues but I don't see them as being all that horrible, so I'm curious why you're so against the idea. I tend to think it just ends up being a lot more work than it needs to be to make good code.

I gave a talk at the recent CLD Summit in the UK where I explain the issue.  It is a public video on the CSLUG youtube channel.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.