async call by ref workers

smithd · May 21, 2015

Hey all,

I've spent a little time here and there working on this and I figured now was the right time to ask for feedback.

Typically when making a new UI I'll use something like AMC and have a producer (the event structure) and consumer (a QMH). This is the standard template in AMC (image here) and its also used in, for example, the sample projects. This is ok and has done well for a long time, but there are weak points. (a) the QMH can get clogged. After all you're sending all of your work down there and if something is slow, the consumer will run slow. (b) This pattern seems to always end up with a weird subset of state and functionality shared between the two loops. For example maybe your UI is set up to disable some inputs in state X, except that its your QMH, not your UI loop, which determines that you're in state X. So, maybe you send a message to the QMH, it takes some time and so your user is able to press buttons they shouldn't be able to. You fix this by putting the disable code in your UI loop, but then you need both loops to know that you're in state X. Another example: if you're using some features like right click menus, you need to share state between the UI and the QMH so you can generate the appropriate right click menu. Theres many examples like this. None of them is particularly heartbreaking, but my hope is that this is a better way.

At one point a few months ago I was in a conversation with R&D about events and we got onto some of these issues. Aristos and some others pointed out this was basically making two UI threads and suggested pulling everything back into a single loop (just the event handler) but then using async call by ref to take care of all the work that takes more than 200 ms (or whatever you personally consider the cutoff to be). This solves both problems because (a) async call by ref has a pool of VI instances it can use, so the code never blocks and (b) you only have one loop for the UI and associated state information, so there are fewer chances for weird situations.

Since the code for doing all that manually is kind of tedious, I put together this prototype library to hopefully make the above design really really easy.

Feedback I am looking for:

-is this a worthwhile pursuit at all? (ie do you agree with the first couple paragraphs above?)

-has this been done before (I searched and searched but I may have missed something)

-any thoughts on this first draft at implementation?

The code is here and examples are in the project or here. The main example is "example UI get websites" but this example also requires the lovely variant repository. Not for any particular reason, I just like it. There are more details about the code in the readme.

ak_nz · May 21, 2015

Looks interesting; I have been working on something similar. I notice your naming follows the TPL convention in the 4.0+ .NET Framework, were you inspired by this strategy?

smithd · May 21, 2015

Yeah, thats correct. I had been learning a bit of c# and thought the ease with which you could run things in the tpl was impressive. Mine is...nowhere near as fancy, and probably never can be, but the intent was similar.

drjdpowell · May 21, 2015

Have a look at the â€œAsync Action.lvlib:Action.lvclassâ€ in my "Messenger Libraryâ€ on the Tools Network, as well as child classes such as â€œTime Delayed Messageâ€, â€œAsync File Dialogâ€, or â€œAddress Watchdogâ€. This is my OO implementation of what your describing, where child classes override â€œExecute.viâ€, and optionally â€œSetup.viâ€ (run before Async Call).

However, though I have 10 asynchronous actions built into â€œMessenger Libraryâ€, and can create new ones easily, I donâ€™t generally make application-specific ones. Instead, I use either an on-diagram â€œhelperâ€ loop, or have an independent subactor. The pattern could be described as a â€œManagerâ€ (a event- and message-handling loop with state data) and a number of â€œWorkersâ€ (specialized â€œhelperâ€ loops or subactors).

Asynchronous actions are still very worthwhile pursuing, though, as things like a delayed enqueue or an asynchronous dialog box are very valuable.

â€”James

BTW> The â€œProducer-Consumerâ€ QMH templates produced by NI are to be avoided, for all sorts of reasons. I just did a talk on this for the CSLUG user group on Monday; there is a recording of it in their Google+ account (I mentioned the issue of state-sharing between the loops, but my primary criticism was the horrifying potential for race conditions).

smithd · May 22, 2015

Have a look at the â€œAsync Action.lvlib:Action.lvclassâ€ in my "Messenger Libraryâ€ on the Tools Network, as well as child classes such as â€œTime Delayed Messageâ€, â€œAsync File Dialogâ€, or â€œAddress Watchdogâ€. This is my OO implementation of what your describing, where child classes override â€œExecute.viâ€, and optionally â€œSetup.viâ€ (run before Async Call).

However, though I have 10 asynchronous actions built into â€œMessenger Libraryâ€, and can create new ones easily, I donâ€™t generally make application-specific ones. Instead, I use either an on-diagram â€œhelperâ€ loop, or have an independent subactor. The pattern could be described as a â€œManagerâ€ (a event- and message-handling loop with state data) and a number of â€œWorkersâ€ (specialized â€œhelperâ€ loops or subactors).

Asynchronous actions are still very worthwhile pursuing, though, as things like a delayed enqueue or an asynchronous dialog box are very valuable.

I thought about yours and af before moving forward on this and decided it still made sense as more of a loop co-processor than as a dedicated logical actor. That is its more of an off-diagram "helper" loop, in your terminology.

I may need to go back and look at the code, though, as I was under the impression that everything in there was an actor and I wanted to avoid that because you still have the problem of clogging the QMH. If every instance is its own async call then I think I must have just missed the right spot in the code or misunderstood.

Looking at it again, now that I'm looking in the right place, it looks like yours does most of the same stuff mine does. I had been under the impression that your library was more focused on communicating between actors but now I see its way more general-purpose.

BTW> The â€œProducer-Consumerâ€ QMH templates produced by NI are to be avoided, for all sorts of reasons. I just did a talk on this for the CSLUG user group on Monday; there is a recording of it in their Google+ account (I mentioned the issue of state-sharing between the loops, but my primary criticism was the horrifying potential for race conditions).

I can't access the google+ or youtube page. Could you upload the slides here or just describe the race condition problem? I think we're talking about similar sets of issues but I don't see them as being all that horrible, so I'm curious why you're so against the idea. I tend to think it just ends up being a lot more work than it needs to be to make good code.

drjdpowell · May 24, 2015

Some comments after having a look at the code:

1) Is there value in the Execution System stuff? Iâ€™ve never discovered much reason to mess with Execution Systems. It would be simpler (and possibly less overall overhead) to have a single clone pool for all Async stuff, created on first use as in the Actor Framework or "Messenger Libraryâ€.

2) Consider combining the â€œActionâ€ and â€œFunctionâ€ classes (and thus making Actions children of â€œTaskâ€). This reduces the number of wire types and would make Tasks fully recursive (so a Batch can contain other Batches).

3) Do you really need the Variant Parameter stuff in â€œActionâ€, given that children of Action can just add whatever parameters they want in private data?

â€” James

ak_nz · May 24, 2015

We have a lot of third-party IP in dlls here so the Execution System wrappers would help us out a lot (and avoid continuously customizing ini files to extend the number of threads per system). But agreed this wouldn't be beneficial to everyone.

Edited May 24, 2015 by ak_nz

smithd · May 24, 2015

Some comments after having a look at the code:

1) Is there value in the Execution System stuff? Iâ€™ve never discovered much reason to mess with Execution Systems. It would be simpler (and possibly less overall overhead) to have a single clone pool for all Async stuff, created on first use as in the Actor Framework or "Messenger Libraryâ€.

2) Consider combining the â€œActionâ€ and â€œFunctionâ€ classes (and thus making Actions children of â€œTaskâ€). This reduces the number of wire types and would make Tasks fully recursive (so a Batch can contain other Batches).

3) Do you really need the Variant Parameter stuff in â€œActionâ€, given that children of Action can just add whatever parameters they want in private data?

â€” James

1- ^^what he said, its mostly just there if you know you're going to block an entire thread doing something. For example with http get I believe its calling a dll, so you're blocking a thread during that process. Same thing with some of the other I/O types. Its not clearly documented, but those inputs can be completely ignored and it will automatically create a pool of size 10 on the standard exec system and always run it there.

2-I thought about that one a lot and went back and forth. On the one hand I liked the idea of batches of batches, and of course you can still do that with your own tasks. But I figured that it could be handy to focus things down somewhat, which is why I added the actions. That way people who are afraid of objects can use callbyref actions, people who are ok with objects can make new actions, and people who want to use all the features can make tasks.

--> At the same time, given that the inputs are basically the same, I kind of see what you mean. It probably makes sense to merge them.

3-I also went back and forth on this. First, you're absolutely right. But... it kind of simplifies things to always have a 'parameter' input that you can call on any type of action. I think what would probably be the best solution would be to remove it from the parent action (which combined with #2 above would basically mean i delete action entirely) but leave it on the callbyref class since thats supposed to be the easiest to use and there has to be a generic parameter input on that one anyway.

smithd · May 27, 2015

Ok I made the changes you suggested and I think I like it better this way. :thumbup1:

Also I realized I forgot to address one point on your #1

Is there value in the Execution System stuff? Iâ€™ve never discovered much reason to mess with Execution Systems. It would be simpler (and possibly less overall overhead) to have a single clone pool for all Async stuff, created on first use as in the Actor Framework or "Messenger Libraryâ€.

Even if I did change it to just support the standard execution system, I still prefer having a specific 'context' or whatever that everything runs on, rather than using what is basically an inaccessible FGV. If you want a semi-real reason its this: the async call pool can only grow, which makes it kind of scary to me to use in a long running application unless you have the ability to shut down the entire clone pool (which I don't think you can do in AF or yours). Having a separate reference to a specific clone pool means that as you launch or shut down parts of your application you could launch and shut down the paired clone pool. Not a huge deal, but just makes me feel more comfortable using it.

drjdpowell · May 27, 2015

Not a huge deal, but just makes me feel more comfortable using it.

Well, clone pools are a â€œhigh water markâ€ type thing; you will never have more than the maximum number that you had running at any one time. Iâ€™ve done testing with my Async Actions and Actors, and running a few thousand is no issue. Note that any shared-renetrant subVIs called by your top-level â€œfunctionâ€ wonâ€™t be cleaned up when you release the pool, so Iâ€™m not sure if you can reliably recover from a â€œdo a million things at onceâ€ bug. Whatever you do, you need to have a simple high-level API for the new User, with a minimum of different new wire types and â€œCreate so-and-soâ€ calls, even if you have a more complex low-level API semi-hidden in an â€œAdvancedâ€ palette.

drjdpowell · May 27, 2015

BTW, I didnâ€™t comment on the Cancelation-token functionality. I donâ€™t have that, because once I have to send something a message, even if that message is just â€œcancelâ€, I consider making the something an actor, which can be extended to new messages as needed. My â€œactionsâ€ are to be as simple as possible. They always poll their equivalent of the â€œresults reporterâ€, and abort if that object dies, since they then have no reason to continue (their job being to send results). This feature means you donâ€™t have the â€œcancelâ€ them in order to shut down the application.

So I suggest you think about either building the cancelation token up into something more message-like, or eliminating it for simplicity.

Added Later>> Check out the â€œCancelableObserverâ€ class in Messenger Library. This allows one to make a cancelable â€œforwarding addressâ€ out of any other standard address. You could do this for your â€œResults Reporterâ€. Then your Actions can just poll the validity of their Results Reporter instead of a Cancelation token. Note that it is guaranteed that no message can be sent after you call â€œcancelâ€ on the communication method, in contrast to calling cancel on a token, where the running Action may have just checked the token and be about to send the results. The latter behavior can lead to race conditions.

smithd · May 27, 2015

Well, clone pools are a â€œhigh water markâ€ type thing; you will never have more than the maximum number that you had running at any one time. Iâ€™ve done testing with my Async Actions and Actors, and running a few thousand is no issue. Note that any shared-renetrant subVIs called by your top-level â€œfunctionâ€ wonâ€™t be cleaned up when you release the pool, so Iâ€™m not sure if you can reliably recover from a â€œdo a million things at onceâ€ bug. Whatever you do, you need to have a simple high-level API for the new User, with a minimum of different new wire types and â€œCreate so-and-soâ€ calls, even if you have a more complex low-level API semi-hidden in an â€œAdvancedâ€ palette.

Meh, you're right. I hadn't thought about that issue...the DD calls will eventually all add up, and they'll be shared across all the call pools probably. Oh well

On the adv vs. easy API topic what I was considering was creating a FGV which has the same behavior as what you and AF have, so it would initialize a default call pool and provide a simple 'run task' function you can just grab and use. But having a backing API makes me happy

BTW, I didnâ€™t comment on the Cancelation-token functionality. I donâ€™t have that, because once I have to send something a message, even if that message is just â€œcancelâ€, I consider making the something an actor, which can be extended to new messages as needed. My â€œactionsâ€ are to be as simple as possible. They always poll their equivalent of the â€œresults reporterâ€, and abort if that object dies, since they then have no reason to continue (their job being to send results). This feature means you donâ€™t have the â€œcancelâ€ them in order to shut down the application.

So I suggest you think about either building the cancelation token up into something more message-like, or eliminating it for simplicity.

Added Later>> Check out the â€œCancelableObserverâ€ class in Messenger Library. This allows one to make a cancelable â€œforwarding addressâ€ out of any other standard address. You could do this for your â€œResults Reporterâ€. Then your Actions can just poll the validity of their Results Reporter instead of a Cancelation token. Note that it is guaranteed that no message can be sent after you call â€œcancelâ€ on the communication method, in contrast to calling cancel on a token, where the running Action may have just checked the token and be about to send the results. The latter behavior can lead to race conditions.

I've been kind of on the fence about the cancellation thing since I made my UI example, as its kind of hard to keep track of. It felt like it would be easier to ignore a result than to cancel *and* ignore the partial result. That having been said, my goal was not really to make shutdown faster, just to let the task know we don't care if it finishes...but then we get back to if there is really a benefit. I tend to think that for my purposes I'd chose to avoid modifying the state of the system, so cancelling really just saves CPU time, which isn't an issue. Since any really long running tasks (like wait on TCP or whatever) can't be effectively cancelled, it makes me think your suggestion of eliminating for simplicity is the right one. Will think about it more.

smithd · May 27, 2015

Since its a simple change I made a branch here: https://github.com/smithed/taskpool/tree/removecancel

I think I like it better, but I'm still thinking about it.

Edit: yeah I think it makes sense to leave that to the end user. It should be easy to make a task which supports a custom cancellation mechanism if needed.

drjdpowell · May 28, 2015

For comparison, I made a quick version of the asynchronous HTTP Get (LabVIEW 2013; Messenger Library):

Async HTTP Get.zip

Since any really long running tasks (like wait on TCP or whatever) can't be effectively cancelled,

You can, if it is alright to kill the TCP reference from a parallel loop.

smithd · May 29, 2015

Looks pretty straightforward. On the one hand I like the type safety the object gives you but on the other hand, objects are a pain to use in large quantities. I hate the whole documentation, inheritance, etc process. I know there are some fixes out there but really I'm ok with giving up type safety in exchange for just passing in a vi server reference...of course nothing about your version prevents someone from doing that. May just have to use yours in the future

drjdpowell · May 29, 2015

I'm ok with giving up type safety in exchange for just passing in a vi server reference...

What would be nice would be if LabVIEW supported a multi-part â€œCall by Referenceâ€, where one could fill the inputs of a function, pass it to an async process for execution, then read to outputs when it comes back. That would be type-safe and very simple. For async HTTP Get, youâ€™d just need a reference to HTTP Get. Might also simplify command messages as in the AF.

smithd · June 1, 2015

What would be nice would be if LabVIEW supported a multi-part â€œCall by Referenceâ€, where one could fill the inputs of a function, pass it to an async process for execution, then read to outputs when it comes back. That would be type-safe and very simple. For async HTTP Get, youâ€™d just need a reference to HTTP Get. Might also simplify command messages as in the AF.

So I mean they were trying to get there with call and collect, I just don't think its very user-friendly. Your concept makes a lot of sense, and it would be handy for all the different types of call-by-ref. Instead of a single node, we have "latch values", "run", and "get values", and I suppose we'd have to have a function to "get instance from clone pool". The other usability items I can think of:

Timeout on the wait/collect node
Easy way to abort reentrant clone pools if we need to shut down (probably solvable if we had a "get instance from clone pool" function).
Improved type propagation so if you update your connpane it doesn't break everything in your code (this seems to happen more with objects...to fix this I've resorted to just feeding a variant through everywhere :/)
Decorate VI server references with different settings, so you don't have to remember the correct call by ref setting (and so the compiler could check the type for you--if you say "open this for reentrant run" and the VI isn't reentrant, it should break.
Some of the functional programming/lamda discussions from one of the other forums would be handy (I was thinking earlier "well hey most of this I could probably do with an xnode" and then I realized that I'd need to make a second VI, and this would be solved if I could script the VI I needed inside of the node...)

How do those sound to you?

drjdpowell · October 6, 2015

I can't access the google+ or youtube page. Could you upload the slides here or just describe the race condition problem? I think we're talking about similar sets of issues but I don't see them as being all that horrible, so I'm curious why you're so against the idea. I tend to think it just ends up being a lot more work than it needs to be to make good code.

I gave a talk at the recent CLD Summit in the UK where I explain the issue. It is a public video on the CSLUG youtube channel.

Sign In

async call by ref workers

Recommended Posts

smithd

ak_nz

smithd

drjdpowell

smithd

drjdpowell

ak_nz

smithd

smithd

drjdpowell

drjdpowell

smithd

smithd

drjdpowell

smithd

drjdpowell

smithd

drjdpowell

Join the conversation

Browse

Activity

Important Information