Jump to content
John Lokanis

Network Messaging and the root loop deadlock

Recommended Posts

I have been working on an architecture that uses VI Server to send messages between application instances, both local and across the network. One of the problems I have run into is the fact that VI Server calls are blocked by activity in the root loop (sometimes referred to as the UI Thread).

There are several things that can cause this: other VI server calls, system dialogs (calls to the one and two button dialog functions), if the user drops down a menu but does not make a selection... (I'm sure there are more...)

Since this is a pretty normal way of communicating between applications, I was wondering if anyone had any ideas for a work around.

Here is a basic description of my architecture:

Message is created and sent to local VI that sends to outside application instance.

Local sending VI opens VI server connection to remote instance. It then calls a VI in the remote instance that takes the message as input.

This remote VI them places the message in the appropriate queue on the remote instance so it gets handled.

If the remote instance root loop is blocked, the sending VI on the local machine is also blocked.

I could try to eliminate all system dialogs from the remote application, but that only partially addresses the issue. I really wish a future version of LabVIEW would eliminate this problem with the root loop and VI Server all together.

BTW: using LV2012 but this issue exists in all versions.

-John

Share this post


Link to post
Share on other sites

I have been working on an architecture that uses VI Server to send messages between application instances, both local and across the network. One of the problems I have run into is the fact that VI Server calls are blocked by activity in the root loop (sometimes referred to as the UI Thread).

There are several things that can cause this: other VI server calls, system dialogs (calls to the one and two button dialog functions), if the user drops down a menu but does not make a selection... (I'm sure there are more...)

Since this is a pretty normal way of communicating between applications, I was wondering if anyone had any ideas for a work around.

Here is a basic description of my architecture:

Message is created and sent to local VI that sends to outside application instance.

Local sending VI opens VI server connection to remote instance. It then calls a VI in the remote instance that takes the message as input.

This remote VI them places the message in the appropriate queue on the remote instance so it gets handled.

If the remote instance root loop is blocked, the sending VI on the local machine is also blocked.

I could try to eliminate all system dialogs from the remote application, but that only partially addresses the issue. I really wish a future version of LabVIEW would eliminate this problem with the root loop and VI Server all together.

BTW: using LV2012 but this issue exists in all versions.

-John

I think you need to distinguish a few things. First root loop and UI thread are close related but as far as I'm aware not exactly the same. Then not everything in VI server is blocked by the root loop, but Open Application and Open VI Reference surely are and any property nodes that operate on UI elements are executed in the UI thread. Once you have a VI reference open and can keep it open you should be able to invoke the VI remotely with Call by Refernece without blocking. Not sure about the asynchonous Call by Reference though. Do you see problems with the synchronous CbR or are you trying to do other things on the VI and application reference?

Share this post


Link to post
Share on other sites

Since the Send function is self contained, I designed it as a fully encapsulated function. I pass it the machine name, port and VI to call. It then does everything in one shot. If I was to cache the VI Server ref and the VI ref, I would have to devise some sort of FGV mechanism and then find a way to deal with multiple calls to this re-entrant VI that are directing the message to different targets.

Oh, and I would love to hear a detailed explanation of the similarities and differences between the root loop and UI thread, including all the potential blocking operations.

The reason I went with this message architecture is it was the only one I could think of that did not require any polling on the receiver's part.

Share this post


Link to post
Share on other sites

Why are you using VI Server, rather than other communication methods (like TCP, Network Steams or Shared Variables)?

Share this post


Link to post
Share on other sites

Why are you using VI Server, rather than other communication methods (like TCP, Network Steams or Shared Variables)?

See reply above. :-)

Guess we posted at the same time.

Share this post


Link to post
Share on other sites
The reason I went with this message architecture is it was the only one I could think of that did not require any polling on the receiver's part.

One can do a “Server” without polling, if one dynamically spawns new processes to handle each connection. Though dynamic spawning brings one right back to the root loop problem...

Share this post


Link to post
Share on other sites

Do you have an example? Seems I would have to create a listener on the receiver side for every 'potential' sender. Since the receiver does not know who the senders might be, I can't think of how to do that.

Share this post


Link to post
Share on other sites

Do you have an example? Seems I would have to create a listener on the receiver side for every 'potential' sender. Since the receiver does not know who the senders might be, I can't think of how to do that.

What is so bad about a polling server? Do you foresee dozen of clients connecting to the server at the same time and loading it with large messages to be processed and answered? Otherwise a polling server can work quite well. If you really need to do potentially many simultaneous connections at the same time you might have to rethink your strategy anyhow. LabVIEW is not the ideal environment for heavy load network servers and even in C it requires some very careful programming to not run into thread starvation, and/or process creation overload in such situations. The Apache webserver uses some very sophisticated and platform specific code paths to allow handling many simultanous connections at the same time, most of which is not directly portable to LabVIEW.

Share this post


Link to post
Share on other sites

Do you have an example? Seems I would have to create a listener on the receiver side for every 'potential' sender. Since the receiver does not know who the senders might be, I can't think of how to do that.

There is only one listener. It listens for clients on a port and creates connections (one for each connected client). Those connections can be serviced either by polling through them or by dynamically spawning a connection handler for each. But there is only one listener.

Share this post


Link to post
Share on other sites

Do you have an example? Seems I would have to create a listener on the receiver side for every 'potential' sender. Since the receiver does not know who the senders might be, I can't think of how to do that.

Share this post


Link to post
Share on other sites

Also see the NI examples “DataServerUsingStartAsynchronousCall” or “DataServerUsingReentrantRun”.

Share this post


Link to post
Share on other sites

The end goal is a system where there are N servers and N clients. Each client can connect to N servers at the same time. Servers support N connections from clients simultaneously Servers 'push' data changes to the clients. Clients send commands to servers to control them. The data is mostly small but there are some circumstances where it could be around 1M. But that would not be continuous. Only once a minute or less. Most of the time the messages will contain a few k of data at most.

Clients will drop off from time to time and the servers will automatically detect this and stop sending to them.

Clients will know the machine names of the servers and will contact them to start a connection.

I really wish the VI Server method was feasible. It is by far the simplest and cleanest.

I will look at those examples. thanks,

-John

Share this post


Link to post
Share on other sites

One can do a “Server” without polling, if one dynamically spawns new processes to handle each connection. Though dynamic spawning brings one right back to the root loop problem...

How so? I thought you can avoid the root loop when spawning as long as you hold a refnum to build clones off of for the lifetime of an application? Of course that implies at some point you need to open the initial target refnum, but if done at start-up, no root access should be required whenever you need to spin off a new process.

Share this post


Link to post
Share on other sites

The end goal is a system where there are N servers and N clients. Each client can connect to N servers at the same time. Servers support N connections from clients simultaneously Servers 'push' data changes to the clients. Clients send commands to servers to control them. The data is mostly small but there are some circumstances where it could be around 1M. But that would not be continuous. Only once a minute or less. Most of the time the messages will contain a few k of data at most.

Clients will drop off from time to time and the servers will automatically detect this and stop sending to them.

Clients will know the machine names of the servers and will contact them to start a connection.

I really wish the VI Server method was feasible. It is by far the simplest and cleanest.

I will look at those examples. thanks,

-John

Yup. I meant the example to show dynamically launching of TCP processes (there are hidden "handlers" which are dynamically launched).

I don't see any reason why it should be an issue. Launching dynamic VIs means you can run them in separate threads and/or execution systems from the launching process so although your dispatcher might be in the UI thread, the spawned processes need not be.

Edited by ShaunR

Share this post


Link to post
Share on other sites

In my application, The client would open a connection to the server using VI server (open app ref, open VI ref, call VI). The server would have been up and running for some time before this so it is not in the startup state. The client too could have been up for some time and was making the connection due to some user actions. The server is unlikely to block the root loop since it is headless. But the client could easily block messages from the server by the user taking an action (drop down a menu) or a system dialog box being active. This would cause the server to be hung while it waits to execute the send message operation.

I was planning to have the server cache the client machine name and port for sending replies, but I suppose I could cache a reference instead so the only block-able action would be the initial connection establishment. Now it becomes a problem of risk and mitigation. Still, I would like to eliminate the risk altogether.

Share this post


Link to post
Share on other sites

Of course. You can't cache a connection for a machine you don't yet know which exists...

Share this post


Link to post
Share on other sites
How so? I thought you can avoid the root loop when spawning as long as you hold a refnum to build clones off of for the lifetime of an application? Of course that implies at some point you need to open the initial target refnum, but if done at start-up, no root access should be required whenever you need to spin off a new process.

I was going to go into that, but bailed with a “…"

The end goal is a system where there are N servers and N clients. Each client can connect to N servers at the same time. Servers support N connections from clients simultaneously Servers 'push' data changes to the clients. Clients send commands to servers to control them.

Maybe you should use a central message “broker”, with all servers and clients connecting via the broker. Then there is only one connection per process. I think Shaun’s Dispatcher works this way, if I recall right.

Share this post


Link to post
Share on other sites

If I can get this all sorted, I will post the example. It is part of an architecture I am trying out where each process in an application registers a transport with which to receive messages. So, when a message is sent, the system determines what process owns it and then uses the selected transport to send it.

The cool thing is, both the sender and receiver contain each process's class and all its messages. For networked messages, the sender simply registers the process with a network transport and does not implements a process loop to handle messages. On the receiver side, the same process is set to use a local queue for messages and we do implement a loop to handle messages. The sender's transport code simply puts the message into the receiver's local queue. The neat part is any process can receiver local and remote messages. And I can move a process from the receiver to the sender by simply implmenting a process handler loop for it and by changing its transport from remote to local.

The idea is to treat the messages as an API for each process. But, since I am still working out the details, I have not shared any of it yet. This VI Server issue came up as I was implementing my idea for the remote transport. But, I could easily replace that code with something else that gets the message from the sender to the receiver's queue.

One drawback is each application instance can only own one copy of a process type. (messages determine their destination by process type, not the actual owning process object)

Also, I have not sorted out a clean way to deal with local data within a process loop. But I think this can be done elegantly.

thanks for the ideas. I will see if I can make something workable.

Maybe you should use a central message “broker”, with all servers and clients connecting via the broker. Then there is only one connection per process. I think Shaun’s Dispatcher works this way, if I recall right.

That just creates a single point of failure. Something I cannot do in this system as the cost of it failing is expensive. I can live with one server going down or one client, but not something central to everything.

Share this post


Link to post
Share on other sites

That just creates a single point of failure. Something I cannot do in this system as the cost of it failing is expensive. I can live with one server going down or one client, but not something central to everything.

Depends how many dispatchers you have. If you have a single dispatcher (centralised server like a DNS server) then yes. If you have a dispatcher on every machine (like apache), then no-as long as you have reduntant processes elsewhere. The usual topology I use is to have a machine with, say, 5 processes and replicate that for failover. If the machine goes down then you just point to the other machine(s). That's the way the web works ;)

But haven't we discussed this before?

Edited by ShaunR

Share this post


Link to post
Share on other sites
That just creates a single point of failure. Something I cannot do in this system as the cost of it failing is expensive. I can live with one server going down or one client, but not something central to everything.

One thing to possibly look into is LabbitMQ, a LabVIEW wrapper of RabbitMQ, a message broker system. An already developed message system might have addressed many of your issues. Haven’t tried it myself (has anyone used LabbitMQ?) so I can’t tell how much effort it would be to set up.

— James

Share this post


Link to post
Share on other sites

But haven't we discussed this before?

Yes. But I ran into this again when working on my architecture. I really want to keep this simple. That is why I want the VI Server implementation to work. It lets me do messaging without having any code on the reciver side specific to the transport. I just call the local vi and stuff in the message. Simple and elegant, but not bulletproof unfortunately...

Share this post


Link to post
Share on other sites

I guess no one at NI want to chime in here with some tips about avoiding the root loop issue. I know some other architectures use VI Server for messaging. There must be some ways to avoid these issues or at least mitigate them.

Share this post


Link to post
Share on other sites

See this discussion, where I suggested the Actor Framework adopt mje’s mitigation of the issue (which he referred to above).

Wait, sorry. You mean for using VI Server for messaging, rather than dynamic launching of VIs. I don’t know what to do about that.

Edited by drjdpowell

Share this post


Link to post
Share on other sites

One thing to possibly look into is LabbitMQ, a LabVIEW wrapper of RabbitMQ, a message broker system. An already developed message system might have addressed many of your issues. Haven’t tried it myself (has anyone used LabbitMQ?) so I can’t tell how much effort it would be to set up.

— James

We use networked shared variables and also ActiveMQ (an implementation of Java Message Service) -- via a custom LabVIEW interface -- for messaging much as John describes. Both are examples of publish-subscribe communication (although ActiveMQ actually supports several modes); see also Observer Pattern. Communication with both (in our system) is event-driven (the DSC Module supports shared variable events). (I think polling is not as desirable both because of the resource issues but even more because it is possible to miss value changes and, on the flip side, it means acquiring and at least in some sense processing the same data multiple times.) John, you might want to look at the publish-subscribe approach. Isn't publish-subscribe communication really what you are after? That seems to me to be more or less what you are describing.

(For the record, RTI offers a LabVIEW interface to yet another implementation of publish-subscribe communication, this time using the DDS standard. The LabVIEW interface in the earlier versions wasn't as convenient as the native shared variable API was, but I haven't seen the latest versions.)

Share this post


Link to post
Share on other sites

We use networked shared variables and also ActiveMQ (an implementation of Java Message Service) -- via a custom LabVIEW interface -- for messaging much as John describes. Both are examples of publish-subscribe communication (although ActiveMQ actually supports several modes); see also Observer Pattern. Communication with both (in our system) is event-driven (the DSC Module supports shared variable events). (I think polling is not as desirable both because of the resource issues but even more because it is possible to miss value changes and, on the flip side, it means acquiring and at least in some sense processing the same data multiple times.) John, you might want to look at the publish-subscribe approach. Isn't publish-subscribe communication really what you are after? That seems to me to be more or less what you are describing.

(For the record, RTI offers a LabVIEW interface to yet another implementation of publish-subscribe communication, this time using the DDS standard. The LabVIEW interface in the earlier versions wasn't as convenient as the native shared variable API was, but I haven't seen the latest versions.)

Well when I was talking about polling I didn't mean the server to poll the clients but rather the old traditional multi client TCP/IP server example that adds incoming connection requests to an array of connections that is then processed inside a loop continously with a very small TCP Read timeout. Polling is probably the wrong name here. Unlike the truely asynchronous operation with one server handler clone being spawned per incoming connection, this solution simply processes all incoming data packets sequentially. While this can have potential problems in response time, if there are large messages to be processed and/or many parallel connections needing to be served, it completely avoids any root loop issues as it is not using any root loop synchronized LabVIEW nodes.

I have done several applications with an architecture based on this scheme and aside from some race conditions in handling TCP Reads with low timeouts in LabVIEW 5.0 that could crash LabVIEW hard this has always worked fine, even with several parallel connections needing to be serviced.

One architecture is based on a binary data protocol similar to the CVT Client Communication (CCC) Reference Library, the other is using in fact HTTP based messages that allow to even connect to the certain subsets of the server with simple web browsers also supporting user authentication for connections.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.