Jump to content

Intra-process signalling (was: VIRegister)


Recommended Posts

Hi.

This discussion is carried over from the Code-In-Development topic "VIRegister" since it developed into a more general discussion of the sanity of using by reference value-sharing (globals by lookup-table in different forms). I'll start this off by describing a few use cases that drove me to implement AutoQueues in general and the special case of the VIRegister toolset;

Consider a Top-Level VI with several subVIs. I call that a module in this post. A module could in itself be the entire application, or it could be a dynamically dispatched module from a logically higher level application.

Use case 1: The global stopping of parallel loops

We face this all the time: we have multiple parallel while-type loops running, maybe in the same VI, maybe in different (sub)VIs of a module. Something happens that should make the module stop - this "something" could happen externally (e.g. user quits main application) or internally (e.g. watchdog). Anyways, we need a way to signal a parallel structure to stop. If the parallel loop is throttled by an event-type mechanism (event, queue, TCP-read etc.) already, we could use that medium to transfer our stop signal on. But if the loop is more or less free-running, possibly just throttled by some timer, then we usually poll a local variable in each iteration for the stop condition. Locals work fine for such signalling inside a single VI. There are some drawbacks with locals, like you need to tie them to an FP object, you need to take special precautions regarding use of the terminal versus the local on Real-Time and so on.

If you cross VI boundaries locals won't cut it anymore. Globals could work, but they are tied to a file. Other types of "globals" exist, like Single-Process Shared Variables (they are rigidly defined in the LV project). So to be good architects we deploy another construct that'll allow sharing of our signal inside the module - typically an event or a queue (most often depending on 1:1 or 1:N topology). It's more work to create a user event, or to obtain a queue, for this, since we need somewhere to store the reference so everybody can get to it (FGs are often used for this), we must be careful for data (the signal) not to be invalidated due to the context moving out of scope, and we might have to handle the destruction or release of the reference when we're done using it.

A while back I made a toolset I called "Flags" to solve both cases above. Flags are dynamically dispatched FGs that each contain a single boolean, just nicely wrapped up passing the FG-VI reference from Flag function to Flag function:

post-15239-0-09281800-1309589130_thumb.p

This works ok, but not perfect. The signal can only be boolean in this case (95% of my uses are for booleans anyway, so this isn't that bad), but it gets unwieldy when I have many Flags (most of my modules use 3-6 different Flags) which again usually mean not all Flag references need to be wired to all subVIs:

post-15239-0-84535000-1309591315_thumb.p ... post-15239-0-35212800-1309591349_thumb.p

To improve my intra-process signalling toolbox I made the VIRegisters. The same example as above, but with VIRegisters; No initialization necessary, and no references to pass around:

post-15239-0-66902600-1309593162_thumb.p

Use case 2: Module-wide available data

In the TCPIP-Link code snippet above there is also another data set replaced with a VIRegister: Stream Parameters. This is a cluster of parameters that are used by a buffering subsystem of TCPIP-Link. The data is set with an external user event, but various parts of the Stream Parameters are used in different places (subVIs) of this module. Therefore it makes more sense to store this data in a VIRegister than it does to store it in a control on the Top-Level VI of the module and get to it by control reference.

Use case 3: Application/Instance-wide available data

There is an additional set of data in TCPIP-Link that must be "globally" accessible so a user API function can read it, namely a set of traffic information for each connection (payload ratio etc.). This data is stored inside a VI of a module in TCPIP-Link, but to make it available to the entire LV instance, that module was required to support generating an event to any party asking for this continuosly updating information. Replacing this mechanism with a VIRegister I disposed of all the event generation code, including the need for the API to fetch the event refnum to register from another global storage. Now I just drop in a VIRegister read function wherever I want to read the traffic information. It's leaps and bounds simpler to code, and I can't see it's any more fragile than the previous setup.

Remember I'm talking about lossy "signal" type data here (aka. latest-value or current-value single-element registers). Similar to Locals, but much better (in my view). I have other toolsets for lossless data transmission, but we can discuss those at another time.

Are VIRegisters (or any similar architecture) bad? Stephen has some grief regarding the reference-nature of VIRegisters, but I don't think they are worse than named queues. Queues can be accessed from anywhere in the LV instance by name, using an internal lookup-table. I don't see NI discourage the use of named queues? So, the discussion commences. I'm looking forward to learning something from this :rolleyes:.

Cheers,

Steen

  • Like 1
Link to comment

Hi Steen,

Curious to know... what is tcpip-link? Sounds like some cool inter process messaging architecture.

It is :D. It's basically a TCP/IP based messaging toolset that I've spent the last 1½ years developing. I'm the architect and only developer on it, but my employer owns (most of) it. I spoke with Eli Kerry about it at the CLA Summit, and it might be presented to NI at NI Week if we manage to get our heads around what we want to do with it. But as it's not in the public domain I unfortunately can't share any code really. But this is what TCPIP-Link is (I'm probably forgetting some features):

  • A multi-connect server and single-connect client that maintains persistent connections with each other. That means they connect, and if the connection breaks they stay up and attempt to reconnect until the world ends (or until you stop one of the end-points :rolleyes:).
  • You can have any number of TCPIP-Link servers and clients running in your LabVIEW instance at a time.
  • Both server and client support TCP/IP connection with other TCPIP-Link parties (LabVIEW), as well as non-TCPIP-Link parties (LabVIEW or anything else, HW or SW). So you have a toolset for persistent connections with anything speaking TCP/IP basically.
  • Outgoing messages can be transmitted using one of four schemes: confirmation-of-transmission (no acknowledge, just ack that the message went into the transmit-buffer without error), confirmation-of-arrival (TCPIP-Link at the other end acknowledges the reception; happens automatically), confirmation-of-delivery (you in the receiving application acknowledges reception; is done with the TCPIP-Link API, the message tells you if it needs COD-ack), and a buffered streaming mode.
  • The streaming mode works a bit like Shared Variables, but without the weight of the SVE. The user can set up the following parameters per connection: Buffer expiration time (if the buffer doesn't fill, it'll be transmitted anyway after this period of time), Buffer size (the buffer will be transmitted when it reaches this size), Minimum packet gap (specifies minimum idle time on the transmission line, especially useful if you send large packets and don't want to hog the line), Maximum packet size (packets are split into this size if they exceed it), and Purge timeout (how long time will the buffer be maintained if the connection is lost, before it's purged).
  • You transmit data through write-nodes, and receive data by subscribing to events.
  • Subscribable system-events are available to tell you about connects/disconnects etc.
  • A log is maintained for each connection, you can read the log when you want or you can subscribe to log-events. The log holds the last 500 system eventsfor each connection (Connection, ConnectionAttempt, Disconnection, LinkLifeBegin, LinkLifeEnd, LinkStateChange, ModuleLifeBegin, ModuleLifeEnd, ModuleStateChange etc.) as well as the last 500 errors and warnings.
  • The underlying protocol, besides persistence, utilizes framing and byte-stuffing to ensure data integrity. 12 different telegram types are used, among which is a KeepAlive telegram that discover congestion or disconnects that otherwise wouldn't propagate into LabVIEW. If an active network device exist between you and your peer, LabVIEW won't tell you if the peer disconnected by mistake. If you and your peer have a switch between you for instance, your TCP/IP-connection in LabVIEW stays valid even if the network cable is disconnected from your peer's NIC - but no messages will get through. TCPIP-Link will discover this scenario and notify you, close the sockets down, and go into reconnect-mode.
  • TCPIP-Link of course works on localhost as well, but it's clever enough to skip TCP/IP if you communicate within the same LV-instance, in which case the events are generated directly (you can force TCPIP-Link to use the TCP/IP-stack anyway in this case though, if you want to).
  • Something like 20 or 30 networking and application related LabVIEW errors are handled transparently inside all components of TCPIP-Link, so it won't wimp out on all the small wrenches that TCP-connections throw into your gears. You can read about most of what happens in the warning log if you care though (error 42 anyone? Oh, we're hitting the driver too hard. Error 62? Wait, I thought it should be 66? No, not on Real-Time etc.).
  • The API will let you discover running TCPIP-Link parties on the network (UDP multicast to an InformationServer on each LV-instance, configurable subnet time-to-live and timeout). Servers and clients can be configured individually as Hidden to remain from discovery in this way though.
  • Traffic data is available for each connection, mostly stuff like line-load, payload ratio and such.

It's about 400 VIs, but when you get your connections up and running (which isn't harder than dropping a StartServer or StartClient node and wire an IP-address to it) the performance is 90-95% of the best you (I) can get through the most raw TCP/IP-implementation in LabVIEW. And such a basic implementation (TCP Write and TCP Read) leaves a bit to be desired, if the above feature list is what you need :rolleyes:.

We (CIM Industrial Systems) use TCPIP-Link in measurement networks to enable cRIOs to persistently stay connected to their host 24/7 for instance. I'm currently pondering implementing stuff like adapter-teaming (bundling several NICs into one virtual connection for redundancy and higher bandwidth) as well as data encryption. Here's a connection diagram example from the user guide (arrows are TCPIP-Link connections):

post-15239-0-50704200-1309631909_thumb.p

Cheers,

Steen

  • Like 1
Link to comment

In the past I have bought into the whole NI DSC/NSV technology for my cRIO based SCADA applications. I really like the SVE Publish Subscribe Protocol and the all the built-in features like NSV-NSV and NSV-LV control binding as well as the Distributed System Manager and DSC Citadel data logging. The problem is that it is a closed system, for example even though the SVE supports SV events, in LabVIEW this functionality is not exposed and you have to buy the DSC just to get such a basic capability. Also NSV's are cpu intensive, if you wanted to use them to emulate a CVT then you will hit the cpu% ceiling rapidly. I think, at least on the read side, the SVE could offer a more efficient interface for getting current NSV data. I think in the future I will consider an open tcpip based architecture.

Link to comment

It's quite a shame really, because Shared Variables held such a promise, but we use them as little as we can get away with now. We've been burned repeatedly by their shortcomings, and I feel we've been let down by NI Support when we've tried to get issues resolved. Issues that still carry several CARs around, like how every Shared Variable in an SV-lib inherits the error message from any of the member-SVs to have experienced a buffer overflow for instance - such an error will emerge from all the SV-nodes when any single node experiences the error. NI have known about that for years now, but won't fix it. And the update rate performance specifications aren't transferable to real life. SVs are quite limited in regards to dynamic IPs - we use DataSocket to connect to SVs on a target with dynamic IP, but DataSocket holds its own bag of problems (needing root loop to execute), and it's not truely flexible enough anyway. A Real-Time application can't deploy SVs on its own programmatically etc.

Hi Steen,

I totally agree with you about the limitations of Shared Variables and about the disappointing support from NI regarding SVs. However, I haven't given up on them quite yet. Actually, the majority of our systems are still designed with SVs as the main tool for intra-process (and extra-process) signalling, and I must say they work really well.

I design data acquisition and control systems for thermal vacuum chambers that are used to perform space simulation. Tests can be up to one month long (24/7) but everything happens relatively slowly. The acquisition rate of our CompactRIOs is 1Hz and losing a reading once in a while is not an issue. Also, the outputs can take a couple seconds to react without causing problems. Finally, data loggers that record hundreds of temperature readings are queried only once every 10 seconds. In this environment, Shared Variables with the DSC module and a Citadel database can be an excellent solution, especially because they benefit from the power of the Distributed System Manager. When we consider other designs, like VIRegister, the DSM is the factor that tips the scale in favour of using Shared Variables.

Having said that, I don't see anything wrong with VIRegister and I will consider it next time I design a high throughput application (yes, we do have some of those as well ;) ).

Regards,

LP.

Link to comment

Hi Steen,

I totally agree with you about the limitations of Shared Variables and about the disappointing support from NI regarding SVs. However, I haven't given up on them quite yet. Actually, the majority of our systems are still designed with SVs as the main tool for intra-process (and extra-process) signalling, and I must say they work really well.

I design data acquisition and control systems for thermal vacuum chambers that are used to perform space simulation. Tests can be up to one month long (24/7) but everything happens relatively slowly. The acquisition rate of our CompactRIOs is 1Hz and losing a reading once in a while is not an issue. Also, the outputs can take a couple seconds to react without causing problems. Finally, data loggers that record hundreds of temperature readings are queried only once every 10 seconds. In this environment, Shared Variables with the DSC module and a Citadel database can be an excellent solution, especially because they benefit from the power of the Distributed System Manager. When we consider other designs, like VIRegister, the DSM is the factor that tips the scale in favour of using Shared Variables.

Having said that, I don't see anything wrong with VIRegister and I will consider it next time I design a high throughput application (yes, we do have some of those as well ;) ).

Regards,

LP.

Hi LP.

It's not my intention to turn this into a flamewar against SVs. I sincerely hope the kinks will get ironed out. NI is putting a lot of effort into making SVs better at least, so I'm certain we'll see big improvements also in the next several LabVIEW releases. SVs, especially network enabled ones, are very powerful when they work. My concern isn't performance as any technology has a performance envelope - I'm disappointed that NI didn't disclose the full monty about the expected performance envelope when we struggled so hard to make SVs work in streaming applications (our typical Real-Time application will need to stream 15-30 Mbytes/s, often distributed on several RT-targets). CIM Industrial Systems is one of the biggest NI integrators out there, so I'd expected more honesty. Now I believe we have a good idea about that performance envelope, we have afterall probably field tested SVs for 5-10,000 hours :rolleyes:. No, my real concern is the SVs tendency to undeploy themselves when we're pushing their limits. And it's not possible to recover from that mode of failure without human interaction, simply because we cannot redeploy an SV-lib from the Real-Time system itself. LabVIEW for Desktop can, but not RT. That's a risk I can't present with a straight face to our customers. And I agree, the DSM is a great tool!

Regarding VIRegisters please note that this is a lossy register. More like a global variable than a buffered Shared Variable. I'm looking into making a buffered VIRegister, where all updates are receieved by all readers, but it's quite complicated if I do not want to have a central storage like the SVE. It'd be very simple to enable buffering in the CVT toolset, since you here have the central storage (an FG), but I don't want to go that way. That'd be too straightforward, and we know straightforwardness in implementation is inverse proportional to performance ;).

A replacement for network enabled SVs could be TCPIP-Link (minus the control binding feature and the integration with DSM, at least for now), but that's a different story.

Cheers,

Steen

Edited by Steen Schmidt
Link to comment

It's not my intention to turn this into a flamewar against SVs. I sincerely hope the kinks will get ironed out. NI is putting a lot of effort into making SVs better at least, so I'm certain we'll see big improvements also in the next several LabVIEW releases. SVs, especially network enabled ones, are very powerful when they work. My concern isn't performance as any technology has a performance envelope - I'm disappointed that NI didn't disclose the full monty about the expected performance envelope when we struggled so hard to make SVs work in streaming applications (our typical Real-Time application will need to stream 15-30 Mbytes/s, often distributed on several RT-targets). CIM Industrial Systems is one of the biggest NI integrators out there, so I'd expected more honesty. Now I believe we have a good idea about that performance envelope, we have afterall probably field tested SVs for 5-10,000 hours :rolleyes:.

I totally understand your frustration, we are not a major NI integrator so imagine how much they tell us :frusty: . We are the kings of work arounds.

Now I believe we have a good idea about that performance envelope, we have afterall probably field tested SVs for 5-10,000 hours :rolleyes:.

5-10,000 hours!!! :worshippy: I would be very interested to hear about your findings, it would make a great post...

No, my real concern is the SVs tendency to undeploy themselves when we're pushing their limits. And it's not possible to recover from that mode of failure without human interaction, simply because we cannot redeploy an SV-lib from the Real-Time system itself. LabVIEW for Desktop can, but not RT. That's a risk I can't present with a straight face to our customers.

Personally, I have never witnessed this phenomenon but I must say I never ask much of the Shared Variables I deploy on real-time systems. I will print this quote and post it on my office wall to remind me to always be gentle with RTS SVs.

Link to comment

Steen, thank you for posting this information.

I have held off replying in this thread... I'm still trying to take all of the content in and formulate a reply, but I can give some feedback at this point.

Your first post is a set of use cases. However, all three of them make assumptions about what the code needs to do, a bit too late in the design process from where I hope to intercept and redirect. Take the "stop parallel loops" for example. The way you've written the use case, there's an unspoken assumption that the code requires a special communications channel among the loops in order to communicate "stop". This particular problem is *exactly* the one that a group of us inside NI have spent over a year debating. We found a wide variety of techniques used in various applications to do this task. After a lot of analysis, where we were looking for ease of set up, correctness, inability of a stop signal to be accidentally missed, etc, we came to a conclusion.

I just typed that conclusion and then deleted it. I am unsure what the best way to approach this is, whether to bootstrap this conversation with some of the preliminary ideas of the small group inside NI or whether to let this LAVA post develop and see whether the results that develop here are the same as have developed in our small group. The problem is debating all of this over threads of LAVA is a LOT harder than doing it in person. I'm going to be presenting some portion of this content at NI Week this year -- I'm still working on the presentation, and at this point, I have no idea how much I'll pack into the limited time I've got.

Ok... let me try this and see how it goes.

Continuing to focus on "stop two parallel while loops." Suppose we have two loops, Alpha and Beta, both in the same app but not on the same block diagram. Let's ignore "stop" for a moment and focus on the normal communications between these two loops. Global VIs, functional global VIs, queues, notifiers, user events, local variables, TCP/IP, shared variables, etc... We've got a list of about 20 different communications technologies within LV that can all be used to do communications between two loops. Suppose for a moment that Alpha is going to send messages to Beta. Alpha was written to have two queues, one for high priority messages and one for low priority messages.

post-5877-0-39022800-1310222022_thumb.pn

What does Beta.vi look like? Assume that it, like Alpha, is an infinite loop that does not error checking... what I want us to focus on for the moment is how it receives and processes the messages from Alpha. Assume that the goal is to call a subVI "Process Message.vi" for each message, but don't waste time processing messages in the low priority queue if there are messages in the high priority queue waiting. If it helps, feel free to assume that an empty string is never a valid message.

How efficient can you make Beta.vi? Does it sleep when there are no messages in either queue? How much wiring did it take to write Beta? Is it more or less wiring than you would expect to have to wire? How does the complexity of code scale if the architecture added a third queue for "middle priority", or even an array of queues where the first of the queue was highest priority for service and end of the queue was lowest? What are your overall thoughts on this communication scheme?

[LATER] I have now uploaded three different variations that work for Beta.vi in a later post. But before you skip ahead to that answer, I encourage you to try to write Beta.vi yourself. It's a really good LV exercise.

Link to comment

Your first post is a set of use cases. However, all three of them make assumptions about what the code needs to do, a bit too late in the design process from where I hope to intercept and redirect. Take the "stop parallel loops" for example. The way you've written the use case, there's an unspoken assumption that the code requires a special communications channel among the loops in order to communicate "stop". This particular problem is *exactly* the one that a group of us inside NI have spent over a year debating. We found a wide variety of techniques used in various applications to do this task. After a lot of analysis, where we were looking for ease of set up, correctness, inability of a stop signal to be accidentally missed, etc, we came to a conclusion.

Ok, I'm getting this feeling that I'm kind of missing the point here and probably being hopelessly naive, but...

Isn't the two parallel loops exactly what notifiers were invented for ? Ok, yes they're lossy, but if you are just using it to stop two parallel loops and you know the last message every to posted on the notifier is the 'stop now' then it doesn't matter surely ? The problem surely with any queue based design is that you're stuffed if someone else grabs the queue and takes your message - or if you want to stop N loops in parallel where N is only known at run time.

What would be handy for multiple loops distributed over indeterminate numbers of parallel running vis would be a one to many queue with priorities - so that you could enqueue an element with an arbitrary priority and have that element delivered to multiple waits and have the elements presented to the wait sorted first by priority and then by enque-timestamp. Thus each dequeue node could pull entries safe in the knowledge that it wasn't affecting any other dequeue node, could choose to discard elelemts if it wanted, but would process them in an order determined by the enquer that wasn't necessarily FIFO. Nut I don't see this is necessary for the stated problem...?

Link to comment

Ok, I'm getting this feeling that I'm kind of missing the point here and probably being hopelessly naive, but...

Isn't the two parallel loops exactly what notifiers were invented for ? Ok, yes they're lossy, but if you are just using it to stop two parallel loops and you know the last message every to posted on the notifier is the 'stop now' then it doesn't matter surely ? The problem surely with any queue based design is that you're stuffed if someone else grabs the queue and takes your message - or if you want to stop N loops in parallel where N is only known at run time.

What would be handy for multiple loops distributed over indeterminate numbers of parallel running vis would be a one to many queue with priorities - so that you could enqueue an element with an arbitrary priority and have that element delivered to multiple waits and have the elements presented to the wait sorted first by priority and then by enque-timestamp. Thus each dequeue node could pull entries safe in the knowledge that it wasn't affecting any other dequeue node, could choose to discard elelemts if it wanted, but would process them in an order determined by the enquer that wasn't necessarily FIFO. Nut I don't see this is necessary for the stated problem...?

That is really what events are for. However. I have a dislike for them since they cannot be encapsulated easily and maintain genericism. There are a couple of other options though (with queues). You can peek a queue and only dequeue the message only if it is for it (has the downside that if you don't dequeue an element-it stalls). Or my favorite of each "module" has a queue linked to the VI instance name. To close all dependents, you only need to list all VI names and poke (enqueue at opposite end) an exit message on all of them (just a for loop). This becomes very straight forward if all queues are string types and just compare (or have a case) to detect EXIT,STOP,DIE or whatever to terminate. Some people however prefer strict data types.

But I think you are right. In the absence of events, notifiers are the next best choice for 1 to many messaging. I think that most people prefer to have one or the other rather than both in a loop though. And if a queue is already being used, it makes sense to try and incorporate an exit strategy using it.

Edited by ShaunR
Link to comment
Isn't the two parallel loops exactly what notifiers were invented for ?
The question originally came up because it's hard to teach Notifiers to beginning LV programmers. It's a lot simpler to teach local variables, which do often work for stopping two loops on the same diagram, albeit with some difficulty around a button's mechanical action. Surely there must be a quick-to-program, easy-to-understand way to stop two parallel loops? It's a question I've asked myself for years, but only recently undertaken trying to identify why it is so frigging hard (compared with the expected level of difficulty) to see if a better pattern could be found.
And if a queue is already being used, it makes sense to try and incorporate an exit strategy using it.
Ten points and a bowl of gruel to Shaun for hitting on the core point!

Here are three possible solutions to what Beta.vi could look like:

post-5877-0-47463100-1310308936_thumb.pn

post-5877-0-14863100-1310308944_thumb.pn

post-5877-0-18876800-1310307846_thumb.pn

All solutions I've seen to this problem are either extremely messy or extremely inefficient (because they poll instead of sleep). In my experience, this sort of mess crops up whenever you try to mix multiple communications channels -- either you end up polling all of them, you end up having some complex "wake everyone up when one wakes up" scheme, or you have sentinel values to give time to each one (this last not being an option if all channels are equal priority). A good architecture rule appears to be "Between any two parallel loops Alpha and Beta, there should be one and only one communication channel from Alpha to Beta (there's a separate conversation to be had about whether communication from Beta to Alpha should use the same or a separate communications channel).

The rule "there should be only one communications channel from Alpha to Beta" implies that the right way to stop two loops is not a fixed answer. If you say, "Use notifiers to stop the loops", that would imply you should only ever use notifiers to communicate between loops, which is daft. Instead, the right way to stop two loops is to use whatever communication scheme you've already got between the two loops. If there isn't any then, yes, Notifiers are probably the simplest to set up and get right for all use cases (including the quite tricky "stop both loops and then restart the sender loop but only after you're sure that both loops stopped"). But if you have an existing queue/event/network stream/etc, then use that.

Using the same channel does not necessarily mean tainting your data with a sentinel value for stop. The pattern of "producer calls Release Queue and the consumer stops on error" is an example of using the channel.

Complete and total tangent: Option 3, as I say in my comments, is the sort of code I generally discourage people from trying to write. Mixing the Status functions with the Wait functions (meaning Dequeue, Wait for Notifier, Event Structure, etc) is a perfect storm for race conditions and missed signals because the Status check and starting the Wait are not atomic operations. Here's a slight variant of Option 3, but this variant doesn't work... there are times when a low priority message will be handled even when there are multiple high priority messages waiting in the queue. This is my diagram, but it is based on some other code I've been shown that tried to do this, and it looked right to the author.

post-5877-0-02580400-1310308840_thumb.pn

Link to comment

  • A multi-connect server and single-connect client that maintains persistent connections with each other. That means they connect, and if the connection breaks they stay up and attempt to reconnect until the world ends (or until you stop one of the end-points :rolleyes:).
  • You can have any number of TCPIP-Link servers and clients running in your LabVIEW instance at a time.
  • Both server and client support TCP/IP connection with other TCPIP-Link parties (LabVIEW), as well as non-TCPIP-Link parties (LabVIEW or anything else, HW or SW). So you have a toolset for persistent connections with anything speaking TCP/IP basically.
  • Outgoing messages can be transmitted using one of four schemes: confirmation-of-transmission (no acknowledge, just ack that the message went into the transmit-buffer without error), confirmation-of-arrival (TCPIP-Link at the other end acknowledges the reception; happens automatically), confirmation-of-delivery (you in the receiving application acknowledges reception; is done with the TCPIP-Link API, the message tells you if it needs COD-ack), and a buffered streaming mode.
  • The streaming mode works a bit like Shared Variables, but without the weight of the SVE. The user can set up the following parameters per connection: Buffer expiration time (if the buffer doesn't fill, it'll be transmitted anyway after this period of time), Buffer size (the buffer will be transmitted when it reaches this size), Minimum packet gap (specifies minimum idle time on the transmission line, especially useful if you send large packets and don't want to hog the line), Maximum packet size (packets are split into this size if they exceed it), and Purge timeout (how long time will the buffer be maintained if the connection is lost, before it's purged).
  • You transmit data through write-nodes, and receive data by subscribing to events.
  • Subscribable system-events are available to tell you about connects/disconnects etc.
  • A log is maintained for each connection, you can read the log when you want or you can subscribe to log-events. The log holds the last 500 system eventsfor each connection (Connection, ConnectionAttempt, Disconnection, LinkLifeBegin, LinkLifeEnd, LinkStateChange, ModuleLifeBegin, ModuleLifeEnd, ModuleStateChange etc.) as well as the last 500 errors and warnings.
  • The underlying protocol, besides persistence, utilizes framing and byte-stuffing to ensure data integrity. 12 different telegram types are used, among which is a KeepAlive telegram that discover congestion or disconnects that otherwise wouldn't propagate into LabVIEW. If an active network device exist between you and your peer, LabVIEW won't tell you if the peer disconnected by mistake. If you and your peer have a switch between you for instance, your TCP/IP-connection in LabVIEW stays valid even if the network cable is disconnected from your peer's NIC - but no messages will get through. TCPIP-Link will discover this scenario and notify you, close the sockets down, and go into reconnect-mode.
  • TCPIP-Link of course works on localhost as well, but it's clever enough to skip TCP/IP if you communicate within the same LV-instance, in which case the events are generated directly (you can force TCPIP-Link to use the TCP/IP-stack anyway in this case though, if you want to).
  • Something like 20 or 30 networking and application related LabVIEW errors are handled transparently inside all components of TCPIP-Link, so it won't wimp out on all the small wrenches that TCP-connections throw into your gears. You can read about most of what happens in the warning log if you care though (error 42 anyone? Oh, we're hitting the driver too hard. Error 62? Wait, I thought it should be 66? No, not on Real-Time etc.).
  • The API will let you discover running TCPIP-Link parties on the network (UDP multicast to an InformationServer on each LV-instance, configurable subnet time-to-live and timeout). Servers and clients can be configured individually as Hidden to remain from discovery in this way though.
  • Traffic data is available for each connection, mostly stuff like line-load, payload ratio and such.

Cheers,

Steen

This sound like a more polished/advanced evolution of the Dispatcher in the CR (I like the use of events here although I ran into issues with them and decided TCPIP timeouts were more robust). Many of the features you highlight here (like auto-reconnect, system messages, heartbeat etc) I've been meaning to add along with a more bi-directional architecture (although the version I have in my SVN also has control channels as well as the subscriber streaming channels). But on the whole, your stuff sounds a lot more flexible and useful (I'd love to get a peek at your error detection and recovery :) )

Link to comment

That is really what events are for. However. I have a dislike for them since they cannot be encapsulated easily and maintain genericism.

I think the thing that has put me off events is that it forces one to use an event structure as the only way to handle messages and then one might actually want an event structure somewhere else in the same VI and I have this prejudice against two event structures on the same diagram... Also, there's no equivalent to the queue/notifier status and flush primitives (although I take note of AQ's warning over race conditions when used in combination with dequeue/wait primitives). I guess what I really want is something that combines elements of everything:

  • Non-lossy like queues
  • One to many mappings like notifiers
  • Something I can feed the reference straight into an event structure and handle 'new element' events.
  • Re-ordering of queued elements so that one can have high priority traffic overtake lower priority.

Link to comment

I think the thing that has put me off events is that it forces one to use an event structure as the only way to handle messages and then one might actually want an event structure somewhere else in the same VI and I have this prejudice against two event structures on the same diagram... Also, there's no equivalent to the queue/notifier status and flush primitives (although I take note of AQ's warning over race conditions when used in combination with dequeue/wait primitives). I guess what I really want is something that combines elements of everything:

  • Non-lossy like queues
  • One to many mappings like notifiers
  • Something I can feed the reference straight into an event structure and handle 'new element' events.
  • Re-ordering of queued elements so that one can have high priority traffic overtake lower priority.

Indeed. Events have been screaming for an overhaul for some time. I'm not sure, but I think they may also run the the UI thread which would make them useless for running in different execution systems and priorities (another reason I don't use them much....just in case).

I would also add to your list being able to feed VISA sessions straight in so we can have event driven serial (pet dislike of ine :P )..

Link to comment

I agree with both Shawn and Stephen. My loops only have one way to receive messages; adding more for special cases creates unnecessary complexity. I only use user events to send messages to "user input" loops. (Loops with an event structure to handle front panel actions.) Sometimes I'll use notifiers for very simple single-command loops--a parallel loop on a block diagram that starts up and run continuously until instructed to shut down. Usually I just stick with queues though, since the consistency is convenient and queues are a lot more flexible.

All solutions I've seen to this problem are either extremely messy or extremely inefficient <snip> either you end up polling all of them, you end up having some complex "wake everyone up when one wakes up" scheme, or you have sentinel values to give time to each one (this last not being an option if all channels are equal priority)

Polling or junk messages are the only ways I've been able to figure out how to deal with the problem. LapDog's PriorityQueue class uses polling simply because it's easier to implement and understand. I believe it would be possible to do the more efficient junk message implementation that allows for an arbitrary number of priority levels by wrapping the dequeue prim in a vi and dynamically launching an instance for each queue in the PriorityQueue's internal array when PriorityQueue.Dequeue is called. It's a fairly complex solution, but at least it's all wrapped in a class and hidden from the end user.

What would be handy for multiple loops distributed over indeterminate numbers of parallel running vis would be a one to many queue with priorities...

I don't think it would be hard to implement what you describe. The aforementioned LapDog PriorityQueue class is an example of how to implement message priority functionality in a queue. There are several ways to make a custom "queue" class have the same one-to-many behavior as a notifier. You could have one input queue and an array of output queues, one for each message receiver. You could call it a queue but under the hood use a notifier instead. Combine the two and you're good to go.

Stephen has some grief regarding the reference-nature of VIRegisters, but I don't think they are worse than named queues. Queues can be accessed from anywhere in the LV instance by name, using an internal lookup-table. I don't see NI discourage the use of named queues?

NI may not discourange their use, but there are many of us (admittedly a minority) who believe they are a quick solution instead of a good solution. Here's a quote taken from Stephen's excellent article, The Decisions Behind the Design:

LabVIEW has mechanisms already for sharing data on a wire. Although those mechanisms are insufficient by some standards, they do exist, and they improve in every LabVIEW release. LabVIEW didn’t need another way to share data. It needed a way to isolate data. It is hard to guarantee data consistency when you cannot limit changes to the data. To support encapsulation, we decided that a class in LabVIEW should be basically a cluster that cannot be unbundled by all VIs. Thus, unlike Java or C# and unlike C++, LabVIEW has a pure by-value syntax for its objects. When a wire forks, the object may be duplicated, as decided by the LabVIEW compiler. (Emphasis added.)

I remember not agreeing with that when I first read it. After all, data obtained in one part of an application is often needed by another part of the application. Isolation makes it harder to get the data from here to there, right? (As it turns out, not really, as long as you have a good messaging system and a well-defined loop control hierarchy.)

Unfortunately it's really easy to miss the key sentence in that paragraph, and there's no explanation about why it is important. IMO, that sentence is,

It is hard to guarantee data consistency when you cannot limit changes to the data.

As is usually the case when I disagree with Stephen, over time I began to see the why until eventually I understood he was right. The guarantee of data consistency pays off in spades in my ability to understand and debug systems. Any kind of globally available data--named queues, functional globals, globals, DVRs, etc.--breaks that guarantee to some extent.

When I'm digging through someone else's code and run into those constructs, I know the amount of work I need to do to understand the system as a whole has just increased--probably significantly. There's no longer a queue acting as a "single point of entry" for commands to the loop. Instead, I've got this data that is being magically changed somewhere else (possibly in many places) in the application. It is usually much harder to figure out how that component interacts with the other components in the system. Named queues are especially bad. We have some control over the scope of the other constructs and can limit where they're used, but once we create a named queue there's no way to limit who interacts with it.

"Good" application architectures limit the connections between components and the knowledge one component has of another. Flags usually, imo, reveal too much information about other components and lead to tighter coupling. Suppose I have a data collection loop and a data logging loop and I want the logger to automatically save the file when data collection stops. A flag-based implementation might have the collection loop trip a "CollectionStopped" boolean flag when it stopped collecting data, while the logging loop polls the flag and saves the data when it switches to true. Quick and easy, right? Yeah, if you don't mind the consequences.

For starters, why should the logging loop even know of the existence of a collection loop? (It does implicitly by virtue of the CollectionStopped boolean.) It shouldn't care where the data came from or the conditions under which it should save the data, only that it has data that might need to be saved at some time. Maybe the data was loaded from a file or randomly generated. Maybe I want users to have the option to save the data every n minutes during collection to guard against data loss. How do I save that data?

I could trip the flag in other parts of my application to cause the logging loop to save the data (if we optimistically assume the flag doesn't also trigger actions by other components,) but why is a timer loop way over on this side of the app tripping a "CollectionStopped" flag when it doesn't have any idea if collection has started or stopped? To clarify the code perhaps we rename the flag to "SaveDataLog." Better? Not really. Why is the collection loop issuing a command to save data? What if I'm running a demo or software test and don't want to save the data? Another flag? How many flags will I have to add to compensate for special cases? Each time I make a change to accomodate a special case I have to edit the collection loop or the logging loop, possibly introducing more bugs.

Contrast that with a messaging-based system, where the collection loop starts when it receives a "StartDataCollection" message on its input queue and sends a "DataCollectionStopped" message on its output queue when it stops. Likewise the logging loop saves data to disk when it receives a "SaveData" input message and sends a "DataSaved" output message when it has finished saving. Both loops send and receive messages from a mediator (or control) loop that does the message routing. Once implemented and tested the collection and logging loops don't need to be revisited to handle special cases. Separating the code that performs the collection and logging from the logic that controls the collection and logging makes it far easier (again imo) to build robust, sustainable, and testable applications.

Acquiring data and saving data are two independent processes. Directly linking their behavior via flags or any other specific reference type leads to long term coupling and makes it harder to extend the app's functionality as requirements change. Do I use reference data and flags? Yep, when I have to, but I think they are used far more often than they need to be and I've found I rarely need them myself.

Link to comment
Where do they run then? In the execution system of the vi properties?
Yes. Each Event Structure has its own events queue, and it dequeues from that queue in the thread that it is executing that section of the VI. The default is "whatever execution thread happens to be available at the time", but the VI Properties can be set to pick a specific thread to execute the VI, and the Event Structure executes in that thread.

The "Generate User Events" node follows the same rules -- it runs in whatever thread is running that part of the VI. UI events are generated in the UI thread, obviously, but they're handled at the Event Structure.

I think the thing that has put me off events is that it forces one to use an event structure as the only way to handle messages and then one might actually want an event structure somewhere else in the same VI and I have this prejudice against two event structures on the same diagram..
A good prejudice to have, generally. But I have to ask... why wouldn't you use the same event structure to handle both? The Event Structure was certainly intended to mix dynamic user events and static UI events handling, and it works really well for keeping that behavior straight. And if the two event structures are handling completely disjoint sets of user events, that's one of the times when the prejudice can be relaxed without worry. Not that I've ever seen a reason to do this, but it would be ok.
Link to comment

Yes. Each Event Structure has its own events queue, and it dequeues from that queue in the thread that it is executing that section of the VI. The default is "whatever execution thread happens to be available at the time", but the VI Properties can be set to pick a specific thread to execute the VI, and the Event Structure executes in that thread.

The "Generate User Events" node follows the same rules -- it runs in whatever thread is running that part of the VI. UI events are generated in the UI thread, obviously, but they're handled at the Event Structure.

Hmmm. If that is true. How is it reconciled with the events of front panel controls which surely (neck stretched far) must be in the UI thread. I could understand "User Events" being able to run in anything, but if bundled with a heap of front panel events; is it still true?

Link to comment

Named queues are especially bad. We have some control over the scope of the other constructs and can limit where they're used, but once we create a named queue there's no way to limit who interacts with it.

I am not disagreeing with this point, but I always struggle on how to exchange queue information. How does object A tell object B to send me a message when criteria X has been achieved when A and B have been created independently. This is a classic Publish-Subscribe scenario. While I do not use the name parameter in my queue creation for this purpose, I do have a look-up table that is effectively the same thing.

So, what is the best way to share queue references in a network of queues? Or perhaps you don't really share queues. Object A simply tells some message clearing house component that is it interested in knowing about 'criteria X' and Object B tells the same message clearing house component that it can generate a message on 'criteria X'. So, potentially, neither Object A or B now about each other, only the message clearing house component.

Link to comment

How does object A tell object B to send me a message when criteria X has been achieved when A and B have been created independently. This is a classic Publish-Subscribe scenario.

Preface: I'm assuming there are unstated reasons obj A and obj B must run in separate loops. This implementation pattern is overkill if the A and B need not be asynchronous.

Publish-subscribe is one solution to that problem. PS establishes the communication link between the two components at runtime. That is useful primarily when the communication links between components are not well-defined at edit time. For example, a plug-in framework where the number of installed plugins and the messages each plug-in is interested will vary is a good candidate for PS.

Usually I don't need that level of flexibility, so using PS just adds complexity without a tangible benefit. To keep A and B decoupled from each other I combine them together in a higher level abstraction. What you are calling a "message clearing house" I call a "mediator loop" because it was inspired by the Mediator Pattern from GoF. The mediator receives messages from A and B and, based on the logic I've built into the mediator loop, forwards the message appropriately.

post-7603-0-67767900-1310405133_thumb.pn

A couple things to point out in the diagram above:

1. Each mediator loop is the "master loop" for one or more "slave loops" (which I've mentioned elsewhere.) "Master" and "slave" are roles the loops play, not attributes of the loop. In other words, the Obj A loop may actually be another mediator loop acting as master for several slaves, and the Mediator Loop on the diagram may be a slave to another higher level master.

2. The arrows illustrate message flow between loops, not static dependencies between components (classes, libraries, etc.) Instances of slave components only communicate with their master, but in my implementations they are not statically dependent on the component that contains the master. However, the master component is often statically dependent on the slave. (You may notice having master components depend on slave components violates the Dependency Inversion Principle. I'm usually okay with this design decision because a) I want to make the application's structure as familiar to non-LVOOP programmers as possible to lower the barrier to entry, and b) if I need to it is fairly straightforward to insert an inversion layer between a master and a slave.)

In practice my applications have ending up with a hierarchical tree of master/slave loops, like this:

post-7603-0-93582200-1310406330_thumb.pn

The top loop exposes the application's core functionality to the UI loops (not shown) via the set of public messages it responds to. Each loop exposes messages appropriate for the level of abstraction it encapsulates. While a low level loop might expose a "LoadLimitFileFromDisk" message, a high level loop might just expose a "StartTest" message with the logic of actually starting a test contained within a mediator loop somewhere between the two.

Slave loops--which all loops are except for the topmost master ("high master?") loop in each app--have two fundamental requirements:

1. They must exit and clean up when instructed to do so by their master. That includes shutting down their own slave loops.

2. They must report to their master when they exit for any reason.

These two requirements make it pretty straightforward to do controlled exits and have eliminated most of the uncertainty I used to have with parallelism. A while back I had an app that in certain situations was exiting before all the data was saved to disk. To fix it I went to the lowest mediator loop that had the in-memory data and data persistence functionality as slaves and changed the message handling code slightly so the in-memory slave wasn't instructed to exit until after the mediator received an "Exited" message from the persistence slave. The change was very simple and very localized. There was little risk in accidentally breaking existing behavior and the change didn't require crossing architectural boundaries the way references tend to.

Final Notes:

-Strictly speaking, I don't think my mediator loops are correctly named. As I understand it a Mediator's sole responsibility is to direct messages to the appropriate recipient. Sometimes my mediators will keep track of a slave's state (based on messages from the slave) and filter messages instead of passing them along blindly.

-Not all masters are mediators. I might have a state machine loop ("real" SM, not QSM) that uses a continuously running parallel timer loop to trigger regular events. The state machine loop is not a mediator, but it is the timer loop's master and is responsible for shutting it down.

-The loop control hierarchy represents how control (and low speed data) messages propogate through the system. For high-speed data acquisition the node-hopping nature of this architecture probably will not work well. To solve that problem I create a data "pipe" at runtime to run data directly from the producer to the consumer, bypassing the control hierarchy. The pipe refnum (I use queues) is sent to the producer and consumer as part of the messages instructing them to start doing their thing.

Final Final Note:

In general terms, though not necessarily in software engineering terms, a mediator could be anything that intercepts messages and translates them for the intended recipient. Using that definition, any kind of abtraction is a mediator. An instrument driver mediates messages between your code and the instrument. Your code mediates messages between the user and the system. I don't know where this line of thought will lead me, but it's related to the vague discomfort I have over calling it a "mediator loop."

  • Like 1
Link to comment

I've been on vacation and I haven't looked at the VIRegister library at all, but I did want to share how our team stops loops.

All of our code has something like the following architecture, and all loops are stopped by destroying a reference to a queue or notifier, which is a pattern we call "scuttling". It doesn't matter whether the wait functions have timeouts or not, it works immediately and all you have to do is filter out the error (error 1 or error 1122) that it throws. Usually the code is split into many VIs, but as long as you make sure that any queue or notifier reference is eventually destroyed (we try always to do that at in the same VI as the creation), then all the loops will stop.

post-1764-0-95096800-1310414056_thumb.pn

Link to comment

Or my favorite of each "module" has a queue linked to the VI instance name. To close all dependents, you only need to list all VI names and poke...[snip]

That is basically what VIRegisters are. And exactly the use case that initiated me making AutoQueues, and wrapping a couple of those in a polyVI --> VIRegister. But it's not good style, as you're spawning queues by reference this way, Shaun ;). But you must admit it's convenient...

Cheers,

Steen

Link to comment

Ok, alot of discussion went on before I got back to this - super :rolleyes:.

(On named queues):

NI may not discourange their use, but there are many of us (admittedly a minority) who believe they are a quick solution instead of a good solution. Here's a quote taken from Stephen's excellent article, The Decisions Behind the Design:

(Quote from Stephen: It is hard to guarantee data consistency when you cannot limit changes to the data):

As is usually the case when I disagree with Stephen, over time I began to see the why until eventually I understood he was right. The guarantee of data consistency pays off in spades in my ability to understand and debug systems. Any kind of globally available data--named queues, functional globals, globals, DVRs, etc.--breaks that guarantee to some extent.

When I'm digging through someone else's code and run into those constructs, I know the amount of work I need to do to understand the system as a whole has just increased--probably significantly. There's no longer a queue acting as a "single point of entry" for commands to the loop. Instead, I've got this data that is being magically changed somewhere else (possibly in many places) in the application. It is usually much harder to figure out how that component interacts with the other components in the system. Named queues are especially bad. We have some control over the scope of the other constructs and can limit where they're used, but once we create a named queue there's no way to limit who interacts with it.

I agree in principle with what you're stating, and have had my share of apps suffering from entanglement - but I don't think it's all as black and white as you make it sound.

There is a duality in "hiding" and "encapsulating" functionality. A subVI does both, as does a LabVIEW class.

LabVIEW is evolving into a language more and more devoid of wires; First you bundled stuff together, then we magically wafted data off into the FP terminals of dynamically dispatched VIs, then we got events, and now we have all sorts of wire-less variables like Shared Variables. Everytime we use a shift register or feedback node we have the potential for enheriting data that others have changed - this didn't get any better with the introduction of shared reentrancy. My point is that I wouldn't dispense with any of this. What other people can or can't comprehend and what they might make a mess of, isn't my concern. As long as I ensure my application or toolset works as intended, I will happily use 5 references when the alternative is a gazillion wires. And when I need to make sure some data stays verbatim, I'll of course make sure it's transferred by value.

Therefore, the difference in my mind between a named and an unnamed queue is convenience. Queues are very flexible, as we agree on, but unnamed queues have the built-in limitation that you have to wire the refnum to all your users. And getting a refnum to a dynamically dispatched VI can't be done by wire. All a named queue does differently is to link its refnum to another static refnum (the name). When I keep that name secret, no one else is going to change my data, but I as developer know the name and can pull my shared data from Hammer space when I want to.

Inside my application it's my responsibility to maintain modularization to ensure maintainability and testability. I do that by having several hierarchies of communication - it looks a bit like your "mediator" tree. From the top there is typically a single command channel going out to each module. Then each module might have one or more internal communication channels for each sub-component. The need for more than one channel is typically tied to prioritization as Stephen highlights (so I'm thinking in parallel about a good architecture for that). Down at the lowest level I might use flags (or now VIRegisters) to signal to the individual structures that they must do something simple, like stopping (just as Shaun favorizes). But these "flags" don't span an entire application, they are created, used, and destroyed inside an isolated module. There isn't any replacement for sharing data (by reference) when you want to share data (a common signal). I can't use by-value for this.

So regarding references; the issue seems to boil down to if the reference is public or private. Isn't that it? If you keep the scope small enough, you can guarentee even the most complex tool will work.

Regarding your (very good) Data logger/Data collection analogy I couldn't agree more. But application development is always about compromise. It seems to be a rule that whenever you improve encapsulation you also add weight (system load, latency etc.) to your application. 99 out of a 100 times I tend to favor encapsulation and abstraction though, as computers get faster, and I trust I can always optimize my way out of any performance deficiency :lol:. But if every operation gets absolutely abstract the code would be very heavy to run, not to mention extremely hard to understand. The magic happens when you get the abstractions just right, so your application runs on air, it takes a wizard to shoot your integrity out of the water, while a freshman still understands what happens. And one case needs some tools while another case needs other tools to achieve that. There are many ways to do it wrong, but I still believe there are also many ways do it right.

Cheers,

Steen

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.