Grampa_of_Oliva_n_Eden Posted May 9, 2011 Report Posted May 9, 2011 Is LV a Pure dataflow langauge? No. 1) Mark a sub-VI as "sub-routine" priority. 2) Find an instance of that sub-VI and choose "skip if busy". If the sub-VI is crunching numbers in one thread when called by another it will be skipped. So no, LV is not pure dataflow. Ben Quote
Val Brown Posted May 9, 2011 Report Posted May 9, 2011 How do you not see that as dataflow? I'm a bit confused. Quote
Grampa_of_Oliva_n_Eden Posted May 9, 2011 Report Posted May 9, 2011 How do you not see that as dataflow? I'm a bit confused. It sis an exception to pure dataflow since the VI's following the skipped sub-VI execute even though they do NOT get the data from the skipped Sub-VI. Ben Quote
Val Brown Posted May 9, 2011 Report Posted May 9, 2011 They definitely "get" that the VI was skipped and then, presumably use default values or previously stored values if they include something like FGV code. What do you think should happen instead? Quote
Daklu Posted May 9, 2011 Report Posted May 9, 2011 The links are very good. Thanks for posting them. Well I was hoping that AQ would have backed me up by now, but then I realized that he backed me up on this last time I tried to tell people that queues don't poll. So if you won't take my word for it, you can post a rebuttal on that other thread... Well I didn't mean that the OS allows LabVIEW to own a hardware interrupt and service it directly. But an OS provides things like a way to register callbacks so that they are invoked on system events.The interrupt can invoke a callback (asynchronous procedure call). I don't think you are understanding what I am trying to say. I'm talking about the entire software stack... user application, Labview runtime, operating system, etc., whereas you are (I think) just talking about the user app and LV runtime. Unless the procedure is directly invoked by the hardware interrupt, there must be some form of polling going on in software. Take the timer object in the multimedia timer link you provided. How do those work? "When the clock interrupt fires, Windows performs two main actions; it updates the timer tick count if a full tick has elapsed, and checks to see if a scheduled timer object has expired." (Emphasis mine.) There's either a list of timer object the OS iterates through to see if any have expiered, or each timer object registers itself on a list of expired timer objects then the OS then checks. Either way that checking is a kind of polling. Something--a flag, a list, a register--needs to be checked to see if it's time to trigger the event handling routine. The operating system encapsulates the polling so developers don't have to worry about it, but it's still there. How is it relevant to our discussion? You said dequeue functions break dataflow because they are "event-driven." I'm saying "event-driven" is an artificial distinction since software events are simply higher level abstractions of lower level polling. What about the discussion where AQ said the queues don't poll? Under certain circumstances--the number of parallel nodes is equal to or less than the number of threads in the execution system--the execution system may not do any polling. (Though the OS could on behalf of the execution system.) However, he also said this: "If the number of parallel nodes exceeds the number of threads, the threads will use cooperative multitasking, meaning that a thread occassionally stops work on one section of the diagram and starts work on another section to avoid starvation of any parallel branch." (Emphasis mine.) Windows uses pre-emptive multitasking to decide when to execute threads. The application, or in this case the Labview RTE, decides what is going to be executed on each thread. The cooperative multitasking has to be implemented in the Labview RTE. I ran a quick check and spawned 50 different queues each wired into a dequeue function with infinite timeout, then used Process Explorer to monitor Labview's threads. As expected, LV didn't spawn new threads for each sleeping queue, so it's safe to say the os thread doesn't necessarily sleep on a waiting dequeue. According to that discussion LV8.6 apparently uses a strategy similar to what I described above in wild speculation #2. The dequeue prim defines the start of a new clump. That clump stays in the waiting area until all it's inputs are satisfied. Assuming all the clump's other inputs have been satisfied, when the enqueue function executes the queued data is mapped to the dequeue clump's input, which is then moved to the run queue. What happens if the clump's other inputs have not been satisfied? That clump stays in the waiting room. How does LV know if the clump's inputs have been satisfied? It has to maintain a list or something that keeps track of which inputs are satisfied and which inputs are not, and then check that list (poll) to see if it's okay to move it to the run queue. Having said all that, whether or not the dequeue function polls (and what definition we should use for "poll") isn't bearing any fruit and seems to have become a sideshow to more productive topics. (You are of course free to respond to any of the above. Please don't be offended if I let the topic drop.) Do you think LabVIEW polls the keyboard too? No, but the operating system does. Based on re-reading this old AQ post, I'm trying to reconcile your concept of constant-source dataflow with the clumping rules and the apparent fact that a clump boundary is caused whenever you use a node that can put the thread to sleep. I assume you're referring to an os thread here? In LV8.6 it looks like that was true. In one of the threads AQ went on to say, "All of the above is true through LV 8.6. The next version of LV this will all be mostly true, but we're making some architecture changes... they will be mildly beneficial to performance in the next LV version... and they open sooooo many intriguing doors..." and we know they made a lot of improvements for 2010, so the clumping rules may not require that anymore. It sounds like it is much harder to for LV to optimize code which crosses a clump boundary. If someone cares about optimization (which in general, they shouldn't), then worrying about where the code is put to sleep might matter more than the data sources. My understanding is the clumping *is* the optimization. But your comment raises a very good point... In a dataflow language if the clumps are too big parallelism suffers and performance degrades. If the clumps are too small a disproportionate amount of time is spent managing the clumps and performance degrades. The best thing we can do (from an efficiency standpoint) as developers is let Labview make all the decisions on where and how to clump the code. (That means don't enforce more serialization than is necessary and don't write code that requires clump breaks.) As computer hardware and compiler technology improve through yearly Labview releases, recompiling the app will automatically create clumps that are the "best" size. So maybe when AQ talks about "breaking" dataflow, he is talking about stuff that forces a clump boundary? (Edit - Hmm... that doesn't seem quite right either.) Overall I'm having trouble seeing the utility of your second category. Yes, your queue examples can be simplified, but it's basically a trivial case, and 99.99% of queue usage is not going to be optimizable like that so NI will probably never add that optimization. It's applicable whenever a queue only has a single enqueue point and a single dequeue point. There are other cases where depending on how it has been coded it could be resolved with multiple enqueue points. In principle there are lots of places where a queue could be factored out completely. For example, you could have a global variable that is only written in one place, so it might seem like the compiler could infer a constant-source dataflow connection to all its readers. But it never could, because you could dynamically load some other VI which writes the global and breaks that constant-source relationship. If the dynamically loaded vi is included as part of the executable build, then the compiler would know that it cannot replace the global. I admit I don't know what would happen if the dynamic vi were not included in the executable, loaded during runtime, and tried to write to the global variable in the executable--I've never tried it. Assuming it works, then constant-source obviously isn't the right way to determine if dataflow is broken. ------------- [Edit] Is LV a Pure dataflow langauge? Excellent point Ben. Maybe we need to make a sharper distinction between edit time dataflow and run time dataflow? Quote
jdunham Posted May 9, 2011 Report Posted May 9, 2011 (edited) I don't think you are understanding what I am trying to say. I'm talking about the entire software stack... user application, Labview runtime, operating system, etc., whereas you are (I think) just talking about the user app and LV runtime. Unless the procedure is directly invoked by the hardware interrupt, there must be some form of polling going on in software. Take the timer object in the multimedia timer link you provided. How do those work? "When the clock interrupt fires, Windows performs two main actions; it updates the timer tick count if a full tick has elapsed, and checks to see if a scheduled timer object has expired." (Emphasis mine.) There's either a list of timer object the OS iterates through to see if any have expiered, or each timer object registers itself on a list of expired timer objects then the OS then checks. Either way that checking is a kind of polling. Something--a flag, a list, a register--needs to be checked to see if it's time to trigger the event handling routine. The operating system encapsulates the polling so developers don't have to worry about it, but it's still there. Well I do understand, but I don't agree. We'll probably never converge, but I'll give it one more go. Repeating, "When the clock interrupt fires, ... it checks to see if a scheduled timer object has expired". This seems to you like polling, but I don't think it is. If the interrupt never fires, the timer objects will never be checked. Now since the clock is repetitive, it seems like polling, but if you cut the clock's motherboard trace, that check will never occur again, since it's not polled. If app-level code has registered some callback functions with the OS's timer objects, then those callbacks will never be called, since there is no polling, only a cascade of callbacks from an ISR (I suppose you could call that interrupt circuity 'hardware polling', but it doesn't load the CPU at all). According to that discussion LV8.6 apparently uses a strategy similar to what I described above in wild speculation #2. The dequeue prim defines the start of a new clump. That clump stays in the waiting area until all it's inputs are satisfied. Assuming all the clump's other inputs have been satisfied, when the enqueue function executes the queued data is mapped to the dequeue clump's input, which is then moved to the run queue. What happens if the clump's other inputs have not been satisfied? That clump stays in the waiting room. How does LV know if the clump's inputs have been satisfied? It has to maintain a list or something that keeps track of which inputs are satisfied and which inputs are not, and then check that list (poll) to see if it's okay to move it to the run queue. Again, polling is one way to do it, but is not required. Not all testing is polling! You might only test the list of a clump's inputs when a new input comes in (which is the only sensible way to do it). So if a clump has 10 inputs, you would only have to test it 10 times, whether it took a millisecond or a day to accumulate the input data. I guess that's back to definitions, but if you're not running any tests, not consuming any CPU while waiting for an event (each of our 10 inputs = 1 event in this example), then you're not polling the way I would define polling. You don't have to run the test after every clump, because LV should be able to associate each clump output with all the other clump inputs to which it is connected. You only have to run the test when one of the clumps feeding your clump terminates. It makes sense that LV wouldn't poll the queues in the example because it wouldn't really help with anything. That's the watched pot that never boils. As long as you design the execution system to be able to work on a list of other clumps which are ready to run, and you can flag a clump as ready based on the results of the previous clumps, then you don't need to poll. It's sort of like... dataflow! If the LV execution thread has exhausted all the clumps, I suppose it could poll the empty list of clumps, but by the same logic it doesn't need to. The minimum process load you see LV using at idle may be entirely dedicated to performing Windows System Timer event callbacks, all driven from the hardware ISR (which seems like polling but I already tried to show that it might not be polled). If Microsoft or Apple or GNU/Linux engineers chose to simulate event behavior with a polled loop within the OS, then yes it could be, but it doesn't have to be polled. And as Steve pointed out, if there are no functions to run at all the processor will do something, but I don't think you have to call that polling, since the behavior will go away when there is real code to run. Having said all that, whether or not the dequeue function polls (and what definition we should use for "poll") isn't bearing any fruit and seems to have become a sideshow to more productive topics. (You are of course free to respond to any of the above. Please don't be offended if I let the topic drop.) Understood. I'm enjoying the spirited debate, but it doesn't need to go on forever. I'm glad we're having it because the mysteries about queues and sleeping threads and polling are common. If the topic were silly or had easy answers, presumably someone else would have come forward by now. Edited May 9, 2011 by jdunham Quote
ShaunR Posted May 9, 2011 Report Posted May 9, 2011 (edited) Well. I'm coming in a bit late but I'm a bit surprised that there's no discussion of tagged-token/static dataflow or even actors and graphs. Yet this whole thread is about dataflow ? I don't know specifically because I have never given it much thought and only have a cursory understanding since it's not my remit. But if I were to catagorise labvierw I would say it is either a tagged-token dataflow model or a variant thereof (hybrid?). Asking whether it is "pure" or not is a bit like asking if the observer design pattern is a "pure" design pattern. Edited May 9, 2011 by ShaunR Quote
jdunham Posted May 9, 2011 Report Posted May 9, 2011 Well. I'm coming in a bit late but I'm a bit surprised that there's no discussion of tagged-token/static dataflow or even actors and graphs. Yet this whole thread is about dataflow ? Apologies for not being sufficiently buzzword-compliant. And it's about time you got here! Quote
SteveChandler Posted May 9, 2011 Author Report Posted May 9, 2011 Repeating, "When the clock interrupt fires, ... it checks to see if a scheduled timer object has expired". This seems to you like polling, but I don't think it is. If the interrupt never fires, the timer objects will never be checked. I think you are right - it is all about definitions. My definition is that if Windows is checking something every time an ISR is fired it is still polling regardless of the fact that the check was fired by a hardware interrupt. I think what you are saying is that since the code that does the check is automatically fired directly or indirectly by the ISR it is not polling. I can kind of understand that definition. Quote
jdunham Posted May 9, 2011 Report Posted May 9, 2011 I think you are right - it is all about definitions. My definition is that if Windows is checking something every time an ISR is fired it is still polling regardless of the fact that the check was fired by a hardware interrupt. I think what you are saying is that since the code that does the check is automatically fired directly or indirectly by the ISR it is not polling. I can kind of understand that definition. Sure. And it depends on what you are checking. If you are looking for new keypresses every time the timer ISR is fired, that's certainly polling. But if you only check for keypresses when the keyboard ISR fires, that's not polling. I don't think Windows polls the keyboard. Quote
Daklu Posted May 10, 2011 Report Posted May 10, 2011 Perhaps some of our disagreement is related to our backgrounds. I spent a year writing microcontroller code in assembly and loved it. (I still dig out my stuff for fun on rare occasions.) When you're working that close to the metal software events don't exist. This seems to you like polling, but I don't think it is. In your opinion, what is polling? What characteristics are required to call something polling? Based on your comments I'm guessing that it needs to be based on a real-world time interval instead of an arbitrary event that may or may not happen regularly? [Edit] I don't think Windows polls the keyboard. You're right that windows doesn't poll the keyboard directly. But unless you're using a computer circa 1992 the keyboard doesn't generate interrupts either. The polling is done by the USB controller chip. USB is a host-driven communication protocol. Every so often the usb host sends out signals to the connected devices to see if they have any new information to report. If so, the host sends another message asking for the report. For mice the default polling interval is 8ms. Since mice only report the position delta from the last report, if you could move a mouse and return it to its original position before 8 ms has passed windows would never know the mouse had moved. You couldn't do that with an event-based mouse. USB Keyboards operate essentially the same way, except they probably use a buffer to store the keystrokes and pass them all to the host when the next request comes along to avoid missing keystrokes. [/end Edit] Well. I'm coming in a bit late but I'm a bit surprised that there's no discussion of tagged-token/static dataflow or even actors and graphs. Yet this whole thread is about dataflow? Ugh... your linked document in ancient! Try something a little more up to date. (It's been a while since I've read over it but I do keep it around in my reference library.) I don't know specifically because I have never given it much thought and only have a cursory understanding since it's not my remit. But if I were to catagorise labvierw I would say it is either a tagged-token dataflow model or a variant thereof (hybrid?). My guess is LV uses a hybrid approach as described in section 4 of the link. Specifically, I think they are using a Large-Grain Dataflow approach. Asking whether it is "pure" or not is a bit like asking if the observer design pattern is a "pure" design pattern. Maybe "pure" dataflow is conceptual. Everything that can execute in parallel does. Every node has a single output that propogates as soon as the node finishes operating. Every node executes as soons as its inputs are satisfied. (I think the article refers to this as fine-grained dataflow.) The delay between when the inputs are satisfied and when the node is fired is due to the threading model of modern computer processors. In other words, variable execution time isn't a property of dataflow, it's a side-effect of implementing dataflow on the von-Neumann architecture. That raises the question... what about writing LV code for an FPGA? They can run many more (albeit much shorter) parallel "threads" than a pc cpu. I would think FPGA programming would benefit a lot from writing code in "pure" form. Quote
jdunham Posted May 10, 2011 Report Posted May 10, 2011 In your opinion, what is polling? What characteristics are required to call something polling? Based on your comments I'm guessing that it needs to be based on a real-world time interval instead of an arbitrary event that may or may not happen regularly? I would say polling is testing something in software repeatedly to detect state change rather than executing code (possibly including the same test) as a result of a state change (or the change of something else strongly coupled to it). Time is not relevant. I would also say that if polling is going on beneath some layer of abstraction (like an OS kernel), then it's not fair to say that everything built on top of that (like LabVIEW queues) is also polled. At last I would say that there could exist an implementation of LabVIEW such that one could create a queue and then the LV execution engine could wait on it without ever having to blindly test whether the queue is empty in order to decide whether to dequeue an element. That test would only need to be run once whenever the dequeue element node was called, and then once more whenever something was inserted into the queue, assuming the node was forced to wait. Given that an ignoramus like me could sketch out a possible architecture, and given that AQ has previously posted that a waiting queue consumes absolutely no CPU time, I believe that such an implementation does exist in the copy of LabVIEW I own. Pretty much all of those statements have been denied by various posters in this thread and others, though I'm sure that's because it has taken me a long time to express those ideas succinctly. I'm sort of hoping it worked this time, but extrapolating from the past doesn't give me too much confidence. Thanks for indulging me! Jason Quote
SteveChandler Posted May 10, 2011 Author Report Posted May 10, 2011 I would also say that if polling is going on beneath some layer of abstraction (like an OS kernel), then it's not fair to say that everything built on top of that (like LabVIEW queues) is also polled. Thanks for indulging me! Jason Ah that is the crux of it. What I was getting at earlier is that something somewhere is polling for a message in a queue even though, as AQ says, the thread with the dequeue is fast asleep. And if code is triggered by a state change that is event driven. But something is continuously checking for the state change. I also got my start in extremely low level assembly language and most of my experience with event driven programming is Twisted. That has to impact on my way of viewing the world. Quote
Daklu Posted May 15, 2011 Report Posted May 15, 2011 I would also say that if polling is going on beneath some layer of abstraction (like an OS kernel), then it's not fair to say that everything built on top of that (like LabVIEW queues) is also polled. I thought this was an interesting comment. Suppose I created a sub vi like this and used it in an app. Is this polling or is it waiting for a TRUE event to occur? I'll go out on a limb and assume you'll agree this is polling. But wait... I've abstracted away the functionality that is doing the polling into a sub vi. Does that mean the vis that use this sub vi aren't based on polling? Is it fair to consider them event-driven? To save a bit of time I'll step further out on the limb and assume you'll respond with something along the lines of "an OS is different than a sub vi." (Feel free to correct me if my assumption is wrong.) To that I agree--an os is different than a sub vi. An os provides a much more extensive set of services than any single sub vi can hope to provide. They both, however, do so by abstracting away lower-level functionality so the sub vi does meet the requirement of "some layer of abstraction." All software is layer upon layer of abstraction. From high level functions that are part of your application code down to the lowest level assembly routines, it's all an abstraction of some bit of functionality provided by a lower level abstraction. What is it about the abstraction provided by an os that makes it okay to ignore the polling aspect of it while not being able to ignore the polling aspect of the sub vi? Conversely, what changes would we have to make to the sub vi that would make it okay to ignore the polling aspect of it? Quote
Val Brown Posted May 15, 2011 Report Posted May 15, 2011 I think the bottom line is that this is a question of perspective. Yes, in the end (or is it the beginning?) ALL computer operations are based on polling, and can't not be. But from the perspective of certain "levels" of abstraction, it appears that real "events" are happening without ANY polling. It just depends on which perspective you want to adopt for the purpose of the discussion. It's kind of like discussing the differences and similarities of absolute and relative boddhicitta but, then again, I also have a background in philosophy, both Western and Eastern. Quote
jdunham Posted May 16, 2011 Report Posted May 16, 2011 (edited) I thought this was an interesting comment. Suppose I created a sub vi like this and used it in an app. Is this polling or is it waiting for a TRUE event to occur? Well if I can't seem to be right, I'll take interesting as a consolation prize. I'll go out on a limb and assume you'll agree this is polling. But wait... I've abstracted away the functionality that is doing the polling into a sub vi. Does that mean the vis that use this sub vi aren't based on polling? Is it fair to consider them event-driven? Yes it's polling. However, I don't see that you've abstracted it away. 'Abstracting' implies you've hidden the implementation details, which you haven't. Now if you put that in an lvlib or an lvclass and make it a private method, remove the diagrams, and then I would say my code is not polling, even if it calls yours. If I instrument the code and discover that it really is polling underneath the abstraction, I could rewrite your lvlib, maintaining the same API, and get rid of your pesky polling. Since the system can go from polling to not polling without any change in my highest-level code, it's not useful to say that my code is polling. Furthermore if you truly believe the OS and/or LabVIEW queue functions are polling, then you could pay NI 10 billion dollars or perhaps less, to furnish a new operating system with a different architecture and get rid of the polling and your queue-based code would not have to change (unless you were polling the queue status which I don't recommend). Now I don't believe this investment is necessary since I've already laid out arguments that queues are not polled, and no compelling evidence to the contrary has been presented. Didn't you already back me up on this by making some code to wait on dozens of queues and observing that the CPU usage does not increase? I think the bottom line is that this is a question of perspective. Yes, in the end (or is it the beginning?) ALL computer operations are based on polling, and can't not be. I don't agree. Like most CPUs, 80x86 chips have interrupt handlers, which relieves the CPU of any need to poll the I/O input. If the interrupt line does not change state, none of the instructions cascading from that I/O change are ever executed. The ability to execute those instructions based on a hardware signal is built into the CPU instruction set. I guess you could call that "hardware polling", but it doesn't consume any of the CPU time, which is what we are trying to conserve when we try not to use software polling. If you put a computer in your server rack, and don't plug in a keyboard, do you really think the CPU is wasting cycles polling the non-existent keyboard? Is the Windows Message handler really sending empty WM_KEYDOWN events to whichever app has focus? Well the answer, which Daklu mentioned in a previous post, is a little surprising. In the old days, there was a keyboard interrupt, but the newer USB system doesn't do interrupts and is polled, so at some point in time, at level way beneath my primary abstractions, the vendors decided to change the architecture of keyboard event processing from event-driven to polled. While it's now polled at the driver level, a even newer peripheral handling system could easily come along which restores a true interrupt-driven design. And this polling is surely filtered at a low level so while the device driver is in fact polling, the OS is probably not sending any spurious WM_KEYDOWN events to LabVIEW. So I suppose you could say that all my software changed from event-driven to polled at that moment I bought a USB keyboard, but I don't think that's a useful way to think about it. Maybe my next keyboard will go back to interrupt-handled. (see a related discussion at: http://stackoverflow...d-input-generic) Edited May 16, 2011 by jdunham Quote
Aristos Queue Posted May 16, 2011 Report Posted May 16, 2011 So that node (Dequeue Element) is executed under dataflow rules, like all LabVIEW nodes are, but what goes on inside that node is a non-dataflow operation, at least the way I see it. It's "event-driven" rather than "dataflow-driven" inside the node. Similarly a refnum is a piece of data that can and should be handled with dataflow, but the fact that a refnum points to some other object is a non-dataflow component of the LabVIEW language I agree with this analysis generally. Now, having said that, there is a concept of asynchronous dataflow, where data flows from one part of a program to another, and a queue that has exactly one enqueue point and exactly one dequeue point can be as dataflow safe.Shared, local and global variables also violate dataflow for the same underlying reason as refnums. A variable "write" doesn't execute until its input value is available, but what it does under the hood is dependent upon non-local effects. So the best definition I can offer for "dataflow safe" is "all nodes execute when their inputs are available and, regardless of the behavior behind the scenes, no piece of data is simultaneously 'owned' by more than one node." I've been trying for some time to refine this definition. (we're not still calling it 'G', are we?). Actually, NI is trying to encourage the use of "G" as a way to differentiate the language from the IDE. This is contrary to our earlier position, because we were trying to avoid confusion among customers, but as we have gotten larger and our products have been more used in larger systems, the distinction has become more useful than harmful. That was something we started differentiating at the end of 2010, and it will take a while to permeate our communications/documentation/etc. Quote
Yair Posted May 16, 2011 Report Posted May 16, 2011 Actually, NI is trying to encourage the use of "G" as a way to differentiate the language from the IDE. This is contrary to our earlier position, because we were trying to avoid confusion among customers, but as we have gotten larger and our products have been more used in larger systems, the distinction has become more useful than harmful. That was something we started differentiating at the end of 2010, and it will take a while to permeate our communications/documentation/etc. I thought there was some sort of legal issue there, where someone else had rights to the name. I know that NI has used "G" officially and unofficially since the beginning, so I was sure that this was the reason that NI seemed to be trying to avoid using it officially in recent years. Quote
Val Brown Posted May 16, 2011 Report Posted May 16, 2011 I think the bottom line is that this is a question of perspective. Yes, in the end (or is it the beginning?) ALL computer operations are based on polling, and can't not be. I don't agree. Like most CPUs, 80x86 chips have interrupt handlers, which relieves the CPU of any need to poll the I/O input. If the interrupt line does not change state, none of the instructions cascading from that I/O change are ever executed. The ability to execute those instructions based on a hardware signal is built into the CPU instruction set. I guess you could call that "hardware polling", but it doesn't consume any of the CPU time, which is what we are trying to conserve when we try not to use software polling. You make my point: it IS polling but your perspective is that, if the polling doesn't consume CPU time then it doesn't "count" as polling. For me it's far simpler: it IS polling. And the real issue here - again - is one of perspective: viz, where do you stand when you look at these phenomena and what "counts" of doesn't. But in the end it is fundamentally polling of SOMETHING, SOMEWHERE and can't not be. Quote
Daklu Posted May 16, 2011 Report Posted May 16, 2011 Well if I can't seem to be right, I'll take interesting as a consolation prize. 'Abstracting' implies you've hidden the implementation details, which you haven't. I agree I haven't hidden the details. I disagree software abstraction requires that the implementation details are unknown to the user; it merely means the details do not need to be known. Knowledge or ignorance of the implementation details seems like a poor condition to use as the basis for defining "abstraction." I know the details of the LapDog Message Library, but it is still an abstraction of some bit of functionality. If NI unlocked the source code for Labview's queues and I learned that inside and out, I would still consider a queue an abstraction. Now if you put that in an lvlib or an lvclass and make it a private method, remove the diagrams, and then I would say my code is not polling, even if it calls yours... <snip> ...it's not useful to say that my code is polling. I agree, there is no polling in your code. (There is polling in the execution path though.) I also agree that when communicating with other developers it isn't very useful to say your code is polling if you haven't explicitly implemented a polling algorithm. But the question isn't about the relative merits of referring to your code as polled or event-based in everyday language. The question is whether using event-based features of a given language is sufficient condition to claim dataflow has been broken. The principle of dataflow transcends any programming language or os implementation. In general terms, dataflow is a way of organizing and scheduling the various steps required to reach a goal by basing it on dependencies rather than on strict sequencing. Of course, goals and scheduling are concepts that stretch beyond the boundaries of Labview, Windows, x86 processors, etc., so dataflow exists outside of that environment. (A Gantt Chart is a common representation of a dataflow-ish system in the world of project management.) We need to look beyond Labview's implementation to determine what "breaking dataflow" means in general before we can even hope to figure out what, if anything, within Labview breaks dataflow. Didn't you already back me up on this by making some code to wait on dozens of queues and observing that the CPU usage does not increase? Nope, I waited on dozens of queues and observed Labview's thread count didn't increase, indicating Labview doesn't necessarily put the thread to sleep when a wait is encountered. I didn't look at cpu usage, but I'd be surprised if there was any noticable increase in cpu load. Any polling that is going on (assuming there is polling) is probably happening in low level c code in the LVRTE or the OS and the incremental workload is small enough to be insignificant. While it's now polled at the driver level, a even newer peripheral handling system could easily come along which restores a true interrupt-driven design. It could, but I'd be surprised if it was applied to something as pedestrian as keyboard inputs or UI events. Hardware interrupts are necessary, but the "give me attention NOW" nature of them makes them disruptive and costly. Too many interrupts in the system brings it to a screetching halt. Occasionally my pc gets very lethargic. Almost invariably when I bring up Process Explorer I see 25% or more of my cpu load is being consumed by hardware interrupts. (Likely caused by a faulty driver somewhere.) Users can't tell the difference between a 4 microsecond response and a 20 millisecond response. Better to put keyboard activity in a regular polling loop and avoid the interrupt overhead. And this polling is surely filtered at a low level so while the device driver is in fact polling, the OS is probably not sending any spurious WM_KEYDOWN events to LabVIEW. Agreed. But polling *is* occurring, which is the point I'm trying to make. When we get a value change event in Labview, somewhere in the execution path that generates that event someone wrote software that is "testing something to detect a state change." Events are nothing more than abstractions created by lower level code as a convenience for programmers. They exist in source code, but not in compiled code. Since dataflow is a run-time activity based on compiled code, events in source code cannot be an indicator of broken dataflow. ---- ...there is a concept of asynchronous dataflow, where data flows from one part of a program to another, and a queue that has exactly one enqueue point and exactly one dequeue point can be as dataflow safe. It seems more complicated than that. I can easily create a vi with n enqueue points and d dequeue points that is still dataflow safe. I can also create a vi using local variables that is dataflow safe. I guess both of these examples illustrate the concept of data "ownership" you referred to. I've had this nagging desire to include time as a necessary component to determining dataflow, but haven't quite been able to wrap my brain around it. "Simultaneous ownership" seems like as good an explanation as any. Quick question... It seems like your definition of dataflow is very closely tied to what you called "local effects" and Labivew's G's by-value data handling. In your opinion, is that an inherent part of dataflow in general or is it important because of how NI has implemented dataflow and the limitations of currect technology? Quote
Aristos Queue Posted May 18, 2011 Report Posted May 18, 2011 Quick question... It seems like your definition of dataflow is very closely tied to what you called "local effects" and Labivew's G's by-value data handling. In your opinion, is that an inherent part of dataflow in general or is it important because of how NI has implemented dataflow and the limitations of currect technology? My opinion? Inherent. Whenever anyone in CS research says that "dataflow programming is (safer/more provable/easier to refactor) than procedural", they say these things because of the local effects rules. Quote
Daklu Posted May 19, 2011 Report Posted May 19, 2011 My opinion? Inherent. Fair enough. Follow up questions based on that definition... 1. Earlier in this thread I posted examples of using queues that could be refactored into more direct dataflow code. I believe it is safe to say any code that can be refactored in that way is dataflow safe in its original form. Do you know of situations where code cannot be refactored into direct dataflow yet are still dataflow safe? 2. What are the major consequences of breaking dataflow? Is it primarily related to human requirements (readability, complexity, degubability, etc.) or are there significant run-time penalties? I know you've mentioned fewer compiler optimizations when dataflow is broken. Have these been benchmarked at all? I've been wondering if it's possible to concisely describe what it means to "own" a piece of data. I think it's more than just access to the data. Multiple nodes reading the same data shouldn't cause a dataflow problem. Maybe a simultaneous data write operation is required to violate single ownership? I dunno... this line of thought naturally progresses into equating ownership with race conditions, which I'm not sure is helpful when discussing dataflow. Thoughts? Quote
Aristos Queue Posted May 21, 2011 Report Posted May 21, 2011 I just got home from Spain where I was attending an academic conference for researchers in functional programming languages. I put this question to some of the attendees and got a refreshingly straightforward answer. A VI should be considered clean if, when executed, for any set of input parameters, the output values are always the same. In other words, if you have a function that takes two integers and produces a string output, if those two integers are 3 and 5 and the output is "abc" the first time you call the function, it should ALWAYS be "abc" every time you call the function with 3 and 5. The term for such functions in other languages is "hygienic functions." It's a term I'd heard before but never thought to apply back to LabVIEW. This definition covers your queue example. Hygienic functions are not required to have the same performance characteristics whenever they are called, just the same output data values. There are variants on this theme. For example, the multi-enqueue queue example you posted is hygienic because the Obtain Queue is in the function body. If that were in a caller VI, you would have the situation where the caller VI would be hygienic but the subVI would not be (because given the same refnum [even during the same run of the application where that refnum refers to the same queue] the output might not always be the same because it might be called from places where others had mucked with the queue instead of being right after the Obtain Queue the way it is now). There is a concept called monads that I've been playing with for a number of years as a way of eliminating refnums from LabVIEW, and I hadn't had much luck with that approach, but this conversation highlighted specifically why monads exist in other functional programming languages and how they are used to make non-hygienic functions become safe for callers. I'm going to make another run at how monads could fit into LabVIEW. Do you know of situations where code cannot be refactored into direct dataflow yet are still dataflow safe? Yes. By the hygienic definition, a producer/consumer loop pair used to do some kind of divide and conquer algorithm or pipelining algorithm would be dataflow safe. Quote
Daklu Posted May 21, 2011 Report Posted May 21, 2011 A VI should be considered clean if, when executed, for any set of input parameters, the output values are always the same. In other words, if you have a function that takes two integers and produces a string output, if those two integers are 3 and 5 and the output is "abc" the first time you call the function, it should ALWAYS be "abc" every time you call the function with 3 and 5. I thought that was "deterministic?" How do hygienic functions and deterministic functions differ? There is a concept called monads that I've been playing with for a number of years... There is a concept called monads that I've read about for a number of years but never quite been able to wrap my brain around. Maybe someday... Quote
SteveChandler Posted May 23, 2011 Author Report Posted May 23, 2011 I just got home from Spain where I was attending an academic conference for researchers in functional programming languages... Oh yea? Well I just got home from the hardware store where I was buying some caulking to plug a hole under my sliding glass door that ants were crawling through Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.