LAVA 1.0 Content Posted November 10, 2006 Report Share Posted November 10, 2006 Hi! Current processors have multiple cores and therefore taking advantage of parallel computation has become more important. Even though LV is highly parallel programming language, it still lacks parallel looping. I'd like to suggest a new LV structure, namely "parallel loop" that would evaluate the content of the loop in parallel and not sequentially as current loops do. The parallel loop would be a variant of "for-loop". When the parallel loop structure is encountered, LV would execute multiple parallel diagrams with the loop content instead of executing the loop itself. The parallel loop would not have shift registers but it would be otherwise similar to current "for-loop" from the programmers point of view. The following image emphasises how the evaluation of an expression would be really done using parallel loop. Quote Link to comment
LAVA 1.0 Content Posted November 10, 2006 Author Report Share Posted November 10, 2006 I remembered reading this BLOG entry some time ago. It may be worth a read.... Note his last comment: "If you're a student of computer architecture or parallel programming (as I once was), always pay attention to the overhead of distributing your workload because it may swamp any savings you hope to gain." Quote Link to comment
Aristos Queue Posted November 10, 2006 Report Share Posted November 10, 2006 The LV compiler already does some loop unrolling. That's something you'd never actually see on the diagram, just as you don't see it in C compilers, etc. If we ever had distributed parallelism, where each frame of the loop passed to different computers, that's something that would be displayed to users. Quote Link to comment
Mike Ashe Posted November 10, 2006 Report Share Posted November 10, 2006 The example of unrolling given above is trivial, not hard to unroll and distribute. But I imagine that with complex contents in the loop that the difficulty of unrolling/atomizing goes up exponentially. I would think that, eventually, to make coding with this efficient that we would need some type of on-the-fly syntax warning that the code (fragment) we just created/connected is not unrollable and that if we wanted to take advantage of the parallel capability we would have to edit. Interesting CS/compiler problems. I can see where more parallel distribution would be highly advantageous if we were distributing to, say, an array of processing cards each with one or more CPUs running LabVIEW Embedded and we were trying to create an MxN matrix processor for something like image processing of video streams. Quote Link to comment
LAVA 1.0 Content Posted November 11, 2006 Author Report Share Posted November 11, 2006 The LV compiler already does some loop unrolling. That's something you'd never actually see on the diagram, just as you don't see it in C compilers, etc. If we ever had distributed parallelism, where each frame of the loop passed to different computers, that's something that would be displayed to users. There are cases where you know that your problem is highly parallel and very processor intensive. In these cases it would be nice if you could force the calculation to dynamically distribute to multiple processors without explicitly dividing the problem into multiple parallel loops. There could for example be a right click menu from which you could select how the loop is parallelized. The options would be: Sequential ParallelDynamic Two threads Three threads Four Threads Five Threads Six Threads Seven Threads Eight Threads ... The dynamic option would define the number of threads for the loop at runtime and create copy of the loop dataspace for each thread dynamically at runtime. If user would specify the number of threads at compile time then the compiler could create the dataspaces for each thread already at compile time and therefore the looping would be performancewise more efficient. If the number of required threads is not known at compile time the user should select Dynamic, especially when the computation inside the loop is heavy and the cost for spawning the threads is low compared to the total computation time. Alternatively to the context menu configuration parallel loop structure could have an optional constant which would define the number of threads at compile time. This constant would be disabled if the loop would be configured for dynamic parallelism. Quote Link to comment
Ton Plomp Posted November 11, 2006 Report Share Posted November 11, 2006 The LV compiler already does some loop unrolling. That's something you'd never actually see on the diagram, just as you don't see it in C compilers, etc. If we ever had distributed parallelism, where each frame of the loop passed to different computers, that's something that would be displayed to users. I have seen this where removing a shift register (it was not usefull anymore) sped up (is that proper english)? the for-loop with a lot (10 x times faster or so). Should have :camera: it. I got the the idea that all cases (or several) were executed at the same moment!!!! But I am a little afraid in trying to reproduce this, because it sounds too good to be true... :laugh: Ton Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.