Jump to content

Recommended Posts

Posted

Hi!

Current processors have multiple cores and therefore taking advantage of parallel computation has become more important. Even though LV is highly parallel programming language, it still lacks parallel looping. I'd like to suggest a new LV structure, namely "parallel loop" that would evaluate the content of the loop in parallel and not sequentially as current loops do. The parallel loop would be a variant of "for-loop". When the parallel loop structure is encountered, LV would execute multiple parallel diagrams with the loop content instead of executing the loop itself. The parallel loop would not have shift registers but it would be otherwise similar to current "for-loop" from the programmers point of view. The following image emphasises how the evaluation of an expression would be really done using parallel loop.

post-4014-1163158907.png?width=400

Posted

I remembered reading this BLOG entry some time ago. It may be worth a read.... :book:

Note his last comment:

"If you're a student of computer architecture or parallel programming (as I once was), always pay attention to the overhead of distributing your workload because it may swamp any savings you hope to gain."

Posted

The LV compiler already does some loop unrolling. That's something you'd never actually see on the diagram, just as you don't see it in C compilers, etc. If we ever had distributed parallelism, where each frame of the loop passed to different computers, that's something that would be displayed to users.

Posted

The example of unrolling given above is trivial, not hard to unroll and distribute. But I imagine that with complex contents in the loop that the difficulty of unrolling/atomizing goes up exponentially.

I would think that, eventually, to make coding with this efficient that we would need some type of on-the-fly syntax warning that the code (fragment) we just created/connected is not unrollable and that if we wanted to take advantage of the parallel capability we would have to edit.

Interesting CS/compiler problems.

I can see where more parallel distribution would be highly advantageous if we were distributing to, say, an array of processing cards each with one or more CPUs running LabVIEW Embedded and we were trying to create an MxN matrix processor for something like image processing of video streams.

Posted
The LV compiler already does some loop unrolling. That's something you'd never actually see on the diagram, just as you don't see it in C compilers, etc. If we ever had distributed parallelism, where each frame of the loop passed to different computers, that's something that would be displayed to users.

There are cases where you know that your problem is highly parallel and very processor intensive. In these cases it would be nice if you could force the calculation to dynamically distribute to multiple processors without explicitly dividing the problem into multiple parallel loops. There could for example be a right click menu from which you could select how the loop is parallelized. The options would be:

  • Sequential
  • Parallel
    • Dynamic
    • Two threads
    • Three threads
    • Four Threads
    • Five Threads
    • Six Threads
    • Seven Threads
    • Eight Threads
    • ...

The dynamic option would define the number of threads for the loop at runtime and create copy of the loop dataspace for each thread dynamically at runtime. If user would specify the number of threads at compile time then the compiler could create the dataspaces for each thread already at compile time and therefore the looping would be performancewise more efficient. If the number of required threads is not known at compile time the user should select Dynamic, especially when the computation inside the loop is heavy and the cost for spawning the threads is low compared to the total computation time.

Alternatively to the context menu configuration parallel loop structure could have an optional constant which would define the number of threads at compile time. This constant would be disabled if the loop would be configured for dynamic parallelism.

Posted
The LV compiler already does some loop unrolling. That's something you'd never actually see on the diagram, just as you don't see it in C compilers, etc. If we ever had distributed parallelism, where each frame of the loop passed to different computers, that's something that would be displayed to users.

I have seen this where removing a shift register (it was not usefull anymore) sped up (is that proper english)?

the for-loop with a lot (10 x times faster or so). Should have :camera: it. I got the the idea that all cases (or several) were executed at the same moment!!!! But I am a little afraid in trying to reproduce this, because it sounds too good to be true... :laugh:

Ton

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.