Jump to content

Formula nodes: code readability comes at a price


Oakromulo

Recommended Posts

After two years "leeching" content every now and then from the Lava community I think it's time to contribute a little bit.

Right now, I'm working on a project that involves lots of data mining operations through a neurofuzzy controller to predict future values from some inputs. For this reason, the code needs to be as optimized as possible. With that idea in mind I've tried to implement the same controller using both a Formula Node structure and Standard 1D Array Operators inside an inlined SubVI.

Well... the results have been impressive for me. I've thought the SubVI with the Formula Node would perform a little bit better than the other one with standard array operators. In fact, it was quite the opposite. The inlined SubVI was consistently around 26% faster.

Inlined Std SubVI

rtp6j9.png

Formula Node SubVI

2ex7vq0.png

evalSugenoFnode.vi

evalSugenoInline.vi

perfComp.vi

PerfCompProject.zip

  • Like 2
Link to comment

I agree, LabVIEW lacks in readability when it comes to doing math like this, but there's not much you can do about it. Now with wire labels it helps a little bit because you can label your intermediate "variables" so-to-speak. You've made it about as clean as you can. I'm guessing the formula node has some sort of overhead on a "per call" basis so calling it in the for loop is causing the long execution times as compared to the primitives. I think, in general, primitives are always the best option in terms of performance due to optimization (but someone else can probably give better detail on the why than I can so I'll just leave it at that).

Edited by for(imstuck)
Link to comment

Another surprise over here... I've tried the formula node with c-like array manipulation, with constant dimensions. This time the primitives have been 71% faster than formula node. The O(n) overhead theory seems unlikely...

eqti8j.png

Slight modification, same results:

35bdmyh.png

2upuwlw.png

Edited by Oakromulo
Link to comment

LaTex --> G... that's awesome. Definitely should become a core LV feature! It'd be nice to meet Darin in the next NI Week...

By the way, the equations represent a simplified First Order Sugeno Fuzzy Inference System. Always a good idea to add them to the VIs!

141uueq.png

Edited by Oakromulo
Link to comment

Formula nodes are for c and matlab programmers that can't get their head around LabVIEW (or don't want to learn it).

It's well known that it is a lot slower than native LV and it's a bit like the "Sequence Frame" in that it is generally avoided. I would guess there are optimisations that LabVIEW is unable to do if using the formula node which are reliant on the G language semantics (in-placeness?).

Link to comment

Some time ago I performed some benchmarks of Formula Node and the conclusion was that the difference from native code is neglectable (until you don't use arrays in the node). But it was on single core machine. I think that difference you observe comes from execution parallelism.

  • Like 1
Link to comment

Formula nodes are slow, and in particular when using array manipulations. Things gets even worse for RT. This is a pity, because when doing math you want to just take one look and reckognize the code. I typically refactor as much as possible, and include a text with the equivalent text code.

Link to comment

After two years "leeching" content every now and then from the Lava community I think it's time to contribute a little bit.

Right now, I'm working on a project that involves lots of data mining operations through a neurofuzzy controller to predict future values from some inputs. For this reason, the code needs to be as optimized as possible. With that idea in mind I've tried to implement the same controller using both a Formula Node structure and Standard 1D Array Operators inside an inlined SubVI.

Well... the results have been impressive for me. I've thought the SubVI with the Formula Node would perform a little bit better than the other one with standard array operators. In fact, it was quite the opposite. The inlined SubVI was consistently around 26% faster.

Inlined Std SubVI

rtp6j9.png

Formula Node SubVI

2ex7vq0.png

I only get a speed improvement of 4%

  • LabVIEW 2012
  • Win7 32-bit
  • Intel i5-2410M @2.3 GHz

Ton

Link to comment

I only get a speed improvement of 4%

  • LabVIEW 2012
  • Win7 32-bit
  • Intel i5-2410M @2.3 GHz

Ton

Then you are doing something wrong that does not filter out overhead etc. I also get consistently 20-30 % improvements with diagram vs formula node. Try doing 2D array math in a formula node. Last I checked it was 50-100% slow down (but that was a looong time ago). Writing a DLL in C gives the fastest running code, but that kind of defeats the purpose of making the code accessible, readable and maintainable. Wire diagram HAS improved in the latest iterations of LV, and IMO that is overall the best solution (given reasonably complex math).

I have written a matrix solver using exclusively wires. It's pritty fast for matrix smaller than approximately 200 x 200. For larger matrixes the native solver (using DLL) is faster, but I guess one of the main reason it is faster is it probably uses a more complex algorithm that scales better, parallel maybe, I don't know.

Link to comment

Ton,

Same thing here... I ran the first comparison again in my laptop at work and it was just 5% faster too!

Desktop (home):

AMD Phenom II 965BE C3 @ 3.7 GHz (quad core)

8GB DDR3-2000 CL5

Laptop (work)

Intel Core i5 M540 @ 2.53 GHz (dual core, Hyper Threading enabled)

6GB DDR3-1333 CL8

Both with Win7 x64 and LV2011.

bsvingen,

I think I'm going to try an equivalent DLL to be called from LV. I have little to no experience with DLLs on LV apart from the system ones.

vugie,

If I push the code inside a timed loop with manual affinity, is it safe to say it runs only in a single core?

Edited by Oakromulo
Link to comment

I just realized now that the percentiles have been calculated in a very wrong way. I invite you all to check the new comparison below with a queue structure.

9fvfht.png

Now with parallelized for loops and queue, the primitives were a full 4 times faster than the formula node SubVI!

PerfCompProject3.zip

Edited by Oakromulo
Link to comment
With output auto-indexing disabled, wouldn't the indicators outside the loop kick in compiler optimizations? Anyway, a queue in this case seems a better option.

Yes. That's what you want, right? Fast? Also. LV has to task switch to the UI thread. UI components kill performance and humans cant see any useful information at those sorts of speeds anyway ( 10s of ms ) . If you really want to show some numbers whizzing around, use a notifier or local variable and update the UI in a separate loop every, say, 150ms.

Edited by ShaunR
Link to comment

Yes. That's what you want, right? Fast? Also. LV has to task switch to the UI thread. UI components kill performance and humans cant see any useful information at those sorts of speeds anyway ( 10s of ms ) . If you really want to show some numbers whizzing around, use a notifier or local variable and update the UI in a separate loop every, say, 150ms.

Sure! I've added the indicators just for avoiding the "unused code/dangling pin" compiler optimization. You're right, it wasn't very clever, the queue idea is much better. The slow random number generator inside the for loop is there for the same reason to avoid unfair comparisons between the formula node SubVI and the standard one.

2mpblv4.png

Edited by Oakromulo
Link to comment

Sure! I've added the indicators just for avoiding the "unused code/dangling pin" compiler optimization. You're right, it wasn't very clever, the queue idea is much better. The slow random number generator inside the for loop is there for the same reason to avoid unfair comparisons between the formula node SubVI and the standard one.

A local variable will be the fastest except for putting the indicator outside (and won't kick in that particular optimisation as long as you read it somewhere I think). The queues, however will have to reallocate memory as the data grows, so they are better if you want all the data, but a local or notifier would be preferable as they don't grow memory.,

Link to comment

A local variable will be the fastest except for putting the indicator outside (and won't kick in that particular optimisation as long as you read it somewhere I think). The queues, however will have to reallocate memory as the data grows, so they are better if you want all the data, but a local or notifier would be preferable as they don't grow memory.,

I've forgotten that only RT FIFOs are pre-allocated and so enable constant-time writes. This time I replaced the queue with a DBL functional global variable.

xqb1bd.png

Link to comment

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.