Wouter Posted July 26, 2012 Report Share Posted July 26, 2012 (edited) The current implementation is very slow, so I checked if I can do it any faster and well I can :-) al lot also. This is my revised version. Differences: # elements, current array filter, mine implementation array filter 10000 - 30 ms - 0 ms 100000 - 2154 ms - 5 ms 1000000 - after 7 minuts it was still running - 56 ms The code: wgtk_filterArray.vi The only difference is that I do not remove duplicates from the "items to filter". Further in this implementation the "filtered items indices" are returned sorted. Also the code could (theoratically) be even faster by first sorting the "items to filter" and then use a binary search instead of the labview linear search. -edit- small improvement, the second loop iterator for indexing the "filtered items indices" array wasn't necessary. Because it can be calculated by substracting the loop iterator by the iterator for the other array. wgtk_filterArray.vi Edited July 26, 2012 by Wouter 1 Quote Link to comment
crossrulz Posted July 26, 2012 Report Share Posted July 26, 2012 The thing that's hurting the current implementation is the abundance of build arrays inside of loops, which you found a way around by preallocating arrays. As long as the functionality is the same, I'm all for the upgrade (I haven't fully tested it yet). I am currious what would happen if we removed the duplicate filtered items. Quote Link to comment
asbo Posted July 26, 2012 Report Share Posted July 26, 2012 Cool! You better wire that error through, though. 1 Quote Link to comment
Wouter Posted July 26, 2012 Author Report Share Posted July 26, 2012 (edited) As long as the functionality is the same, I'm all for the upgrade (I haven't fully tested it yet). I am currious what would happen if we removed the duplicate filtered items. Nothing. But i didnt understand why it was build in the first place. Because why would it contain duplicate items? I think its up to the programmer if he wishes to check if the filter list contains duplicates. Cool! You better wire that error through, though. No because the error inputs are also not used in the terminals. I just put them there because maybe in the future I want to throw a error, then I will wire it. Now LabVIEW will just treat it as deadcode. Edited July 26, 2012 by Wouter Quote Link to comment
crossrulz Posted July 26, 2012 Report Share Posted July 26, 2012 Nothing. But i didnt understand why it was build in the first place. Because why would it contain duplicate items? I think its up to the programmer if he wishes to check if the filter list contains duplicates. I'm inclined to agree. Once I started thinking about it, I was like "what idiot would put duplicates in filtered items anyways?" And even if they did, there should be to few to really care about. And get rid of the error clusters. They are not needed for this function. Quote Link to comment
Phillip Brooks Posted July 26, 2012 Report Share Posted July 26, 2012 (edited) No because the error inputs are also not used in the terminals. I just put them there because maybe in the future I want to throw a error, then I will wire it. Now LabVIEW will just treat it as deadcode. Be careful about error clusters that do nothing. I updated some low-level internal libraries a couple of years ago and had a bunch of developers get mad at me. Basically, if you do add the error in / error out terminals to your connector pane for 'future use', make sure you place an Always Copy node on the Error wire or any caller could require a recompile if you make changes... Edited July 26, 2012 by Phillip Brooks Quote Link to comment
Wouter Posted July 26, 2012 Author Report Share Posted July 26, 2012 Be careful about error clusters that do nothing. I updated some low-level internal libraries a couple of years ago and had a bunch of developers get mad at me. Basically, if you do add the error in / error out terminals to your connector pane for 'future use', make sure you place an Always Copy node on the Error wire or any caller could require a recompile if you make changes... yes but they are not even wired now so as said, and as far as I know LaBVIEW will see it as deadcode and remove it when the code is compiled. Quote Link to comment
Mellroth Posted July 26, 2012 Report Share Posted July 26, 2012 yes but they are not even wired now so as said, and as far as I know LaBVIEW will see it as deadcode and remove it when the code is compiled. Personally I feel that unwired terminals just add confusion to the BD (why are these not wired, should they have been etc.). Then regarding the code I think that a function like this should reuse the input buffer if possible. If the input array terminates in your function it still requires an equally large memory portion to be allocated to store the result, instead I prefer to save the result in the input array. /J Quote Link to comment
ShaunR Posted July 26, 2012 Report Share Posted July 26, 2012 yes but they are not even wired now so as said, and as far as I know LaBVIEW will see it as deadcode and remove it when the code is compiled. Wiring through error clusters (if they exist) is recommended since if someone wires them up it will clear any errors on the wire from previous operations which is unexpected bahaviour. Quote Link to comment
Wouter Posted July 26, 2012 Author Report Share Posted July 26, 2012 Wiring through error clusters (if they exist) is recommended since if someone wires them up it will clear any errors on the wire from previous operations which is unexpected bahaviour. Yes but they aren't wired to the terminal anyhow this is a bit offtopic lets stay ontopic. If you want you can remove them it indeed does not matter. Quote Link to comment
Wouter Posted July 26, 2012 Author Report Share Posted July 26, 2012 (edited) I thought of something extra aswell. This is nearly the same function as the previous one however this function has a callback as input instead of a list of items. So you can define your own filter function, same terminal type as the wgtk_filterArrayCallback template. Example: now it is not possible to filter all odd elements. If you now build a seperate VI which would check if a element is odd and you give that strict VI ref to the filterArray it will filter all odd elements. wgtk_filterArray (callback).vi wgtk_filterArrayCallback.vit Edited July 26, 2012 by Wouter Quote Link to comment
Mads Posted July 27, 2012 Report Share Posted July 27, 2012 The OpenG array functions are some of the most useful VIs there are I think. I'm very grateful for them. I've done quick jobs where those functions have been used extensively. In the cases where those applications have ended up being used on very large chunks of data I've had to revisit the design though, to remove or redesign the OpenG functions to get an acceptable performance. So, when I saw this thread now I just picked the next array function I could find on the palette and threw together something I would expect to run faster (i.e. it is not necessarily the optimal solution, but a better one). Attached is the OpenG Remove Duplicates function revised (as mentioned, it was thrown together so I've just used the auto-cleanup on it, not very nice, but it does outperform the original quite nicely). The improvement varies a lot with the size of the array and number of duplicates. I got anything from 2 to 5 times the speed of the original in my quick tests. I've kept the original's input and outputs, but the performance would have been better if the indexes output was dropped. For optimal performance it would be better to sort the input array and do a binary search for the duplicates. That would add another (optional?) step to recreate the original order of the input array, but still,- on large arrays it would have a huge impact on the performance. PS. To actually publish an improved version there is of course still a lot of work needed. If deemed of interest I could always redo this one properly and apply it to the full polymorph VI set. MTO Remove Duplicates from 1D Array (DBL)__improved.vi Quote Link to comment
Wouter Posted July 27, 2012 Author Report Share Posted July 27, 2012 (edited) I've kept the original's input and outputs, but the performance would have been better if the indexes output was dropped. For optimal performance it would be better to sort the input array and do a binary search for the duplicates. That would add another (optional?) step to recreate the original order of the input array, but still,- on large arrays it would have a huge impact on the performance. I would rather first sort the array and then iterate over the array, I think its faster because of how LabVIEW works. It still has the same theoratical speed, yours would be O(n*log(n) + n*log(n)) = O(2*n*log(n)) = O(n*log(n)) (because you do a binary search, log(n), in a loop thus multiply by n, becomes n*log(n). Mine is O(n*log(n) + n) = O(n*log(n)). But sure it would be faster then the current implementation which is O(n^2). -edit- Funny a simple implementation of the sort and then linear approach is still slower then yours. Edited July 27, 2012 by Wouter Quote Link to comment
crossrulz Posted July 27, 2012 Report Share Posted July 27, 2012 I'm a little surprised how effective that In Place Element Structure helped with the split array. My attempt at the sort first came out 2x slower than the current implementation. Maybe somebody can find somewhere to improve it. Remove Duplicates from 1D Array (DBL)__improved.vi Quote Link to comment
egraham Posted July 27, 2012 Report Share Posted July 27, 2012 I've realized there are a few caveats to keep in mind when looking at optimizing array manipulation functions. 1. Optimization of code for one data type does not necessarily affect other data types the same way. 2. The type of computer/processor can have a HUGE affect on how effective your optimized code performs. 3. There are likely significant differences in the code performance between different versions of LabVIEW. Do some benchmark tests with different data types, different methods of processing your arrays, on different types of processors. I was astounded at the differences and realized there's a lot of low level processing going on behind the scenes that I would need to understand to know why performance differences existed between the different scenarios. Quote Link to comment
Mads Posted July 28, 2012 Report Share Posted July 28, 2012 If the starting point is a function that repeatedly resizes the array unnecessarily then I would expect there to be plenty of room for improvement before the variations due to data types etc. become significant (or?). If we could make sure that that part of the optimization is applied to all the OpenG functions then we would surely have achieved a lot - and maybe as much as is possible without having different code for different data types, operative systems etc. (The former might not be too bad though...) Quote Link to comment
Wouter Posted July 28, 2012 Author Report Share Posted July 28, 2012 I've realized there are a few caveats to keep in mind when looking at optimizing array manipulation functions. 1. Optimization of code for one data type does not necessarily affect other data types the same way. 2. The type of computer/processor can have a HUGE affect on how effective your optimized code performs. 3. There are likely significant differences in the code performance between different versions of LabVIEW. Do some benchmark tests with different data types, different methods of processing your arrays, on different types of processors. I was astounded at the differences and realized there's a lot of low level processing going on behind the scenes that I would need to understand to know why performance differences existed between the different scenarios. 1. This is true but I think differences may be very small I think. We can design 2 different algorithms, 1 algorithm which swaps elements in a array inplace, 1 algorithm which copies contents to another array. Case 1) when we swap elements in a array its just changing pointers, so constant time. Case 2) when we retrieve a element from a certain datatype of array A and copy it to array B it may indeed be the case that a algorithm implementation of case 1 is faster because copying and moving that element for that certain datatype in the memory consumes more then constant time, it would have a cost function which depends on the amount of memory the datatype consumes. (I'm not very sure if case 2, also works like this internally as I explained but I think it does?) 2. This will always be the case. 3. True but that is not our failure. I think its fair to assume that the algorithm runs the fastest on the newest version of LabVIEW and the slowest on first version of LabVIEW. The goal, imo, is to achieve a algorithm that minimizes the amount of operations and the memory space. Quote Link to comment
Mellroth Posted July 28, 2012 Report Share Posted July 28, 2012 So, when I saw this thread now I just picked the next array function I could find on the palette and threw together something I would expect to run faster (i.e. it is not necessarily the optimal solution, but a better one). Attached is the OpenG Remove Duplicates function revised (as mentioned, it was thrown together so I've just used the auto-cleanup on it, not very nice, but it does outperform the original quite nicely). The improvement varies a lot with the size of the array and number of duplicates. I got anything from 2 to 5 times the speed of the original in my quick tests. We have a similar approach but we didn't make use of the "In Place Element Structure" (but we will from now on), and I think you can optimize this further by getting rid of that array-copying before you enter the loop and by converting to a boolean before the case structure (the case evaluation works faster on booleans than integers, at least in my experience). /J Quote Link to comment
Aristos Queue Posted July 29, 2012 Report Share Posted July 29, 2012 Basically, if you do add the error in / error out terminals to your connector pane for 'future use', make sure you place an Always Copy node on the Error wire or any caller could require a recompile if you make changes... Right idea, wrong actual solution. Instead of "Always Copy", use the "Mark As Modifier" option on the Inplace Element Structure instead for this purpose. It signals the same "this subVI could modify this value" on the conpane without introducing an unnecessary copy on No Error cases. 2 Quote Link to comment
Mads Posted July 30, 2012 Report Share Posted July 30, 2012 You are right Mellroth - you save quite a bit of memory that way. The difference in speed is not noticable at first (in fact it was negative on my machine), but it surfaces when debugging is disabled. Nice. We have a similar approach but we didn't make use of the "In Place Element Structure" (but we will from now on), and I think you can optimize this further by getting rid of that array-copying before you enter the loop and by converting to a boolean before the case structure (the case evaluation works faster on booleans than integers, at least in my experience). /J Quote Link to comment
GregSands Posted July 30, 2012 Report Share Posted July 30, 2012 In my experience, keeping memory allocation to a minimum is at least as important as execution time. The original array should be reused for the output (where possible), and every array access checked that it is performed in-place (with or without the IPES). In addition, output arrays of indices should be optional as to whether they are generated, with the default being false. As a related note - wouldn't it be nice if LabVIEW could choose whether to create an array depending on whether the sub-VI output is wired or not? I think LabVIEW does take wiring into account within a single VI, but not in terms of connections to sub-VIs - please correct me if I'm wrong. I wonder if marking the subroutine as "inline" would make any difference? Quote Link to comment
Aristos Queue Posted July 31, 2012 Report Share Posted July 31, 2012 Greg: Yes, making the subVI inline does make it possible for the output to avoid calculation entirely. Doing it on any subVI call is something that the compiler optimization team has on its list of possible optimizations, but not currently implemented. 1 Quote Link to comment
Mads Posted July 31, 2012 Report Share Posted July 31, 2012 I've spent some additional time on the array function optimization. Attached is an archive of the revised VIs. It's just the 1D DBL-versions so far, but the logic itself will of course apply to all the data types. The filter array logic is the one developed by Wouter, but the inputs and outputs (of all the VIs) are identical to the ones of the originals. The VI with the largest improvement so far is not surprisingly the Delete Elements-function. The larger the array the more significant the change is, but to get an idea: on a 250k elements array the speed was 400 to 900x that of the original on my machine (depending on how big a fraction of the array (random elements) it should remove). I've run tests in the IDE with or without debugging enabled, and on built applications. Sometimes the difference between variations of the algorithms would be very small, and then I've chosen the one that uses the least memory. I'm sure all of them can be improved even further, but it's a good step in the right direction I think. OpenG Array functions improved Rev1.zip Quote Link to comment
crossrulz Posted July 31, 2012 Report Share Posted July 31, 2012 Impressive, but I think you got your Remove Duplicates from 1D Array messed up. It's the same as the Conditional Auto-Indexing Tunnel. Quote Link to comment
Mads Posted July 31, 2012 Report Share Posted July 31, 2012 Indeed. I've attached a new set here with the correct Remove Duplicates. OpenG Array functions improved Rev2.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.