Jump to content

Delete multiple items from array?


jpo

Recommended Posts

The delete from Array function in LabVIEW, deletes only a set of elements defined by the Length parameter from the Starting Index. I'm guessing you are looking to delete specific indexes of the array, for this you will need to implement a For loop. The AutoGrow is to support  multi-dimensional array. The OpenG Array libraries does have a 'Delete Elements from Array' function which could meet your needs.

Edited by Justin Thomas
Link to comment

You cannot. The costly method (more so the larger the array and the number of deletes are) is to do repeated deletes (decrementing delete indexes as you go if you do it in order e.g., see attached example code). It is costly in both memory and speed due to reallocation on every round.

(The same goes for array building; do not do it one element/subarray at a time, pre-allocate the full size then replace (or use auto-indexing in for loop to have LabVIEW pre-allocate automatically)).

For multiple deletes nowadays I suspect conditional array indexing will most often be the most (or at least quite) efficient way to handle it. Something like this (NOTE this particular example assumes sorted (and non-empty) delete indexes!):

1132698522_deletegivenindexesfromarray.png.34f1b065ed25d6843f0554ee4e10d636.png
If the evaluation of which elements to delete can be done within the loop the logic gets simpler...

Doing a quick test on my computer, comparing it with deletes of half the elements in an array of 100k elements (extreme case), this cuts the time of the operation from a whopping 3 seconds to 0,4 ms  (OpenG does multiple deletes from the end instead of start, that is only marginally better (1,5 sec) than deletes from the start (3 sec), as you can see in the example code).

Before effective conditional indexing in for-loops was added to LabVIEW a more efficient algorithm (still, but only very slightly) would involve shifting array elements (using replace, which is very fast and does not allocate) we want to keep to the front, then taking the subarray with the new size in one operation. I have added that to the attached example.

 

Methods of deleting multiple array elements.vi

Edited by Mads
  • Like 1
Link to comment

Hey that's a pretty cool speed test.  Even if you turn down the samples to something more reasonable like 100, or 1000 the OpenG method still loses by an order of magnitude.  Would you mind if I adapted your code into the Hooovahh Array VIMs package?  At the moment it is basically the OpenG method, with an optional input on if the indexes to remove are sorted already or not.  The OpenG method returns the deleted elements, and there is a some book keeping that needs to take place if that array isn't sorted.  But if your method works the way I think it's performance with or without sorted indexes should be similar.

Also if anyone sees performance improvements for that array package I'd be interested in adding them.  Most of it is the OpenG methods, with a few changes to help performance.

EDIT: Oh the OpenG method does work with unsorted elements to remove, and returns the deleted elements in the correct order.  I think the Shift and subarray, still can generate the same output, but needs extra work to track things which might eat into that time difference.

  • Like 1
Link to comment

Okay as with most things, there is some nuance.  If the number of elements being deleted are very small, the OpenG method is faster, but it has to be pretty small, and the main array you are deleting from needs to be pretty large.  Attached is the version that I think works well, and supports sorted, or unsorted indexes to delete with the same output as the OpenG method, which includes the deleted elements.

Methods of deleting multiple array elements Hooovahh Test.vi

  • Like 1
Link to comment

Looks good @hooovahh, feel free to use it / the inspiration 🙂 in your VIMs.

PS. Twelve years ago (!) I made other optimized implementations of many of the OpenG functions and posted them here.
I could not find them though, but I attached it here now. I am sure some of them are not that much quicker anymore / have better alternative implementations now, but some of them probably are better. I have not gone through them again yet.

1509324986_OpenGArrayRevisedR4LV2011.zip

  • Thanks 1
Link to comment

Wow there is a lot to go over here.  I'm unsure how long it will take to run various performance tests on these.  I'm sure they are better then the native OpenG stuff, but I need to also compare them to the changes I made.  I see you tend to reuse input arrays which is a great idea.  I tend to use indexing and conditional tunnels, with my suspicion being your method is better.  Another thing I tend to do is reuse functions in other functions.  Like how my Filter 1D Array uses the Delete Elements From Array inside it.  This makes for more readable code, and improvements to the Delete function will improve the Filter function.  But I will have to run some tests to see if there is a measurable improvement with yours.

Link to comment

The old read me file says that only 8 functions were changed so I took a look at one of them now:

Filter 1D array

Improvement when filtering every 5th element in a 100k array:   2.4x
Improvement when filtering every 100th element: 2.2x
Improvement when filtering every 1000th element: 1.8x

OK, but not that impressive...

However, looking at the algorithm I saw that there are repeated 1D Array searches for the whole array of elements to filter, that is costly. So I threw in a new alternative where that search is based on the sorted search instead. That makes a big difference when the number of items to filter is big, for the test cases above the results on my computer were

Improvement when filtering every 5th element in a 100k array:   227x
Improvement when filtering every 100th element: 22x
Improvement when filtering every 1000th element: 3x

Now the last algorithm changed two things, it uses both conditional indexing and sort/search sorted array. The sorting bit could just as well be added to the original revision...I guess the reason why OpenG still does not use conditional indexing is to be backwards compatible (very far now though...), and then the in-built sorted search would not be available either. That could be replaced by a binary search function that was available back then too though.

Filter 1D test.zip

Edited by Mads
  • Thanks 1
Link to comment

I might be wrong, but I don't think your Revised v2 is working.  The Result array doesn't have the same number of outputs as the other modes.  My Array VIMs package at the moment is 2018 and newer so I don't mind conditional tunnels, inlineing, or VIMs (obviously).  No Maps or Sets yet, but maybe one day.

That being said my Filter 1D is already pretty good with your previous help.  OpenG is 1.3, Revised 0.7, my version with your help is 0.7, and if I have the No Duplicates input to my version it is 0.4.  I did go through your other VIs that you said had changes, and some had a measurable improvement, some were close enough to what I already had.

Link to comment

That is pretty fast.  I have noticed that on smaller arrays some of the other methods work better.  Somewhere between 100 and 1000 elements the first Revision works better, then Revision 2.  And the Hooovahh method with No Duplicates is faster then Revision 2 at at least 500 elements.  I've also seen that with a smaller number of items to filter, other methods work better then Revision 2.  Even OpenG beats it when there are many items, but only a few to filter.  In most cases when I'm filtering an array, I have a relatively large set of data, and a small-ish array of things to remove.

I'm finding it difficult to know what method is best for the most common use cases.  Is it worth wasting some processing time reading the array sizes, and then picking the algorithm that works best for that size?  The problem I see with this is various compiler optimizations may take place, making some methods faster or slower, with later releases of LabVIEW.  Since my method basically is for all Items to Filter, and your method is for all Array In, I could say if the Items to Filter is 10 or less run the Hooovahh method.  If it is greater then 10 use your Revision 2.  If the No Duplicates is used, and the items to filter is 500 or less, use the Hooovahh method, otherwise use Revision 2.  I'll think about it, but a hybrid approach might not be a bad idea, even if it complicates things a bit.  I'd say of the array VIMs on my palette, Filter, and Remove Duplicates are the ones I most commonly use, so getting the most performance out of these would be good.

Link to comment

Performance mainly becomes an issue when the task at hand is heavy.

I do not think saving microseconds on a small array/filter list is worth losing milliseconds or seconds when the array gets large. So that would be the rationale for simply ignoring the "quick on small arrays, slow on large" and choosing the one that is the fastest in the worst cases.

However - there is one case where this can become an argument *for* keeping the other algorithms too; namely when you have to repeatedly run through small array/filtering jobs....🤨 Life is never easy😬

The "Revised version" is the best middle ground; quicker than all on most small and medium arrays, and still 2x times quicker than OpenG on worst case scenarios. The Rev2 one is 10 times quicker then...but in truly worst case scenarios you typically end up writing customized algorithms instead of using generic libraries anyway...(I would then often skip the deleted array indexes output to get things even quicker e.g.).

 

Edited by Mads
Link to comment

Not sorting the array we are processing, sorting the array of elements to delete. That is done just to facilitate the filtering algorithm, it does not affect the array that is to be processed.

(Unless the sort has somehow gotten misplaced somewhere in the code that has been bounced around here...)

Link to comment

When it comes to "Remove Duplicates" the performance of OpenG is a bit of a curious case (unless I am just overlooking something obvious).

The original OpenG function builds the output arrays one element at a time, but still (in my LabVIEW 2022 at least) outperforms the alternatives I make where I reuse the input array.

The latter use less memory it seems, as expected, but it is as if LabVIEW has become smarter when there are repeated build operations (but not for the functions we have gone through previously...(yes debugging is disabled..) 🤨

The attached version is a simplified one of the original suggested revision. As mentioned it does seem to use less memory than the original OpenG, but is still slower  😞

Remove duplicates.zip

Edited by Mads
Link to comment
8 hours ago, Mads said:

Not sorting the array we are processing, sorting the array of elements to delete. That is done just to facilitate the filtering algorithm, it does not affect the array that is to be processed.

(Unless the sort has somehow gotten misplaced somewhere in the code that has been bounced around here...)

My bad, it does not scramble elements 😞 Busy on the side dealing with a lab flood doesn't help with focus...

Link to comment
On 2/25/2023 at 8:12 AM, Mads said:

I do not think saving microseconds on a small array/filter list is worth losing milliseconds or seconds when the array gets large. So that would be the rationale for simply ignoring the "quick on small arrays, slow on large" and choosing the one that is the fastest in the worst cases.

Oh I totally agree here and wouldn't suggest otherwise.  All I'm saying is that specifically for the Filter 1D there is two main approaches.  To perform the filter in a for loop running for all elements in the Filter Array, or running a for loop running for all elements in the Array In.  If one is small and the other is large one approach will easily outperform the other, even if it isn't optimized.  So maybe deciding which technique could be used based on the size of the Filter Array.  Additionally I like the No Duplicates input, because there are often times when I'm filtering an array of numeric indexes, and removing 0 or 1 elements from each Filter Array element can make for lots of shortcuts.  I tried optimizing your sort and binary search with this, but the benefit where much since I think the majority of the time is spent sorting the array for large arrays, and then the actual binary search is very quick.

Similarly I've added the Array Is Sorted input to my Remove Duplicates VIMs.  If you are removing duplicates from an array that has already been sorted there are shortcuts that can be taken that save lots of time.  I think the Remove Duplicates I have is very slight improvement to the OpenG method, and I'll stick with what I have at the moment.  However your sort, then work does give me an idea I'll test when I get time.

Link to comment
  • 1 month later...

Crap, it sure looks like it.  I did lots of unit testing, but it didn't test for that apparently.  Also note that if Indicies Ascending is true then it has the correct result.  Version 3.0.1.15 has this fix on VIPM.IO.  Thanks a lot for finding issues like this.

  • Thanks 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.