Jump to content

Index of the Median Value in an array


Laur

Recommended Posts

I am a fairly new LabVIEW user, so I apologize is this is an incredibly silly question. I am trying to find the index of where the median value is in an array. I am finding the frequency spectrum of a simulated sine wave, and I can get the correct median value; however, I am unsuccessful in finding the index value of where that median value is (I use "Search 1D Array", which returns -1 every time). Please see the attached for a shot of what I have attempted using LabVIEW 8.5. Any suggestions will be extremely appreciated!

Link to comment

QUOTE (Laur @ Feb 1 2009, 12:54 AM)

I am a fairly new LabVIEW user, so I apologize is this is an incredibly silly question. I am trying to find the index of where the median value is in an array. I am finding the frequency spectrum of a simulated sine wave, and I can get the correct median value; however, I am unsuccessful in finding the index value of where that median value is (I use "Search 1D Array", which returns -1 every time). Please see the attached for a shot of what I have attempted using LabVIEW 8.5. Any suggestions will be extremely appreciated!

A workaround:

You could sort your data first and take out index at 50% (of max index) of your sorted data.

Aha! If you think of the definition of median... you will realize why you get -1.. You need to be very lucky to get a element that is equal to an average of the two elements in the middle.

Link to comment

QUOTE (Anders Björk @ Feb 1 2009, 01:19 AM)

A workaround:

You could sort your data first and take out index at 50% (of max index) of your sorted data.

Aha! If you think of the definition of median... you will realize why you get -1.. You need to be very lucky to get a element that is equal to an average of the two elements in the middle.

If the array has an even number of elements, the median is the average of the two points just below and above the 50% value (top 50% and bottom 50%). This may of course result in a median being returned which is not, in itself, part of the array. Thus searching for this median will (almost always) be unsuccessful. You MAY luck out and have exactly the same value for the two respective values....

Only if the array has an odd number of elements will the Median returned actually be a member of the array.

Shane.

Link to comment

QUOTE (Anders Björk @ Feb 1 2009, 06:05 PM)

I think NI should provide index/-es as an output not only the median value. Certainly it is in their computations, why not gives us that data too.

Please re-read my post: A median index normally does not exist!

By far the majority of arrays being "medianed" are of even length. As such, the median value is not actually present in the array, therefore it can't have an index....

What you are requesting is impossible.

Shane.

Link to comment

The median is just the middle number of the array. In the case where the array has an even number of elements, the median is the average of the two elements at the middle. But the *index* of the median is always (size of array / 2) rounded up. So all you need are the Array Size function, the Quotient&Remainder function and the Add -- all of which can be found by searching the palettes. Just get the size of the array, use quotient and remainder to divide by two and then add the resulting remainder to the quotient. (If the remainder was zero, then you had an even array and you have the lower index; if the remainder was 1 then you have the index of the middle element of the array).

Link to comment

QUOTE (Aristos Queue @ Feb 1 2009, 03:17 PM)

... if the array is sorted! Most interesting physical data is not sorted (until you run the Median function on it, but the function does not return the sort order).

QUOTE (Anders Björk @ Jan 31 2009, 04:19 PM)

You could sort your data first and take out index at 50% (of max index) of your sorted data.

If you go this route, you should run your array into a FOR loop, and cluster each point with its index value. Then sort the array of clusters. The median will be at the halfway point of the sorted index, and the second element in that cluster will be the original index.

QUOTE (Laur @ Jan 31 2009, 03:54 PM)

I am finding the frequency spectrum of a simulated sine wave, and I can get the correct median value; however, I am unsuccessful in finding the index value of where that median value is

... Any suggestions will be extremely appreciated!

Laur, what the heck are you trying to measure? The FFT of a sine wave should be a sharp peak at the fundamental. The majority of the FFT values will represent the tiny amounts of noise at the rest of the frequencies. Whichever one happens to be the median will be at a nearly random frequency. A much more interesting measurement is the median power frequency, which divides the spectrum into equal areas. However, that is also trivial for a sine wave, since it should be at the fundamental.

Link to comment

QUOTE (jdunham @ Feb 1 2009, 10:54 PM)

... if the array is sorted!
Bah. "Median" is the middle number of the set... and an array is an ordered set. So I'll contend that for an array, giving the middle value is technically valid. But I will concede that the subVI in question gives -- what I would call -- the median of the sorted permutation of the array, which does make the finding of the median more difficult.

Having said that ... you recommended clustering the value with the index and then sorting... I wonder if that would really benchmark well compared with just calling Sort 1D Array, calculating the middle and then using Search 1D Array on the original array. Sorting an array can be done without any new array allocations, and the data movement is significantly less that moving around the value plus an additional 4 bytes of index. The Search 1D Array is linear, but a multi-core CPU might break that down pretty quick compared to the actual data movement incurred in the larger sort -- after all, if your data is type double then the double+int is larger than a single assembly "move" instruction.

[Note that for any array less than a few milion elements this sort of performance discussion is probably pointless on a modern CPU, but I find the thought experiment interesting nonetheless.]

Link to comment

QUOTE (Aristos Queue @ Feb 2 2009, 12:03 AM)

Having said that ... you recommended clustering the value with the index and then sorting...

Well the OP said he was new to LabVIEW, and a lot of new users don't understand that the sort routine will take an array of clusters and that you can drop the index in there and that that can be extremely useful.

I hadn't thought about your idea of searching the array rather than sorting a cluster, though in most cases, you would have to be very careful about duplicated search values. You might have to find all of the search values in the original array and come up with a way to handle them. That's another reason why the OP request is totally bogus. If his FFT were symmetric, he would get duplicate median values on either side of the peak. I just can't imagine how that would be useful.

Link to comment

QUOTE (Aristos Queue @ Feb 1 2009, 03:17 PM)

But the *index* of the median is always (size of array / 2) rounded up. So all you need are the Array Size function, the Quotient&Remainder function and the Add -- all of which can be found by searching the palettes. Just get the size of the array, use quotient and remainder to divide by two and then add the resulting remainder to the quotient. (If the remainder was zero, then you had an even array and you have the lower index; if the remainder was 1 then you have the index of the middle element of the array).

I think this needs some modification in order to be correct.

Example:

Array size = 4

Quotient (after dividing by 2) = 2

Remainder = 0

Median = 2

The indexes of the array would be 0, 1, 2 and 3 and the median index should be 1.5. Using the method above with an array with an odd number of elements produces a median index that is one higher than it should be.

The correct algorithm for finding the median index of an array should be subtracting 1 from the array length and then dividing by 2.

Example revisited:

Array size = 4

Array size - 1 = 3

Quotient (after dividing by 2) = 1.5

Link to comment

QUOTE (Aristos Queue @ Feb 2 2009, 09:03 AM)

Bah. "Median" is the middle number of the set... and an array is an ordered set. So I'll contend that for an array, giving the middle value is technically valid. But I will concede that the subVI in question gives -- what I would call -- the median of the sorted permutation of the array, which does make the finding of the median more difficult.

Having said that ... you recommended clustering the value with the index and then sorting... I wonder if that would really benchmark well compared with just calling Sort 1D Array, calculating the middle and then using Search 1D Array on the original array. Sorting an array can be done without any new array allocations, and the data movement is significantly less that moving around the value plus an additional 4 bytes of index. The Search 1D Array is linear, but a multi-core CPU might break that down pretty quick compared to the actual data movement incurred in the larger sort -- after all, if your data is type double then the double+int is larger than a single assembly "move" instruction.

[Note that for any array less than a few milion elements this sort of performance discussion is probably pointless on a modern CPU, but I find the thought experiment interesting nonetheless.]

For 1 million DBL values, the NI Median implementation is 4 times faster than an array sort (According to my quick test).

My median implementation is around 12-15 times faster than a sort. You need a lot of cores to make up that difference.

I still maintain however that the index of a median position is in upwards of 90% of the time a non-existent entity.

Shane.

Link to comment

QUOTE (shoneill @ Feb 2 2009, 10:36 AM)

For 1 million DBL values, the NI Median implementation is 4 times faster than an array sort (According to my quick test).
True, but the NI Median implementation is not returning the data we are looking for -- namely the index of that median. If we put a new requirement on the NI implementation, I'm sure they'd have to change their calculations to find that value somehow. I was just commenting on benchmarking the "sort then search" method vs the "cluster then sort" method.

QUOTE

I still maintain however that the index of a median position is in upwards of 90% of the time a non-existent entity.

Mathematics would be in a sorry state today if people gave up searching just because the entity was non-existent. Are you familiar with Stanislaw Lev's Cyberiad? One of my all-time favorite quotes, posted on my desk at work...

Everyone knows that dragons don't exist. But while this simplistic formulation may satisfy the layman, it does not suffice for the scientific mind. The School of Higher Neantical Nillity is in fact wholly unconcerned with what does exist. Indeed, the banality of existence has been so amply demonstrated, there is no need for us to discuss it any further here. The brilliant Cerebron, attacking the problem analytically, discovered three distinct kinds of dragon: the mythical, the chimerical, and the purely hypothetical. They were all, one might say, nonexistent, but each nonexisted in an entirely different way...

Link to comment

QUOTE (Aristos Queue @ Feb 3 2009, 12:06 AM)

True, but the NI Median implementation is not returning the data we are looking for -- namely the index of that median. If we put a new requirement on the NI implementation, I'm sure they'd have to change their calculations to find that value somehow. I was just commenting on benchmarking the "sort then search" method vs the "cluster then sort" method.

Mathematics would be in a sorry state today if people gave up searching just because the entity was non-existent. Are you familiar with Stanislaw Lev's Cyberiad? One of my all-time favorite quotes, posted on my desk at work...

Everyone knows that dragons don't exist. But while this simplistic formulation may satisfy the layman, it does not suffice for the scientific mind. The School of Higher Neantical Nillity is in fact wholly unconcerned with what does exist. Indeed, the banality of existence has been so amply demonstrated, there is no need for us to discuss it any further here. The brilliant Cerebron, attacking the problem analytically, discovered three distinct kinds of dragon: the mythical, the chimerical, and the purely hypothetical. They were all, one might say, nonexistent, but each nonexisted in an entirely different way...

Just wanted to point out the execution time differences between a "proper" median search and a sort.

There be no dragons here....

Shane.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.