grouping N-D data in Labview

torekp · October 17, 2008

I found one resource on clustering in Labview, using k-means:

http://forums.ni.com/ni/board/message?boar...uireLogin=False

and several non-Labview free software packages:

http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview

http://www.prudsys.com/Produkte/Algorithmen/Xelopes/

http://www.shih.be/dataMiner/screenShots.html

... and I'm just wondering if anyone has any suggestions. My data has over 1k dimensions and probably 10k samples (observed ordered N-tuples) and I want to cluster into some fixed number of groups, less than 10.

For a simple 2D example of clustering, here's some made-up data clustered into 3 groups:

Anders Björk · October 20, 2008

QUOTE (torekp @ Oct 16 2008, 07:55 PM)

I found one resource on clustering in Labview, using k-means:
http://forums.ni.com/ni/board/message?boar...uireLogin=False

and several non-Labview free software packages:

http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview

http://www.prudsys.com/Produkte/Algorithmen/Xelopes/

http://www.shih.be/dataMiner/screenShots.html

... and I'm just wondering if anyone has any suggestions. My data has over 1k dimensions and probably 10k samples (observed ordered N-tuples) and I want to cluster into some fixed number of groups, less than 10.

For a simple 2D example of clustering, here's some made-up data clustered into 3 groups:

http://lavag.org/old_files/monthly_10_2008/post-4616-1224179598.jpg' target="_blank">

Hello

Your datatable is 10k observations and 1k variables? Then I would build a PCA-model and do clustering on the PCA-scores, to get an reasonable dimension for each observation. When get new data your could use the PCA-loadings to calculate new PCA-scores.

torekp · October 21, 2008

Thanks Anders,

That sounds like a really smart idea to save lots of computing time. Unfortunately it involves more programming time :unsure: - I guess I'll see how bad this k-means computation is, first.

Anders Björk · October 21, 2008

QUOTE (torekp @ Oct 20 2008, 02:39 PM)

Thanks Anders,
That sounds like a really smart idea to save lots of computing time. Unfortunately it involves more programming time - I guess I'll see how bad this k-means computation is, first.

If youre matrices is complete it can be done by SVD plus four of five other VIs.

torekp · October 23, 2008

QUOTE (Anders Björk @ Oct 20 2008, 03:16 PM)

If youre matrices is complete it can be done by SVD plus four of five other VIs.

Can you explain that more fully?

Here is my attempt to follow someone else's recipe for PCA (pp. 52-53). I made up some data, and the resulting factor weights (eigenvectors) and eigenvalues SEEM reasonable, but what do I know. (Not much.)

torekp · October 23, 2008

And here is my attempt to follow this recipe (thanks, Los Alamos!) using SVD. I changed from normalizing my data matrix to merely centering it, but I changed my earlier VI similarly and the results agree. Hooray! Does this mean I actually did this right?

According to the website, Vector S^2 is proportional to the variances of the principal components, so I'm taking that as a measure of how important each Component is.

Anders Björk · October 28, 2008

QUOTE (torekp @ Oct 22 2008, 02:58 PM)

Can you explain that more fully?
Here is my attempt to follow someone else's recipe for PCA (pp. 52-53). I made up some data, and the resulting factor weights (eigenvectors) and eigenvalues SEEM reasonable, but what do I know. (Not much.)

Adding the vectors to a matrix common is to enable sorting of the eigenvecors in order from large to small, which i common in PCA.

QUOTE (torekp @ Oct 22 2008, 03:59 PM)

And here is my attempt to follow

this recipe

(thanks, Los Alamos!) using SVD. I changed from normalizing my data matrix to merely centering it, but I changed my earlier VI similarly and the results agree. Hooray! Does this mean I actually did this right?

According to the website, Vector S^2 is proportional to the variances of the principal components, so I'm taking that as a measure of how important each Component is.

A PCA-modell is usally like this X=TP'+E

Where T=S*(What you call scores now)

multiplying Scorevector1*S1, Scorevector2*S and so on.

Now youre PCA-model is factored in two instead of three variables.

Sorry for the late answer, had about too much work for a time.

GregSands · October 28, 2008

Does the experimental clustering library at NI Labs help you at all?

Cheers ~ Greg

torekp · November 21, 2008

QUOTE (GregSands @ Oct 27 2008, 11:24 PM)

Does the experimental clustering library at http://www.ni.com/labs/' target="_blank">NI Labs help you at all?
Cheers ~ Greg

Wow! I haven't tried yet, but I imagine so.

It figures. Every time I reinvent the wheel, either NI publishes the same thing a little later, or I discover it in OpenG or something.

Usually a lot more robust & versatile than my version.

LAVA 1.0 Content · November 21, 2008

QUOTE (torekp @ Nov 20 2008, 02:48 PM)

...
It figures. Every time I reinvent the wheel, either NI publishes the same thing a little later, or I discover it in OpenG or something.

Usually a lot more robust & versatile than my version.

Ditto that!

I developed an architecture that would allow arbitrary data paths between VI to allow customers to compose a system from building blocks that I provided. I spent weeks getting it all to work correctly, in BridgeVIEW 2.1 (LV 5.1). Before the ink was dry on the paper NI released LV 6.0 that feature control references and completely invalidated all of my work. :headbang: Funny, but actually threw away the disks with backups of that code and enjoyed doing it. :thumbup:

Sometimes you are better off being the second person with the idea.*

Ben

*Grand unification theory is probably one of those exceptions.

Sign In

grouping N-D data in Labview

Recommended Posts

torekp

Anders Björk

torekp

Anders Björk

torekp

torekp

Anders Björk

GregSands

torekp

LAVA 1.0 Content

Join the conversation

Browse

Activity

Important Information