logistic regression in LabView

torekp · July 3, 2008

How would you do logistic regression in Labview? Some key points (see the wiki link for more detail), where f(z) is a probability function for some event:

$2c9aab00ccf9f653e975960ae951d14f.png$

The variable z is usually defined as

$035135ba193647cebfc0419f8edfd772.png$ where β₀ is called the "intercept" and β₁, β₂, β₃, and so on, are called the "regression coefficients" of x₁, x₂, x₃ respectively. To find the parameters βi, one optimizes them such as to obtain the maximum likelihood of getting the actual observed data. Making a few semi-reasonable assumptions, this amounts to maximizing

$657a9ec6871d3b20c7dabb1e8a31deff.png$

where L(theta) = the likelihood of getting the observed data, and L* is just the log of that.

Why use this approach? In my application, I've got a few categories of scrap metal that I'm trying to sort, let's say Al, Mg, and Zn. They're on a belt, and several diagnostic tests are done as they whiz by; each test outputs one or more real numbers. No one test is absolutely decisive, thus the need for optimizing their weightings.

I'm thinking of negating L*(theta) and using Conjugate Gradient nD.vi from the optimization palette. Does anyone have experience with this type of problem, and want to offer advice? If you thinking I'm barking up the wrong statistical/optimization tree, I'd like to hear your thoughts on that too.

Anders Björk · July 3, 2008

Why not use principal component analysis to cluster different scrap classess togheter instead, from my perspektive that would be eaiser (probally more robust). Ones you found the classes you could build a classifier based the on PCA scores.

Another method to use would be discrimant analysis using partial least squares.

torekp · July 4, 2008

QUOTE (Anders Björk @ Jul 2 2008, 03:15 PM)

Why not use principal component analysis to cluster different scrap classess togheter instead, from my perspektive that would be eaiser (probally more robust). Ones you found the classes you could build a classifier based the on PCA scores.
Another method to use would be discrimant analysis using partial least squares.

Thanks for the ideas. Without buying the Toolkit from NI (advanced signal processing, I think it was), how would you set up PCA analysis in Labview?

Anders Björk · July 11, 2008

QUOTE (torekp @ Jul 3 2008, 08:23 PM)

Thanks for the ideas. Without buying the Toolkit from NI (advanced signal processing, I think it was), how would you set up PCA analysis in Labview?

Using the SVD-vi or coding up the NIPALS-algorithm. There are also examples on PCA here on Lava with some vis. The SVD approach would work well if you dont have extremely many scrap piecies, say less than some thousands. This since I guess you dont have more than 10 meaures per scrap piece.

torekp · July 12, 2008

Thanks again. Would you like to comment on the "philosophical" aspects of PCA vs Logistic modeling vs whatever? I'm not very familiar with PCA. From what I've read, it seems like it's designed more to select which measures to use, rather than (or in addition to) telling you how much weight to put on each measure.

In my case, I already have a good idea which measures I want to use. All of them contribute to improving the signal to noise ratio. You're right by the way, there are less than 10.

Anders Björk · July 26, 2008

QUOTE (torekp @ Jul 11 2008, 05:51 PM)

Thanks again. Would you like to comment on the "philosophical" aspects of PCA vs Logistic modeling vs whatever? I'm not very familiar with PCA. From what I've read, it seems like it's designed more to select which measures to use, rather than (or in addition to) telling you how much weight to put on each measure.
In my case, I already have a good idea which measures I want to use. All of them contribute to improving the signal to noise ratio. You're right by the way, there are less than 10.

Sorry for not answering this so fast, have been on vacation.

Since the scores from a PCA models are a few new coordinates they are eaiser to apply some classification rule than the orignal variables.

The scores are the new coordinates of your data (the first, second,.. largest variation in your measured variables).

You could see a PCA model as model capturing the most of variation in your data and where you can select the number of underlying phenomenas you want to discribe by determining the number of principal components, PC (new coordinates). The principal components are preprendicular to each other, orthagonal.

The PCA model

X=TP'+E

T scores matrix (Objects row, columns the score value for PC 1, 2 to n)

P Loading matrix

E Error matrix (residuals)

X Data matrix (Objects rows, Variables columns)

torekp · July 31, 2008

Thanks Anders. I wrote a preliminary Logistic Regression solver, but it performs very poorly when the number of pieces of scrap metal is >300 or so. Here it is, in case anyone's interested. It requires the Optimization VIs, which you only get with Labview ... full? Professional? Not sure.

I also found this web-based Logistic Regression calculator, which really helped me test my stuff. I haven't tested large datasets in the web-based one yet, to see if it's faster. Edit: the web page algorithm is a lot faster.

Meanwhile, I haven't done anything on the PCA front yet. But I'll start by looking on LAVA as you suggested.

Sign In

logistic regression in LabView

Recommended Posts

torekp

Link to comment

Anders Björk

Link to comment

torekp

Link to comment

Anders Björk

Link to comment

torekp

Link to comment

Anders Björk

Link to comment

torekp

Link to comment

Join the conversation

Browse

Activity

Important Information