Jump to content

logistic regression in LabView


Recommended Posts

How would you do logistic regression in Labview? Some key points (see the wiki link for more detail), where f(z) is a probability function for some event:

2c9aab00ccf9f653e975960ae951d14f.png

The variable z is usually defined as

035135ba193647cebfc0419f8edfd772.png where β0 is called the "intercept" and β1, β2, β3, and so on, are called the "regression coefficients" of x1, x2, x3 respectively. To find the parameters βi, one optimizes them such as to obtain the maximum likelihood of getting the actual observed data. Making a few semi-reasonable assumptions, this amounts to maximizing

657a9ec6871d3b20c7dabb1e8a31deff.png

where L(theta) = the likelihood of getting the observed data, and L* is just the log of that.

Why use this approach? In my application, I've got a few categories of scrap metal that I'm trying to sort, let's say Al, Mg, and Zn. They're on a belt, and several diagnostic tests are done as they whiz by; each test outputs one or more real numbers. No one test is absolutely decisive, thus the need for optimizing their weightings.

I'm thinking of negating L*(theta) and using Conjugate Gradient nD.vi from the optimization palette. Does anyone have experience with this type of problem, and want to offer advice? If you thinking I'm barking up the wrong statistical/optimization tree, I'd like to hear your thoughts on that too.

Link to comment

Why not use principal component analysis to cluster different scrap classess togheter instead, from my perspektive that would be eaiser (probally more robust). Ones you found the classes you could build a classifier based the on PCA scores.

Another method to use would be discrimant analysis using partial least squares.

Link to comment

QUOTE (Anders Björk @ Jul 2 2008, 03:15 PM)

Why not use principal component analysis to cluster different scrap classess togheter instead, from my perspektive that would be eaiser (probally more robust). Ones you found the classes you could build a classifier based the on PCA scores.

Another method to use would be discrimant analysis using partial least squares.

Thanks for the ideas. Without buying the Toolkit from NI (advanced signal processing, I think it was), how would you set up PCA analysis in Labview?

Link to comment

QUOTE (torekp @ Jul 3 2008, 08:23 PM)

Thanks for the ideas. Without buying the Toolkit from NI (advanced signal processing, I think it was), how would you set up PCA analysis in Labview?

Using the SVD-vi or coding up the NIPALS-algorithm. There are also examples on PCA here on Lava with some vis. The SVD approach would work well if you dont have extremely many scrap piecies, say less than some thousands. This since I guess you dont have more than 10 meaures per scrap piece.

Link to comment

Thanks again. Would you like to comment on the "philosophical" aspects of PCA vs Logistic modeling vs whatever? I'm not very familiar with PCA. From what I've read, it seems like it's designed more to select which measures to use, rather than (or in addition to) telling you how much weight to put on each measure.

In my case, I already have a good idea which measures I want to use. All of them contribute to improving the signal to noise ratio. You're right by the way, there are less than 10.

Link to comment
  • 2 weeks later...

QUOTE (torekp @ Jul 11 2008, 05:51 PM)

Thanks again. Would you like to comment on the "philosophical" aspects of PCA vs Logistic modeling vs whatever? I'm not very familiar with PCA. From what I've read, it seems like it's designed more to select which measures to use, rather than (or in addition to) telling you how much weight to put on each measure.

In my case, I already have a good idea which measures I want to use. All of them contribute to improving the signal to noise ratio. You're right by the way, there are less than 10.

Sorry for not answering this so fast, have been on vacation.

Since the scores from a PCA models are a few new coordinates they are eaiser to apply some classification rule than the orignal variables.

The scores are the new coordinates of your data (the first, second,.. largest variation in your measured variables).

You could see a PCA model as model capturing the most of variation in your data and where you can select the number of underlying phenomenas you want to discribe by determining the number of principal components, PC (new coordinates). The principal components are preprendicular to each other, orthagonal.

The PCA model

X=TP'+E

T scores matrix (Objects row, columns the score value for PC 1, 2 to n)

P Loading matrix

E Error matrix (residuals)

X Data matrix (Objects rows, Variables columns)

Link to comment

Thanks Anders. I wrote a preliminary Logistic Regression solver, but it performs very poorly when the number of pieces of scrap metal is >300 or so. Here it is, in case anyone's interested. It requires the Optimization VIs, which you only get with Labview ... full? Professional? Not sure.

I also found this web-based Logistic Regression calculator, which really helped me test my stuff. I haven't tested large datasets in the web-based one yet, to see if it's faster. Edit: the web page algorithm is a lot faster.

Meanwhile, I haven't done anything on the PCA front yet. But I'll start by looking on LAVA as you suggested.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.