Jump to content

Consulting Help to Speed Up a Curve Smoothing VI


Recommended Posts

Hi:

This link points to a great utility to do curve smoothing.

http://zone.ni.com/d...a/epd/p/id/4499

The challenge though is that the VI is really slow when trying to smooth a long curve. I am using this in geological logs that happen to be ~45 pages long.

Question: Does anyone wish to do code optimization to make the VI run faster. I need it to run more than 10 times faster. Let me know how much the consulting effort will be for this?

NOTE: While I will or can pay for the effort to optimize the code, the VI will still be available for others to use and the original author will be acknowleged.

Regards

Anthony

Link to comment

I would not be able to help you for a few weeks as I’m off on vacation. I can see why your smoothing functions so slow and I’m sure someone could easily improve it’s performance by orders of magnitude on large data sets. However, are you sure you would not be better served by using one of the many “Filter” VIs in LabVIEW? I tend to use the Savitsky-Golay filter, but there are many others that can be used for smoothing. They’ll be much, much faster.

Link to comment

drjdpowell:

I checked that Savitsky-Golay filtering algorithm. I like the fact that it leaves in all the 'peaks' and 'valleys' while smoothing, but how can I use it with an array of X and Y values where the X values are not collected at the same X-interval?

The VI connector pane shows that it will accept only one X- values at-a-time?

Note that the Lowess VI in te example that I provide above already has array inputs for pairs of 'X' and 'Y' values. . . . . Or am I missing something?

Anthony L.

Link to comment

Ah, non-uniform spacing, I see.

Greg,

My first thought would have been to truncate the weighting calculation and fitting to only a region around the point where the weights are non negligible. Currently, the algorithm uses the entire data set in the calculation of each point even though most of the data has near zero weighting. For very large datasets this will be very significant.

— James

BTW> Sometimes it can be worth using interpolation to produce a uniform spacing of non-uniform data in order to be able to use more powerful analysis tools. Savitsky-Golay, for example, can be used to determine the first and higher-order derivatives of the data for use in things like peak identification.

Link to comment

Greg Sands:

Wow! 15x Faster. I owe you a beer, or a soda --just-in-case you don't drink beer. I have been in search of something more efficient. I really appreciate the wizardry that went into speeding up this routine. Will Test and get back.

Thanks and Regards

Anthony

djdpowell:

I like the idea of interpolation to attin uniform X- spacing because we have to that to conform to the LAS file export standard. I will try the Savitsky-Golay fileter when going through the interpolation algorithm. The only thing is that the interpolation is only done during final reporting to conform with Log ASCII file export requirements for report drilling info. I allother cases the data needs to presented as-is but smoothed.

Thanks for all these great suggestions. Amazing what can be learned from these forums when minds from around the word come together to contribute solutions.

Anthony

Link to comment

Greg,

My first thought would have been to truncate the weighting calculation and fitting to only a region around the point where the weights are non negligible. Currently, the algorithm uses the entire data set in the calculation of each point even though most of the data has near zero weighting. For very large datasets this will be very significant.

Yes, in fact the weighting calculation already truncates the values so that they are set to zero outside of a window around each point. However with a variable X-spacing, the size of the window (in both samples and X-distance) can vary, so it would be a little more complicated to work out that size for each point.

Just had a quick go - with a fixed window size, you get a further 10x speedup, but if you have to threshold to find the appropriate window, it's only another 2-3x. Still, that's around 40x overall, with further increases using multiple cores.

SmoothCurveFit_subset.zip

I'd only ever used it for arrays up to about 5000 points, so it had been fast enough. Interestingly, the greatest speedup still comes from replacing the Power function with a Multiply -- never use x^y for integer powers!

Any further improvements?

Link to comment

Any further improvements?

Just has a cursory glance. But it looks like you are calculating the coefficients and passing the XY parms for the linear fit twice with the same data (it's only the weightings that change from the first "fit" to the second) . You could pre-calculate them in a separate loop and just pass them into the other loops. Also, you might benefit from passing through the x array (through the coefficient vi).

Edited by ShaunR
Link to comment

Just has a cursory glance. But it looks like you are calculating the coefficients and passing the XY parms for the linear fit twice with the same data (it's only the weightings that change from the first "fit" to the second) . You could pre-calculate them in a separate loop and just pass them into the other loops.

I used a separate loop to start with, but the speed improvement was minimal, and the memory use would be increased fairly significantly.

Link to comment
  • 2 weeks later...

I tried this on the logs and the smoothing assigned NaN values to some of the Y-Points. Do you know what could be causing this?

The final result is that the output curve has many broken regions. I have attached various screen shots that shows this.

Hard to tell without seeing the data, but if your screen-shot is correct, you have a "Weighting Fraction" equal to zero, so I wonder if that is causing the problem. I'm pretty sure that it should be greater than zero - it's the fraction of the dataset to use for fitting at each point.

Edited by GregSands
Link to comment

Hi Greg:

There is code in your routine which forces the ratio to something other than zero. I later found out that I need to input a higher coefficient compared to the ones I've been using on the original routine. So your revised routine does work but I just need to adjust the smoothing factor after I have tried on several logs.

Thanks and Regards

Anthony L.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.