Does anyone use netCDF file format?

Rolf Kalbermatter · January 3, 2017

As has been already pointed out, there are a number of possible reasons why a library could be not thread safe. The most common being the use of global variables in the library. One solution here is to always call the library from the same thread. Since a thread can't split magically into two threads, that is a safe method to call such a library. Theoretically a library developer could categorize each function if it makes use of any global and sort the library API's into safe functions who don't access any global state and into non-safe functions who need to be called in a protected way. Another way is to use a semaphore. That can be done explicitedly by the caller (what drjdpowell describes) or in the library itself but the later has the potential to lockup if the library uses multiple global resources that are each protected by their own semaphore. OpenSSL which Shaun probably refers to, requires the caller to install callback functions that provide the semaphore functionality and which OpenSSL then uses to protect access to its internal global variables. Without having installed those callbacks OpenSSL is not threadsafe and dies catastrophally rather sooner than later when called from LabVIEW in multithreaded mode.

An entirely different issue is thread local storage. That is memory that the OS reserves and associates with every thread. When you call a library that uses TLS from a multithreaded environment you have to make sure that the current thread has the library specific TLS slots initialized to the correct values. The OpenGL library is such a library and if you checkout the LabVIEW examples you will see that each C function wrapper on entry copies the TLS values from the current refnum to the TLS and on exit restores those values from TLS back into the refnum. In a way it's another way of global storage but requires a completely different approach.

But for all of these issues guaranteeing that all library functions are always called from the same thread solves the problems too.

drjdpowell · January 3, 2017

49 minutes ago, rolfk said:

Another way is to use a semaphore. That can be done explicitedly by the caller (what drjdpowell describes) or in the library itself but the later has the potential to lockup if the library uses multiple global resources that are each protected by their own semaphore. OpenSSL which Shaun probably refers to, requires the caller to install callback functions that provide the semaphore functionality and which OpenSSL then uses to protect access to its internal global variables.

Ahh, so the callback is the locking mechanism of your choice. Having the library do the lock means a lock is only made when necessary, in contrast to my Semaphore in LabVIEW, where I always lock before any call, even where it might not be needed.

Edited January 3, 2017 by drjdpowell

ShaunR · January 3, 2017

6 minutes ago, drjdpowell said:

Ahh, so the callback is the locking mechanism of your choice. Having the library do the lock means a lock is only made when necessary, in contrast to my Semaphore in LabVIEW, where I always lock before any call, even where it might not be needed.

It's not a choice and it is always needed. You need to invert your view of who is locking who.

Rolf Kalbermatter · January 3, 2017

56 minutes ago, ShaunR said:

It's not a choice and it is always needed. You need to invert your view of who is locking who.

I'm not sure I understand you well here. If the library offers to install semaphore callbacks that is of course preferable from a performance viewpoint but you can still choose to protect it on the calling side by a semaphore instead (and you could even use an implicit serialization by packing all CLNs into the same VI with an extra function selector and setting the VI to not be reentrant) instead of wrapping each CLN into an optain semaphore and release semaphore.

A library offering semaphore callback installation is pretty likely to only use them around critical code sections so yes there might be many function calls that don't invoke a semaphore lock at all as it is not needed there. Even when it is needed it may choose to do so only around critical accesses, freeing the semaphore during (relatively) lengthy calculations so that other parallel calls are not locked, which can result in quite a bit of performance when called from a true multitasking system like LabVIEW.

Edited January 3, 2017 by rolfk
Extra explanations

ShaunR · January 3, 2017

1 hour ago, rolfk said:

I'm not sure I understand you well here. If the library offers to install semaphore callbacks that is of course preferable from a performance viewpoint but you can still choose to protect it on the calling side by a semaphore instead (and you could even use an implicit serialization by packing all CLNs into the same VI with an extra function selector and setting the VI to not be reentrant) instead of wrapping each CLN into an optain semaphore and release semaphore.

A library offering semaphore callback installation is pretty likely to only use them around critical code sections so yes there might be many function calls that don't invoke a semaphore lock at all as it is not needed there. Even when it is needed it may choose to do so only around critical accesses, freeing the semaphore during (relatively) lengthy calculations so that other parallel calls are not locked, which can result in quite a bit of performance when called from a true multitasking system like LabVIEW.

The HDF5 library uses recursive locks and keeps a separate error stack for each thread.

drjdpowell · January 3, 2017

9 minutes ago, ShaunR said:

The HDF5 library uses recursive locks and keeps a separate error stack for each thread.

The “multithreaded” version of HDF5, you mean. The “separate error stack” is actually an example of something you need a LabVIEW-level lock to solve. LabVIEW does multi-tasking in the UI thread, so if one queries the error information after a function throws an error, it is quite possible that LabVIEW has called some other function from a parallel loop which has overwritten the error stack. Using Thread-Local Memory to allow “multithreading” is based on the assumption that the threads are not multitasking. Instead, you need to have a lock held over the original (error-throwing) function call and the secondary call to read the error information.

ShaunR · January 3, 2017

1 hour ago, drjdpowell said:

The “multithreaded” version of HDF5, you mean. The “separate error stack” is actually an example of something you need a LabVIEW-level lock to solve. LabVIEW does multi-tasking in the UI thread, so if one queries the error information after a function throws an error, it is quite possible that LabVIEW has called some other function from a parallel loop which has overwritten the error stack. Using Thread-Local Memory to allow “multithreading” is based on the assumption that the threads are not multitasking. Instead, you need to have a lock held over the original (error-throwing) function call and the secondary call to read the error information.

Multi-tasking <> multithreading. You seem to be conflating many different aspects of application execution.

drjdpowell · January 3, 2017

7 minutes ago, ShaunR said:

Multi-tasking <> multithreading. You seem to be conflating many different aspects of application execution.

I'm only interested in LabVIEW applications executing correctly when one is calling the same library from two parallel loops. If libraries fail that, then I do not care if they are "multithreaded" by some technical definition.

drjdpowell · January 4, 2017

Anyway, back to netCDF. Looks like no-one has used it, and, though it looks like a nice data storage API, it doesn’t offer any improvement over the existing HDF5 labview implementations. Unless one needs compatibility with existing netCDF tools, but I think that is only if you are in the Atmospheric science community. The MDSplus library I mentioned above is even more niche; only used by those doing research in nuclear fusion.

BenK · January 23, 2017

Hello

besides I can hardly discuss on a deeper developer level like it was in the middle of this discussion, I'd like to give some feedback to the initial question:

Yes, me.

No, seriously: In my opinion, netCDF4 has - despite "only" being some kind of fork on top of HDF5 - some conceptually important differences, which are for me personally the reason to choose it again and again before HDF5.

(I looked some times towards HDF5 because of the existing LV-interface, but I always found it to be too general for my needs.)

The data I receive from my measurement device is n-dimensional, while n can differ from measurement to measurement. This was the main reason I was searching for something different than tdms or just ASCII-files.

Important for my point of view is that the independent variables of such a dataset are coordinates and I want that information to be stored in the data file. Perhaps this is best described here http://www.unidata.ucar.edu/blogs/developer/entry/dimensions_scales.

So what I did was running the LV-assistant for creating wrapper-VIs for given c-headers and the adjusting manually step by step the things to be made that it runs without errors...

But I have to admit that this usage of netCDF4 has not made it into my main every days LV-project which runs my measurement device for years now. It's still part of the new beta-like project - but it works and produces fine nc-files.

Edited January 23, 2017 by BenK
Link contained period so didn't work.

kilgo · November 7, 2017

An ancient thread, I know... but, for information, NetCDF is used as an interchange format for mass spectrometry - mainly for gas chromatography mass spectrometry (GCMS). So, there is a potential market out there for a Net CDF toolkit in LabVIEW.

For most other kinds of mass spectrometry, the interchange formats (e.g. mzml or mzxml) are easier to read, so improving the capability for user to get into the netCDF formatted GCMS data might be welcomed. I know I'd like it.

I'll let the thread rest quietly now!

mcduff · October 22, 2019

On 12/28/2016 at 7:12 PM, martijnj said:

My understanding is that netCDF-4 is an HDF5 file with specific applied structure

@drjdpowell

Would you happen to have a document that has HDF5 attributes needed to make an HDF5 netCDF-4 compliant. I have downloaded some example netCDF-4 files and the attributes among them are slightly different. In addition, finding information on this site is not straightforward.

Thanks for your help.

Regards,

mcduff

drjdpowell · October 22, 2019

i'm afraid I only ever had the documentation from the netCDF website, sorry.

mcduff · October 22, 2019

36 minutes ago, drjdpowell said:

i'm afraid I only ever had the documentation from the netCDF website, sorry.

Thanks!

Sign In

Does anyone use netCDF file format?

Recommended Posts

Rolf Kalbermatter

drjdpowell

ShaunR

Rolf Kalbermatter

ShaunR

drjdpowell

ShaunR

drjdpowell

drjdpowell

BenK

kilgo

mcduff

drjdpowell

mcduff

Join the conversation

Similar Content

View Data from HDF5 in Grafana

Browse

Activity

Important Information