Herbert

February 4, 2009

I agree that classes in general should be tested through their public interfaces. On the other hand, I would want to design my tests so they lead me to the root cause of a problem in the shortest amount of time possible. If a "black box" test using my public interface fails, I don't want to have to dig down my VI hierarchy in order to find the root cause. Not if I know that a "white box" test inside the class could have provided me with that information without me doing anything. So I guess I want to test both the public interface and the private methods.

Obviously, black box testing is the only way of making sure that you're testing the exact behavior your class will expose to its callers. A white box test can interfere with the inner workings of a class, bearing the risk that it alters the classes behavior or otherwise produces results that couldn't occur in a black box test. So, if a black box test fails, I'll probably have to fix my code. If a white box test fails, I might have to fix the test instead. Sometimes it's worthwhile adding and maintaining a white box test, sometimes it's not ...

I strongly encourage everyone who is interested in unit testing to watch out for new releases on ni.com/softwareengineering and related content on ni.com/largeapps on Friday, 02/06/2009.

August 17, 2007

I might look at this through my TDMS glasses too much, but to me, the natural way of storing the events you have mentioned would have been to create a channel for each cluster element - where the channel is of the same data type as your cluster element. I realize that this requires you to unbundle and bundle the cluster for writing and reading, respectively. But you wouldn't loose any numeric accuracy, any timestamp tidbits or other things. The only advantage I can see in storing everything as strings would be less coding. Am I missing something there?

I have thought a lot about allowing arbitrary clusters in TDMS. The problem, as you mentioned, is, that you don't know what kind of data you're really dealing with, so it's impossible to magically do the right thing. Some cluster elements are better off being stored as properties, but how would I know? If I store them as properties because they are scalar, I'm out of luck if they change their value after 1000 iterations. Similarly, what would I do with a numeric array in the cluster? Create a channel? Append the array values from the next cluster to that channel? What if these are FFT results? I have not been able to come up with a good way of identifying these things automatically. Of course, you can always come up with some fancy piece of UI that allows users to assign cluster elements to TDMS objects (smells like Express VI ), but the best interface we have for making that assignment is the block diagram.

If a cluster doesn't contain arrays or other clusters, you could make a case for that we should handle that by making each cluster element a channel. That would be a viable thing to do. But when it comes to nested clusters and clusters that include arrays, providing "automatic" handling creates expectations that can hardly be fulfilled.

Herbert

June 20, 2007

QUOTE(Kevin P @ Jun 19 2007, 10:50 AM)

1. Talk to NI internal account people so I can install older LabVIEW 8.20 on my newly purchased license rather than 8.2.1 (Would also install DAQmx 8.3 instead of 8.5). This seems like an easy way to be sure that my stand-alone executable will be compatible with the older deployed PC.
2. Go ahead and install LabVIEW 8.2.1, but stick with DAQmx 8.3. Then I'd be building an 8.2.1 executable that I want to deploy to an older PC with 8.20 runtime. Will this work, both in general due to runtime version difference and in particular with respect to the TDMS functions? Bear in mind that the older PC is creating the TDMS files with a LV 8.20 app.

Kevin,

the second option will not work. The executable needs to be compiled with the same version the runtime of which you are using. The incompatibility is not in the TDMS files, it exists between LabVIEW-compiled code and the LabVIEW Runtime Engine. So, I'm afraid the first option will be the only way to go (except for updating everything to 8.2.1, regardless of the DAQmx version).

Herbert

June 20, 2007

I had a chance of seeing motorcycle traffic in Vietnam recently. People participating or joining traffic never look to left, right or back, but that's ok since everybody is aware of it. If you want to pass by someone, you generally honk, so they know something is coming from behind. The whole thing might look quite familiar to you if you have one of these screen savers that simulate a fish swarm. There are more details, but it is scary enough just like that.

Herbert

http://forums.lavag.org/index.php?act=attach&type=post&id=6142

June 20, 2007

QUOTE(Thang Nguyen @ Jun 18 2007, 04:49 PM)

Is it always true that storing the Y value of waveform and seperate it with the timestamp is more efficient than storing the waveform?
Based on the property of TDMS channel (wf_increment, wf_start_time), I think TDMS naturally used to store waveform.

But in this case, it is easier for me to read data out, if I seperate the data and timestamp.

As long as your data values are equally sampled, storing them as waveforms is a lot more efficient. A timestamp is 128bit, a double value is 64bit. So, by not storing a timestamp with every data value, you save 2/3 of disc footprint and performance. Of course, if your data is not equally sampled, this is not useful for you. In that case, you need to store time stamp and data values to different channels. If it is easier for you to read out, you can always store time and data to different channels, but it might become a performance bottleneck in your application.

QUOTE(Thang Nguyen @ Jun 18 2007, 04:49 PM)

I have never come to Caffee Suoi Da, actually, there are a lot of coffee in Vietnam ( a lot ...).

Plus, there is always the B52 :thumbup: . (If only I could figure out how to embed videos in my posts...).

QUOTE(Thang Nguyen @ Jun 18 2007, 04:49 PM)

I lived in Saigon.

. Do you feel hot there?

Not really. I live in Austin, Texas. That's about as hot as Saigon. I wouldn't dare driving a motorcycle in Saigon, though ...

Herbert

June 19, 2007

Thang,

A) The idea here is that users should never have to touch properties like wf_increment or even know about them. We use the wf_xxx properties to store things that are embedded in LabVIEW data types (e.g. T0 and dT are embedded in the waveform data type). If you use waveforms correctly, all of these properties should be written and read without you doing something special. That of course only works if the waveforms have the correct values in them. Since you are asking - here are the important ones:

T0 is saved to wf_start_time (timestamp).
dT is saved to wf_increment (double).
If your data is not time-domain, wf_start_time will still be set, but your X0 value goes into wf_start_offset (double). This will happen for example with frequency-domain data or histogram results.
If you exchange data with DIAdem, you need to set the wf_samples property to something other than 0 (we ususally set it to the number of values in the incoming waveform, so in your file, it is 1). DIAdem will use this property to determine whether a channel is a waveform or not.

B) That's exactly right. The only thing you need to do is set the property NI_MinimumBufferSize (integer) for each of your data channels to 1000 or 10000 or something. The TDMS API does the buffering automatically (requires LV 8.2.1). This is not crucial to the functionality of your application, but it will speed up writing and reading quite a bit.

Unrelated) I see from the flags on your account that you're from Vietnam. I just came back from 2 weeks of vacation, visiting friends in Vietnam. They took me on a roundtrip through the country, including Hanoi, Ha Long, Nha Trang and Saigon. Best vacation I had in a long time. I'm addicted to Caffee Suo Da now :thumbup:

Herbert

June 19, 2007

Can't you just go with the one waveforms you acquire and split it up, e.g. using "Get Waveform Components" combined with "Get Digital Components" or using some of the functions on the "Digital Waveform" -> "Conversion" palette?

Herbert

June 19, 2007

QUOTE(Thang Nguyen @ Jun 18 2007, 12:23 PM)

Yeah, they are stored in the right order. And I store data at different rate. You can see the number of data in high frequency is larger the number of data in low frequency. All of them start and stop at the same time.

Looking at the file with the TDMS Viewer, what you have is:

different channel lengths (high freq channels have 618 values, low freq channels have 224)
same dT (1.00 for all channels)
varying starting times (T0) for every channel

It looks like you are using waveforms to save single values. In that case, I'm not sure that DAQmx or other functions that put out waveforms will set dT correctly, because there is no second value to reference to. If you save a series of single values to a waveform channel, you need to be really sure that they are equally sampled. If you're not sure, you should rather split up the waveform data type and store the timestamps and the data values to different channels (e.g. one timestamp and one double channel).

Saving single values to TDMS like this is also not a very efficient thing to do. It is a lot more efficient to gather a bunch of values and write them as a larger array. You can have the TDMS API do that for you by setting the channel property "NI_MinimumBufferSize" to the number of values that you wish to buffer. In your case, good values might be 1000 or 10000.

Hope that helps,

Herbert

June 15, 2007

Try the VIs in vi.lib\utility\libraryn.llb.

Herbert

May 19, 2007

QUOTE(Darren @ May 17 2007, 05:16 PM)

County Line on the Lake does provide an all-you-can-eat group deal...at least they did when I went there in Fall of 1998 with about 100 other senior engineering students...
-D

Consider my vote changed County Line. Map. Menu. :thumbup:

Unless of course the Salt Lick tradition is really important. I joined in January. I wouldn't know.

Herbert

May 19, 2007

My favourite Austin BBQ joint is "The County Line on the Lake". Should be large enough for really big parties, but they can't score on tradition and BYOB. I also think they don't have all-you-can-eat. So, I'm in on the Salt Lick.

Herbert

http://forums.lavag.org/index.php?act=attach&type=post&id=5898

May 18, 2007

QUOTE(torekp @ May 17 2007, 08:43 AM)

Liar, liar.

I'm almost sure you've seen it, but just in case ... I posted some more details on how we benchmark file formats at NI on http://forums.lavag.org/index.php?s=&showtopic=7939&view=findpost&p=30185' target="_blank">this thread, including prerequisites and the actual VIs we use to run our benchmarks.

For relatively short periods of writing, the profiler returns only the time it takes to shove your data into the Windows buffer, but that doesn't mean it's on disc yet. Don't yell at it - the poor thing doesn't know any better :blink:

Herbert

May 17, 2007

QUOTE(Tomi Maila @ May 16 2007, 12:39 PM)

Thanks Herbert!
Tomi

No problem. Let me know how version 1.8 holds up...

Herbert

May 17, 2007

QUOTE(Tomi Maila @ May 16 2007, 11:28 AM)

Herbert, I've a LabVIEW interface for HDF5 1.8 alpha. Would you like to share the benchmark code so I could run the benchmarks with the new version of HDF5? I think you must have used HDF5 1.4 or earlier.
Tomi

Tomi,

I used HDF5 version 1.6.4. The LabVIEW API for that was never released to the public. I also don't have that code in my benchmark tool any more.

You might need to rip some stuff out of the code, e.g. DAQmx or HWS, depending on what you have on your machine. Adding a format is rather simple. Just add it to the typedef for the pulldown list and add new cases to the open, write and close case structs.

Some remarks:

Hope that helps,

Herbert

http://forums.lavag.org/index.php?act=attach&type=post&id=5888 ''>http://forums.lavag.org/index.php?act=attach&type=post&id=5888 '>http://forums.lavag.org/index.php?act=attach&type=post&id=5888

May 17, 2007

QUOTE(Gary Rubin @ May 16 2007, 10:53 AM)

Does Labview Bytestream just refer to this? http://forums.lavag.org/index.php?act=attach&type=post&id=5887 ''>http://forums.lavag.org/index.php?act=attach&type=post&id=5887 '>http://forums.lavag.org/index.php?act=attach&type=post&id=5887

Yes.

Herbert

May 17, 2007

Benchmarks

Every benchmark was run on a "clean" machine. The machine is what used to be a good office machine 2 years ago. It has software RAID, which is a minor influence on some of the benchmarks. Depending on what machine you use (e.g. what harddrive, singleproc vs. dualproc etc.) results may obviously vary. If you see spikes in time consumption where my benchmarks don't show any, you might need a better harddisc / controller. Harddisc on Windows needs to be defragmented and at least half empty in order to achieve reproducible results. No on-demand virus scanning. Better shut down any service that Windows can survive without. Load your benchmark VI, open the task manager, wait until processor performance stays at zero and hit run. Make sure you have plenty of memory, so your system never starts paging.

We did not care about the time it takes to write small amounts of data to disc. Windows will buffer that data and your application continues to run before the data is actually on disc. We only cared for sustained performance that you can hold up for an extended period of time. In order to achieve this "steady state", we stored at least 1000 scans in each of our benchmarks. The graphs in the attached PDF files show the number of scans stored on the x axis and the time it took for a single scan to be written on the y axis. The time consumed is only the time for the writing operation. Time consumed by acquisition and other parts of the application is not included.

There are several things we were looking for in a benchmark:

Overall time to completion (duh).
Number and duration of spikes in time consumption. Minor spikes are normal and will occur with any file format on pretty much any system. Larger spikes can be a killer for high-speed streaming.
Any dependency of performance on file size and/or file contents. This is where we eliminated most existing formats from our list. Performance often degrades linearly or even exponentially when meta data is added.

Source data always was a 1d array of analog waveforms with several waveform attributes set. Formats under test were:

TDMS
TDM
LabVIEW Bytestream
LabVIEW Datalog (datalog type 1d array of wfm)
NI HWS (NI format for Modular Instruments, reuses HDF5 codebase)
HDF5
LVM (ASCII based, Excel-friendly)

Some benchmarks only include a subset of these formats. The ones that are missing didn't perform well enough to fit in our graphs. HDF5 was tested only in the "Triggered Measurements" use case, because with the HDF5-based NI HWS format we already had a benchmark "on the safe side". The reason TDM goes down in flames in some benchmarks is that it stores channels as contiguous pieces of data.

Mainstream DAQ

First Benchmark is a mainstream DAQ use case. Acquire 100 channels with 1000 samples per scan and do that 1000 times in a row. Note the spikes when Datalog and HWS/HDF5 are updating their lookup trees. TDMS beats bytestream by a small margin because of a more efficient processing of waveform attributes.

http://forums.lavag.org/index.php?act=attach&type=post&id=5882

Modular Instruments

Acquire 10 channels with 100000 values per scan. Here's where HWS/HDF5 still has TDMS beat. They do that by using asynchronous, unbuffered Windows File I/O. According to MS, that's the fastest way of writing to disc on Windows. We're working on that for TDMS. An interesting detail is the first value in the upper diagram. Note that HWS/HDF5 takes almost a second to initially create the file.

http://forums.lavag.org/index.php?act=attach&type=post&id=5883

Industrial Automation

Acquire single values from 1000 channels. These are LabVIEW 8.20 benchmarks. With the 8.2.1 NI_MinimumBufferSize feature TDMS should look better than that, but I haven't run this test yet. Note that HWS/HDF5 takes about 3 seconds where all 3 native LabVIEW formats stay below 100ms.

http://forums.lavag.org/index.php?act=attach&type=post&id=5884

Triggered Measurements

In this use case, every scan creates a new group with a new set of channels. This typically occurs in triggered measurements, or when you're storing FFTs or other analysis results that you cannot just append. We acquire 1000 values from 16 channels per scan for this use case. From all things, I've lost the original data for the HDF5 test, so I need to attach 2 diagrams. The first one is the 8.20 benchmark without HDF5:

http://forums.lavag.org/index.php?act=attach&type=post&id=5885

The second one is an older benchmark that was done with a purely G-based prototype of TDMS (work title TDS). I attached it because it has the HDF5 data in it. The reason HWS is faster than the underlying HDF5 is that it stores only a limited set of properties.

http://forums.lavag.org/index.php?act=attach&type=post&id=5886

Reading

I also have a bunch of reading benchmarks, e.g. read all meta data from a file, read a whole channel from a file, read a whole scan from a file etc. These are less exciting to lock at though, because I only have aggregate numbers on that.

We also recently conducted a benchmark on how fast DIAdem can load and display data from multi-gigabyte files, where TDMS was the overall fastest reading format.

Hope that helps,

Herbert

QUOTE(Tomi Maila @ May 16 2007, 12:39 AM)

I'd love to use TDMS but it doesn't suit our needs as it is today with only two hierarchy levels and lacking support multidimensional arrays (3-15d) and scalars. Are you intending to extend tdms format to support these features?
Tomi

Yes, we are planning to add these features. The underlying infrastructure (TDMS.DLL) is already fully equipped to do that, the file format already has placeholders for all necessary information in it. The reason we don't have these things yet is that TDMS is used for data exchange with DIAdem, where deep hierarchies and multi-dimensional arrays are not supported. So everytime we add something like this, we need to coordinate with other groups that use TDMS (CVI, SignalExpress, DIAdem...) to make sure everybody has an acceptable way of handling whatever is in the file. We're working on that.

Herbert :headbang:

May 16, 2007

QUOTE(Tomi Maila @ May 15 2007, 02:24 PM)

Herbert, could you please specify the performance issues and if possible refer to the source.
Tomi

Tomi,

prior to making TDMS, we ran a bunch of different benchmarks on a variety of file formats. Test cases included high-speed logging on 10 channels, single-value logging on 10000 channels, saving FFT results (the point being that you cannot append FFT channels) and more. HDF5 does great on small numbers of channels, but it started having issues when we had about 100 data channels, where a channel in HDF5 is a node with a bunch of properties and a 1D array. If you keep adding channels (as you have to in the FFT results use case), performance goes down exponentially (!) to the number of channels.

HDF5 furthermore tends to produce spikes in time consumption when writing. We contacted the HDF5 development team about that and they responded that it was a known issue they would be working on, but they couldn't give us a timeline for when it would be fixed.

Herbert

May 16, 2007

I think I answered this on Info-LabVIEW earlier today ... for the kind of dataset you describe, the TDMS file format and the TDM Streaming functions (subpalette on File I/O) would be a good solution. TDMS files are binary, so their disc footprint is going to be much smaller than ASCII. The file size is only limited by your hard disc size. Within the file, you can organize data in groups that you can assign names and properties to (so you can use a smaller number of files). LabVIEW comes with a viewer application for TDMS. TDMS is also supported in CVI, SignalExpress and DIAdem, plus we provide an Excel AddIn for TDMS as a free download. If you need more connectivity than that, there's also a C DLL and a documentation of the file format available on ni.com.

An SQL database might be a reasonable solution, too - if it is well designed. It'll certainly help you maintaining the large number of tests that you are storing. HDF5 is probably a bad idea. It is great for storing few signals at a high speed, but it has some really bad performance issues when it comes to storing large numbers of data sets.

Hope that helps,

Herbert

May 11, 2007

If you get a variant out of an ActiveX node that is supposed to represent another ActiveX object, you need to cast it into the object you are looking for. You can use the "Variant To Data" function for that. Wire the "type" terminal with a refnum constant that represents the ActiveX object type you expect.

Herbert

May 11, 2007

Finally, a good way of expressing myself when I feel like a named unbundler. On other days, I feel more like a single radio button :wacko:

I like the idea, but adding it to the emoticons will make that popup several square feet large...

Herbert

May 11, 2007

When you open a file and write to it, the values are not immediately written to disc. They are cached in memory until a certain amount of data has accumulated, then the operating system will "flush" the cache to disc. If you need an event log for debugging purposes, you have two options:

Open, write and close everytime you write. Close will force the operating system to flush your data to disc.
Use the "Flush" function from the "Advanced" sub palette in file i/o. That essentially does the same thing, but it performs a little better, since you save the overhead for re-opening the file.

If your system powers off without the OS properly shutting down, and that happens at a point in time when the system is flushing it's disc buffers, you might end up with a corrupted file. In that case, I would recommend to narrow down what causes the error, and then write one log file for each message from the code that appears to cause the problem. This will obviously result in tons of files, but it will tell you when your system bailed out.

Hope that helps,

Herbert

May 9, 2007

QUOTE(Thang Nguyen @ May 8 2007, 10:05 AM)

I want to now if there is any limitation in the size of the TDMS file? If YES, is there any solution to determine it?

There is practically no limit to the file size. We can address any piece of data within a signed integer 64 range (that's almost 1000000 TeraBytes). The limiting factor for that would be your harddisc size.

The only limiting factor is that we cash meta data and index information. That means we keep names and properties of all groups and channels in memory, plus everytime you store data to a channel, we will keep a uInt64 value that tells us where in the file that piece of data went. Hence, if you store a large number of objects in a loop with a large number of iterations, you can run out of memory eventually.

To give you an idea of what I mean by large: One of our test cases for TDMS has 10000 channels with dozens of properties and 10000 chunks of raw data each. This test runs just fine on an average office machine. If your requirements are below that, you should be fine. If you ever run into that kind of limits, you might be successful by

using NI_MinimumBufferSize to reduce the number of times we actually write to disc
changing your group/channel setup to be more efficient
defragmenting your file
using multiple files.

Hope that helps,

Herbert

May 5, 2007

WSDL

I've been playing with WSDL quite a bit, using a variety of tools to create WSDL files and read them back in. Unfortunately, there are compatibility issues often times, e.g. some tools use namespaces that other tools don't support, some tools declare their data types in a way other tools won't recognize etc. So I went back to what I think is the grand daddy of web services - the Google Search API. I took their WSDL and did some text processing in order to adapt it to my server. Crude method. Works like a charm. For reference on WSDL, I was pleasantly surprised to see that the W3 specification of WSDL is very well written, and has a lot of examples. Great resource. http://www.w3.org/TR/wsdl#_http.

Web Service

The WSDL file is just declaring your API, and you still have to implement it. Programming the actual web server in LabVIEW is probably quite a bit of work. If you want it running stand-alone, you'll have to implement your own networking code. If you run a LabVIEW-built executable as a CGI app in a web server, you can avoid that effort, but you still need to write code to serialize / deserialize the SOAP strings. That's not exactly rocket science, but it might not be fun either. Plus, you need to live with the disadvantages of CGI apps, the most important of which is that they are slow, because every call will start up a new process.

If you are using VISUAL Studio, I'd recommend building your own WSD as described above, having VISUAL Studio make a C# web service out of it, and hooking that up to either the LabVIEW ActiveX server interface or a LabVIEW-built DLL (the ActiveX server has the advantage that you can keep LabVIEW running in order to save performance). If you are using a C/C++ compiler, check out http://gsoap2.sourceforge.net/. I know there are many ways of implementing web services in other languages (I think about 90% of them are done in Java), but I'm not quite sure how the LabVIEW connectivity works from there.

Hope that helps,

Herbert

May 4, 2007

You should be safe there. DAQmx refnums are indeed special in that they can be casted into strings. That applies to a variety of them, including channels, tasks, tags etc. I'm not sure it applies to all of them. LabVIEW automatically coerces these refnums to strings if they are wired to a terminal that expects a string.

There have been some issues in the past if you use DAQmx refnums as attributes of variants or waveforms, where the variant/waveform goes into a LabVIEW function that tries to process the attributes. Doesn't look like you're going to have that use case though.

Herbert

May 3, 2007

QUOTE(PJM_labview @ May 1 2007, 02:34 PM)

Unfortunately the described workaround (Check the found output) does not work.
PJM

I guess we'll have to get ugly then. "Get Properties" with no property name and no type wired to it will give you a list of all property names. Worst case that can be used to verify whether a property is there or not.

Herbert

Sign In

Herbert

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Gallery

Posts posted by Herbert

Testing private class vis

TDMS -- treating clusters as channels?

Error -2500 with write to TDMS file from executable

Scarry...

Read data at a specific time from TDMS

Read data at a specific time from TDMS

Digital Waveform

Read data at a specific time from TDMS

Delete File From *.llb Programmatically

Annual LAVA/OpenG BBQ

Annual LAVA/OpenG BBQ

So many binary file options!

File structures for large data sets

File structures for large data sets

File structures for large data sets

File structures for large data sets

File structures for large data sets

File structures for large data sets

COM component that gives a variant instead of reference

LabVIEW BD functions, controls, indicators as emoticons

File Logging in RT

TDMS file size limitation?

WDSL implementation in LV for windows and linux

TDMS for big config clusters -- is there a better way?

[LV 8.2.1] TDMS Get Property Bugs

Browse

Activity

Important Information