Real-time acquisition and plotting of large data

wohltemperiert · October 2, 2014

Hi everyone,

my current application has to deal with real time data acquisition (200Hz or above) on multiple channels simutaneously and plotting the acquired data on XY-graphs (including decimation) in real time. Such acquisition typically lasts several days and a very large amount of data is therefore being created, so that the memory of the PC could run out very quickly, if the data is kept in the memory of the PC. Since the XY-Graphs have to be replotted periodically (at least quasi in real time), it seems to me that, the whole range of data somehow has to be available, no matter if decimation is being performed or not. What I have been thinking about is, trying to stream the accquired data to the hard disk and read the decimated data points into the memory for plotting. Has anyone ever dealt with this kind of issue and could provide me with some tipps or experiences? Thanks in advance!

Best,

Fred

ShaunR · October 2, 2014

Hi everyone,

my current application has to deal with real time data acquisition (200Hz or above) on multiple channels simutaneously and plotting the acquired data on XY-graphs (including decimation) in real time. Such acquisition typically lasts several days and a very large amount of data is therefore being created, so that the memory of the PC could run out very quickly, if the data is kept in the memory of the PC. Since the XY-Graphs have to be replotted periodically (at least quasi in real time), it seems to me that, the whole range of data somehow has to be available, no matter if decimation is being performed or not. What I have been thinking about is, trying to stream the accquired data to the hard disk and read the decimated data points into the memory for plotting. Has anyone ever dealt with this kind of issue and could provide me with some tipps or experiences? Thanks in advance!

Best,

Fred

There is the "Data Logging" example which demonstrates this exactly in the SQLite API for LabVIEW. The issue would be whether you could log continuously at >200Hz - maybe with the right hardware and a bit of buffering.

wohltemperiert · October 2, 2014

There is the "Data Logging" example which demonstrates this exactly in the SQLite API for LabVIEW. The issue would be whether you could log continuously at >200Hz - maybe with the right hardware and a bit of buffering.

Hi Shaun,

thanks for the rapid reply. Your tool kit seems to solve my whole issue which actually also includes the aspect of zooming. I still have some questions and I would appreciate it very much, if you could provide me some answers:

- Is it possible that the tool kit also works with numbers with comma as decimal mark (for Germany), coz when I first tried the zooming out, the VI "row col count" has reported error and the error was eliminated, after I had change the Windows settting for decimal mark to point. Or maybe you know a elegant workaround, without having to change the setting on Windows.

- Is it possible to manage and make parallel access to several "tables" within the same database or they have to be held in different databases?

- without going through all the examples and documents: Is there any description for the syntax used to where clause?

Thanks for your reply in advance!

Best,

Fred

ShaunR · October 2, 2014

- Is it possible that the tool kit also works with numbers with comma as decimal mark (for Germany), coz when I first tried the zooming out, the VI "row col count" has reported error and the error was eliminated, after I had change the Windows settting for decimal mark to point. Or maybe you know a elegant workaround, without having to change the setting on Windows.

Can you post the error? Row and column count return integers and should not have anything to do with decimal points.

- Is it possible to manage and make parallel access to several "tables" within the same database or they have to be held in different databases?

Yes. You can have parallel reads (but only a single write without getting busy errors)

- without going through all the examples and documents: Is there any description for the syntax used to where clause?

The syntax is standard SQLite SQL which is 99% compatible with M$ and MySQL. If you work with DBs, you have to learn SQL.

Edited October 2, 2014 by ShaunR

drjdpowell · October 2, 2014

SQLite is the nicest solution, but if you only need a decimated graph of the full data (no zoom in for fine scale), then a quick fix is a â€œself-compressing arrayâ€. An example taken from a past project:

Self Compressing Array.zip

This automatically decimates the data to keep the total under a fixed size. Never allocates memory except at initialization. But you canâ€™t zoom in; for that and other cool features you have to go with SQLite.

eberaud · October 6, 2014

Is there a reason why nobody is suggesting TDMS files? Wouldn't they be appropriate?

wohltemperiert · October 7, 2014

Can you post the error? Row and column count return integers and should not have anything to do with decimal points.

Yes. You can have parallel reads (but only a single write without getting busy errors)

The syntax is standard SQLite SQL which is 99% compatible with M$ and MySQL. If you work with DBs, you have to learn SQL.

Hi Schaun,

I've attached the error message. My assumption is, the time stamps for the start and end of the time range are causing the error message, since they are formatted using the system default decimal point, which is in my case a comma.

I've tried the following workarounds by forcing the vi to use point as decimal point each time data is converted between string and value, and the example vi seems to work now with comma as default decimal point of Windows setting (see attachment):

- adding %.; to the format string

- replacing the VI "SQLite_Select DBL" with "SQLite_Select Str" and adding the Value False to the VI "Fract/Exp String to Number" (this modification was also done for the timeout case)

Do you think these workarounds would solve the whole issue with the decimal point? Are there any other aspects I've neglected?

Thanks for your reply in advance!

wohltemperiert · October 7, 2014

SQLite is the nicest solution, but if you only need a decimated graph of the full data (no zoom in for fine scale), then a quick fix is a â€œself-compressing arrayâ€. An example taken from a past project:

Self Compressing Array.zip

This automatically decimates the data to keep the total under a fixed size. Never allocates memory except at initialization. But you canâ€™t zoom in; for that and other cool features you have to go with SQLite.

Hi drjdpowell,

thanks for your reply. I think your method is a really elegant and more resource saving way for dealing with the issue, if zooming is not needed.

I've also tried to think about a simular solution in terms of keeping only the data to be plotted and dumping the raw data, when I was going to finish my application without the zooming function. But I was not able to figure out the way as you did by compressing the array.

Now I am going to try out the alternative with SQLite first, keeping in mind that there is another alternative.

Is there a reason why nobody is suggesting TDMS files? Wouldn't they be appropriate?

Hi Manudelaveg,

I've been asking myself the same thing, when I was looking for a ready-made solution for my issue. I have no experience with the TDMS format and I think it could come into consideration, if it provides the possibility to "select" and decimate data from a large database as SQLite. Do you think it's possible?

ShaunR · October 7, 2014

Hi Schaun,

<snip>

- adding %.; to the format string

- replacing the VI "SQLite_Select DBL" with "SQLite_Select Str" and adding the Value False to the VI "Fract/Exp String to Number" (this modification was also done for the timeout case)

<snip>

Yes. I see. SQLite is actually the reason here. They removed localisation from the API some time ago so that it will only accept decimal points

Always use "." instead of "," as the decimal point even if the locale requests ",".

So the solution, as you say, is to use "%.;" in the query string to enforce it.

And yes, you may run into another issue in that the SELECT for double precision uses the Labview primitive "Fract/Exp String To Number Function".

This will cause integer truncation on reads of floating point numbers from the DB on localised machines. I've created a ticket to modify it, You can get updates as to the progress from there. In the meantime your suggestion to use the string version of SELECT and use the ""Fract/Exp String To Number Function" yourself with the "use system decimal point" is correct. You can also set to false or to modify the SQLite_Select Dbl.vi yourself like this.

Those are the only two issues and thanks for finding them. Both the changes will be added for the next release of the API which, now I finally have an issue to work on as an excuse, will be in the next couple of days

Edited October 7, 2014 by ShaunR

eberaud · October 7, 2014

I've been asking myself the same thing, when I was looking for a ready-made solution for my issue. I have no experience with the TDMS format and I think it could come into consideration, if it provides the possibility to "select" and decimate data from a large database as SQLite. Do you think it's possible?

Well lately I have been digging quite deeply into TDMS, and after a few struggles, I now have it working quite nicely for a need quite similar to yours I believe. There are a lot of considerations to take into account when choosing between a database solution and a TDMS solution, so I wouldn't advise you to switch to TDMS just yet, but this is something you could look into...

ShaunR · October 7, 2014

Is there a reason why nobody is suggesting TDMS files? Wouldn't they be appropriate?

A relational database (RDB) is far more useful than a flat-file database, generally, as you can do arbitrary queries.

Here, we are using the query capability to decimate without having to retrieve all the data and try and decimate in memory (which may not be possible). We can ask the DB to just give us every nth data-point between a start and a finish. To do this with TDMS requires a lot of jumping through hoops to find and load portions of the TDMS file if the total data cannot be loaded completely into memory. That aspect is a part of the RDB already coded for us.

It is, of course, achievable in TDMS but far more complicated, more coding and requires fine-grained memory management. With the RDB it is a one-line query and it's job done. Additionally, there is an example written that demonstrates exactly what to do, so what's not to like?

If the OP finds that he cannot achieve his 200Hz aquisition via the RDB, then he will have no other choice but to use TDMS. It is, however, not the preferred option in this case (or in most cases IMHO).

Edited October 7, 2014 by ShaunR

JoeQ · October 8, 2014

Hi everyone,

my current application has to deal with real time data acquisition (200Hz or above) on multiple channels simutaneously and plotting the acquired data on XY-graphs (including decimation) in real time. Such acquisition typically lasts several days and a very large amount of data is therefore being created, so that the memory of the PC could run out very quickly, if the data is kept in the memory of the PC. Since the XY-Graphs have to be replotted periodically (at least quasi in real time), it seems to me that, the whole range of data somehow has to be available, no matter if decimation is being performed or not. What I have been thinking about is, trying to stream the accquired data to the hard disk and read the decimated data points into the memory for plotting. Has anyone ever dealt with this kind of issue and could provide me with some tipps or experiences? Thanks in advance!

Best,

Fred

Before getting into to much detail, you need to provide what sort of data rate you need. If you are needing to record two 8-bit channels at 200 hz? Or a hundred 24-bit channels at 200 Hz?

There are big differences in drive write speeds. One of the faster systems I played with used two FLASH drives configured with a Raid 0. Other things, like what else the PC is running may come into play. I have had to use compression to overcome problems with write times. To give you an idea of the amount of data that can be captured, the system I am currently working on can fill a Tbyte drive in an evening. In this instance there is no compression and I'm using mechanical non RAID storage. All the code is in Labview.

Normally, I will have two separate data paths. One for storage, other for the display. Typically I will do something similar to a peak detect on a scope for the display data. Lots of ways to display the data and really depends on your requirements.

ShaunR · October 9, 2014

Before getting into to much detail, you need to provide what sort of data rate you need.   If you are needing to record two 8-bit channels at 200 hz? Or a hundred 24-bit channels at 200 Hz?

There are big differences in drive write speeds.   One of the faster systems I played with used two FLASH drives configured with a Raid 0.   Other things, like what else the PC is running may come into play.   I have had to use compression to overcome problems with write times. To give you an idea of the amount of data that can be captured, the system I am currently working on can fill a Tbyte drive in an evening.   In this instance there is no compression and I'm using mechanical non RAID storage.  All the code is in Labview.

Normally, I will have two separate data paths. One for storage, other for the display.  Typically I will do something similar to a peak detect on a scope for the display data.   Lots of ways to display the data and really depends on your requirements.

It's an excellent point. For example. Trying to log 200 double precision (8 byte) datapoints at 200Hz on a sbRIO to a class 4 flash memory card is probably asking a bit much. The same on a modern PC with an SSD should be a breeze if money is no object. However. We are all constrained by budgets so a 4TB mechanical drive is a better fiscal choice for long term logging just because of the sheer size.

The toolkit comes with a benchmark, so you can test it on the hardware. On my laptop SSD it could just about manage to log 500, 8 byte (DBL) datapoints in 5ms (a channel per column, 1 record per write). If that is sustainable on a non-real time platform is debatable and would probably require buffering. 200 datapoints worked out to about 2ms and 100 was under 1ms so it could be a near linear relationship between number of columns (or channels, if you like) and write times. The numbers could be improved by writing more than one record at a time but a single record is the easiest.

I have performance graphs of numbers of records verses insert and retrieve times. I think I'll do the same for numbers of columns as I think the max is a couple of thousand.

Edited October 9, 2014 by ShaunR

wohltemperiert · October 9, 2014

Before getting into to much detail, you need to provide what sort of data rate you need.   If you are needing to record two 8-bit channels at 200 hz? Or a hundred 24-bit channels at 200 Hz?

There are big differences in drive write speeds.   One of the faster systems I played with used two FLASH drives configured with a Raid 0.   Other things, like what else the PC is running may come into play.   I have had to use compression to overcome problems with write times. To give you an idea of the amount of data that can be captured, the system I am currently working on can fill a Tbyte drive in an evening.   In this instance there is no compression and I'm using mechanical non RAID storage.  All the code is in Labview.

Normally, I will have two separate data paths. One for storage, other for the display.  Typically I will do something similar to a peak detect on a scope for the display data.   Lots of ways to display the data and really depends on your requirements.

Hi JoeQ,

in my current application, I am dealing with 16 NI USB DAQ devices, with each of them running at 200Hz and recording data (DBL) at 3 analog channels + 1 channel time stamp simutaneously.

You said, you normally had 2 data paths with one of them for display. Does the data path for display have some kind of a fixed data length? Could you please describe a little bit more, how you did it with peak detection, or you could suggest some other ways? Thanks in advance!

JoeQ · October 9, 2014

Sub 1Mbit then. This should be no problem for storage. If the boards are able to put out data in say a raw 24-bit mode, it would be better store it in that format and post process them to double. For your speeds, doesn't matter.

...and really depends on your requirements.

Typically I don't care about the GUI when I am collecting data. Maybe I just need some sort of sanity check. So I may take a segment of data, say 1000 samples, and do a min/max on it. Then just use the two data points that will be sent to the GUI. I don't like to throw out data or average as it is just too misleading when looking at the data. Again, think about how a peak detect on a DSO works. I may for example change the sample size for the min/max as the user zooms in.

This min/max data path is normally separate from the rest of the data collection. If the GUI stalls for a half second it may not be a problem but normally, missing collected data is not something I can have.

For your rates, you should be able to do most anything and get away with it. System I mentioned previous has about 500X higher sustained data rate and can't drop data. It's a little tight but workable without any special hardware.

I still use that old Microsoft CPU Stress program to stress my Labview apps. Launch a few instances of it and you can get a good idea how your design is going to hold up.

ShaunR · October 10, 2014

<snio>

so it could be a near linear relationship between number of columns (or channels, if you like) and write times.

<snip>

I have performance graphs of numbers of records verses insert and retrieve times. I think I'll do the same for numbers of columns as I think the max is a couple of thousand.

yup. It is linear.

A while later.............

That was up to the default maximum number of columns (1000 ish for that version). As I was I was building version 3.8.6 for uploading to LVs-Tools I thought I would abuse it a bit and compile the binaries so that SQLite could use 32,767 columns (3.8.6 is a bit faster than 3.7.13, but the results are still comparable).

I think I've taken the thread too far off from the OP now, so that's the end of this rabbit hole.Meanwhile. Back at the ranch..........

Edited October 10, 2014 by ShaunR

bigjoepops · October 14, 2014

_Fred,

_{If you haven't found it already I would check out the Continuous Measurement and Logging Template. I have had pretty good success with that. It should be able to give you reasonable "real time" results, depending on the specifics of what you are acquiring. You will have to make sure you use a fast method logging data. I don't know if it is better to wait till you have a large amount of buffered data to save it off or if you can save it "one at a time".}

_{Do you need graphs or can you just use charts for your monitoring? If you are saving/building large arrays in a loop you will slow things down.}

_Joe

Sign In

Real-time acquisition and plotting of large data

Recommended Posts

wohltemperiert

ShaunR

wohltemperiert

ShaunR

drjdpowell

eberaud

wohltemperiert

wohltemperiert

ShaunR

eberaud

ShaunR

JoeQ

ShaunR

wohltemperiert

JoeQ

ShaunR

bigjoepops

Join the conversation

Browse

Activity

Important Information