Jump to content

Reading in large data files


Recommended Posts

I need to read in files over 1GB and the exress read VI doesn't cut it anymore. The algorithm in MatLab that I used to use to read in files like this worked pretty well; read in 10000 pieces of data, processed, then spit it out. I have the OpenG lib and I tried some of the read data file VI's that are in there, but they all sent out garbage. Where there was supposed to be a zero the VI would return 2.65487E-28 or something like that... The data is not complicated: first column is relative time and the other columns are channels of data.

Any suggestions? Thanks...

Link to comment

You provide us now with too little information to solve the problem. Are you trying to read a text file? Is it tab delimited? What are the datatypes stored in the file? Do you have code you are using to read the file and could you share it with us? Does the problems occur only with files over 1GB or also with small files? Can provide a small sample file that reproduces the problem?

Link to comment

Sorry... i cant post a copy of my vi because my work has some sort of firewall up...

Its a tab delimited file and has voltage readings for each channel with respect to the time column. Its just a regular ASCII file, so I can open it up in notepad and read through the data. The problem only occurs with the larger files, because the Read from Measurement File Express VI can only read in relatively small files. The file looks exactly like this, starts with the names of the channels, then moves to data and channels:

FAIRING_NOSE

FAIR_FWD_BOLT_1ST

FAIR_FWD_BOLT_2ND

S2_SEPARATION

S3_TVC_BATTERY

FAIR_BASE_JOINT

PAYLOAD_SEP_1ST

PAYLOAD_SEP_2ND

0.000000 0.004883 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000

0.000200 0.000000 0.000000 0.000000 0.000000 -0.004883 0.000000 0.000000 0.004883

0.000400 -0.004883 -0.004883 -0.004883 -0.004883 0.000000 0.000000 0.000000 0.000000

0.000600 0.000000 0.000000 -0.004883 -0.004883 -0.004883 -0.004883 0.000000 0.000000

0.000800 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000

0.001000 0.000000 -0.004883 0.000000 0.004883 0.000000 0.000000 0.000000 0.000000

The first column is time in seconds. It goes on for about an hour, so the file ends up being pretty big.

Link to comment

QUOTE(Yuri33 @ Aug 10 2007, 10:29 AM)

Is there any reason you can't implement the same solution in LV that you implemented in Matlab (read in\process\spit out 10000 samples at a time)? LV provides equivalent open\read\seek\close functions that Matlab has.

Depending which version of LV this may or may not work.

LV 8.0 and up handles 64 bit file pointers (I believe). Prior to that, the file offset was limited to 32 bits. But even then the file I/O primatives tracked the file size internally using 64 bit, so as long as you just let the file I/O stuff track the file pointer, it would still work.

HDF5 will also handle files that large.

Under Windows XP and prior you could seldom read in ALL of the file at once becuase Windows could only provid about 1.2 G of memory.

Ben

Link to comment

i was using textscan in Matlab

yea Labview throws an exception before it even begins reading in the file. Can anyone post a screenshot of a VI that would read in that kind of file using the commands in Labview 8.0.1?

this is what i have in Matlab

fprintf('Working');

totalData = textscan(fid, formatStr, 10000,'delimiter','\t', 'headerlines', startPoint);

while ~isempty(totalData{1}) %continue processing until no more data is read

for C=1:sizeChannels+1 % across each channel

for L=1:max(size(totalData{1,C})) % down each row

if C==1

time_counter = time_counter+1; %adds up to find total time of test

elseif totalData{C}(L)>outofbounds(C-1) || totalData{C}(L)<(-outofbounds(C-1))

out_of_bounds_matrix{C-1}(1,count(C-1)) = totalData{1}(L); %adds time and value to out of bounds array

out_of_bounds_matrix{C-1}(2,count(C-1)) = totalData{C}(L);

count(C-1)=count(C-1)+1;

end

end

end

fprintf('.');

totalData = textscan(fid,formatStr,10000,'delimiter','\t'); %reads in 10000 pieces at a time

end

Link to comment

Here's some example code:

http://forums.lavag.org/index.php?act=attach&type=post&id=6603

But it's not quite optimal.

Be aware that you have a serious load of data:

5 kHz*3600*9*8=1.3 GB, this will most likely not be allowed by your memory since LabVIEW needs the array in one continous block of data.

So I'd try create list of pointers with the byte position of every second, this will creat an array of 8*3600=28.8 kB

Ton

PS update your profile it shows you use 8.2.1, but you stated 8.0.1 earlier

Link to comment

I have a large (multi-gigabyte), fixed record length binary file that I extract values from in a post-processing step (no data acq running). I used queues and parallel loops to improve my data load time from ~5 minutes to ~30 seconds!

Here is a picture of my VI. My records are binary, but you could use "read lines" and a for loop to enqueue the strings in the lower loop. Part of my conversion to actual values is done in a sub-vi that I use in other places; you could use Ton's example to extract the values from the string (spreadsheet string to array).

To improve your load time; Use parallel loops to retrieve data and convert it; predefine your array sizes and use replace element instead of build array or autoindexing.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.