Jump to content

Read a CSV file like Excel


Recommended Posts

I've got a CSV file I'm trying to read with lines like the following:

"Chris","Davis, Jr.","Nothing",100.00,80.00,34.10,"Nothing""Christopher","Relf","LabVIEW Expert",100.00,90.00, 50.10,"LabVIEW Expert"

When I read this type of file into excel I get 7 columns, when I read it with LabVIEW I get 8 columns. I'm using a spreadsheet string to array with a comma delimiter. I know that I can seperate the file just like excel does, but I don't know the easiest and most efficent way of doing it.

Any pointers?

Thanks in advance...

Link to comment
The easy solution would be to use TAB as delimiter in stead of comma in your input :blink:

If you insist on commas is delimiter:

Here is my solution, that uses regexp to split the strings up, I'm not shure if I would call it either easy or efficient...

post-5175-1162897649.png?width=400

-Mikkel :)

I wish I could change the format of the data file, but, for various reasons, I can't.

Thanks for the help. I'll try this tomorrow.

For future reference, I believe the function you are using in your screen shot was introduced in LV 8.

Thanks! :thumbup:

Link to comment

I don't know this "Christopher Relf" guy - but he sounds like he's awesome! Please give him my details - I'd love to offer him a job :) (I bet he's really handsome too, and smart, and witty, and humble...)

Anyways, yep - it's that rogue "," that's the issue, and parsing is the only way out of it. Excel does some "smarts" by not parsing text between the " and ", but the LabVIEW functions are more base, allowing for far more flexibility. Alas, with more flexibility come less smarts - you need to include that yourself :P

I like Mikkel's answer - it's probably the cleanest you'll get. Another option is to go through the raw test once it's read in and replace the delimters outside the quote with a different delimter (say, \t) and then parse into an array.

Link to comment
  • 4 weeks later...

Well I did get a chance to try the posted code, but it didn't work exactly as intended. Probably because I didn't give a good representation of the actual data file. Its more like what you see below:

"Chris","Davis, Jr.","asdfpoiu2134098",12/01/2006,01:15:06 PM,12.00,13.00,24.00"asdfwer234098lkj","weoirtuwe234098",1,1,0,0,"This is a test""weoriu324098lj","This question has a, comma in it",1,1,0,0,"More text here""wqeroi2340981","This question has been cut off, with a "quoted piece of text",0,0,1,1,"in it""23498osfuoiwuer","This question has a "QUOTE", and a comma in it",0,0,1,1,"Kinda Rough"

The solution I'm using right now is as follow...

--Search for all double quotes, log thier locations

--Look through double quote locations, checking each quote to see if it has a comma before or after it, if so keep it.

--Set everything between known good quote locations to a space

--Search for all commas, log thier locations

--Split string (using OpenG string subset) into an array of strings

This technique works on everything but the last line of my examples above. Right now I'm logging those types of lines and forcing the user to fix them manually. I was wondering if someone else had another idea. I've included some LV7 code that does the above mentioned steps. I'm using LV 8.2 for the final project, but I wanted to reach the widest possible audience with this example. I'm open to any ideas or suggestions.

Thanks!

Download File:post-2547-1165029240.vi

Link to comment

Chris (Davis),

This is a tricky problem. One thing that should be noted is that Excel escapes quotes by repeating the quote twice (e.g., " becomes "" in the CSV file). For example, I think that Excel would save your example as the following:

"Chris","Davis, Jr.","asdfpoiu2134098",12/01/2006,01:15:06 PM,12.00,13.00,24.00"asdfwer234098lkj","weoirtuwe234098",1,1,0,0,"This is a test""weoriu324098lj","This question has a, comma in it",1,1,0,0,"More text here""wqeroi2340981","This question has been cut off, with a ""quoted piece of text",0,0,1,1,"in it""23498osfuoiwuer","This question has a ""QUOTE"", and a comma in it",0,0,1,1,"Kinda Rough"

I have created a VI (attached) that seems to handle Excel CSV files correctly. (However, it might not work on your original example strings, since it expects quotes to be escaped.)

This is something that I was thinking of adding to the OpenG code, someday, so I have BSD'ed the VI. If you want to help polish it up, so that we can add it to the OpenG string library, that would be great :)

Download File:post-17-1165043092.zip

Link to comment

Jim,

Thanks for saving it back to 6.1 for me. I tried it out, and it worked great. The code is easier to understand than mine. I don't know what I could contribute to the VI. It seems well documented and understandable, other than not having any VI documentation and no subversion rev information, it should fit right into the OpenG String library. I've just got to find a way to make my data files have the right # of quotes where they should.

I could contribute some more test cases to help flesh out the VI, but I don't know the format of your automatic testing VIs. Perhaps I can post/email them to you?

Thanks for the help.

Link to comment
  • 2 years later...

Jim,

here we are in 2008, and you saved me alot of headache. while i didn't use your vi, i saw that u were using the "match pattern.vi" which was easier than the "scan string for token.vi" that i was using.

coming from C, it's frustrating knowing what i want to do, and not knowing whats available or the easiest implementation...

thanks!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.