Jump to content

RFC-4180 formatted CSV file parser


Recommended Posts

After many frustrating experiences with patched together CSV parsers (admittedly of my own creation), I've finally broken down and developed a CSV file parser that follows the RFC-4180 format: http://tools.ietf.org/html/rfc4180

 

That is to take into account double quotes and commas inside double quotes and multiple-line entries. This is essential if you plan on storing user-input text data or storing numbers that may have a comma as a decimal point.

 

From what I can tell, this is the same format that Open Office, Libre Office and Microsoft Excel use for their CSV file export. Therefore, using this format, you can export from LabVIEW to CSV, open it in Excel, make changes, export from Excel to CSV then import it back to LabVIEW without everything being broken by a missing double quotation mark or an extra comma.

 

Attached is the VI to parse a CSV formatted string to a 2D array of strings for LV 2013, 2012 and 8.6. I've also attached an example CSV file (inside Test1.zip) exported from LibreOffice to demonstrate some of the tricky cases that most CSV parsers can't handle.

 

Please give it a try and let me know your thoughts.

 

Update: I've posted the finished library to the Code Repository Database & File IO catagory here http://lavag.org/files/file/239-robust-csv/

Edited by Porter
  • Like 2
Link to comment

I posted something similar on the NI forums several years ago.

 

Someone pointed out to me there that some apps will generate opening and closing quotes (rather than just straight quotes) when saving as CSV.

 

Here is some opening and closing quotes data based on the wikipedia entry for CSV files example that you could test with:

 

Year,Make,Model,Description,Price1997,Ford,E350, “ac, abs, moon”,3000.001999,Chevy,“Venture ““Extended Edition”””,“”,4900.001999,Chevy,“ Venture ““ Extended Edition, Very Large”””,“”,5000.001996,Jeep,Grand Cherokee,“MUST SELL!air, moon roof, loaded”,4799.00

 

My VI can be found here:

 

http://forums.ni.com/t5/LabVIEW/Read-csv-file-with-double-quotes/m-p/1591640#M580390

Link to comment

Thanks for the info Phillip.

 

It's unfortunate that I didn't find your post earlier. Your VI is about 8 times faster than mine. Converting to a byte array and replacing CR/LF with ASCII record separators and commas with ASCII unit separators is a very good idea. Using indexed for loops instead of while loops to build the array must also make a big difference.

 

I was also using trim white space on every cell which is actually not correct.

 

I modified your VI slightly to fit my application:

- removed file open and path input/output (I don't want that functionality)

- added support for field delimiters other than the comma. Tested with tab and semicolon.

- force all end of lines to LF before processing. This avoids causing a blank row from a CR+LF.

 

Would you like me to post the modified VI here?

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.