Jump to content

Recommended Posts

I'm trying to use the DOM parser in LabVIEW and the HTML I'm trying to parse is throwing errors. I am wondering if someone can help me work around this. The device that is publishing the HTML is quite old and my knowledge with regards to HTML is limited. I've played around for a bit and been unable to get it to work so I figured I'd some here and see if anyone could troubleshoot. I'm open to minor manipulations to the HTML to get the parser to work, so feel free to modify as needed.

test.html

ParseHTML.vi

Link to comment

A quick look at the html source shows me it's old-school HTML, not XHTML. That 'X' is very important as it implies the content would be valid XML in addition to HTML. The LabVIEW DOM functions operate on the XML DOM, be it XHTML or any other XML, and demand valid data structures.

 

If you want to parse (non X) HTML you'll probably have to do search and replace string operations. Either that or work some sort of browser engine into your code which can interpret HTML and allow you to operate on the HTML (not XML) DOM.

  • Like 1
Link to comment

This will be on a windows machine, maybe I can use .NET in LabVIEW, but we'll see. Is there something about regular HTML that itself would allow a generic parser not to be written and it's not worth digging for one? If that fails I'll just manipulate the string myself (ugh).

Edited by for(imstuck)
Link to comment

Sooooo.... perhaps let's just short circuit and skip to regexes... how sophisticated exactly is this modification you're wanting to do? (related: attachment)

 

For the future, here's a convenient link to test XHTML compliance of a document, as a "first-pass" check whether the LabVIEW DOM parser might have a rough go at it: http://validator.w3.org/

 

post-17237-0-93793600-1389757572.png

  • Like 1
Link to comment

Whatttt? I didn't crash at all on mine and I ran lots of times. Must be your Mac ;) (disregard if running in a Windows VM...which it looks like you are).

 

This program I'm trying to write is generally simple. I don't actually have to modify the HTML, just pull a couple numbers out from the table. I was just open to modifying the HTML if there was a simple change that would make it compatible with the LabVIEW DOM Parser which wouldn't effect the data I needed to grab.

 

That said, I will probably go the regex route, but it will be a good exercise for me. So, I'll post back if I get stuck on that. With all do respect, I'd like to forge (see: hack) ahead on my own for the time being, for learning's sake.

Edited by for(imstuck)
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.