Jump to content

LabView Regexp Problem


zlocm

Recommended Posts

I try to delete all html tags from sequence (like <.....>). But LabView don`t find all of tags in me sequence.

My sequence: <tr id = "ololo">Hello World <br> !!!</tr>

Regexp: <(.*?)>

My vi (LV 2009) is in attachements.

Regerds.

Funny, I just recently posted a vi (on the dark side) that I wrote awhile back to remove HTML tags from TestStand HTML reports. Maybe it will help you.

There are some other regex nuggets in the thread...

http://forums.ni.com/t5/BreakPoint/Regular-Expressions-Board/m-p/1269088#M14343

Link to comment

As far as the problem in your original VI, you're using Shift Registers with the Match Regular Expression node incorrectly. You can't use the Offset After Match if you're only going to search what was found before and after the previous match. Savvy?

post-7534-097691200 1287151082_thumb.png

For the record, /<(.*?)>/ is considered a fairly "bad" regular expression. You should work with something more like /<[^>]+>/. A cookie to whoever can explain why that's better ;)

Yours will not miss nested tags line break characters. If the tag spans multiple lines the "bad" regular expression will fail.

Edited by jcarmody
Link to comment

Yours will not miss nested tags line break characters. If the tag spans multiple lines the "bad" regular expression will fail.

half-eaten%20cookie%202.gif

The Match Regular Expression node does have a multiline parameter to account for that; when set to True, the ^ and $ anchors no longer match line endings and the . wildcard will also match \r and \n. But more importantly, you should rarely, if ever, write a regex that uses that . wildcard. You might parallel it to global variables in LV - there are valid use cases, but they are far and few. Using more specific matching makes debugging and readability much more straightforward.

Though my solution does have a flaw - if there's a nested > (within an attribute, for example), the regex will break. This way why regular expressions are almost never the correct solution for HTML/XML/*ML problems, it's a job for a proper parser (TidyHTML, for example).

Link to comment

I have expression:<tr id = "ololo">Hello World <br> !!!</tr>

Output should be: Hello World !!!

But, it was: My sequence:Hello World <br> !!!

This library (perl regexp) works fine with python, so this regexp get right result, but LAbView not.

This pthon script:

[/color][color=#1C2837]import reif __name__ == '__main__':    data = '''&lt;tr id = "ololo"&gt;Hello World &lt;br&gt; !!!&lt;/tr&gt;'''    table_regex = re.compile("&lt;(.*?)&gt;",re.IGNORECASE)    print("FIRST EXRESSION ---------------------")       print(table_regex.search(data).group())    print("SECOND EXRESSION---------------------")       p2 = data[table_regex.search(data).end():]    print(table_regex.search(p2).group())    print("THIRD EXRESSION---------------------")    p3 = p2[table_regex.search(p2).end():]    print(table_regex.search(p3).group())    pass[/color][color=#1C2837]

returns:

FIRST EXRESSION ---------------------

<tr id = "ololo">

SECOND EXRESSION---------------------

<br>

THIRD EXRESSION---------------------

</tr>

so, this regexp should works fine.

What's the problem? What should the output be?

As far as the problem in your original VI, you're using Shift Registers with the Match Regular Expression node incorrectly. You can't use the Offset After Match if you're only going to search what was found before and after the previous match. Savvy?

post-7534-097691200 1287151082_thumb.png

Yours will not miss nested tags line break characters. If the tag spans multiple lines the "bad" regular expression will fail.

Oh, you right, thanks.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.