LabView Regexp Problem

zlocm · October 15, 2010

I try to delete all html tags from sequence (like <.....>). But LabView don`t find all of tags in me sequence.

My sequence: <tr id = "ololo">Hello World !!!</tr>

Regexp: <(.*?)>

My vi (LV 2009) is in attachements.

Regerds.

Phillip Brooks · October 15, 2010

I try to delete all html tags from sequence (like <.....>). But LabView don`t find all of tags in me sequence.

My sequence: <tr id = "ololo">Hello World !!!</tr>

Regexp: <(.*?)>

My vi (LV 2009) is in attachements.

Regerds.

Funny, I just recently posted a vi (on the dark side) that I wrote awhile back to remove HTML tags from TestStand HTML reports. Maybe it will help you.

There are some other regex nuggets in the thread...

http://forums.ni.com/t5/BreakPoint/Regular-Expressions-Board/m-p/1269088#M14343

jcarmody · October 15, 2010

How about this?

zlocm · October 15, 2010

Thanks for advice !

But regexp have problem...

How about this?

Edited October 15, 2010 by zlocm

jcarmody · October 15, 2010

Thanks for advice !

But regexp have problem...

What's the problem? What should the output be?

asbo · October 15, 2010

For the record, /<(.*?)>/ is considered a fairly "bad" regular expression. You should work with something more like /<[^>]+>/. A cookie to whoever can explain why that's better

jcarmody · October 15, 2010

As far as the problem in your original VI, you're using Shift Registers with the Match Regular Expression node incorrectly. You can't use the Offset After Match if you're only going to search what was found before and after the previous match. Savvy?

For the record, /<(.*?)>/ is considered a fairly "bad" regular expression. You should work with something more like /<[^>]+>/. A cookie to whoever can explain why that's better

Yours will not miss nested tags line break characters. If the tag spans multiple lines the "bad" regular expression will fail.

Edited October 15, 2010 by jcarmody

asbo · October 15, 2010

Yours will not miss nested tags line break characters. If the tag spans multiple lines the "bad" regular expression will fail.

half-eaten%20cookie%202.gif

The Match Regular Expression node does have a multiline parameter to account for that; when set to True, the ^ and $ anchors no longer match line endings and the . wildcard will also match \r and \n. But more importantly, you should rarely, if ever, write a regex that uses that . wildcard. You might parallel it to global variables in LV - there are valid use cases, but they are far and few. Using more specific matching makes debugging and readability much more straightforward.

Though my solution does have a flaw - if there's a nested > (within an attribute, for example), the regex will break. This way why regular expressions are almost never the correct solution for HTML/XML/*ML problems, it's a job for a proper parser (TidyHTML, for example).

ShaunR · October 15, 2010

....debugging and readability much more straightforward.

This is regex we're talking about

zlocm · October 15, 2010

I have expression:<tr id = "ololo">Hello World !!!</tr>

Output should be: Hello World !!!

But, it was: My sequence:Hello World !!!

This library (perl regexp) works fine with python, so this regexp get right result, but LAbView not.

This pthon script:

[/color][color=#1C2837]import reif __name__ == '__main__':    data = '''&lt;tr id = "ololo"&gt;Hello World &lt;br&gt; !!!&lt;/tr&gt;'''    table_regex = re.compile("&lt;(.*?)&gt;",re.IGNORECASE)    print("FIRST EXRESSION ---------------------")       print(table_regex.search(data).group())    print("SECOND EXRESSION---------------------")       p2 = data[table_regex.search(data).end():]    print(table_regex.search(p2).group())    print("THIRD EXRESSION---------------------")    p3 = p2[table_regex.search(p2).end():]    print(table_regex.search(p3).group())    pass[/color][color=#1C2837]

returns:

FIRST EXRESSION ---------------------

SECOND EXRESSION---------------------

THIRD EXRESSION---------------------

</tr>

so, this regexp should works fine.

What's the problem? What should the output be?

As far as the problem in your original VI, you're using Shift Registers with the Match Regular Expression node incorrectly. You can't use the Offset After Match if you're only going to search what was found before and after the previous match. Savvy?

Yours will not miss nested tags line break characters. If the tag spans multiple lines the "bad" regular expression will fail.

Oh, you right, thanks.

Sign In

LabView Regexp Problem

Recommended Posts

zlocm

Phillip Brooks

jcarmody

zlocm

jcarmody

asbo

jcarmody

asbo

ShaunR

zlocm

Join the conversation

Browse

Activity

Important Information