Pattern Match performance

dannyt · April 7, 2011

Hi

I was looking at Ton Plomp's Mercurial API over the last few days (I hope to announce why in a few weeks) and I noticed a piece of code similar to this

It took me a couple of seconds to see it was a simple pattern match and as I always use the Match Pattern privative or for something more complicated a match regular expression privative I was intrigued as to why he did it this way and looking further I found out something I suspect many of you know already.

I ran some tests using Ton's method, Match Pattern & Match Regular Expression all just match something at the start of a string. I ran each in a loop 500000 times

EDIT Picture updated thank you Shaun

I think I knew Match Regular Expression was a "heaver" VI than Match Pattern, I think it says so in the documentation somewhere, but it was interesting to see it was so much slower than the other two methods. I am sure that at times in my code I have used Match Regular Expression where it is not really called for, often I was trying to match something more complicated then when the matching was optimised I never changed back to the simpler match pattern.

It is interesting to see that in terms of performance both the other method are the same in terms of time, but Ton's method only uses ~75K for my test VI but around ~210K for the match pattern.

Anyway, thanks Ton, I must look at more of the LAVA CR code it is so interesting seeing how others do things.

cheers

Dannyt

ShaunR · April 7, 2011

The regular expression primitive is actually an xnode (the tell-tale busy cursor whilst it generates the code).

Once the code is generated by the xnode your string is passed to the regex function in the labview.exe (MatchRegExpEfficient). But. When it' returns, it is parsed in Gcode using several split string functions to split out to the terminals. So not only do you have a regex engine overhead (which may or may not be the same as match pattern - i don't think it is). But you also have 4 or 5 (more if you grow the terminals) additional split strings.. So at best, the regex function would be 4 or 5 times slower, even without the call to labview.

But match pattern (as blazingly fast as it is) becomes unwieldy and difficult to use on selective string matching (where something may or may not be present - like area codes in a telephone number). That's when you need the regex.

NB:

Your image shows 75k for the match pattern and 210k for Tons.

Edited April 7, 2011 by ShaunR

jgcode · April 7, 2011

Yer, MP does not support all of the same special characters as regex, so I too jump between them for speed or functionality to meet requirements.

ShaunR - how do you know so much about the LabVIEW internals on these functions (which is great btw)?

ShaunR · April 7, 2011

how do you know so much about the LabVIEW internals on these functions (which is great btw)?

Just right click and select "generate code". Nothing clever. Even I can do it

Edited April 7, 2011 by ShaunR

dannyt · April 7, 2011

Just right click and select "generate code". Nothing clever. Even I can do it

wow

meant to give +1 to Shaun but gave it to jgcode hey so both have +1 :thumbup1:

Daklu · April 7, 2011

Danny, you should add this to the Wiki. I've often thought there should be an optimization section with tips and tricks (mainly because I can never remember them all)--this would be perfect.

(2 :star: for Shaun! One for knowing the details and one for telling us how to learn them ourselves. :thumbup1: )

Rolf Kalbermatter · April 8, 2011

The regular expression primitive is actually an xnode (the tell-tale busy cursor whilst it generates the code).

Once the code is generated by the xnode your string is passed to the regex function in the labview.exe (MatchRegExpEfficient). But. When it' returns, it is parsed in Gcode using several split string functions to split out to the terminals. So not only do you have a regex engine overhead (which may or may not be the same as match pattern - i don't think it is). But you also have 4 or 5 (more if you grow the terminals) additional split strings.. So at best, the regex function would be 4 or 5 times slower, even without the call to labview.

But match pattern (as blazingly fast as it is) becomes unwieldy and difficult to use on selective string matching (where something may or may not be present - like area codes in a telephone number). That's when you need the regex.

The internal MatchPattern function is regex similar but by far not as extensive. I haven't timed the MatchPattern with a normal PCRE call yet. Personally I think MatchPattern, while being a nice function, most likely isn't as optimized as PCRE but doing less and therefore being simpler it still likely does it's thing faster than PCRE. Since the MatchPattern function exists in LabVIEW since at least version 3 or 4, and since it would be way to tricky to develop a wrapper around PCRE to behave EXACTLY as the old Match Pattern function, I'm very sure it is still the old function.

On the other hand I just recently run into a problem, where behaviour of Match Pattern was changed. First they changed the documentation in 8.2 and then they changed the functionality to match the documentation in 8.5!! It struck me as very sneaky.

Sign In

Pattern Match performance

Recommended Posts

dannyt

Link to comment

ShaunR

Link to comment

jgcode

Link to comment

ShaunR

Link to comment

dannyt

Link to comment

Daklu

Link to comment

Rolf Kalbermatter

Link to comment

Join the conversation

Browse

Activity

Important Information