Jump to content

Pattern Match performance


Recommended Posts

Hi

I was looking at Ton Plomp's Mercurial API over the last few days (I hope to announce why in a few weeks) and I noticed a piece of code similar to this

post-7256-0-24320300-1302165188_thumb.pn

It took me a couple of seconds to see it was a simple pattern match and as I always use the Match Pattern privative or for something more complicated a match regular expression privative I was intrigued as to why he did it this way and looking further I found out something I suspect many of you know already.

I ran some tests using Ton's method, Match Pattern & Match Regular Expression all just match something at the start of a string. I ran each in a loop 500000 times

EDIT Picture updated thank you Shaun

post-7256-0-96164400-1302181214_thumb.pn

I think I knew Match Regular Expression was a "heaver" VI than Match Pattern, I think it says so in the documentation somewhere, but it was interesting to see it was so much slower than the other two methods. I am sure that at times in my code I have used Match Regular Expression where it is not really called for, often I was trying to match something more complicated then when the matching was optimised I never changed back to the simpler match pattern.

It is interesting to see that in terms of performance both the other method are the same in terms of time, but Ton's method only uses ~75K for my test VI but around ~210K for the match pattern.

Anyway, thanks Ton, I must look at more of the LAVA CR code it is so interesting seeing how others do things.

cheers

Dannyt

Link to comment

The regular expression primitive is actually an xnode (the tell-tale busy cursor whilst it generates the code).

Once the code is generated by the xnode your string is passed to the regex function in the labview.exe (MatchRegExpEfficient). But. When it' returns, it is parsed in Gcode using several split string functions to split out to the terminals. So not only do you have a regex engine overhead (which may or may not be the same as match pattern - i don't think it is). But you also have 4 or 5 (more if you grow the terminals) additional split strings.. So at best, the regex function would be 4 or 5 times slower, even without the call to labview.

But match pattern (as blazingly fast as it is) becomes unwieldy and difficult to use on selective string matching (where something may or may not be present - like area codes in a telephone number). That's when you need the regex.

NB:

Your image shows 75k for the match pattern and 210k for Tons.

Edited by ShaunR
  • Like 1
Link to comment

Yer, MP does not support all of the same special characters as regex, so I too jump between them for speed or functionality to meet requirements.

ShaunR - how do you know so much about the LabVIEW internals on these functions (which is great btw)?

  • Like 1
Link to comment

Danny, you should add this to the Wiki. I've often thought there should be an optimization section with tips and tricks (mainly because I can never remember them all)--this would be perfect.

(2 :star: for Shaun! One for knowing the details and one for telling us how to learn them ourselves. :thumbup1: )

Link to comment

The regular expression primitive is actually an xnode (the tell-tale busy cursor whilst it generates the code).

Once the code is generated by the xnode your string is passed to the regex function in the labview.exe (MatchRegExpEfficient). But. When it' returns, it is parsed in Gcode using several split string functions to split out to the terminals. So not only do you have a regex engine overhead (which may or may not be the same as match pattern - i don't think it is). But you also have 4 or 5 (more if you grow the terminals) additional split strings.. So at best, the regex function would be 4 or 5 times slower, even without the call to labview.

But match pattern (as blazingly fast as it is) becomes unwieldy and difficult to use on selective string matching (where something may or may not be present - like area codes in a telephone number). That's when you need the regex.

The internal MatchPattern function is regex similar but by far not as extensive. I haven't timed the MatchPattern with a normal PCRE call yet. Personally I think MatchPattern, while being a nice function, most likely isn't as optimized as PCRE but doing less and therefore being simpler it still likely does it's thing faster than PCRE. Since the MatchPattern function exists in LabVIEW since at least version 3 or 4, and since it would be way to tricky to develop a wrapper around PCRE to behave EXACTLY as the old Match Pattern function, I'm very sure it is still the old function.

On the other hand I just recently run into a problem, where behaviour of Match Pattern was changed. First they changed the documentation in 8.2 and then they changed the functionality to match the documentation in 8.5!! It struck me as very sneaky.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.