Jump to content
News about the LabVIEW Wiki! Read more... ×
Sign in to follow this  
jgcode

MD5 Hash (MD5 Package)

Recommended Posts

This OpenG Review is closed. See Summary Post here. Please start a new thread to discuss new changes to this VI. Please PM me if there are any issues with this thread.

Community,

This VI was in the Candidates folder for the String Package.

post-10325-0-01203200-1314576207.png

It has been sitting in there for a while therefore, I have just gone ahead and posted as is (so the license will be migrated on confirmation from the author etc...).

What are you thoughts on this VI?

  • Would you like to see such a function in OpenG?
  • Can you optimize the code?
  • It may be better suited in e.g. Comparison Package?
  • Should it be rejected?

Kind regards

Jonathon Green

OpenG Developer

Is an MD5.vi

TEST - Is an MD5.vi

Code is in LabVIEW 2009

Share this post


Link to post
Share on other sites

I propose that we use RegEx and clean up the BD.

Not sure of the best way to handle the error, above is an example to maintain the VI's interface.

post-10325-0-62032400-1314606524.png

Is an MD5.vi

Code is in LabVIEW 2009

Share this post


Link to post
Share on other sites

A Regex would be slower, I would ignore the error (if using Match Regular Expression) and the \w will include non-hex characters.

post-7534-0-10196800-1314612257.png

post-7534-0-40661600-1314612644_thumb.pn

  • Like 1

Share this post


Link to post
Share on other sites

The RFC states that it must be lower case...

http://www.ietf.org/rfc/rfc2831.txt

Let HEX(n) be the representation of the 16 octet MD5 hash n as astring of 32 hex digits (with alphabetic characters always in lowercase, since MD5 is case sensitive).

Edited by Phillip Brooks
  • Like 1

Share this post


Link to post
Share on other sites

A Regex would be slower, I would ignore the error (if using Match Regular Expression) and the \w will include non-hex characters.

post-7534-0-10196800-1314612257.png

post-7534-0-40661600-1314612644_thumb.pn

Yes RegEx is slower but your code with Match Pattern does not pass the Test VI?

I don't think Match Pattern supports Quantifiers? I am not sure? (I always find myself looking up Match Pattern help as its different to RegEx)

Can anyone confirm - can we use MP for this (it is much faster).

Anyways here is the fix from using any alphanumeric to specifying a character class based on Phillip's post.

(Sorry, I cut, pasted and edit from the GUID thread - thanks for pointing that out)

I had to update the original Test VI to fail MD5's that contain [A-F] range (see new VI attached).

Is an MD5.vi

TEST - Is an MD5.vi

Code is in LabVIEW 2009

post-10325-0-68391200-1314619191.png

Share this post


Link to post
Share on other sites

My only comment is that the Regex doesn't need the pipe. I don't understand why Match Pattern doesn't work, but it definitely doesn't.

Share this post


Link to post
Share on other sites

My only comment is that the Regex doesn't need the pipe. I don't understand why Match Pattern doesn't work, but it definitely doesn't.

I can take the alternation metacharacter out if no one likes it in there - no dramas. (Unsure if it impacts performance or not).

Share this post


Link to post
Share on other sites

Agreed, the alternation character is unnecessary. It'll work just the same, but it changes the logic of the expression (though it may have no performance impact).

The Match Pattern node does not support finite quantifiers, but dot and star work as expected. Because string length is an O(1) operation, maybe this is worth considering:

post-13461-0-69352700-1314629366.png

Edit: Oops, had a-z as the character range, when it should have been a-f.

  • Like 1

Share this post


Link to post
Share on other sites

Agreed, the alternation character is unnecessary. It'll work just the same, but it changes the logic of the expression (though it may have no performance impact).

The Match Pattern node does not support finite quantifiers, but dot and star work as expected. Because string length is an O(1) operation, maybe this is worth considering:

That's quite clever, and would be faster than RegEx.

Share this post


Link to post
Share on other sites

My question... why would you ever want to test if a string looks like it is a md5 string?

Most likely for data validation.

Of course the main reason for the review is verify whether this would be a useful function (and then how to implement it, if it is useful).

  • Should it be rejected?

So, please continue to let us know either way.

Share this post


Link to post
Share on other sites
My question... why would you ever want to test if a string looks like it is a md5 string? Further I had much rather see VI's in OpenG which could create SHA1 and SHA2 hashes. Maybe ask Ton... https://bitbucket.org/tcplomp/labviewencryption/src/ee5c00d513e7

I sort of agree with the question above, what is the use case for checking a string just looks like an MD5 string, it is either a correct valid md5 string for something or it is not. Maybe I am just not seeing the obvious here.

Danny

Share this post


Link to post
Share on other sites

I sort of agree with the question above, what is the use case for checking a string just looks like an MD5 string, it is either a correct valid md5 string for something or it is not. Maybe I am just not seeing the obvious here.

Danny

Lets say you have a very large file that you want to verify (the LabVIEW 2011 image :) )

Before extracting and installing, you might want to verify the file. The time to generate the MD5 from the downloaded file may be significant. If you are going to compare the results of a time consuming operation with an invalid string, you will be wasting time (by re-downloading a file that may be good).

It is not easy to visually read or count the characters of an MD5 checksum...

Share this post


Link to post
Share on other sites

If we are going to implement this, I wouldn't want it to be strict regarding the case, one of the reasons being that LabVIEW using %x formatter outputs uppercast hex-values.

Regarding the SHA hashes, yes my intention is to offer that code in the OpenG package, however I haven't got time to write unit tests for the SHA and HMAC hashing, though I have a complete set of unit test's for AES cyper/decypher routines.

Ton

  • Like 1

Share this post


Link to post
Share on other sites

In summary:

  • Add this VI to OpenG
  • Move to MD5 package
  • Allow non-strict capitalization
  • Implement code similar to this:

post-13461-0-69352700-1314629366.png

With the non-strict capitalization issue, could an optional Boolean flag called "Strict Capitalization" (default = F) be added to the interface?

Would that be a good feature to have?

Share this post


Link to post
Share on other sites

If we are going to use this functionality, we nood to change the MD5 hash-vi with an extra output.

CUrrently the VI outputs a binary string, that should be converted to a hex-string to produce a valid MD5-hash.

Share this post


Link to post
Share on other sites

I'm not sure I see the value in implementing a strict option. Unless you're actually evaluating whether something is following MD5 spec, the case of your hex string is irrelevant to the data it contains. There's trivial overhead in exposing the option to do this, but the execution of a mixed-case regular expression suffers ~20% performance penalty in some brief benchmarks.

I like Ton's idea RE: the hex string and would actually go as far to say that the string should be the only output - I've never had a use case where I need an MD5 in binary. It's always been for display or inclusion as a checksum (like VIPM does for package files and the like).

Share this post


Link to post
Share on other sites
...but the execution of a mixed-case regular expression suffers ~20% performance penalty in some brief benchmarks.

No, i was thinking that the flag would select the correct RegEx string to use in the MP primitive. So overhead should be negligible.

Share this post


Link to post
Share on other sites

No, i was thinking that the flag would select the correct RegEx string to use in the MP primitive. So overhead should be negligible.

Right, I was comparing the performance of /^[0-9a-f]$/ versus /^[0-9a-fA-F]$/. The addition of the second range of characters is where the 20% comes from.

(I should have included a snippet...)

Share this post


Link to post
Share on other sites

My only complaints are that of verbiage:

* "Strict Character Case" vs. "Strict Capitalization"

* "Is an MD5? returns TRUE if Input String represents an MD5 hash, otherwise it returns FALSE."

* "Valid characters are ..." vs. "Expected characters are ..."

Other than that, thanks for dedicating the time to this :)

Share this post


Link to post
Share on other sites

Other than that, thanks for dedicating the time to this :)

Likewise (to everyone who's posted too)

* "Is an MD5? returns TRUE if Input String represents an MD5 hash, otherwise it returns FALSE."

I changed everything else you mentioned but to be consistent I use the variable names so MD5? instead of Is an MD5?

post-10325-0-29527900-1316100425_thumb.p

post-10325-0-77081100-1316100343_thumb.p

Is an MD5.vi

TEST - Is an MD5.vi

Code is in LabVIEW 2009.

Share this post


Link to post
Share on other sites

I changed everything else you mentioned but to be consistent I use the variable names so MD5? instead of Is an MD5?

Ahh, that makes sense. Looks good!

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.
Sign in to follow this  

×

Important Information

By using this site, you agree to our Terms of Use.