Jump to content

MD5 Hash (MD5 Package)


Recommended Posts

This OpenG Review is closed. See Summary Post here. Please start a new thread to discuss new changes to this VI. Please PM me if there are any issues with this thread.

Community,

This VI was in the Candidates folder for the String Package.

post-10325-0-01203200-1314576207.png

It has been sitting in there for a while therefore, I have just gone ahead and posted as is (so the license will be migrated on confirmation from the author etc...).

What are you thoughts on this VI?

  • Would you like to see such a function in OpenG?
  • Can you optimize the code?
  • It may be better suited in e.g. Comparison Package?
  • Should it be rejected?

Kind regards

Jonathon Green

OpenG Developer

Is an MD5.vi

TEST - Is an MD5.vi

Code is in LabVIEW 2009

Link to comment

A Regex would be slower, I would ignore the error (if using Match Regular Expression) and the \w will include non-hex characters.

post-7534-0-10196800-1314612257.png

post-7534-0-40661600-1314612644_thumb.pn

Yes RegEx is slower but your code with Match Pattern does not pass the Test VI?

I don't think Match Pattern supports Quantifiers? I am not sure? (I always find myself looking up Match Pattern help as its different to RegEx)

Can anyone confirm - can we use MP for this (it is much faster).

Anyways here is the fix from using any alphanumeric to specifying a character class based on Phillip's post.

(Sorry, I cut, pasted and edit from the GUID thread - thanks for pointing that out)

I had to update the original Test VI to fail MD5's that contain [A-F] range (see new VI attached).

Is an MD5.vi

TEST - Is an MD5.vi

Code is in LabVIEW 2009

post-10325-0-68391200-1314619191.png

Link to comment

My only comment is that the Regex doesn't need the pipe. I don't understand why Match Pattern doesn't work, but it definitely doesn't.

I can take the alternation metacharacter out if no one likes it in there - no dramas. (Unsure if it impacts performance or not).

Link to comment

Agreed, the alternation character is unnecessary. It'll work just the same, but it changes the logic of the expression (though it may have no performance impact).

The Match Pattern node does not support finite quantifiers, but dot and star work as expected. Because string length is an O(1) operation, maybe this is worth considering:

post-13461-0-69352700-1314629366.png

Edit: Oops, had a-z as the character range, when it should have been a-f.

  • Like 1
Link to comment

Agreed, the alternation character is unnecessary. It'll work just the same, but it changes the logic of the expression (though it may have no performance impact).

The Match Pattern node does not support finite quantifiers, but dot and star work as expected. Because string length is an O(1) operation, maybe this is worth considering:

That's quite clever, and would be faster than RegEx.

Link to comment

My question... why would you ever want to test if a string looks like it is a md5 string?

Most likely for data validation.

Of course the main reason for the review is verify whether this would be a useful function (and then how to implement it, if it is useful).

  • Should it be rejected?

So, please continue to let us know either way.

Link to comment
My question... why would you ever want to test if a string looks like it is a md5 string? Further I had much rather see VI's in OpenG which could create SHA1 and SHA2 hashes. Maybe ask Ton... https://bitbucket.org/tcplomp/labviewencryption/src/ee5c00d513e7

I sort of agree with the question above, what is the use case for checking a string just looks like an MD5 string, it is either a correct valid md5 string for something or it is not. Maybe I am just not seeing the obvious here.

Danny

Link to comment

I sort of agree with the question above, what is the use case for checking a string just looks like an MD5 string, it is either a correct valid md5 string for something or it is not. Maybe I am just not seeing the obvious here.

Danny

Lets say you have a very large file that you want to verify (the LabVIEW 2011 image :) )

Before extracting and installing, you might want to verify the file. The time to generate the MD5 from the downloaded file may be significant. If you are going to compare the results of a time consuming operation with an invalid string, you will be wasting time (by re-downloading a file that may be good).

It is not easy to visually read or count the characters of an MD5 checksum...

Link to comment

If we are going to implement this, I wouldn't want it to be strict regarding the case, one of the reasons being that LabVIEW using %x formatter outputs uppercast hex-values.

Regarding the SHA hashes, yes my intention is to offer that code in the OpenG package, however I haven't got time to write unit tests for the SHA and HMAC hashing, though I have a complete set of unit test's for AES cyper/decypher routines.

Ton

  • Like 1
Link to comment

In summary:

  • Add this VI to OpenG
  • Move to MD5 package
  • Allow non-strict capitalization
  • Implement code similar to this:

post-13461-0-69352700-1314629366.png

With the non-strict capitalization issue, could an optional Boolean flag called "Strict Capitalization" (default = F) be added to the interface?

Would that be a good feature to have?

Link to comment

I'm not sure I see the value in implementing a strict option. Unless you're actually evaluating whether something is following MD5 spec, the case of your hex string is irrelevant to the data it contains. There's trivial overhead in exposing the option to do this, but the execution of a mixed-case regular expression suffers ~20% performance penalty in some brief benchmarks.

I like Ton's idea RE: the hex string and would actually go as far to say that the string should be the only output - I've never had a use case where I need an MD5 in binary. It's always been for display or inclusion as a checksum (like VIPM does for package files and the like).

Link to comment
...but the execution of a mixed-case regular expression suffers ~20% performance penalty in some brief benchmarks.

No, i was thinking that the flag would select the correct RegEx string to use in the MP primitive. So overhead should be negligible.

Link to comment

No, i was thinking that the flag would select the correct RegEx string to use in the MP primitive. So overhead should be negligible.

Right, I was comparing the performance of /^[0-9a-f]$/ versus /^[0-9a-fA-F]$/. The addition of the second range of characters is where the 20% comes from.

(I should have included a snippet...)

Link to comment

My only complaints are that of verbiage:

* "Strict Character Case" vs. "Strict Capitalization"

* "Is an MD5? returns TRUE if Input String represents an MD5 hash, otherwise it returns FALSE."

* "Valid characters are ..." vs. "Expected characters are ..."

Other than that, thanks for dedicating the time to this :)

Link to comment

Other than that, thanks for dedicating the time to this :)

Likewise (to everyone who's posted too)

* "Is an MD5? returns TRUE if Input String represents an MD5 hash, otherwise it returns FALSE."

I changed everything else you mentioned but to be consistent I use the variable names so MD5? instead of Is an MD5?

post-10325-0-29527900-1316100425_thumb.p

post-10325-0-77081100-1316100343_thumb.p

Is an MD5.vi

TEST - Is an MD5.vi

Code is in LabVIEW 2009.

Link to comment
Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.