jgcode Posted August 29, 2011 Report Posted August 29, 2011 This OpenG Review is closed. See Summary Post here. Please start a new thread to discuss new changes to this VI. Please PM me if there are any issues with this thread. Community, This VI was in the Candidates folder for the String Package. It has been sitting in there for a while therefore, I have just gone ahead and posted as is (so the license will be migrated on confirmation from the author etc...). What are you thoughts on this VI? Would you like to see such a function in OpenG? Can you optimize the code? It may be better suited in e.g. Comparison Package? Should it be rejected? Kind regards Jonathon Green OpenG Developer Is an MD5.vi TEST - Is an MD5.vi Code is in LabVIEW 2009
jgcode Posted August 29, 2011 Author Report Posted August 29, 2011 I propose that we use RegEx and clean up the BD. Not sure of the best way to handle the error, above is an example to maintain the VI's interface. Is an MD5.vi Code is in LabVIEW 2009
jcarmody Posted August 29, 2011 Report Posted August 29, 2011 A Regex would be slower, I would ignore the error (if using Match Regular Expression) and the \w will include non-hex characters. 1
jgcode Posted August 29, 2011 Author Report Posted August 29, 2011 and the \w will include non-hex characters. Of course! Thx! Does it always have to be lower case hex only as per these examples?
Phillip Brooks Posted August 29, 2011 Report Posted August 29, 2011 (edited) The RFC states that it must be lower case... http://www.ietf.org/rfc/rfc2831.txt Let HEX(n) be the representation of the 16 octet MD5 hash n as astring of 32 hex digits (with alphabetic characters always in lowercase, since MD5 is case sensitive). Edited August 29, 2011 by Phillip Brooks 1
jgcode Posted August 29, 2011 Author Report Posted August 29, 2011 A Regex would be slower, I would ignore the error (if using Match Regular Expression) and the \w will include non-hex characters. Yes RegEx is slower but your code with Match Pattern does not pass the Test VI? I don't think Match Pattern supports Quantifiers? I am not sure? (I always find myself looking up Match Pattern help as its different to RegEx) Can anyone confirm - can we use MP for this (it is much faster). Anyways here is the fix from using any alphanumeric to specifying a character class based on Phillip's post. (Sorry, I cut, pasted and edit from the GUID thread - thanks for pointing that out) I had to update the original Test VI to fail MD5's that contain [A-F] range (see new VI attached). Is an MD5.vi TEST - Is an MD5.vi Code is in LabVIEW 2009
jcarmody Posted August 29, 2011 Report Posted August 29, 2011 My only comment is that the Regex doesn't need the pipe. I don't understand why Match Pattern doesn't work, but it definitely doesn't.
jgcode Posted August 29, 2011 Author Report Posted August 29, 2011 My only comment is that the Regex doesn't need the pipe. I don't understand why Match Pattern doesn't work, but it definitely doesn't. I can take the alternation metacharacter out if no one likes it in there - no dramas. (Unsure if it impacts performance or not).
asbo Posted August 29, 2011 Report Posted August 29, 2011 Agreed, the alternation character is unnecessary. It'll work just the same, but it changes the logic of the expression (though it may have no performance impact). The Match Pattern node does not support finite quantifiers, but dot and star work as expected. Because string length is an O(1) operation, maybe this is worth considering: Edit: Oops, had a-z as the character range, when it should have been a-f. 1
jgcode Posted August 29, 2011 Author Report Posted August 29, 2011 Agreed, the alternation character is unnecessary. It'll work just the same, but it changes the logic of the expression (though it may have no performance impact). The Match Pattern node does not support finite quantifiers, but dot and star work as expected. Because string length is an O(1) operation, maybe this is worth considering: That's quite clever, and would be faster than RegEx.
Wouter Posted August 30, 2011 Report Posted August 30, 2011 (edited) My question... why would you ever want to test if a string looks like it is a md5 string? Further I had much rather see VI's in OpenG which could create SHA1 and SHA2 hashes. Maybe ask Ton... https://bitbucket.org/tcplomp/labviewencryption/src/ee5c00d513e7 Edited August 30, 2011 by Wouter
jgcode Posted August 30, 2011 Author Report Posted August 30, 2011 My question... why would you ever want to test if a string looks like it is a md5 string? Most likely for data validation. Of course the main reason for the review is verify whether this would be a useful function (and then how to implement it, if it is useful). Should it be rejected? So, please continue to let us know either way.
dannyt Posted August 30, 2011 Report Posted August 30, 2011 My question... why would you ever want to test if a string looks like it is a md5 string? Further I had much rather see VI's in OpenG which could create SHA1 and SHA2 hashes. Maybe ask Ton... https://bitbucket.org/tcplomp/labviewencryption/src/ee5c00d513e7 I sort of agree with the question above, what is the use case for checking a string just looks like an MD5 string, it is either a correct valid md5 string for something or it is not. Maybe I am just not seeing the obvious here. Danny
Phillip Brooks Posted August 30, 2011 Report Posted August 30, 2011 I sort of agree with the question above, what is the use case for checking a string just looks like an MD5 string, it is either a correct valid md5 string for something or it is not. Maybe I am just not seeing the obvious here. Danny Lets say you have a very large file that you want to verify (the LabVIEW 2011 image ) Before extracting and installing, you might want to verify the file. The time to generate the MD5 from the downloaded file may be significant. If you are going to compare the results of a time consuming operation with an invalid string, you will be wasting time (by re-downloading a file that may be good). It is not easy to visually read or count the characters of an MD5 checksum...
Ton Plomp Posted September 1, 2011 Report Posted September 1, 2011 If we are going to implement this, I wouldn't want it to be strict regarding the case, one of the reasons being that LabVIEW using %x formatter outputs uppercast hex-values. Regarding the SHA hashes, yes my intention is to offer that code in the OpenG package, however I haven't got time to write unit tests for the SHA and HMAC hashing, though I have a complete set of unit test's for AES cyper/decypher routines. Ton 1
Popular Post Francois Normandin Posted September 1, 2011 Popular Post Report Posted September 1, 2011 This VI was in the Candidates folder for the String Package. I'd find more natural to find this in the MD5 palette. 3
jgcode Posted September 7, 2011 Author Report Posted September 7, 2011 In summary: Add this VI to OpenG Move to MD5 package Allow non-strict capitalization Implement code similar to this: With the non-strict capitalization issue, could an optional Boolean flag called "Strict Capitalization" (default = F) be added to the interface? Would that be a good feature to have?
Ton Plomp Posted September 7, 2011 Report Posted September 7, 2011 If we are going to use this functionality, we nood to change the MD5 hash-vi with an extra output. CUrrently the VI outputs a binary string, that should be converted to a hex-string to produce a valid MD5-hash.
asbo Posted September 7, 2011 Report Posted September 7, 2011 I'm not sure I see the value in implementing a strict option. Unless you're actually evaluating whether something is following MD5 spec, the case of your hex string is irrelevant to the data it contains. There's trivial overhead in exposing the option to do this, but the execution of a mixed-case regular expression suffers ~20% performance penalty in some brief benchmarks. I like Ton's idea RE: the hex string and would actually go as far to say that the string should be the only output - I've never had a use case where I need an MD5 in binary. It's always been for display or inclusion as a checksum (like VIPM does for package files and the like).
jgcode Posted September 7, 2011 Author Report Posted September 7, 2011 ...but the execution of a mixed-case regular expression suffers ~20% performance penalty in some brief benchmarks. No, i was thinking that the flag would select the correct RegEx string to use in the MP primitive. So overhead should be negligible.
asbo Posted September 7, 2011 Report Posted September 7, 2011 No, i was thinking that the flag would select the correct RegEx string to use in the MP primitive. So overhead should be negligible. Right, I was comparing the performance of /^[0-9a-f]$/ versus /^[0-9a-fA-F]$/. The addition of the second range of characters is where the 20% comes from. (I should have included a snippet...)
jgcode Posted September 15, 2011 Author Report Posted September 15, 2011 What are people's thoughts on this (does it cover all use cases)? Is an MD5.vi TEST - Is an MD5.vi Code in LabVIEW 2009
asbo Posted September 15, 2011 Report Posted September 15, 2011 My only complaints are that of verbiage: * "Strict Character Case" vs. "Strict Capitalization" * "Is an MD5? returns TRUE if Input String represents an MD5 hash, otherwise it returns FALSE." * "Valid characters are ..." vs. "Expected characters are ..." Other than that, thanks for dedicating the time to this
jgcode Posted September 15, 2011 Author Report Posted September 15, 2011 Other than that, thanks for dedicating the time to this Likewise (to everyone who's posted too) * "Is an MD5? returns TRUE if Input String represents an MD5 hash, otherwise it returns FALSE." I changed everything else you mentioned but to be consistent I use the variable names so MD5? instead of Is an MD5? Is an MD5.vi TEST - Is an MD5.vi Code is in LabVIEW 2009.
asbo Posted September 15, 2011 Report Posted September 15, 2011 I changed everything else you mentioned but to be consistent I use the variable names so MD5? instead of Is an MD5? Ahh, that makes sense. Looks good!
Recommended Posts