jgcode Posted August 20, 2011 Report Posted August 20, 2011 This OpenG Review is now closed. See Summary Post here. Please start a new thread to discuss new changes to this VI. Community, This is the integration of ShaunR's Fast Trim to OpenG's Trim Whitespace function. Please review the code and submit all comments below. On testing PC the new Trim Whitespace VI is >3x faster than the native version and ~5x faster than current OpenG version. Aside from the algorithm change the VI is now a subroutine (like natvie version). The new interface supports an additional argument Remove non-printable characters (False), see BD for further comments. Kind regards Jonathon Green OpenG Developer Trim Whitespace (String) 1.vi TEST - Trim Whitespace (Performance).vi TEST - Trim Whitespace 1.vi 1
Jim Kring Posted August 20, 2011 Report Posted August 20, 2011 Hey Jon, Thanks for posting this design change for review. A couple questions: A) Do users often need to "Remove non-printable characters (False)" -- is this a common use case? B) Does it add much value to couple the removal of non-printable characters to the trimming of leading/trailing whitespace? (For example, do users often need to perform these operations together?) It seems to me that if the answers to these questions aren't an overwhelming "YES" (in all caps , then removal of non-printable characters should be decoupled from Trim Whitespace, since decoupling orthogonal functionality tends to make the software more maintainable, understandable, testable, etc. If the decision is to decouple them, then maybe it makes sense to add a new function called Remove Non-printable Characters. Regards, -Jim 1
jgcode Posted August 20, 2011 Author Report Posted August 20, 2011 Hey Jon, Thanks for posting this design change for review. A couple questions: A) Do users often need to "Remove non-printable characters (False)" -- is this a common use case? B) Does it add much value to couple the removal of non-printable characters to the trimming of leading/trailing whitespace? (For example, do users often need to perform these operations together?) It seems to me that if the answers to these questions aren't an overwhelming "YES" (in all caps , then removal of non-printable characters should be decoupled from Trim Whitespace, since decoupling orthogonal functionality tends to make the software more maintainable, understandable, testable, etc. If the decision is to decouple them, then maybe it makes sense to add a new function called Remove Non-printable Characters. Regards, -Jim Thanks for the feedback Jim, the Remove non-printable characters (False) argument was a feature of Fast Trim so I included it in the proposed VI however, I had to make Fast Trim fit the original OpenG Trim Whitespace leading/trailing interface. i agree with respect to orthogonal functionality - we could definitively decompose it into two VIs. What do others think?
ShaunR Posted August 20, 2011 Report Posted August 20, 2011 (edited) @ JG Why have you slowed it down? (those arrays were inside the for loops for a reason ) @JK. You are right. The original Trim Whitespace+.vi (thanks for pointing out the two chars I missed JG-are they considered whitespace?) was a direct replacement for the built in Trim [whitespace] function. However. the Fast Trim was more useful on comms strings (especially serial) where you can have all sorts of non-printable chars and not just "whitespace". Edited August 20, 2011 by ShaunR 1
jgcode Posted August 20, 2011 Author Report Posted August 20, 2011 @ JG Why have you slowed it down? (those arrays were inside the for loops for a reason ) Hi Shaun, thanks for your reply. Here is some interesting points from testing (note: I did it in 2011 as I had it fired up at the time) after I stripped out the new argument: The arrays inside or outside the loop made no difference I cut and pasted your arrays in - this made it faster. It must be to do with the Search 1D Array primitive? Did you optimize the order of elements in the array? I replaced the order of the tests (e.g. ran your VI in first frame, mine in last frame (in the performance test) and this changed the results! It made mine faster. Can you verify this? With the new argument stripped its pretty similar however, if the argument is to be removed I think your version (above) is better (I had constraint of handling both arguments when I wrote the proposed VI). Cheers -JG
Jim Kring Posted August 20, 2011 Report Posted August 20, 2011 A few thoughts on performance benchmarking (which is a pretty tricky thing): 1) In addition to performing a short operation thousands of times and averaging, I like to keep track of the fastest time over some period (run continuously and then stop when I see the time settle) -- the fastest time is more representative of it's true performance capabilities. Trim Test - Fastest.zip 2) Care should also be taken to use appropriate test vectors and compare performance for each class of vector. For example: Small strings vs. Large strings Mixed whitespace vs most common whitespace I think it's important that the code be optimized for the most common scenarios (a few space "\s" characters at the head and tail, and probably some EOLs at the tail). So, if there are design choices that affect performance (like the order of characters in the whitespace array being searched), then we should lean toward those. Regards, -Jim
ShaunR Posted August 20, 2011 Report Posted August 20, 2011 @JG Well. Both functions of yours are much faster than on my machine. Not running on a quad core are you? (it does use parallelism). Yes I have optimised the order. But for the purposes of your tests; I moved the space char to be the last. But in my tests (poor lowly Dual core) it made no difference whether it was in the first frame or last. The original was about 10ms faster, but since your tests are 100 ms faster than mine anyway;that's about a gnats fart It's interesting to note that in both our cases, the native LV function is pretty similar. Again I'm coming back to parallelism since the regex functions only use the UI thread (if I remember correctly). Note that I don't have the openG libs installed so the old one is disabled (hence zero) Still waiting to see the performance improvements in 2011...lol. @JK. The main advantage of this method (over and above just being faster than the in-built function) is that it is very robust in terms of the length of the string (very similar times for a short string of say 10 chars and one of ..say 10,000 chars-all beginning and end white spaces being equal). Performance is dictated by the number of white spaces only, unlike the other methods that need to iterate over the entire string.
jgcode Posted August 21, 2011 Author Report Posted August 21, 2011 Well. Both functions of yours are much faster than on my machine. Not running on a quad core are you? (it does use parallelism). i5-2410M only 2 cores I am afraid!
ShaunR Posted August 21, 2011 Report Posted August 21, 2011 i5-2410M only 2 cores I am afraid! Hmmm. Foesn't make a lot of sense. Must be sheer willpower then
jgcode Posted August 22, 2011 Author Report Posted August 22, 2011 Ok, so this is an update of the VI based on the above. Is everyone happy (e.g. this is the most optimized version for common use case). These are the relative times (note: Time New and Trim Whitespace+ are now the same BD): Trim Whitespace.zip
ShaunR Posted August 22, 2011 Report Posted August 22, 2011 Ok, so this is an update of the VI based on the above. Is everyone happy (e.g. this is the most optimized version for common use case). These are the relative times (note: Time New and Trim Whitespace+ are now the same BD): Trim Whitespace.zip Looks familair . But I think the second 0xD in the arrays needs to be a 0xB if it is considered a white-space (personally, I don't think 0xB and 0xC are...but that's probably a judgement call).
Bjarne Joergensen Posted August 22, 2011 Report Posted August 22, 2011 (edited) Hi jgcode In the array constant there is two D's! Is that right? Should the one of them not be a B? regards Bjarne Too late. I have not seen that ShaunR pointed out the same Edited August 22, 2011 by Bjarne Joergensen
jgcode Posted August 22, 2011 Author Report Posted August 22, 2011 But I think the second 0xD in the arrays needs to be a 0xB if it is considered a white-space In the array constant there is two D's! Is that right? Ha - just making sure you are all paying attention (This is now just now personal taste, I always like to eliminate unnecessary bends and crossings if simple to do). Cheers! On the contrary, I like to code the index in the bottom LHS and the condition in the bottom RHS. However, I did try to do less bends on the last bit of code for you ...(personally, I don't think 0xB and 0xC are...but that's probably a judgement call). This is inline with original OpenG version. Trim Whitespace.zip
Darin Posted August 22, 2011 Report Posted August 22, 2011 I have a similar version except that I use the Lexical Class comparison (which also sets my definition of white space). My guess (no hard data) is that it is a wee bit faster than the array search. Rearranging my code to match the above image gives me this. Also I would look at the result in your code when an all-whitespace string is trimmed at the end. 1
jgcode Posted August 23, 2011 Author Report Posted August 23, 2011 I have a similar version except that I use the Lexical Class comparison (which also sets my definition of white space). My guess (no hard data) is that it is a wee bit faster than the array search. Rearranging my code to match the above image gives me this. Note to all: it is more helpful to post VIs to this thread than snippets as CP and VI settings are lost. Nice one! - Yes ,the Lexical Class primitive is faster at searching than the Search Array primitive in this case. Below is the new proposed version. We will add you to the copyright for your contribution. Please post here or PM me your full name (or company name) and email address which is required for copyright. Trim New time is from VI above. Trim Whitespace NEW.vi I created a VI from Darin's snippet and tested it in the performance framework to get the above times: Trim Whitespace.zip
Popular Post Darin Posted August 23, 2011 Popular Post Report Posted August 23, 2011 If you feel the need for speed there is one more step you can take. It obfuscates the process a bit, but you can generate a lookup table which is indexed by the byte array. Simply put true at values which correspond to whitespace and index it. VI attached below. (I tend to use the version I posted earlier because I can make the lexical class a control to trim more than whitespace, but this seems like a unitasker). Darin2.vi 3
jgcode Posted August 23, 2011 Author Report Posted August 23, 2011 If you feel the need for speed there is one more step you can take. It obfuscates the process a bit, but you can generate a lookup table which is indexed by the byte array. Simply put true at values which correspond to whitespace and index it. Here are the relative times after I integrated this VI into the test framework. Very impressive speed increase. IMHO I don't think the obfuscation warrants not implementing it, we can add comments on the BD (your description above made enough sense to me to understand it straight away) and we can even made a support VI which creates the lookup and save that. So the question is - can anyone make it faster?
jaegen Posted August 23, 2011 Report Posted August 23, 2011 So the question is - can anyone make it faster? Well after about an hour of trying the only thing I could come up with that wasn't slower was to delete all but the first 32 elements of the boolean lookup array constant, since every character greater than 0x20 is not whitespace. But given that this didn't seem to speed up things at all, only saves 224 bytes, and further obfuscates the code, it's probably not worth it. Nice code Darin. Jaegen
jgcode Posted August 23, 2011 Author Report Posted August 23, 2011 I am really excited how well this process has turned out and thank everyone who participated. I will be changing the status to pending in the near future so anyone who missed out can still respond. After that I will lock the topic and integrate the code for an upcoming release.
Darin Posted August 23, 2011 Report Posted August 23, 2011 So I do have a bit of a disorder when it comes to round numbers, and my latest version was clocking around 102-104 nsec on my machine. Had to get those last 2-4 nsec out of there and noticed a little detail that was counter-intuitive (for me). I insisted on putting the reverse array inside the case, why would I reverse every time if I wasn't going to trim the end? Turns out the compiler (LV9 at least) is happier with the reverse array node on the root diagram (as JG astutely shows above). That simple change shaved about 6 nsecs on my machine so the test clocks in at a tidy 98 nsec (next fastest was at 114). Happy day, got to learn something new. Happier day, it may prove useful.
Ton Plomp Posted August 24, 2011 Report Posted August 24, 2011 I think we should implement the fastest (boolean array if I'm following this correct), IMHO we should call the lookup table 'ASCII==Printable'. Ton
jgcode Posted August 24, 2011 Author Report Posted August 24, 2011 I think we should implement the fastest (boolean array if I'm following this correct), IMHO we should call the lookup table 'ASCII==Printable'. Ton Additional to the Table Name, I think we should add sufficient BD documentation so that it is clear how the table was generated.
Phillip Brooks Posted August 24, 2011 Report Posted August 24, 2011 (edited) Just a quick observation. Does this design trade memory for speed? If so, would this function ever be used in a memory constrained environment such as RT or FieldPoint? It appears that two copies of the string data (as U8 arrays, one reversed) are created to iterate over. Is the LabVIEW compiler is smart enough to only use one buffer for the U8 array data? What does LabVIEW Profiler tell us about the buffer allocations? I don't have 2009 installed, so I can't play with the examples. If there are two buffer allocations for the U8 array data, would there any difference in performance if the 'end trim' loop were to use the same U8 array and simply index from the end of the array (N-i-1) until a hit was found? Edited August 24, 2011 by Phillip Brooks 1
Cloedu72 Posted August 24, 2011 Report Posted August 24, 2011 (edited) What about using the Inline functionality? Claude Darin2_jcm.vi Edited August 24, 2011 by Cloedu72
Recommended Posts