Neil Pate Posted October 6, 2022 Report Share Posted October 6, 2022 Hi everyone, I am trying to wring as much performance as possible out of a single VI I have created. It is part of a rendering application and will be called many-many times per second. In essence it just draws a horizontal line in a 2D array of pixels. I have four techniques implemented (3 work, 1 crashes LabVIEW). Is anyone up for the challenge of trying to get it any faster? I have attached version in 2015 and 2022. I developed in 2022 and back-saved, but for some reason just opening the code in my 2015 LV instance crashes (but it opens fine in 2021). Raster 2022Q3.zip Raster 2015.zip Quote Link to comment
dadreamer Posted October 6, 2022 Report Share Posted October 6, 2022 1 hour ago, Neil Pate said: for some reason just opening the code in my 2015 LV instance crashes As I wrote here, ArrayMemInfo node was introduced only in 2017 version. It just didn't exist in 2015. That's why it crashes. After a quick test in 2022 Q3 MoveBlock didn't crash my LV. Gonna get a closer look at the code tomorrow. Quote Link to comment
Neil Pate Posted October 6, 2022 Author Report Share Posted October 6, 2022 9 minutes ago, dadreamer said: As I wrote here, ArrayMemInfo node was introduced only in 2017 version. It just didn't exist in 2015. That's why it crashes. After a quick test in 2022 Q3 MoveBlock didn't crash my LV. Gonna get a closer look at the code tomorrow. Ah, that explains the crash when opening in 2015. Thanks! Quote Link to comment
Gribo Posted October 6, 2022 Report Share Posted October 6, 2022 Are you on Windows? If so, the .NET picturebox control is faster than the native LV control for such operations. It also has other nice features, such as double buffering. Quote Link to comment
Neil Pate Posted October 6, 2022 Author Report Share Posted October 6, 2022 2 hours ago, Gribo said: Are you on Windows? If so, the .NET picturebox control is faster than the native LV control for such operations. It also has other nice features, such as double buffering. Windows for now, but the point of the exercise is to implement as much as possible from source. Quote Link to comment
Dataflow_G Posted October 7, 2022 Report Share Posted October 7, 2022 Is working with the image data as a contiguous 1D array, rather than a 2D array an option? I wrote a small image library for LabVIEW (G-Image), and found working with 1D arrays of image data consistently faster than 2D arrays. Attached is a quick version which performs the per item + row methods using a 1D array, and both are quicker than their 2D counterparts. Raster 1D 2015.zip Quote Link to comment
ShaunR Posted October 7, 2022 Report Share Posted October 7, 2022 (edited) A warning about the ArrayMemInfo function. Be aware of whether it is returning a sub array or not. It can change even by changing VI properties and a plethora of other reasons. Edited October 7, 2022 by ShaunR Quote Link to comment
dadreamer Posted October 7, 2022 Report Share Posted October 7, 2022 @Neil Pate As I can see, your pointer maths are okay, as long as you provide valid coordinates in Point 1 and Point 2 parameters. Thanks to @ShaunR it appears, that ArrayMemInfo has a bug, that's reproduced in 64-bit LabVIEW. I couldn't reproduce it in 32-bit LabVIEW though. The stride should not be zero, unless the array is empty, no matter if subarray or not. But even when the stride is 0, it shouldn't lead to crash, because in this case we're just writing into row 0 instead of intended one. To eliminate the bug influence (if any), you might not use the stride of ArrayMemInfo node, but use (Array Width x 4) instead as it's a constant in your case. Quote Link to comment
Neil Pate Posted October 7, 2022 Author Report Share Posted October 7, 2022 Thanks everyone, I really appreciate the help. I will take a look over the weekend at the suggestions. Quote Link to comment
ShaunR Posted October 7, 2022 Report Share Posted October 7, 2022 (edited) 4 hours ago, Neil Pate said: Thanks everyone, I really appreciate the help. I will take a look over the weekend at the suggestions. I haven't finished yet 6 hours ago, dadreamer said: @Neil Pate As I can see, your pointer maths are okay, as long as you provide valid coordinates in Point 1 and Point 2 parameters. Thanks to @ShaunR it appears, that ArrayMemInfo has a bug, that's reproduced in 64-bit LabVIEW. I couldn't reproduce it in 32-bit LabVIEW though. The stride should not be zero, unless the array is empty, no matter if subarray or not. But even when the stride is 0, it shouldn't lead to crash, because in this case we're just writing into row 0 instead of intended one. To eliminate the bug influence (if any), you might not use the stride of ArrayMemInfo node, but use (Array Width x 4) instead as it's a constant in your case. Agreed. Bug. I too couldn't preproduce it in 32 bit. However. You need both strides for the 2D array (offset= x*Stride_1 + y*stride_2). Granted LabVIEW uses a contiguous allocation but that also assumes packed and aligned. Not sure we can guarantee that on all platforms (Rolf will know). You can obviate the array allocation for this particular example by auto-initialising it on the first run (or when the length changes). You can remove the calculation of the length too since that is returned as one of the size parms. You can calculate the length as in the other examples but not calculating it improves jitter immensely. I also did your little trick with the wrapper which makes a nice difference here. too. The following was on LV 2021 x32. Compared to... Raster 2021_SR_1.zip Edited October 7, 2022 by ShaunR Quote Link to comment
Gribo Posted October 8, 2022 Report Share Posted October 8, 2022 Hardware acceleration was invented exactly for this use case.. Quote Link to comment
Neil Pate Posted October 8, 2022 Author Report Share Posted October 8, 2022 22 hours ago, ShaunR said: You can obviate the array allocation for this particular example by auto-initialising it on the first run (or when the length changes). You can remove the calculation of the length too since that is returned as one of the size parms. You can calculate the length as in the other examples but not calculating it improves jitter immensely. I also did your little trick with the wrapper which makes a nice difference here. too. The following was on LV 2021 x32. Unfortunately I cannot actually do this optimisation as normally the length of the line, and pixel colour will differ every single call. Well, actually the colour will stay the same for some number of calls, but the line length will normally change. In my benchmark VI I just used a worst case scenario of a line filling the whole row. Your MoveBlock2 method still seems faster, thanks! I will see how it affects the performance of my actual application. Quote Link to comment
Neil Pate Posted October 8, 2022 Author Report Share Posted October 8, 2022 1 hour ago, Gribo said: Hardware acceleration was invented exactly for this use case.. Nope. The use case in this situation is just to learn 🙂 Quote Link to comment
dadreamer Posted October 8, 2022 Report Share Posted October 8, 2022 (edited) On 10/7/2022 at 9:27 PM, ShaunR said: Granted LabVIEW uses a contiguous allocation but that also assumes packed and aligned. Not sure we can guarantee that on all platforms (Rolf will know). I believe it is, at least for desktop platforms. Given the pointer that ArrayMemInfo outputs, I can subtract 4 bytes (for 1D array of U32) or 8 bytes (for 2D array of U32) from it and do DSRecoverHandle, that gives a valid handle. I can even get its size with DSGetHandleSize and it will correspond to the array size that was passed to the ArrayMemInfo (plus N I32-sized fields, where N is the number of the array dimensions). According to the doc's such as Using External Code in LabVIEW handles are relocatable and contiguous. Of course, Rolf could add more here. @Neil Pate If you wonder what that wrapper trick is, check this post. Using those tokens you could somewhat enhance your CLF Nodes and reduce time spent on each call. Edited October 19, 2022 by dadreamer stand corrected 1 Quote Link to comment
Neil Pate Posted October 8, 2022 Author Report Share Posted October 8, 2022 (edited) Plot twist... it turns out that the MoveBlock technique is quicker when replacing long-ish rows, but for smaller chunks of pixles the naive replace an element at a time is actually faster. I just used the worst case scenario in the benchmark, but have realised this is not actually sensible when rendering things made of smaller triangles (and hence smaller line segments). This is what I am rendering, it has close to 20k vertices. Edited October 8, 2022 by Neil Pate Quote Link to comment
Neil Pate Posted October 8, 2022 Author Report Share Posted October 8, 2022 Frame rate is now at 100 FPS, time to stop optimising 🙂 2022-10-08 21-49-45.mkv Quote Link to comment
Rolf Kalbermatter Posted October 18, 2022 Report Share Posted October 18, 2022 (edited) On 10/8/2022 at 8:58 PM, dadreamer said: I believe it is, at least for desktop platforms. Given the pointer that ArrayMemInfo outputs, I can subtract 4 bytes (for 32-bit) or 8 bytes (for 64-bit) from it and do DSRecoverHandle, that gives a valid handle. I can even get its size with DSGetHandleSize and it will correspond to the array size that was passed to the ArrayMemInfo (plus I32 size field plus I32 padding on 64 bits). According to the doc's such as Using External Code in LabVIEW handles are relocatable and contiguous. Of course, Rolf could add more here. @Neil Pate If you wonder what that wrapper trick is, check this post. Using those tokens you could somewhat enhance your CLF Nodes and reduce time spent on each call. Yes arrays in LabVIEW are one single block of memory where the multiple dimensions are all concatenated together for multi-dimensional arrays. There is no row padding, since the natural size of the elements is also the start address of the actual data area. The data area is prepended with the I32 values indicating the size of each dimension. And yes arrays can have 0 * x * y * z elements, which is in fact an empty array but it still maintains the original lengths for each dimension and therefore also allocated a memory block to store those dimension sizes. Only for empty one dimensional arrays (and strings) does LabVIEW internally allow a NULL pointer handle to be equivalent to an array with a 0 dimension size. If you pass such handles to C code through the Call Library Node you have to be prepared for that if the handle is passed by reference (e.g. LStrHandle *string). Here the string variable can be a valid handle with a length of 0 or greater or it can be a NULL pointer. If your C code doesn't account for that and tries to reference the string variable with LStrBuf(**string) for instance (but you anyhow should use the LStrBufH(*string) instead, which is prepared to not crash on a NULL handle), bad things will happen. For handles passed by value (e.g. LStrHandle string) this doesn't apply since while handles are relocatable in principle, there would be no way for the function to create a new handle and pass it back to LabVIEW, if LabVIEW passed a NULL handle in. In this case LabVIEW will always allocate a handle and set its length to 0, if an empty array is to be passed to the function. I do believe that your explanation about the value to subtract is likely misleading however. The pointer reported in the MemInfo function is likely the actual data pointer to the first element of the array. There is one int32 for each dimension located before that before you get to the actual pointer value contained within the handle. And that value is what DSRecoverHandle() needs. The way it works is that the pointer area of the memory block referred to by a handle actually contains extra bytes in front of the start address of the handle pointer. This area stores information such as the actual handle that refers to this handle pointer, the totally allocated storage in bytes for that handle (minus this extra management information and some area for flags that was used when LabVIEW still had two distinct handle types (AZ and DS). AZ handles could be dynamically resized by the memory manager whenever it felt like, unless there was a flag that indicated that the handle was locked. To set and clear this flag there was the AZLock() and AZUnlock() function. Trying to access an AZ handle without locking it could bomb your computer, the Macintosh equivalent of Blue screens back in those days. You got a dialog with a number of bombs, that indicated the type of 68k exception that had been triggered. And yes after acknowledgment of that dialog, the application in question was gone. DS handles never are relocated by the memory manager itself. The application needs to do an explicit DSSetHandleSize() or DSDisposeHandle() for a particular handle to change. However you should not try to rely on this information, the location of where LabVIEW stores the handle value and handle size (and if it even does so) is platform, compiler and version dependent. And since it is private code deep down in the memory manager that is fine. The entire remainder of LabVIEW does not care and is not aware about this. The only people who can do anything useful with that information are LabVIEW developers who actually might need to debug memory manager things. For all the rest including LabVIEW users this is utterly useless. So how much you would need to subtract from that pointer would almost certainly depend on the number of dimensions of your array and not the bitness you operate in. It's 4 bytes per dimension, BUT! There is a little gotcha, On other platforms than Windows 32-bit, the first data element in the array is naturally aligned. So if your array is an array of 64-bit integers or double precision floats, the actual difference to the real start of the handle needs to be a multiple of 8 bytes on non-Windows 32-bit (and Pharlap) platforms, since that is the size of the array data element. Edited October 18, 2022 by Rolf Kalbermatter Quote Link to comment
dadreamer Posted October 19, 2022 Report Share Posted October 19, 2022 19 hours ago, Rolf Kalbermatter said: So how much you would need to subtract from that pointer would almost certainly depend on the number of dimensions of your array and not the bitness you operate in. It's 4 bytes per dimension, BUT! There is a little gotcha, On other platforms than Windows 32-bit, the first data element in the array is naturally aligned. So if your array is an array of 64-bit integers or double precision floats, the actual difference to the real start of the handle needs to be a multiple of 8 bytes on non-Windows 32-bit (and Pharlap) platforms, since that is the size of the array data element. Yeah, I was thinking of double numbers whereas dealing with 4-byte integers, hence the confusion. In that thread I was introducing a filler field of 4 bytes in 64-bit LabVIEW using Conditional Disable structure. That's unnecessary here. Quote Link to comment
Rolf Kalbermatter Posted October 19, 2022 Report Share Posted October 19, 2022 46 minutes ago, dadreamer said: Yeah, I was thinking of double numbers whereas dealing with 4-byte integers, hence the confusion. In that thread I was introducing a filler field of 4 bytes in 64-bit LabVIEW using Conditional Disable structure. That's unnecessary here. Makes sense. In this case its double unneeded. Since it is a 2D array, the two dimension sizes already add up to 8 bytes, so there would be no padding even for 64-bit integers. And since the array uses 32-bit integer values here, there is anyhow never any padding. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.