Jump to content

Another reason why "copy dots" is a bad name for "buffer allocations"


Recommended Posts

There is a tool in LabVIEW called the "Show Buffer Allocations" tool. You can find it in a VI's menu at Tools>>Profile>>Show Buffer Allocations.... It is a useful tool for doing memory optimizations because it shows little dots every place that the LV compiler made a buffer allocation. Some people call these "copy dots" because people *think* these indicate where LV is making a copy of the data. About 90% of the time, that is accurate. But there are some cases where a "buffer allocation" and a "copy" are not the same thing at all. Today I want to mention one that I haven't posted before.

When using arrays, one of the major times when buffer allocations are of interest to programmers, not all buffer allocations are copy dots. Whenever possible, LabVIEW will do an operation on an array by creating a "subarray". If you pay close attention to the Context Help window, you'll sometimes see wires that are of "subarray" type. A subarray is simply a reference to another array, storing an index into that array and a stride and a length. This is a very efficient way to do some pretty complex operations without actually making new arrays, such as decimate array, split array, and reverse array. That last one returns an index to the last element of the array and a stride length of -1. The return of a subarray is only possible when LV knows that no other operation is going to be modifying the value of the array, so it is safe for the reference to the array in memory to be shared.

Now, take a look at this diagram. Notice the buffer allocations:

post-5877-1206141304.png?width=400

The "Split 1D Array" node has two output buffer allocations. A lot of people would think, "Look at those copy dots. That means LV is making two new arrays, one for the first part of the array, and one for the second part of the array." Not true. The buffer allocations are *for the subarrays*. Remember I said that a subarray records a reference to the array, a starting index, a length and a stride. Those four items of data have to be stored somewhere. The buffer allocation is the allocation to store those four items. It is not a copy of the entire array. The output of the Build Array, on the other hand, is a full array allocation.

To see what is being allocated at any given buffer allocation, look at the type of the wire.

And don't call them "copy dots." :-)

Link to comment

QUOTE (Tomi Maila @ Mar 22 2008, 11:43 AM)

Are subarrays ever used or can subarrays somehow be used when passing data to and from external code? AQ or Rolf?

Not at this time. The full array is always exposed to external code. If a subarray is passed to a DLL call, LV will go ahead and allocate a new array that is what the subarray represented and pass the array to the DLL call.

Link to comment

QUOTE (Aristos Queue @ Mar 22 2008, 10:06 PM)

Not at this time. The full array is always exposed to external code. If a subarray is passed to a DLL call, LV will go ahead and allocate a new array that is what the subarray represented and pass the array to the DLL call.

And if you think about it this can't be any other way. There would need to be a way to detect in the external code that it is a subarray instead of a real array and some documented memory layout or access functions to deal with that array. both do not exist until now since the information if it is a subarray or not is in the diagram wire typedef, not in the data pointer.

Rolf Kalbermatter

Link to comment
QUOTE (rolfk @ Mar 24 2008, 07:43 AM)
And if you think about it this can't be any other way. There would need to be a way to detect in the external code that it is a subarray instead of a real array and some documented memory layout or access functions to deal with that array. both do not exist until now since the information if it is a subarray or not is in the diagram wire typedef, not in the data pointer.
It would also be possible if the Call DLL node was able to explicitly mark a terminal as "takes a subarray". If the full array was being passed, that would jsut be the initial pointer, index 0, stride 1, length entire array. But it would only work for calls that expected all of this information. It would be a lot of work for minimal benefit, but it could be done.
Link to comment

QUOTE (rolfk @ Mar 24 2008, 02:43 PM)

And if you think about it this can't be any other way.

Well there could be at least two options. Either you could 1) query if the current handle is of subarray type or regular array type using some function call or 2) configure DLL call so that the datatype passed would always be a of subarray type.

Link to comment

QUOTE (Tomi Maila @ Mar 24 2008, 12:30 PM)

Well there could be at least two options. Either you could 1) query if the current handle is of subarray type or regular array type using some function call or 2) configure DLL call so that the datatype passed would always be a of subarray type.

Well, option 1 I assume is not yet possible since there is no way to pass type descriptor information along with the handle itself. Option 2 would require a new datatype that contains all that information including the pointer to the original handle and data pointer even for arrays that are not really subarrays. Lot's of work for little benefit I would think.

Rolf Kalbermatter

Link to comment

QUOTE (Aristos Queue @ Mar 21 2008, 04:24 PM)

Today I want to mention one that I haven't posted before.

Are your other posts regarding buffer allocation dots stored in a central location? I checked the wiki but didn't find anything.

Link to comment

AQ, does this "stride" array stuff explain why there is a buffer allocation dot in "Reshape Array" even when the number of elements do not change? That has always bothered me.

And as a seperate comment, I have to say that while I'm happy that NI has gone to such lengths to optimize this stuff under the hood, it seems like a lot of work for little gain in terms of speed\memory usage. As far as I know, the output of any array native that produces a sub-array (i.e., AQ's "stride "array) either goes to another array native, a non-array native, a subVI, or a terminal. If the sub-array goes to anything other than the first in that list, LV has to automatically allocate a full array before proceeding, which wipes out any gain in efficiency.

So the only instances where I could see this sub-array implementation providing significant gain is when you have a whole bunch of native array operations to do consecutively with nothing else in between. Even splitting these operations up into subVI's would destrory the sub-array optimization. As soon as anything has to be done other than a change in indexing, the sub-array has to allocate a full array for itself. Or am I missing something?

I do some work with sparse matricies, but I use Matlab for that because it has native functions for operations on sparse matricies, and because Matlab provides lots of complex indexing capabilities. I might be persuaded to do some of that work in LV if NI ever exposes something like this sub-array stuff to us. But I wouldn't hold my breath: the sub-array structure that AQ describes would mean that a pointer to an array in memory would be exposed, and that's a no-no for NI.

Link to comment
QUOTE (Yuri33 @ Mar 25 2008, 11:26 AM)
AQ, does this "stride" array stuff explain why there is a buffer allocation dot in "Reshape Array" even when the number of elements do not change? That has always bothered me.
Now we're straying into particulars of specific prims... I think the following is correct but I'm not certain. I believe that the Reshape Array has to reallocate in order to have space to store the size of the second dimension as part of the array data.
Link to comment

QUOTE (Aristos Queue @ Mar 25 2008, 12:29 PM)

Now we're straying into particulars of specific prims... I think the following is correct but I'm not certain. I believe that the Reshape Array has to reallocate in order to have space to store the size of the second dimension as part of the array data.

I realize that, but if reshape array produced a sub-array, then the only reallocation is for 4 elements: pointer to original array data, new dimensions (whether more dimensions or less than original array), index (=0), and stride (=1). That would be quite efficient.

Edit: I'm sorry, I just realize I was mistaken about what is stored into a sub-array. The dimensional information is not stored, only pointer, length, start index, and stride. I guess my focus was on flattening arrays, which is something I do often, and that doesn't require the dimensional information, since all flattened arrays are 1D. I've always used Reshape Array to do this, and it's bothered me that there's a buffer allocation, but AQ's explanation (new dimension information, which is stored inline with the original array) makes sense for a new buffer. Perhaps NI should consider a new array primitive that simply flattens an array--that way it can truly produce a sub-array and be quite efficient.

Link to comment

QUOTE (Yuri33 @ Mar 25 2008, 11:26 AM)

AQ, does this "stride" array stuff explain why there is a buffer allocation dot in "Reshape Array" even when the number of elements do not change? That has always bothered me.

And as a seperate comment, I have to say that while I'm happy that NI has gone to such lengths to optimize this stuff under the hood, it seems like a lot of work for little gain in terms of speed\memory usage. As far as I know, the output of any array native that produces a sub-array (i.e., AQ's "stride "array) either goes to another array native, a non-array native, a subVI, or a terminal. If the sub-array goes to anything other than the first in that list, LV has to automatically allocate a full array before proceeding, which wipes out any gain in efficiency.

So the only instances where I could see this sub-array implementation providing significant gain is when you have a whole bunch of native array operations to do consecutively with nothing else in between. Even splitting these operations up into subVI's would destrory the sub-array optimization. As soon as anything has to be done other than a change in indexing, the sub-array has to allocate a full array for itself. Or am I missing something?

I'm not 100% sure if subarrays are passed through subVI boundaries but there is no reason why this couldn't be done. The information if it is a subarray or a normal array is in the wire typedescription and a subVI has access to that. The only case where this needs to be converted into a normal array at all times for now, is when it is passed to external code (shared library or CIN) since there is no documented way to deal with sub arrays. In fact CINs do have access to the typeddescriptor too for all parameters (SetCINArraySize() and GetTDPtr() which is internally used in SetCINArraySize() is proof of that) so in theory it could be already done but since subarrays are not documented, there is no way to do that for non-NI programmers and therefore NI has to pass this always as normal array (except they could add an additional CINProperties() attribute telling LabVIEW if the CIN is subarray aware). But since CIN is a legacy technology for a long time which NI tries to completely move away from I do not think they even considered that possibility.

Rolf Kalbermatter

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.