Allocating a very large array: out-of-memory without crashing?

drjdpowell · October 21, 2019

I have application where the User provides files from another program, which are typically 10 Mb, but with no real limit on their size. One User managed to generate a 900+ Mb file and caused an out-of-memory condition which caused my app to fail. Rather than the annoying "Out of memory" dialog that pops up, I would much rather get a standard error that I can handle (by telling the User that file is too big to load and then returning to normal operation). So I have been trying to think of a way to attempt to allocate a very large string with a graceful way to fail if there is not enough memory. Has anyone else found a way to do this?

Antoine Chalons · October 21, 2019

Quote

So I have been trying to think of a way to attempt to allocate a very large string with a graceful way to fail if there is not enough memory. Has anyone else found a way to do this?

I have not.

Last I faced a similar case a few years ago (on a 32-bit Windows), after some discussion with the users we agreed to check the file size before loading it and if it was larger than 400Mb the soft would reject the load request with a nice message telling the user to use another soft that would split the large file in several 400Mb files.

drjdpowell · October 21, 2019

The special math functions like "Create Special Matrix" can do it; they return an error code -20001 ("Analysis: (Hex 0xFFFFB1DF) There is not enough memory to perform the specified routine.") if you try and create a too-large matrix. They do their work in a dll, and this might be the way I would have to go.

Rolf Kalbermatter · October 21, 2019

It depends what you want to do with the memory and how but in principle it is pretty easy.

This function will simply return error 2 when the allocation was not successful.

The challenge is to use this allocated buffer with built in LabVIEW functions. Depending on what functions you may want to use this with, you could for instance pass in the buffer in a VI in which you read the binary file in chunks and copy each chunck into this buffer with the Array Replace Subset function. Memory management is a bitch and you have to often choose between preallocating memory and passing it all the way down a call chain hierarchy to use it there or to let the low level functions attempt to do it and pass the result up through the Call Chain. LabVIEW chooses for the latter and that has good reasons. The first is a lot more complicated to implement and use and has generally less performance since you tend to copy data twice or more (when using streams for instance which at each data direction inversion will usually involve a data copy).

Allocate Array Buffer.vi

Edited October 21, 2019 by Rolf Kalbermatter

ShaunR · October 21, 2019

28 minutes ago, Rolf Kalbermatter said:

It depends what you want to do with the memory and how but in principle it is pretty easy.

This function will simply return error 2 when the allocation was not successful.

The challenge is to use this allocated buffer with built in LabVIEW functions. Depending on what functions you may want to use this with, you could for instance pass in the buffer in a VI in which you read the binary file in chunks and copy ech chunck into this buffer with the Array Replace Subset function. Memory management is a bitch and you have to often choose between preallocating memory and passing it all the way down a call chain hierarchy to use it there or to let the low level functions attempt to do it and pass the result up through the Call Chain. LabVIEW chooses for the latter and that has good reasons. The first is a lot more complicated to implement and use and has generally less performance since you tend to copy data twice or more (when using streams for instance which at each data direction inversion will usually involve a data copy).

Allocate Array Buffer.vi 9.61 kB · 0 downloads

There is DSMaxMem and DSMemStats which aren't much use. Do you have any info on DSMemStatsSlow and DSMemStats2?

Rolf Kalbermatter · October 21, 2019

25 minutes ago, ShaunR said:

There is DSMaxMem and DSMemStats which aren't much use. Do you have any info on DSMemStatsSlow and DSMemStats2?

Nope, sorry.

Still, trying to get information if a memory allocation might succeed by looking at whatever memory statistics might be available can never be a foolproof approach. It has the classical race that between checking if you can and doing it, the statistics might be not actual anymore and you still fail. The only fool proof approach is to actually do the allocation and deal with the failure of it.

Of course for memory allocations that is always tricky as seen here. We want to read in a 900MB file and want to be sure we can read it in. Checking if we can and then trying can still fail. We have to allocate the entire buffer beforehand and then copy piece by piece the file into this buffer. Another approach might be a memory mapped file but trying to trick LabVIEW build in functions to use such a beast is an entire exercise in its own. You basically invert the complete execution flow from calling a function that returns some data, to first preparing a buffer and hand it to a function to use it to eventually return that data.

If you ever have dealt with streams (in Java, or .Net which has not only taken the whole stream concept verbatim from Java) you will know this problem. It's super handy and normally quite easy but internally quite complex. And you always end up with two distinct types that can't be easily connected without some intermediate proxy, Input Streams and Output Streams. And such a proxy will always involve copying data from one stream to the other, adding significant overhead to the originally very simple and seemingly beautiful idea.

Now, one solution in hindsight that would be beneficial in the OPs case would be if those LabVIEW low level functions would return an error 2 or so in these cases rather than throw up a dialog that gives you only the option to quit, crash or puke. With the current almost everywhere present error cluster and its consisten handling throughout LabVIEW, this would seem the logical choice. Back when LabVIEW was invented however, error clusters were not even thought of yet and error handling from things like out of memory conditions was anyhow an end of story condition in almost all cases, since once that happened LabVIEW would almost surely run into other out of memory conditions when trying to handle the previous error conditions. When LabVIEW for Windows came out, most users found 8MB of memory an outragous expensive requirement and were insisting that LabVIEW should be able to fly to the moon and back with the 4MB it was claiming to work with in the marketing material.

Edited October 21, 2019 by Rolf Kalbermatter

ShaunR · October 21, 2019

17 minutes ago, Rolf Kalbermatter said:

Another approach might be a memory mapped file but trying to trick LabVIEW build in functions to use such a beast is an entire exercise in its own.

Been there. Done that.

20 minutes ago, Rolf Kalbermatter said:

The only fool proof approach is to actually do the allocation and deal with the failure of it.

Wouldn't a DSNewHandle suffice?

Rolf Kalbermatter · October 21, 2019

14 minutes ago, ShaunR said:

Wouldn't a DSNewHandle suffice?

That was my first thought too 😆. But!!!!

The Call Library Node only allows for Void, Numeric and String return types and the String is restricted to C String Pointer and Pascal String Pointer. The String Handle type is not selectable. -> Bummer!

And the logic with the two MoveBlock functions to tell the array in the handle what size it actually has, needs to be done anyway. Otherways the handle might be resized automatically by LabVIEW at various places when passing through Array nodes for instance, such as the Replace Array Subset node. Also Replace Array Subset would not copy data into an array beyond the indicated array size too. Handle size and array size are not strictly coupled beyond the obvious requirement

handle size >= dimensions * sizeof(int32) + array size * array element size

Edited October 21, 2019 by Rolf Kalbermatter

Mads · October 21, 2019

The application in question - would it otherwise behave smoothly with the 900 MB file, if it was able to load it, or would it become so sluggish that it would not make any sense to load that much data anyhow (i.e. the technical issue might just be of technical interest...)?

Why do you not just put a limit on the file size you will load? You can always get a handle on how much a file of x megabytes typically takes when loaded, and calculate your suggested limit based on that -. either alone or combined with a reading of the available memory.

If the file is above the limit and the processing permits it, you could offer the user to decimate the data or extract a subsection of it. You can also allow the user to proceed with the full file, but at least you have given him a warning.. If a crash will erase previous work the user might opt out...and if not, it will not look as bad when it does crash.

Edited October 22, 2019 by Mads

JKSH · October 22, 2019

8 hours ago, Rolf Kalbermatter said:

That was my first thought too 😆. But!!!!

The Call Library Node only allows for Void, Numeric and String return types and the String is restricted to C String Pointer and Pascal String Pointer. The String Handle type is not selectable. -> Bummer!

How about creating the initial handle in LabVIEW (e.g. an empty string), passing it into the DLL as a handle, then calling DSSetHandleSize()?

drjdpowell · October 22, 2019

On 10/21/2019 at 4:50 PM, Rolf Kalbermatter said:

It depends what you want to do with the memory and how but in principle it is pretty easy.

This function will simply return error 2 when the allocation was not successful.

Thanks Rolf, your definition of "pretty easy" is slightly wider ranging than mine.

Rolf Kalbermatter · October 22, 2019

31 minutes ago, drjdpowell said:

Thanks Rolf, your definition of "pretty easy" is slightly wider ranging than mine.

Well it is when you look at how the equivalent looks in C 😄

MgErr AllocateArray(LStrHandle *pHandle, size_t size)
{
    MgErr err = NumericArrayResize(uB, 1, (UHandle*)pHandle, size);
    if (size && !err)
        LStrLen(**pHandle) = (int32)size;
    return err;
}

Very simple! The complexity comes from what in C is that easy LStrLen() macro, which does some pointer vodoo that is tricky to resemble in LabVIEW.

Edited October 22, 2019 by Rolf Kalbermatter

Rolf Kalbermatter · October 22, 2019

17 hours ago, JKSH said:

How about creating the initial handle in LabVIEW (e.g. an empty string), passing it into the DLL as a handle, then calling DSSetHandleSize()?

Would work but has the same problem of having to set the array length too, so you save nothing except that you use DSSetHandleSize() instead of NumericArrayResize() (and need to do some extra calculations as you also have to account for the extra int32 that is in there.

Edited October 22, 2019 by Rolf Kalbermatter

ShaunR · October 22, 2019

On 10/21/2019 at 5:48 PM, Rolf Kalbermatter said:
That was my first thought too 😆. But!!!!

The Call Library Node only allows for Void, Numeric and String return types and the String is restricted to C String Pointer and Pascal String Pointer. The String Handle type is not selectable. -> Bummer!

And the logic with the two MoveBlock functions to tell the array in the handle what size it actually has, needs to be done anyway. Otherways the handle might be resized automatically by LabVIEW at various places when passing through Array nodes for instance, such as the Replace Array Subset node. Also Replace Array Subset would not copy data into an array beyond the indicated array size too. Handle size and array size are not strictly coupled beyond the obvious requirement
handle size >= dimensions * sizeof(int32) + array size * array element size

Yup. On first glance I thought something could have been done with bytearraytostring etc. But the handle created isn't an array (in the LabVIEW sense) so a Moveblock would be needed to copy into another array to dereference it (double the memory required-not what we want). The really devious bit of your code is the sequence structure. That's some voodoo I probably would never have thought of.

Edited October 22, 2019 by ShaunR

Rolf Kalbermatter · October 22, 2019

2 minutes ago, ShaunR said:

That's some voodoo I probably would never of thought of.

That's just to force execution of the copying of the array size before assigning the handle to the control. Looks strange when you have created an array with elements but the control shows an empty array. For use as subVI it wouldn't really matter as by the time the subVI returns the array it is correctly sized but when you test run it from the front panel it looks weird.

ShaunR · October 22, 2019

5 minutes ago, Rolf Kalbermatter said:

That's just to force execution of the copying of the array size before assigning the handle to the control. Looks strange when you have created an array with elements but the control shows an empty array. For use as subVI it wouldn't really matter as by the time the subVI returns the array it is correctly sized but when you test run it from the front panel it looks weird.

"Just".

Without it, it would be a debugging nightmare and if I was coming up with that method, I would "just" think it wasn't working.

Rolf Kalbermatter · October 22, 2019

3 minutes ago, ShaunR said:

"Just".

Without it, it would be a debugging nightmare and if I was coming up with that method, I would "just" think it wasn't working.

Sssssht! My first version was without that sequence structure and I was for a brief moment wondering if maybe my ability to do the pointer juggling had failed me. After looking over it once more I figured the problem must be elsewhere and then it struck me that the control assignment was happening right after the NumericArrayResize() call. LabVIEW has a preference to do terminal assignments always as soon as possible.

ShaunR · October 22, 2019

9 minutes ago, Rolf Kalbermatter said:

the control assignment was happening right after the NumericArrayResize() call. LabVIEW has a preference to do terminal assignments always as soon as possible.

Like I said. Voodoo.

Sign In

Allocating a very large array: out-of-memory without crashing?

Recommended Posts

drjdpowell

Antoine Chalons

drjdpowell

Rolf Kalbermatter

ShaunR

Rolf Kalbermatter

ShaunR

Rolf Kalbermatter

Mads

JKSH

drjdpowell

Rolf Kalbermatter

Rolf Kalbermatter

ShaunR

Rolf Kalbermatter

ShaunR

Rolf Kalbermatter

ShaunR

Join the conversation

Browse

Activity

Important Information