Jump to content

Investigating LabVIEW Crashes - dmp file


Recommended Posts

Hi,

My LabVIEW 2016 32 bit is crashing a lot.

I can guess it is related to:

1. incompatibility with DAQmx 9.1.5 and the related device driver which I need for LabVIEW 8.5.1 which I also use (NI's support jumped on that issue yet I think it is not related)

2. an addon that I use to access OpenCV leaving some references open in the dlls

3. Controls with same label or with no label (I believe that this is the issue -LV can't recover correctly in such cases)

However, I don't want to guess like NI's support that it is point 1 and do nothing to prove it.

I expect to see the reason in a log file yet the support wasn't able to show me where to find it.

Searching around I found the dmp file under documents\LabVIEWdata

Is that the way to check those crashes or one of you know of a better way to check it beside elimination?

Thanks in advance

Link to comment

Opinions of Applications Engineer and not of NI R&D

Each crash is going to create a .zip with an lvlog.txt and multiple .dmp files. The lvlog.txt will first print some basic information about your setup; OS Build, AppName (LabVIEW if it's the dev environment), LV Version, etc. The log will then print out some thread information which has never been all that informative to me. Finally you will see a Debug Output section for each DWarn or DAbort. A DWarn is something that LV could recover from and a DAbort is something that LV could not recover from. Usually the DAbort is what we are concerned with but if we get 20+ DWarns printed quickly before a crash I typically suspect they are pretty related to the crash. I looked at a log on my computer and found the following.

<DEBUG_OUTPUT>
2/27/2018 8:21:52.351 AM
DWarnInternal 0xB0BC654A: No path given for current component in JSON map file. C:\ProgramData\National Instruments\LVComponentLocationInfo\LVToolsProductMap.json
e:\builds\penguin\labview\branches\2017\dev\source\panel\showerr.cpp(3732) : DWarnInternal 0xB0BC654A: No path given for current component in JSON map file. C:\ProgramData\National Instruments\LVComponentLocationInfo\LVToolsProductMap.json
minidump id: 9c36ea77-c0f5-45bb-8b2f-2c4267d223ca
$Id: //labview/branches/2017/dev/source/panel/showerr.cpp#5 $

The first line is the time of the error. The next two lines print the actual error message (if it has one) and the source where that error happened. Finally it prints the minidump id which ties this particular error back to a specific .dmp file in the zip folder. If you open the dump file of a minidump and have the correct symbols loaded you will be able to see the call stack but not much else (unless you have enabled full dumps). How much help you get from looking at this information is going to vary greatly depending on the particular crash but the most I personally hope for is a good indication of where to start looking. For instance, if the entirety of the call stack is DAQmx calls, looking at the DAQmx portion of your application sounds like a good place to start at least. Now the stack may just show that LV is allocating/copying memory and then messing with the Data Space. If that's the case, then we have to look for when LV may be allocating memory which means we haven't really learned anything too useful.

Now let's say that we actually get useful information out of this and I can tell you that the reason for the crash is that you are trying to read data from a NULL channel reference (I'm just making that up). Now we've found the root cause in our source but we're not really closer to actually resolving your crash are we? The crash logs may tell us what went wrong at the lowest level in the LabVIEW source but without knowing how we ever got into that situation we aren't really able to make any recommendations for you to act upon. This is why having a reproducing test case is so important. If we know what went wrong and can show how we got there, we can take it to the appropriate people.

TLDR: If you do attach some crash logs I am willing to take a quick look and let you know what I see but I can't promise anything.

  • Like 1
Link to comment

Thank you for the reply.

If it was code related I could have sent you a code sample.

However, it feels more like an environment issue to me.

If I save everything there will be no crash. If I code for a couple of hours without saving... a crash will be very likely no matter what project I'm working on.

It can even be related to a strong antivirus access blocking issue. I don't want to guess, do an endless elimination or reinstall LV. I want proof.

The direction you give is what I found. Thanks for the analysis example. I wish NI gave a tool that would analyze that dmp file for us.

Attached you can find the dmps and the MAX report in case you can understand the reason for the crashes.

How do I load the correct symbols in order to understand that dmp file? You din't give an example to that part :)

16.0f5 (32-bit).zip

ni_support.zip

That said, I would love to understand what NI R&D would do? How do I turn on full dump reports?

Are there other tricks that could help me find the culprit?

How does other programmers here handle such issues? 

Edited by 0_o
Link to comment

Some of your crash logs point towards the TSVN Toolkit. Possibly related: https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z0000019RInSAM

Edit: OpenCV also causes some issues:

<DEBUG_OUTPUT>
21/02/2018 17:31:12.240
DWarn 0x50CBD7C1: Got corruption with error 1097 calling library OpenCVWrapperToCpp.dll function ?lv_Canny@@YAXHHNNH_NPAH@Z

I suppose you were working with the Call Library function at that moment, so these problems are to be expected. Just make sure to restart LV every now and then to reset the memory or LV will crash eventually.

Edited by LogMAN
  • Like 1
Link to comment

LogMAN, you are correct. Thanks!

I wonder why NI's support eng that connected to my computer couldn't tell me the same? He simply searched for an incompatibility issue that could save him from dealing with the problem.

I too saw those issues:

1. providers always race with the internal provider and cause problems. 

2.OpenCV DLL wrapper... DLL...

However, the crashes didn't happen during execution and you refer to DWarns and not DAborts.

Thus, I think more in the line of memory issues from call libraries.

Can you elaborate as to the reason it happens.

I have several ways to check for memory issues and already saw that even when I close a reference to a dll using the call library the memory is not freed till the calling vi is closed (BD closed) and that is often the stage when LV crashes.

Analyzing a video and keeping the snapshots in an array can get LV from 50MB RAM to 380 MB in no time and this is a 32bit version (2GB max RAM). From my experience the issue is FP memory usage. Even 50MB under 1 control is an issue. 

Why should the call library behavior be expected? Anyway to prevent it without restarting once a hour? The exe should work 24/7. I can't ask for it to be restarted for memory issues every couple of hours.

Thanks in advance. 

Edited by 0_o
Link to comment
40 minutes ago, 0_o said:

However, the crashes didn't happen during execution and you refer to DWarns and not DAborts.

You mean they didn't happen in the executable right? The executable has no automatic error handling (unless compiled with debugging enabled, maybe), so the errors still happen but silently.

Error 1097 indicates that the library corrupted memory. Memory that doesn't belong to the library, which is a serious issue. I'm not an expert with Call Library nodes, but this should be taken care of. Here is an article that might be of help: https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z000000P6tcSAC

@jacobson mentioned before:

19 hours ago, jacobson said:

A DWarn is something that LV could recover from and a DAbort is something that LV could not recover from.

A DWarn doesn't mean LV restored the corrupted memory, just that it was able to continue operation despite the memory being corrupted. You should still make sure it doesn't happen again!

42 minutes ago, 0_o said:

I have several ways to check for memory issues and already saw that even when I close a reference to a dll using the call library the memory is not freed till the calling vi is closed (BD closed) and that is often the stage when LV crashes.

Sounds to me like the Call Library nodes are not configured properly. The library is unloaded by LV when closing a VI which causes the library to free all of its memory. Since the library accessed memory in the address space of LV, it tries to free it, causing LV to crash in the process.

44 minutes ago, 0_o said:

Why should the call library behavior be expected?

I meant that this is to be expected while working on the Call Library node. At least it takes me a while to figure out the correct settings before it works without errors.

Link to comment

I meant execution in that paragraph. Not executable. In later paragraph I meant the executable.

4 minutes ago, LogMAN said:

Sounds to me like the Call Library nodes are not configured properly. The library is unloaded by LV when closing a VI which causes the library to free all of its memory. Since the library accessed memory in the address space of LV, it tries to free it, causing LV to crash in the process.

This line you wrote is the solution in my case I guess. 

I'll update the person who created that code and hope for a quick fix.

Thank!

 

Link to comment

In your case, regardless of how many DWarns are thrown, the actual DAbort seems pretty much the same every time. Basically, there are a bunch of general LabVIEW calls, a few mxLvProvider calls and a single msvcr call that seem the same between every crash, and then we go straight into crash reporting calls. I believe the mxLvProvider is the LabVIEW project provider (maybe @David_L remembers some of this?) and if that is the case my best guess would be problems with the TSVN toolkit as LogMAN suggested (not only because of the DWarns but because I believe TSVN does some project provider things).

As for one of your earlier questions, you won't have the correct symbols unless you work for NI. One thing I do like quite a bit about LabVIEW NXG is that the call stack with full function names is printed in the log so that's something to look forward to.

Link to comment

mxLvProvider definitely refers to some project provider code, and TSVN is almost completely a project provider add-on.  The best bet in this case would be to uninstall TSVN and see if you still see crashing.  If the problem points to TSVN you might be able to reach out to the developer, but since it's a free toolkit I don't know how much bandwidth they put into development and updates on it.  

  • Like 1
Link to comment
16 hours ago, 0_o said:

However, the crashes didn't happen during execution and you refer to DWarns and not DAborts.

Thus, I think more in the line of memory issues from call libraries.

Can you elaborate as to the reason it happens.

... the memory is not freed till the calling vi is closed (BD closed) and that is often the stage when LV crashes.

That is the nature of memory corruption: Often, it doesn't cause a crash immediately. The crash happens later, when something else tries to use the corrupted memory.

 

16 hours ago, 0_o said:

Analyzing a video and keeping the snapshots in an array can get LV from 50MB RAM to 380 MB in no time and this is a 32bit version (2GB max RAM). From my experience the issue is FP memory usage. Even 50MB under 1 control is an issue.

This is monitoring memory allocation. It helps you detect memory leaks, but doesn't detect memory corruption. They are different issues.

Memory leaks cause crashes by using up all of your application's memory. Memory corruptions cause crashes by scrambling your application data.

Link to comment

Error 1097 is hardly related to a resource not being deallocated but almost always to some sort of memory corruption due to overwritten memory. LabVIEW sets up an exception handler around Call Library Nodes if you don't disable that in the Call Library Node configuration, that catches low level OS exception, and when you put the debug level to the highest, it also creates some sort of trampoline around buffers. That are memory areas before and after buffers passed to the DLL function as parameter and filled with a specific pattern and after the function returns to LabVIEW, it checks that these trampoline areas still contain the original pattern. If they don''t then the DLL call somehow wrote beyond (or before) the buffer it is supposed to write too and that is then reported as error 1097 too. It may only affect the trampoline area and that would mean that nothing really bad happened, but if it overwrote the trampoline areas it may just as well have overwritten more and then crashes are going to be happening for sure, rather sooner than later. 

In most cases the reason for error 1097 is actually a buffer passed to the function that the function is supposed to write some information into. Unlike in normal LabVIEW code where the LabVIEW nodes will allocate whatever buffer is needed to return an array of values for instance, this is not how C calls usually work. Here the caller has to preallocate a buffer large enough for the function to return information in. One byte to little and the function will overwrite some memory it is not supposed to do.

I'm not sure which OpenCV wrapper library you are using, but image acquisition interfaces are notorious to make such buffer allocation errors. Here you have relatively large buffers that need to be preallocated, and the size of that buffer depends on various things such as width, height, line padding, bit depth, color coding etc. and it is terribly easy to go wrong there and calculate a buffer size to allocate that will not match with what the C function tries to fill in because of a mismatch of one or more of these parameters.

For instance if you create an IMAQ image and then retrieve its pointer to pass to the OpenCV library to copy data into, it will be very important to use the same image type and size as the OpenCV library does want to fill in, and to tell OpenCV about the IMAQ constraints such as the border size, line stride, etc.

Edited by rolfk
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.