Jump to content

Multiple EXE instances crashing at the same time?


Recommended Posts

Hello, 

I have a complex application which calls a proprietary dll from another company.   Recently, we have been having repeated crashes (silent, no error dialog except "Application Errors" in the Windows Event Log).   As a test, I ran 5 copies of the application (using AllowMultipleInstances=True in the EXE's ini file).   All four copies calling the dll crashed within a couple of seconds of each other (crashes happen only on hours/days timescale) while a fifth, not calling that dll, did not.  

I am confused as to how all the 4 copies failed at the same time.  The dll just does a calculation (complex, but does not access anything external), so I do not understand how all the copies could fail at the same time (they were running independently on simulated random data).   Anyone have a similar experience?  All copies are presumably running in the same LabVIEW runtime engine; could there be any relation to the Runtime Engine?

Link to post
Share on other sites
8 hours ago, drjdpowell said:

All copies are presumably running in the same LabVIEW runtime engine; could there be any relation to the Runtime Engine?

Multiple instances of the same LV executable spawn multiple processes in Windows 10 (tested on 2017 SP1 32-bit), which means they (and their DLLs) have separate memory spaces even if they use the same version of the LabVIEW RTE. My test used a 3rd-party DLL which contains a global "quit()" function; calling the global "quit()" on 1 instance did not affect the other instance, which confirms the separation of memory.

Other things I'd check:

  • Does the crash occur if you only run 1 instance of your test with simulated random data?
  • Does the crash occur if you run your multi-instance test on a different PC?
  • Does the crash occur if you run one instance built with LV 201(x) and another instance built with LV 201(x+y) on the same PC? (Preferably with older versions of LabVIEW, before NI introduced backward-compatible LV RTEs)

 

9 hours ago, drjdpowell said:

(they were running independently on simulated random data)

How does the DLL cope with invalid data? (e.g. divide by 0, Inf, NaN)

 

9 hours ago, drjdpowell said:

The dll just does a calculation (complex, but does not access anything external)

Are you 100% sure that the DLL doesn't attempt any inter-process communication, network access, file access (including the temp folder), etc.?

Link to post
Share on other sites

Hi, thanks.

Crashes when running only one instance, and on multiple computers.  I only ran 5 at once to increase the chance of the crash happening, as it is random and can not happen for many days.  Was surprised that they all failed together (the "Application Error" messages in the Windows Event Log were within seconds of each other).  This kind of rules out something in the simulated data like an NaN triggering a bug, as they are all independent.

Thanks for they idea of running instances compiled under different LabVIEW versions, I may try that.

I will also ask the DLL maker about file access.

Link to post
Share on other sites

You could see if dependencywalker (https://dependencywalker.com/) shows anything suspicious as a dependency of the DLL. You can also use resource monitor on windows to see if its opening handles or files at that time, but I dont know how to log it -- to the best of my knowledge it only works in real time. My other thought is antivirus. Have you tried running with it disabled?

Link to post
Share on other sites
On 4/10/2020 at 2:50 AM, JKSH said:

Multiple instances of the same LV executable spawn multiple processes in Windows 10 (tested on 2017 SP1 32-bit), which means they (and their DLLs) have separate memory spaces even if they use the same version of the LabVIEW RTE. My test used a 3rd-party DLL which contains a global "quit()" function; calling the global "quit()" on 1 instance did not affect the other instance, which confirms the separation of memory.

It is actually possible to force DLL to use shared memory space, so it still may be the problem. I had a simillar problem once: when I would run the DLL in development OR in exe, it was ok. But when the exe was running and I was simultaneously runt the dev VI, it would crash. I never bother to look for the actual solution, as I only needed only one exe in production. But I've heard that renaming the DLL for each instance of application might be solution, as it allocates the memory space, shared included, according to the name. I'm not really sure if it's true, but if you're desperate, it is worth a try.

 

Link to post
Share on other sites
Quote

It is actually possible to force DLL to use shared memory space, so it still may be the problem.

Another possibility is that the DLL creates a separate process to do some work and then communicates with that process... if that process is a single instance that all DLL instances talk to, that would be another way to get a crash like this. Look for another process appearing in your Task Manager right after you start the first copy of the app. 

Link to post
Share on other sites

Another idea: Does the DLL possibly reach for some external resource, like a database or network connection? If that thing fails, maybe their error handling for the failure is poor, and all the EXEs would see the failure at roughly the same time. 

Link to post
Share on other sites

Thanks for the ideas.  For clarification: I don't actually want multiple running instances of the EXE, rather, I need to find out why it is crashing silent.  I am running multiple EXE instances to try and increase the amount of deg info (since I can try different configurations in each).  So far, my only clue is that all running copies using the dll die within seconds of each other, which implies a common trigger event.

Sadly, there is no obvious issues using dependency walker, and no network connections to the dll.  Problem seen on multiple PCs, running single or multiple.  Trying to rebuild app in LabVIEW 2019 as a test, but proving difficult (builds broken, and it is a very large app). 

Link to post
Share on other sites
1 hour ago, drjdpowell said:

Thanks for the ideas.  For clarification: I don't actually want multiple running instances of the EXE, rather, I need to find out why it is crashing silent.  I am running multiple EXE instances to try and increase the amount of deg info (since I can try different configurations in each).  So far, my only clue is that all running copies using the dll die within seconds of each other, which implies a common trigger event.

Sadly, there is no obvious issues using dependency walker, and no network connections to the dll.  Problem seen on multiple PCs, running single or multiple.  Trying to rebuild app in LabVIEW 2019 as a test, but proving difficult (builds broken, and it is a very large app). 

Probably not related to your issue... but once upon a time I had a nasty piece of hardware that I had to interact with using their DLL. On a certain type of PC it would randomly crash in LabVIEW after some number of days. The same PC running a simple python script calling the same DLL functions ran without issue for weeks at a time. In the end I had to change the PC 😞 and the problem never happened again. I tried every combination of DLL settings I could think of, but nothing ever got it working nicely on that PC.

Link to post
Share on other sites

I've been dealing with silent crashes for some months now, and it's been a frustrating exercise (I created some NI support tickets to try to help, but that's been difficult as well).  I'm using LV2017.0.1f4 (64-bit) and I suspect it's related to possibly a misbehaving NI frame grabber (NI PCIe-1433) since it seems to crash most often when starting up/initializing communication with the cameras.  The system has 6 frame grabbers and also has various hardware (power supplies, temperature chamber, etc.).

 

Recently (yesterday) I disabled the parallel startup of the communications and one of the frame grabbers was acting up (noise on the camera com channel) so I only used 5 of the 6 frame grabbers and last night the entire run was successful!  My coworker was amazed!  We are running another test tonight to make sure it wasn't a fluke.

 

So right now I'm pointing the finger at either faulty hardware or some bug not allowing many simultaneous IMAQ connections to all start up at the same time.

 

Bruce

Link to post
Share on other sites
41 minutes ago, bmoyer said:

or some bug not allowing many simultaneous IMAQ connections to all start up at the same time.

That is interesting! I actually came to the same conclusion. I was trying to launch 4 of my own style actors which each opened an IMAQ reference to a (different) camera and quite often trying to do this launch in parallel would cause the system to hang. I changed it to a serial launch process and the problem never happened again.

  • Like 1
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.