Jump to content

LV8.6 application builder, shared library glibc problem


Recommended Posts

Posted

Dear LAVAers

After having the linux version of labview 8.6 for a while, we decided it was time to move our applications from labview 8.2.1 to LabVIEW 8.6. most things when fine, but i got into trouble when trying to port some of the applications making use of shared libraries.

There is one library in particular i have been battling with: a implementation of Role Based Access Control (RBAC) done in C++, and used in most of our front ends at work. unfortunately i can't upload the source of the libraries here, since the whole RBAC project invokes several external shared libraries and server dependencies, and won't compile without them.

What is puzzeling me and led me to beleive that there is something wrong with the application builder, is that the shared library call works just fine under the full development environment, but fails once i make an executable out of it.

It also works just fine as a vi and an executable in labview 7.1, 8.2, 8.2.1 but not in 8.5 and 8.6 (haven't gotten the 8.6.1 distro for linux yet).

So here is the error message i get:

*** glibc detected *** free (): invalid pointer 0x0832d930 ***

And here is what i have been trying to do to solve this problem:

-Simplifying library call and narrowing down the triggering problem to one function (class calling a pointer to another class) in the library.

- tried to gather and install all the latest libraries needed by labview executables (For a modified Red hat 5 distro but with all the requirements for labview 8.6) // NOT WORKING

- copied all these libraries in ./ path oh executable, to ensure that this version vas loaded (and setting LD_LIBRARY_PATH to ./) // NOT WORKING, STILL CRASHES

- Installed the latest .rpm of glibc (glibc-2.9) (LabVIEW 8.6 for linux is supposedly working with 2.2.4->) //NOT WORKING SAME PROBLEM

- Installed labview 8.6 trough rpm (the main version is customary installed, so i thought we had missed something in the installation process) on a different machine running Fedora Core 10 with all the latest patches (verifying that the problem does not come from the core libraries?) //NOT WORKING

-Tried to use the shared library in other applications that labview (made a small c-wrapper, calling the same functions as labview) // WORKS JUST FINE:(

-Trying to compile and run with labview 8.2.1 // WORKS JUST FINE :(

-Trying to do a coredump of the executable and debugging it trough gdb // NO DEBUG INFORMATION (of any use to me)

-Tried to compile the shared library on windows, and making a labview 8.6 executable calling the library there // WORKS :(

-Trying to find memory leaks trough valgrind, but valgrind sends abort signal when calling the library trough the labview executable //OF NO HELP

-Eating lots of biscuits and drinking too much coffee, realising that 4 days has passed and nothing has happened.

Here is what im planning to do next:

- Make another shared library calling the RBAC wrapper library, setting it up so that ( hoping) it will use the system memory space and not going trough labview, only passing the actual data in between.

If this shared compiled library also crashed on labview 8.2 and on windows, i would probably be tempted to believe that the problem was caused by some bad memory handling in the c++ project (calling delete twice for the same pointer or something in that manner), but seeing that it works in labview 8.6 development environment on my linux box, it works in the windows environment, it works when compiling executables on windows and it works when compiling executables in labview 8.2.1 on linux. it has to be the application builder??!! (which is another way of saying that i really don't have any clue at this point why the lvexec fails and the development env dont).

Can any of you bright minds out there shed some light on this, cos right now i don't know what to do next. any tips or linktousefulstuff is highly appreciated.

Cheers

X :headbang:

(Unfortunately the upload thingy here prevented me from uploading the .tar.gz of my test vi)

Posted

Have you tried working with NI support? You definitely should. If they can reproduce it then they should be able to figure out if it's a bug in LabVIEW or a bug in your code. If it's a bug in LabVIEW then they should help you get a workaround and file a bug report to make sure it gets fixed.

From your attempt to debug, can you tell if the free call came from LabVIEW or your DLL?

It's tempting to say "it must be LabVIEW because it works all these other ways", but memory corruptions, lack of initialization, and other kinds of memory-related bugs can be very sneaky. They can lurk for a long time and not cause any (noticeable) problems, and then suddenly some code around it changes and everything goes to hell. I'm not saying it's NOT a bug in LabVIEW, but I haven't seen this problem before, so I don't know.

Posted

QUOTE (Adam Kemp @ Mar 27 2009, 04:36 PM)

Have you tried working with NI support? You definitely should. If they can reproduce it then they should be able to figure out if it's a bug in LabVIEW or a bug in your code. If it's a bug in LabVIEW then they should help you get a workaround and file a bug report to make sure it gets fixed.

From your attempt to debug, can you tell if the free call came from LabVIEW or your DLL?

It's tempting to say "it must be LabVIEW because it works all these other ways", but memory corruptions, lack of initialization, and other kinds of memory-related bugs can be very sneaky. They can lurk for a long time and not cause any (noticeable) problems, and then suddenly some code around it changes and everything goes to hell. I'm not saying it's NOT a bug in LabVIEW, but I haven't seen this problem before, so I don't know.

Hi Adam and thanks for your reply!

I have started the process of narrowing down the problem, and it now seems to be coming from the shared library. my problem is that this implementation is huge, and consists of several developers in different countries, so getting hold of all the sources for re-compilation and debugging isn't always too easy.

I agree that it might be a bit to sudden blaming the application builder, but i just find it a bit strange that everything seemingly runs fine in labview 8.2.1 executables.

will try to do some more debugging and let you know :)

cheers

X

Posted

QUOTE (xavier30 @ Mar 27 2009, 11:17 AM)

Dear LAVAers

After having the linux version of labview 8.6 for a while, we decided it was time to move our applications from labview 8.2.1 to LabVIEW 8.6. most things when fine, but i got into trouble when trying to port some of the applications making use of shared libraries.

There is one library in particular i have been battling with: a implementation of Role Based Access Control (RBAC) done in C++, and used in most of our front ends at work. unfortunately i can't upload the source of the libraries here, since the whole RBAC project invokes several external shared libraries and server dependencies, and won't compile without them.

What is puzzeling me and led me to beleive that there is something wrong with the application builder, is that the shared library call works just fine under the full development environment, but fails once i make an executable out of it.

It also works just fine as a vi and an executable in labview 7.1, 8.2, 8.2.1 but not in 8.5 and 8.6 (haven't gotten the 8.6.1 distro for linux yet).

So here is the error message i get:

*** glibc detected *** free (): invalid pointer 0x0832d930 ***

And here is what i have been trying to do to solve this problem:

-Simplifying library call and narrowing down the triggering problem to one function (class calling a pointer to another class) in the library.

- tried to gather and install all the latest libraries needed by labview executables (For a modified Red hat 5 distro but with all the requirements for labview 8.6) // NOT WORKING

- copied all these libraries in ./ path oh executable, to ensure that this version vas loaded (and setting LD_LIBRARY_PATH to ./) // NOT WORKING, STILL CRASHES

- Installed the latest .rpm of glibc (glibc-2.9) (LabVIEW 8.6 for linux is supposedly working with 2.2.4->) //NOT WORKING SAME PROBLEM

- Installed labview 8.6 trough rpm (the main version is customary installed, so i thought we had missed something in the installation process) on a different machine running Fedora Core 10 with all the latest patches (verifying that the problem does not come from the core libraries?) //NOT WORKING

-Tried to use the shared library in other applications that labview (made a small c-wrapper, calling the same functions as labview) // WORKS JUST FINE:(

-Trying to compile and run with labview 8.2.1 // WORKS JUST FINE :(

-Trying to do a coredump of the executable and debugging it trough gdb // NO DEBUG INFORMATION (of any use to me)

-Tried to compile the shared library on windows, and making a labview 8.6 executable calling the library there // WORKS :(

-Trying to find memory leaks trough valgrind, but valgrind sends abort signal when calling the library trough the labview executable //OF NO HELP

-Eating lots of biscuits and drinking too much coffee, realising that 4 days has passed and nothing has happened.

Here is what im planning to do next:

- Make another shared library calling the RBAC wrapper library, setting it up so that ( hoping) it will use the system memory space and not going trough labview, only passing the actual data in between.

If this shared compiled library also crashed on labview 8.2 and on windows, i would probably be tempted to believe that the problem was caused by some bad memory handling in the c++ project (calling delete twice for the same pointer or something in that manner), but seeing that it works in labview 8.6 development environment on my linux box, it works in the windows environment, it works when compiling executables on windows and it works when compiling executables in labview 8.2.1 on linux. it has to be the application builder??!! (which is another way of saying that i really don't have any clue at this point why the lvexec fails and the development env dont).

Can any of you bright minds out there shed some light on this, cos right now i don't know what to do next. any tips or linktousefulstuff is highly appreciated.

Cheers

X :headbang:

(Unfortunately the upload thingy here prevented me from uploading the .tar.gz of my test vi)

So you are supposedly using the Call Library Node to call those external Libraries. Ever checked (really really throughfully) that you pass all the right data types to the Shared lib, and most importantly never pass in an array or string buffer to the lib to be filled in by this library that could be to small?

The fact that it works in the IDE or on Windows means really nothing in the case of such problems. The buffer you pass into the library might often border non vital data that gets corrupted too, but won't cause a fatal crash or it might be even so that it is a buffer for a filepath that is shorter in those other situations never overwriting illegal memory except in the runtime installation.

Lots of things to consider here, but using the Call Library Node and testing your application to not crash on one specific installation/built is really not enough. You really ought to validate every single Call Library Node to have all the data types right and most importatnly either have no output data buffer (array or string filled in by the library) or that those buffers are under all possible circumstances preallocated large enough in the LabVIEW diagram.

Rolf Kalbermatter

Posted

Along those lines, the Call Library Node now has error checking options to detect errors like this and attempt to recover from them. You should try enabling the highest error checking level. See if that detects any problems.

Posted

Thanks Rolfk and Adam

To Rolfk : i tried removing all inputs and outputs between labview and the library, and instead piping all error messages and events in the library to the stdio. I also changed all inputs to constants in the c wrapper itself, re-compiled and tried to see if the labview executable would crash just by calling the library, which it did.

To Adam: i will see if i can get some more data by enabling the full error checking.

I did a small debugging session with gdb and managed to get a backtrace from the shared library and its callers. it seems that the problem could come from an exception created by what you seen in line 12 from the "curl wrapper", causing the "rbac" object to be deleted.

What i also now realise is that my shared library is compiled using the standard libstdc++.so.6 while LabVIEW normally includes libstdc++.so.5?

I will try setting up an environment, linking all the sources with the included standard libraries from ni.

#0 0x00b2f7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2

#1 0x00b70825 in raise () from /lib/tls/libc.so.6

#2 0x00b72289 in abort () from /lib/tls/libc.so.6

#3 0x00ba4cda in __libc_message () from /lib/tls/libc.so.6

#4 0x00bab56f in _int_free () from /lib/tls/libc.so.6

#5 0x00bab94a in free () from /lib/tls/libc.so.6

#6 0x00933b31 in operator delete () from /usr/lib/libstdc++.so.6

#7 0x0082ca0b in __gnu_cxx::new_allocator<std::_List_node<std::string> >::deallocate (this=0xbfffca0c, __p=0x8331888)

at /usr/lib/gcc/i386-redhat-linux/3.4.6/../../../../include/c++/3.4.6/ext/new_allocator.h:86

#8 0x0082c9d4 in std::_List_base<std::string, std::allocator<std::string> >::_M_put_node (this=0xbfffca0c, __p=0x8331888)

at /usr/lib/gcc/i386-redhat-linux/3.4.6/../../../../include/c++/3.4.6/bits/stl_list.h:315

#9 0x008300bc in std::_List_base<std::string, std::allocator<std::string> >::_M_clear (this=0xbfffca0c)

at /usr/lib/gcc/i386-redhat-linux/3.4.6/../../../../include/c++/3.4.6/bits/list.tcc:78

#10 0x0082fff8 in ~_List_base (this=0xbfffca0c) at /usr/lib/gcc/i386-redhat-linux/3.4.6/../../../../include/c++/3.4.6/bits/stl_list.h:330

#11 0x0082ffbd in ~list (this=0xbfffca0c) at ../rbac/LoginModule.h:40

#12 0x0083852f in RBAC::CurlWrapper::curlRequest (servers=@0x83187d0, request=@0xbfffcadc, replyHeader=@0xbfffca8c,

replyData=@0xbfffca7c) at CurlWrapper.cpp:202

#13 0x0082fc3e in RBAC::LoginModule::getAndSaveToken (this=0x83187c8, query=@0xbfffcadc) at LoginModule.cpp:51

#14 0x008308c6 in RBAC::ExplicitLoginModule::login (this=0x83187c8) at ExplicitLoginModule.cpp:64

#15 0x0082d6f3 in RBAC::LoginContext::login (this=0xbfffcc9c) at LoginContext.cpp:166

#16 0x0082dcad in RBAC::LoginContext::login (application=@0xbfffccfc, userName=@0xbfffcd1c, password=@0xbfffcd3c) at LoginContext.cpp:254

#17 0x00826bd0 in Token (pwd=0x879dcbc "MYFakePassWord", back=0x87b4464 "") at Test.cpp:49

#18 0x0832c4a4 in ?? ()

#19 0x0879dcbc in ?? ()

#20 0x087b4464 in ?? ()

#21 0x00000000 in ?? ()

Posted

I'm not sure the extra error checking is going to help in this situation. It looks like your code is really trying to free some memory twice. Either your list itself is being deleted twice (look at frame #12) or your list implementation is trying to free something inside it that has already been freed. Maybe you tried to copy a list and copied some pointers instead of doing a deep copy. In that case you take list A, copy it into list B, delete either list A or B, and then when you try to delete the other list you crash because you already deleted those pointers. Make sure you have a valid copy constructor and copy assignment operator in your list class.

There shouldn't be any problem with using libstdc++.so.6 alongside libstdc++.so.5 as long as you don't try to take an STL object created by one and pass it to the other. I don't think that would happen in this case. The only real downside to using them both is that they take up more memory.

I will mention that we have found bugs in libstdc++.so.5 that show up as double-frees (their std::string library implements copy-on-write in a not-thread-safe way), and we had to work around that. I believe that's fixed in libstdc++.so.6, though, so I doubt that's what's going on.

Posted

QUOTE (Adam Kemp @ Mar 30 2009, 07:11 PM)

I'm not sure the extra error checking is going to help in this situation. It looks like your code is really trying to free some memory twice. Either your list itself is being deleted twice (look at frame #12) or your list implementation is trying to free something inside it that has already been freed. Maybe you tried to copy a list and copied some pointers instead of doing a deep copy. In that case you take list A, copy it into list B, delete either list A or B, and then when you try to delete the other list you crash because you already deleted those pointers. Make sure you have a valid copy constructor and copy assignment operator in your list class.

There shouldn't be any problem with using libstdc++.so.6 alongside libstdc++.so.5 as long as you don't try to take an STL object created by one and pass it to the other. I don't think that would happen in this case. The only real downside to using them both is that they take up more memory.

I will mention that we have found bugs in libstdc++.so.5 that show up as double-frees (their std::string library implements copy-on-write in a not-thread-safe way), and we had to work around that. I believe that's fixed in libstdc++.so.6, though, so I doubt that's what's going on.

Thanks Adam,

I unfortunately haven't gotten the sources of the code causing the double free problem yet (not my part of the code, and it might take some time :) but i found a nifty little tool called "die hard" that i'll check out in the mean time: http://www.diehard-software.org/ to see if i can suppress the error until i get the sources and hopefully manage to fix the problem.

not really the solution i wanted, but if

it works, i'll use it in the meantime :)

Thanks for all the inputs though.

X

Posted

I been doing some debugging now for a while, and everything so far indicates that the problems we are getting in this case is coming from using the g++ 3.4.x compiler which links to libstdc++.so.6 when using functions like "list" and "vectors" inside the library (we are not re-shaping memory allocated by labview, but we are using c++ functions in other libraries called by mine to do some of the work).

I think i managed to reproduce the fault by making a small test library where i create some c++ functions, not doing anything, just initialising things like vectors and lists, and then i compile a shared library with g++ 3.4.x and call it from labview. when i compile this vi and run it as a executable. it crashes, but when i compiled everything with g++ 3.2.x it seemingly work fine.

so for me it seems that the labview 8.6 runtime libraries (liblvrt.so.8.6.0) is not compatible with g++ 3.4.x compiled libraries?

If anyone could verify or depen this a bit, it would be highly appreciated.

Cheers

X

P.S i have included the test code and makefile

Posted

QUOTE (xavier30 @ Apr 6 2009, 08:36 AM)

I been doing some debugging now for a while, and everything so far indicates that the problems we are getting in this case is coming from using the g++ 3.4.x compiler which links to libstdc++.so.6 when using functions like "list" and "vectors" inside the library (we are not re-shaping memory allocated by labview, but we are using c++ functions in other libraries called by mine to do some of the work).

I think i managed to reproduce the fault by making a small test library where i create some c++ functions, not doing anything, just initialising things like vectors and lists, and then i compile a shared library with g++ 3.4.x and call it from labview. when i compile this vi and run it as a executable. it crashes, but when i compiled everything with g++ 3.2.x it seemingly work fine.

so for me it seems that the labview 8.6 runtime libraries (liblvrt.so.8.6.0) is not compatible with g++ 3.4.x compiled libraries?

If anyone could verify or depen this a bit, it would be highly appreciated.

Cheers

X

P.S i have included the test code and makefile

Or the libraries g++ 3.4 uses are buggy. :rolleyes: It has happened before. Open source doesn't mean bug free!

Or using g++ 3.2 only hides the problem for now and there is still a bug in your shared lib. Just because a shared lib is not crashing does really not mean to much about it not having illegal pointer releases or accesses. It could just happen to occurr in a situation that does not crash for the moment and semmingly small changes in the code or even in the LabVIEW app can trigger the crash.

Rolf Kalbermatter

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.