Jump to content

Issues TCP-ing with a C program on same computer


Recommended Posts

Posted

OK - I may have found out something useful here - here's the timeline.

I loaded the LV read and write VIs (using LV 2011 - downloaded it here at NI Week!). The first time, it ran until my computer went to sleep (about one hour). Then, it would pretty consistently fail between 2 and 40 minutes. So I started poking around and looking into the error code and who failed first. That lead me to suspect a timeout problem. I set the timeouts on both the read and write TCP functions to -1 (wait forever) and started it up last night. It was still running this morning (8+ hours). So, this is only a single data point, but it may be useful. Also, you have the writer listening for the reader. Although I think there's no reason while this is wrong, it seems to me that it's more typical for the data consumer to invoke the listener (most LV written server examples use this, I think). I also can't think of why you wouldn't want the reader/writer to wait for data up to forever as long as they have a valid connection - if the connection dies, the functions will stop blocking and return the error (not just lock up the system).

I hope this helps,

Mark

  • Like 1
Posted

I set the timeouts on both the read and write TCP functions to -1 (wait forever) and started it up last night. It was still running this morning (8+ hours).

Oh, right, sorry, I really should have mentioned this.

The reason why I have a 3 second timeout is because after data stops flowing, about 5 seconds later it starts flowing again. A 4-6 second flow stop is pretty consistent. So yes, with no timeout, this can run forever. Of course this 5 second pause is happening every few minutes. I haven't done any tests to see if data is actually lost during that time.

This 5 second pause does not happen between the C-write/C-read or LV-write/C-read. It only happens with the LV-read.

Unfortunately, the "real" C code errors out long before 5 seconds because as far as it's concerned my LV reader side stops responding to data sends. This causes it to reset the connections and that is a Bad Thing when it happens while data is actually being recorded. I don't have much control of the real C-write because it also has to work interfaced with other systems. A 5 second pause there means everything is dead, not just that the TCP read is taking a little unscheduled break.

Also, you have the writer listening for the reader. Although I think there's no reason while this is wrong, it seems to me that it's more typical for the data consumer to invoke the listener (most LV written server examples use this, I think).

The existing C-write I have to tie into is set up to do the listening.

Thanks for trying this out! Are you running on a Windows 7 machine??

Latest tests:

Win7 / LV2011: died after about 90 minutes

WinXP / LV8.6.1: I finally stopped it after 4 hours with no problems. Hmm.

I'm trying to figure out if I really want to go thru the pain and agony of uninstalling LV2010 from one of my (2) Win7 machines and reinstalling LV8.6.1. And then after testing, uninstalling LV8.6.1 and reinstalling LV2010. There goes 2 or 3 days of computer time. Especially since every time my boss walks by my workbench where all my computers are lined up, he says, "I thought you were done playing around with that!"

However, I think I finally convinced him it's important to figure out if this is something that's crept into LV since 8.6.1 or if it's a Win7/any LV problem. Granted, with our usual projects we would never have reason to do loopback TCP, but it may be indicative of some larger problem with Win7/LV/TCP and that wouldn't be good.

Posted

I really appreciate you running these LV/OS combo tests.

Not a problem, especially as this is related to something I'm working on!

Tim

Posted

I also have LV 8.6.1 on my Win7 64 bit machine - I'll run your VI's and report back, although it won't be until tomorrow that I expect to be able to do this - I'm traveling today and I can't keep my machine on that long.

Also, I'm starting to understand your constraints, I think. So effectively the C program is a data server? It listens for requests for data and then writes it to the client. And it doesn't maintain the connection once it finishes the initial write. You might try using the toolkit at

http://lavag.org/fil...ls-for-labview/

I don't really know if this might help since the cause of this behavior is so hard to identify, but since it's a different implementation it might either work or at least return an error message that might throw some light on the subject. If the Windows Socket error isn't recognized and converted to a LabVIEW error code,, it will return the winsock error code which might be useful. It will work as a direct replacement for the TCP/IP functions with the caveat that you always have to supply an IP address (no lookups).

Mark

Posted

I also have LV 8.6.1 on my Win7 64 bit machine - I'll run your VI's and report back, although it won't be until tomorrow that I expect to be able to do this - I'm traveling today and I can't keep my machine on that long.

Also, I'm starting to understand your constraints, I think. So effectively the C program is a data server? It listens for requests for data and then writes it to the client. And it doesn't maintain the connection once it finishes the initial write. You might try using the toolkit at

http://lavag.org/fil...ls-for-labview/

Hopefully I've caught you in time... I ran Win7-64/LV8.6.1 yesterday and it died in around a half hour. Verification from someone else would be great, but if it's too much work, don't worry about it.

I'll take a look at the toolkit when I get to work. Thanks!

Posted

Hopefully I've caught you in time... I ran Win7-64/LV8.6.1 yesterday and it died in around a half hour. Verification from someone else would be great, but if it's too much work, don't worry about it.

I'll take a look at the toolkit when I get to work. Thanks!

Have you tried running in compatibility mode on Win7?

Tim

Posted

Have you tried running in compatibility mode on Win7?

No I hadn't, but I just did. :-) Failed in both Windows Server 2008 and Vista. Not that I'm really surprised. From what I've been reading, MS changed their network implementation for Vista/2008/7. So if it doesn't work in one that might make it more likely it wouldn't work in the others.

Posted

No I hadn't, but I just did. :-) Failed in both Windows Server 2008 and Vista. Not that I'm really surprised. From what I've been reading, MS changed their network implementation for Vista/2008/7. So if it doesn't work in one that might make it more likely it wouldn't work in the others.

Plese forgive me for not reviewing this thread for Qs already asked.

Desktop Trace Execution shows no clues?

Doesn't NI Spy have TCP stuff buit in?

Do you have acces to a LAN analyzer or a Sniffer?

Dumb questions can and should be ignored.

Ben

Posted

I hope this doesn't just get appended to my last post. That happens a lot on this site. I must have something set wrong somewhere. Anyway...

Test results:

Windows --- LV --- Result (passed = ran for over 4 hours without failing)

XP --- 8.6.1f1 --- PASS

XP --- 10SP1 --- PASS

7 ------10SP1 --- FAIL

7 ------ 11 -------- FAIL

7------ 8.6.1f1 --- FAIL

Anyone else see a trend here? ;)

FWIW, all Windows 7 boxes have been 64-bit.

I'm going to run the write and read on 2 separate Windows 7 computers for a Long Time again, since I probably never ran much longer than a couple hours before stopping with no error. For completeness sake, I should do a 4 hour test. Because if *that* fails, there's a really big problem somewhere.

Thanks again for all the help/suggestions and especially thanks to those of you who were running tests for me!

Plese forgive me for not reviewing this thread for Qs already asked.

Desktop Trace Execution shows no clues?

Doesn't NI Spy have TCP stuff buit in?

Do you have acces to a LAN analyzer or a Sniffer?

Dumb questions can and should be ignored.

I have v1.0 of Desktop Execution Trace Toolkit. This should actually be a small enough program for it not to choke. I can try it even tho it doesn't say it works on Win7.

Network analyzers aren't a lot of help since the data never actually leaves the machine.

No questions are dumb!

Posted

I hope this doesn't just get appended to my last post. That happens a lot on this site. I must have something set wrong somewhere. Anyway...

Test results:

Windows --- LV --- Result (passed = ran for over 4 hours without failing)

XP --- 8.6.1f1 --- PASS

XP --- 10SP1 --- PASS

7 ------10SP1 --- FAIL

7 ------ 11 -------- FAIL

7------ 8.6.1f1 --- FAIL

Anyone else see a trend here? ;)

FWIW, all Windows 7 boxes have been 64-bit.

I'm going to run the write and read on 2 separate Windows 7 computers for a Long Time again, since I probably never ran much longer than a couple hours before stopping with no error. For completeness sake, I should do a 4 hour test. Because if *that* fails, there's a really big problem somewhere.

Thanks again for all the help/suggestions and especially thanks to those of you who were running tests for me!

I have v1.0 of Desktop Execution Trace Toolkit. This should actually be a small enough program for it not to choke. I can try it even tho it doesn't say it works on Win7.

Network analyzers aren't a lot of help since the data never actually leaves the machine.

No questions are dumb!

Hmm. Yes. A bit of a trend apart from LV2010. And it may be why I cannot see any problems on my machines (none of the examples fall over after running for 29 hrs now :) ). My windows TCPIP is highly modified from a standard install. It was the only way i could "reliably" get TCPIP transfer rates of up to 80 MB/sec. (Not in loop back; across the network). The sorts of things that were changed were the TCPIP auto-tuning, and Chimney Offload. Also had to play with the TCPIP Optimiser, but can't remember exactly what now. This was in addition to the TX buffers. But i wouldn't have thought 25MB/sec would/should be that much of a problem, but I guess it is windows eh?

  • Like 1
Posted

A bit of a trend apart from LV2010. And it may be why I cannot see any problems on my machines (none of the examples fall over after running for 29 hrs now).

Are you running them on a Windows 7 64 bit machine??

29 hours is probably long enough. :)

Posted

My windows TCPIP is highly modified from a standard install. It was the only way i could "reliably" get TCPIP transfer rates of up to 80 MB/sec. (Not in loop back; across the network). The sorts of things that were changed were the TCPIP auto-tuning, and Chimney Offload. Also had to play with the TCPIP Optimiser, but can't remember exactly what now. This was in addition to the TX buffers.

I messed with all sorts of stuff when I was playing with buffer sizes. I haven't tried auto-tuning, since the point was to get my TCP buffers up and if you disable auto-tuning they can only go to 64kB. I did turn off heuristics which is supposed to keep Windows from changing the auto-tuning level to something much more restricted (I have it set at normal).

And then there's always the fact that this works fine in C without having to change any settings...

Posted

And then there's always the fact that this works fine in C without having to change any settings...

Windows 7 x64 with LV 2009 x64.

Indeed. My problem was just sheer throughput and it didn't matter what it was written in.

I know it's curing the symptom rather than the problem (and it will be blocking), but have you tried getting the C read and write stuff compiled into a DLL and using that instead? Just a thought to see if the specific problem goes away.

What do NI say about it (after all it is repeatable by a number of people)?

Posted

I know it's curing the symptom rather than the problem (and it will be blocking), but have you tried getting the C read and write stuff compiled into a DLL and using that instead? Just a thought to see if the specific problem goes away.

What do NI say about it (after all it is repeatable by a number of people)?

We thought about trying a dll, but none of the C programmers here are experienced with that sort of thing (they are all UNIX programmers and are having enough issues trying to deal in the Windows world) so we dropped it. We're going to write files to a ram drive. It's low enough datarate, plus the fact that all the data is available once a second and not spread out, makes that a pretty good option. Hopefully. :-)

My next move is to kick this to NI, I guess. It's such a narrow issue -- I doubt many folks are doing loopback TCP in LabVIEW on a regular basis -- but my concern is that there's some underlying issue with LV that may affect other TCP functions.

Posted

We thought about trying a dll, but none of the C programmers here are experienced with that sort of thing (they are all UNIX programmers and are having enough issues trying to deal in the Windows world) so we dropped it. We're going to write files to a ram drive. It's low enough datarate, plus the fact that all the data is available once a second and not spread out, makes that a pretty good option. Hopefully. :-)

My next move is to kick this to NI, I guess. It's such a narrow issue -- I doubt many folks are doing loopback TCP in LabVIEW on a regular basis -- but my concern is that there's some underlying issue with LV that may affect other TCP functions.

Well. If they have difficulty with DLLs (SOs on linux) then kernel level drivers will slay them. The randrive.sys driver is no longer available in windows 7 (hope they weren't thinking of using it wink.gif ) but there are a few 3rd party solutions I think.

One final thought. Turn off the nagle algo. It is known to play hell with things like games and it is known to silently introduce delays in packet sending through the loopback. It is off for my setups for this very reason, although I never saw 2 second delays.

Posted

Well. If they have difficulty with DLLs (SOs on linux) then kernel level drivers will slay them. The randrive.sys driver is no longer available in windows 7 (hope they weren't thinking of using it wink.gif ) but there are a few 3rd party solutions I think.

One final thought. Turn off the nagle algo. It is known to play hell with things like games and it is known to silently introduce delays in packet sending through the loopback. It is off for my setups for this very reason, although I never saw 2 second delays.

I'm the one setting up the ramdrive and had originally hoped to use ramdrive.sys. :rolleyes: The Unix guys got a little glassy-eyed about it until I told them not to worry -- it just looks like another disk to their code. The whole ramdrive thing was a step into the wayback machine for me. I haven't used one since my BASIC/MS-DOS days. I'm using RamDisk and so far it seems to be working.

Disabling Nagle's algorithm is something I haven't tried yet, but only because it's supposed to be for optimizing small data packets, and 5MB isn't very small. But hey, I've tried everything else, I can try that, too.

Posted

I'm the one setting up the ramdrive and had originally hoped to use ramdrive.sys. :rolleyes: The Unix guys got a little glassy-eyed about it until I told them not to worry -- it just looks like another disk to their code. The whole ramdrive thing was a step into the wayback machine for me. I haven't used one since my BASIC/MS-DOS days. I'm using RamDisk and so far it seems to be working.

Disabling Nagle's algorithm is something I haven't tried yet, but only because it's supposed to be for optimizing small data packets, and 5MB isn't very small. But hey, I've tried everything else, I can try that, too.

I've also experienced TCP/IP issues with Windows 7. We haven't fully isolated the issue but an application we have that sends quite a bit of data over TCP/IP in Win7 experiences lots of problems but runs like a charm on XP. I also did some traces of the communications and the traffic pattern in the Win7 cases was very strange from a networking perspective, including unexpected TCP-RSTs.

  • Like 1
Posted

I've also experienced TCP/IP issues with Windows 7. We haven't fully isolated the issue but an application we have that sends quite a bit of data over TCP/IP in Win7 experiences lots of problems but runs like a charm on XP. I also did some traces of the communications and the traffic pattern in the Win7 cases was very strange from a networking perspective, including unexpected TCP-RSTs.

Hmm... yes, that's my concern -- that this isn't just a loopback problem.

Thanks for the input. NI is working on this and I'll post here when they get back to me.

  • 4 weeks later...
  • 3 weeks later...
Posted

Here's the bottom line from the AE at NI (Kyle, who has been very helpful):

It appears that there is investigation being done into the TCP stack in LabVIEW, but it's still not clear if this is the cause of the problem. It's still possible that the problem may be the fault of Windows 7 and not LabVIEW. If there will be a fix for this, it will be a part of the normal patch release cycle. This means that any fix will be pushed live in 2012 at the earliest. We have verified, however, that the issue you're seeing only occurs with TCP loopback, and is not present in any other TCP scenarios.

The important part, I guess, is that this issue shouldn't be affecting "regular" TCP usage. This isn't great for anyone else out there who might be thinking about using TCP to communicate with other applications on the same machine, however.

Also from Kyle: The R&D folks working on this are very impressed by the amount of information they were able to start with.

I want to again thank everyone who tested code on various platforms and offered suggestions, so I had all that information to pass on to NI. At this point, the ramdrive is working well, so that's what I'll stick with.

Cat

Posted
I would test to use TCP-optimizer to tweak the Win7 network settings.

Thanks for the link to the software. Unfortunately, I got the same negative results. I had actually played with a lot of these settings by hand using this article. The issue isn't aggregate TCP performance, it's that the TCP Read just seems to go to sleep every once in awhile for a really long time.

Posted

Thanks for the link to the software. Unfortunately, I got the same negative results. I had actually played with a lot of these settings by hand using this article. The issue isn't aggregate TCP performance, it's that the TCP Read just seems to go to sleep every once in awhile for a really long time.

Just one final thought on this (probably not relevant). I have seen exactly what you describe when a machine has Bluetooth . The 5- 8 second freezes go away when the Bluetooth adapter is disabled. I never got to the bottom of this since the machine didn't need Bluetooth, so I just disabled it.

  • Like 1
Posted

Just one final thought on this (probably not relevant). I have seen exactly what you describe when a machine has Bluetooth . The 5- 8 second freezes go away when the Bluetooth adapter is disabled. I never got to the bottom of this since the machine didn't need Bluetooth, so I just disabled it.

Bluetooth and TCP both use the Windows Socket API, so it might be a hang-up internally where bluetooth blocks TCP for some reason. 5-8 seconds sounds a lot like the timeframe necessary for a discovery scan.

  • Like 1

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.