Jump to content

Issues TCP-ing with a C program on same computer


Recommended Posts

I'm working on a project that has 3 parts:

P1) data simulator

P2) data parser

P3) data analyzer

Original version of the project:

P1 and P2 are written in C and live on a remote machine. P3 is written in LabVIEW and runs on a laptop. All the parts communicate via TCP. This works great.

Current version of project:

P1 is in C on remote machine. P2 is in C on laptop. P3 is in LabVIEW on same laptop. All parts still communicating via TCP. This is not working so great. Everything connects fine and will run for anywhere from 2 to 10 minutes. Then P3 starts complaining about timeout errors and/or network peer disconnecting. P2 starts complaining that it can't send it's data to P2. Everybody eventually times out and resets connections. Sometimes starting up another program (wordpad, program manager, etc) will cause this, but most often it happens even if there is no user interaction with the laptop.

The part of the LV code dealing with this consists of 1 reintrant vi called 4 times to handle 4 different data messages. Each message has its own port.

In order to debug I've ripped everything out of P3 except doing the TCP reads. I've knocked it back to reading just one message. I set the subvi with the TCP read to time critical priority (yes, I'm getting desperate). None of this helps.

Just to add insult to injury, if P3 is written in C and run on the laptop with P2, it all works fine...

Help?!?

Link to post
Share on other sites

P2 starts complaining that it can't send it's data to P2.

do you do it through localhost? This is really strange and may point out to issues with OS installation as well as with issues inside the P2. (If it's not a mistype)

Link to post
Share on other sites

do you do it through localhost? This is really strange and may point out to issues with OS installation as well as with issues inside the P2. (If it's not a mistype)

All connections are made using actual IP addresses. But either way, the parts are always connecting, and data always flows for some amount of time. Just not for as long as I would like it to.

I would love it to be the P2 code since I didn't write that. :) However, since P2© and P3(LV) work fine if they are on different computers, and P2© and P3© work fine on the same computer, it's hard to point a finger at P2 being the problem.

Link to post
Share on other sites

All connections are made using actual IP addresses. But either way, the parts are always connecting, and data always flows for some amount of time. Just not for as long as I would like it to.

I would love it to be the P2 code since I didn't write that. :) However, since P2© and P3(LV) work fine if they are on different computers, and P2© and P3© work fine on the same computer, it's hard to point a finger at P2 being the problem.

Hrm... Do you have anything to monitor network traffic?

Link to post
Share on other sites

Hrm... Do you have anything to monitor network traffic?

Just downloaded and started playing with Wireshark. Looks like a great program. Except, I couldn't get it to work. Finally discovered that "loopback interfaces are not available on windows platform". :frusty:

The Wireshark wiki does have links to a couple commercial products that monitor loopback connections. But it will take me awhile to get my hands on those, and I'm not even sure they'll tell me anything.

Link to post
Share on other sites

Just downloaded and started playing with Wireshark. Looks like a great program. Except, I couldn't get it to work. Finally discovered that "loopback interfaces are not available on windows platform". :frusty:

Wireshark is a good product; unfortunately I've only worked with it with the detailed instructions of tech support. It occurred to me you might be able to get load information from a managed switch.

Tim

Link to post
Share on other sites

How many NICs do you have on each PC?

I'm assuming you are running windows, and connected to an internet gateway/LAN on another connection.

One thing that might help you is changing the interface metric of the individual NICs on each PC. Put the ones used in your setup as a 1, and all the other ones 2 or higher. This will probably mess up your internet connection, or make it unbearably slow when it works.

~Jon

Link to post
Share on other sites

How many NICs do you have on each PC?

I'm assuming you are running windows, and connected to an internet gateway/LAN on another connection.

One NIC. It's a very simple setup -- 3 computers on a standalone switch. They have hardcoded IP addresses. Yes, I am running Windows 7 (and using LV10). There was another NIC on the laptop, but I took it out in case it was the problem. Didn't help any.

Link to post
Share on other sites

What sort of throughput are you trying to achieve? You could try and give LV a bit more time to service the TCPIP stack by increasing the buffer size.

Each of the data streams is ~ 8 M/B, so worst case thruput is 24MB/s (counting P2 --> P3 twice). I've been staring at Resource Monitor a lot. The network (1Gb) is loping, the CPUs are loping.

Thanks for the vi. Any chance you've got a "Get Buffer" vi? I'd like to know what the buffers are set at before I start playing around with them. I tried to reverse-engineer the wsock32.dll call in "Set Buffer" but can't really test it here on my home box.

Link to post
Share on other sites

Wireshark is a good product; unfortunately I've only worked with it with the detailed instructions of tech support. It occurred to me you might be able to get load information from a managed switch.

I'll talk to my network guru about that and see what hardware we've got available. But I'm thinking its very possible this data never really makes it out of the computer onto the network.

Link to post
Share on other sites

Each of the data streams is ~ 8 M/B, so worst case thruput is 24MB/s (counting P2 --> P3 twice). I've been staring at Resource Monitor a lot. The network (1Gb) is loping, the CPUs are loping.

Thanks for the vi. Any chance you've got a "Get Buffer" vi? I'd like to know what the buffers are set at before I start playing around with them. I tried to reverse-engineer the wsock32.dll call in "Set Buffer" but can't really test it here on my home box.

Well, that's not a huge amount. Even the default LV examples should be able to cope with that.

The default windows buffer is 8192 (if I remember correctly-don't have the getsockettoption....maybe later in the week). There are a few ways of calculating the optimum size dependent on the network characteristics, but I usually just set it to 65356 (64K) unless it's a particularly slow network (like dial up). It really makes a difference with UDP rather than TCP (datagram size errors). Note, however, that it only makes a difference if you are setting the "Listener" connection. It has no effect on "Open".

It''s strange that the C program doesn't exhibit the same problem. If you are doing the usual write size, write data then you can try and combine it into one write operation (just concatenate the strings) but I haven't run into a problem with the former. If you have the C code then you can take a peek to see if they are doing anything fancy with the sockets....but I doubt it.

Are you trying to send and receive on the same port? (Cmd-Resp) or do you have two separate channels; one for sending and one for receiving. If the latter. What part disconnects? The receipt of the data or the send of the data (or arbitrarily both) and what is the error thrown (66?). If you get timeout errors on a read, then you should see that in the network monitor (task manger) as drops but you say that it "lopes" (had to look that one up..lol). That's normally indicative of a terminated messaging scheme where the terminator gets dropped for some reason..

Link to post
Share on other sites

The default windows buffer is 8192 (if I remember correctly-don't have the getsockettoption....maybe later in the week).

That's what the vi you sent defaults to. I can just go with that.

Note, however, that it only makes a difference if you are setting the "Listener" connection. It has no effect on "Open".

Ohh... so the fact that I'm the client (open) not the server (listen) means this isn't going to help?

It''s strange that the C program doesn't exhibit the same problem. If you are doing the usual write size, write data then you can try and combine it into one write operation (just concatenate the strings) but I haven't run into a problem with the former. If you have the C code then you can take a peek to see if they are doing anything fancy with the sockets....but I doubt it.

Everyone involved agrees it's strange. And no, there's nothing fancy going on with the sockets on the C side. Not that we know of. This *is* a port of Unix C code, done by someone with lots of Unix C experience but zero Windows experience.

In order to remove any other possible code issues, I've written a couple vis that are basic read/write to the same port. The C programmer has done the same. My two programs run fine together. Her two programs run fine together. Run her write and my read and it runs for awhile but errors out eventually.

Tho this is making me wonder what might happen if we run my write and her read together...

Are you trying to send and receive on the same port? (Cmd-Resp) or do you have two separate channels; one for sending and one for receiving.

One port. The C side most of the time throws a 10060 error. The LV side thows 56 for awhile and then 66. Basically, "what we've got here is a failure to communicate".

you should see that in the network monitor (task manger) as drops but you say that it "lopes" (had to look that one up..lol)

Oops. Sorry. :P

Link to post
Share on other sites

One port. The C side most of the time throws a 10060 error. The LV side thows 56 for awhile and then 66. Basically, "what we've got here is a failure to communicate".

Have you tried playing with the mode input from TCP Read? Do you use message termination scheme with CR LF or something else? Basically LabVIEW defaults to a semi buffered mode but most simple socket programming without any poll() operation resembles more the immediate mode of LabVIEW. A mismatch in this mode with what a C program expects is usually the most likely reason for the behavior you see.

The C Read with LabVIEW Write most likely will simply work if above is the culprit.

  • Like 1
Link to post
Share on other sites

I guess it won't help then.

Rolfs got a point, but immediate really puts a burden on the CPU since you've got to (wo)man handle characters as they arrive. Then concatenate and terminate the loop on whatever it's supposed to terminate on (number of bytes or term char).

This is the sort of thing:

As you can probably see from the snippet. there is a (small) possibility that the first 4 bytes are garbage or maybe you read 1/2 way through a string and therefore expect a huge number.

So are you using character terminated or pretending a payload size? You haven't said much of the inner workings. Example perhaps?

Link to post
Share on other sites

Every once in a long while I'm presented with a problem I just can't figure out. It's been quite some time; I guess I'm overdue. I've run so many different tests I'm seeing them in my sleep, but here's the summary of tearing my hair out for the past two weeks:

The basic problem is to write 5MBytes of data from one program to another, on the same computer, every second, via TCP. In its original configuration, this data is literally 5MB all in one TCP write every second, not paced out. It uses payload size to determine the end of the data.

If the two programs are written in C, it works. I was incorrect in my orignal statement that if both programs are in LabVIEW it also works. It doesn't, or rather I haven't been able to figure out how to make it work. It does work if both LV programs are on different computers, but not if they are on the same computer. And if the LV is doing the write, the C read works fine. So the issue seems to be the LV read, on the same computer with any type write. The two programs connect and are sending/receiving the data for several minutes (2-50). Then both sides stop with various errors. With both sides in LV, most often, the read errors out with a timeout (56), and the write errors out saying the system caused a network disconnect (62).

Here are some things I've tried that made little or no difference:

running the LV programs together on a different computer

Intermediate mode read (thanks for the suggestion, Rolf)

breaking the 5MB write up into 10 500KB writes and 100 50kB writes

breaking the 5MB write up into 10 500KB writes and pacing them out over 750ms

reading the 5MB all at once

breaking the read up into 500kB and 50kB passes

Shaun, your suggestion to play with the TCP buffer sizes helped in that instead of failing in a few minutes, it would go for several minutes. Oh, and controlling buffer size on a Windows 7 machine is a PITA. Check out this article if you're doing it on a Win7 platform. I tried every buffer size possible, but it never really helped much more. I even posted to serverfault.com with the problem, and also many other questions about Win7 TCP buffers that no one seems to understand, and have gotten a deafening silence in response.

At this point, I'm going with Plan B. I've configured a ramdisk on the computer and we're just going to write/read files. In retrospect, this may actually be a better solution, but dang it, I want to know why the TCP way isn't working.

I'm attaching a couple of very simple vis to demonstrate the problem. Just run them on the same machine, with that machine's IP address (it's an input instead of default for testing on 2 different machines). Written with LV 2010 SP1 64-bit on Windows 7. About all I haven't been able to try is a different combo of LV version/OS. The longest these test vis have ever run has been 50 minutes, and that was much longer than the norm. If anyone has a chance to run these vis, please let me know how it goes.

Thanks,

Cat

Write TCP Data - Simple.vi

Read TCP Data - Simple.vi

Link to post
Share on other sites

Every once in a long while I'm presented with a problem I just can't figure out. It's been quite some time; I guess I'm overdue. I've run so many different tests I'm seeing them in my sleep, but here's the summary of tearing my hair out for the past two weeks:

The basic problem is to write 5MBytes of data from one program to another, on the same computer, every second, via TCP. In its original configuration, this data is literally 5MB all in one TCP write every second, not paced out. It uses payload size to determine the end of the data.

If the two programs are written in C, it works. I was incorrect in my orignal statement that if both programs are in LabVIEW it also works. It doesn't, or rather I haven't been able to figure out how to make it work. It does work if both LV programs are on different computers, but not if they are on the same computer. And if the LV is doing the write, the C read works fine. So the issue seems to be the LV read, on the same computer with any type write. The two programs connect and are sending/receiving the data for several minutes (2-50). Then both sides stop with various errors. With both sides in LV, most often, the read errors out with a timeout (56), and the write errors out saying the system caused a network disconnect (62).

Here are some things I've tried that made little or no difference:

running the LV programs together on a different computer

Intermediate mode read (thanks for the suggestion, Rolf)

breaking the 5MB write up into 10 500KB writes and 100 50kB writes

breaking the 5MB write up into 10 500KB writes and pacing them out over 750ms

reading the 5MB all at once

breaking the read up into 500kB and 50kB passes

Shaun, your suggestion to play with the TCP buffer sizes helped in that instead of failing in a few minutes, it would go for several minutes. Oh, and controlling buffer size on a Windows 7 machine is a PITA. Check out this article if you're doing it on a Win7 platform. I tried every buffer size possible, but it never really helped much more. I even posted to serverfault.com with the problem, and also many other questions about Win7 TCP buffers that no one seems to understand, and have gotten a deafening silence in response.

At this point, I'm going with Plan B. I've configured a ramdisk on the computer and we're just going to write/read files. In retrospect, this may actually be a better solution, but dang it, I want to know why the TCP way isn't working.

I'm attaching a couple of very simple vis to demonstrate the problem. Just run them on the same machine, with that machine's IP address (it's an input instead of default for testing on 2 different machines). Written with LV 2010 SP1 64-bit on Windows 7. About all I haven't been able to try is a different combo of LV version/OS. The longest these test vis have ever run has been 50 minutes, and that was much longer than the norm. If anyone has a chance to run these vis, please let me know how it goes.

Thanks,

Cat

Try it this way....

Edited by ShaunR
Link to post
Share on other sites

I'm attaching a couple of very simple vis to demonstrate the problem. Just run them on the same machine, with that machine's IP address (it's an input instead of default for testing on 2 different machines). Written with LV 2010 SP1 64-bit on Windows 7. About all I haven't been able to try is a different combo of LV version/OS. The longest these test vis have ever run has been 50 minutes, and that was much longer than the norm. If anyone has a chance to run these vis, please let me know how it goes.

I've had the VIs running on a WinXP SP3 Core2 Duo 2 Ghz machine with 2 GB of RAM under LV 2011 for about 1-1/2 hours with no sign of issues. Can try with LV 2010, but don't have 9.0 installed.

Tim

  • Like 1
Link to post
Share on other sites

I've had the VIs running on a WinXP SP3 Core2 Duo 2 Ghz machine with 2 GB of RAM under LV 2011 for about 1-1/2 hours with no sign of issues. Can try with LV 2010, but don't have 9.0 installed.

I knew LV2011 would be the answer to all my problems!! :P

I do not have an XP machine with anything other than LV8.6 on it, and it's out on loan at the moment. I should go see if I can borrow it back...

Thanks for the feedback!

Try it this way

I ran your two vis together on 2 different machines and they died after 7 minutes on 1 machine and after 42 minutes on the other...

Link to post
Share on other sites

Hmmm......oops.gif

That's ok! I appreciate any help! Have you had a chance to run my code for any length of time?

Wait for SP1 wink.gif

That's usually my mantra, but if it actually makes this work... I've got my XP/8.6 machine back and will be trying it soon.

Link to post
Share on other sites

LV 2010 on the same machine ran for 2 hours with no signs of stopping.

Wow. So maybe it's a Win7/LV10 problem...

I'm installing LV11 on my Win7 homebox as I type. I'll try it on that.

BTW, I ran both sides on my XP/LV8.6 machine this afternoon for over an hour with no problem. I'll try a longer test tomorrow.

I really appreciate you running these LV/OS combo tests.

Link to post
Share on other sites

Wow. So maybe it's a Win7/LV10 problem...

I'm installing LV11 on my Win7 homebox as I type. I'll try it on that.

BTW, I ran both sides on my XP/LV8.6 machine this afternoon for over an hour with no problem. I'll try a longer test tomorrow.

I really appreciate you running these LV/OS combo tests.

I know it's one of those silly questions (especially since the C program would suffer from it too). But has to be asked.....

Are you sure the power saving is turned off on the network card(s)

Link to post
Share on other sites

I know it's one of those silly questions (especially since the C program would suffer from it too). But has to be asked.....

Are you sure the power saving is turned off on the network card(s)

At this point, there are no such things as silly questions. But yes, I've turned off any and all power savings settings I could find, including the network card. At least on my computers here at work.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.