Issues TCP-ing with a C program on same computer
#1
Posted 22 July 2011 - 02:54 PM
P1) data simulator
P2) data parser
P3) data analyzer
Original version of the project:
P1 and P2 are written in C and live on a remote machine. P3 is written in LabVIEW and runs on a laptop. All the parts communicate via TCP. This works great.
Current version of project:
P1 is in C on remote machine. P2 is in C on laptop. P3 is in LabVIEW on same laptop. All parts still communicating via TCP. This is not working so great. Everything connects fine and will run for anywhere from 2 to 10 minutes. Then P3 starts complaining about timeout errors and/or network peer disconnecting. P2 starts complaining that it can't send it's data to P2. Everybody eventually times out and resets connections. Sometimes starting up another program (wordpad, program manager, etc) will cause this, but most often it happens even if there is no user interaction with the laptop.
The part of the LV code dealing with this consists of 1 reintrant vi called 4 times to handle 4 different data messages. Each message has its own port.
In order to debug I've ripped everything out of P3 except doing the TCP reads. I've knocked it back to reading just one message. I set the subvi with the TCP read to time critical priority (yes, I'm getting desperate). None of this helps.
Just to add insult to injury, if P3 is written in C and run on the laptop with P2, it all works fine...
Help?!?
#2
Posted 22 July 2011 - 03:07 PM
do you do it through localhost? This is really strange and may point out to issues with OS installation as well as with issues inside the P2. (If it's not a mistype)P2 starts complaining that it can't send it's data to P2.
CLD
#3
Posted 22 July 2011 - 04:23 PM
All connections are made using actual IP addresses. But either way, the parts are always connecting, and data always flows for some amount of time. Just not for as long as I would like it to.do you do it through localhost? This is really strange and may point out to issues with OS installation as well as with issues inside the P2. (If it's not a mistype)
I would love it to be the P2 code since I didn't write that.
#4
Posted 22 July 2011 - 05:05 PM
Hrm... Do you have anything to monitor network traffic?All connections are made using actual IP addresses. But either way, the parts are always connecting, and data always flows for some amount of time. Just not for as long as I would like it to.
I would love it to be the P2 code since I didn't write that.However, since P2© and P3(LV) work fine if they are on different computers, and P2© and P3© work fine on the same computer, it's hard to point a finger at P2 being the problem.
"If this was easy our kids would be doing it." - Coworker
#5
Posted 22 July 2011 - 06:28 PM
Just downloaded and started playing with Wireshark. Looks like a great program. Except, I couldn't get it to work. Finally discovered that "loopback interfaces are not available on windows platform".Hrm... Do you have anything to monitor network traffic?
The Wireshark wiki does have links to a couple commercial products that monitor loopback connections. But it will take me awhile to get my hands on those, and I'm not even sure they'll tell me anything.
#6
Posted 22 July 2011 - 07:21 PM
Wireshark is a good product; unfortunately I've only worked with it with the detailed instructions of tech support. It occurred to me you might be able to get load information from a managed switch.Just downloaded and started playing with Wireshark. Looks like a great program. Except, I couldn't get it to work. Finally discovered that "loopback interfaces are not available on windows platform".
![]()
Tim
"If this was easy our kids would be doing it." - Coworker
#7
Posted 22 July 2011 - 09:20 PM
I'm assuming you are running windows, and connected to an internet gateway/LAN on another connection.
One thing that might help you is changing the interface metric of the individual NICs on each PC. Put the ones used in your setup as a 1, and all the other ones 2 or higher. This will probably mess up your internet connection, or make it unbearably slow when it works.
~Jon
#8
Posted 23 July 2011 - 09:03 AM
Founder and general mischief maker on www.labview-tools.com.
SQlite aficionado and websocket zealot.
If it 'aint in LabVIEW, then you 'aint got a clue!
#9
Posted 23 July 2011 - 08:33 PM
One NIC. It's a very simple setup -- 3 computers on a standalone switch. They have hardcoded IP addresses. Yes, I am running Windows 7 (and using LV10). There was another NIC on the laptop, but I took it out in case it was the problem. Didn't help any.How many NICs do you have on each PC?
I'm assuming you are running windows, and connected to an internet gateway/LAN on another connection.
#10
Posted 23 July 2011 - 09:21 PM
Each of the data streams is ~ 8 M/B, so worst case thruput is 24MB/s (counting P2 --> P3 twice). I've been staring at Resource Monitor a lot. The network (1Gb) is loping, the CPUs are loping.What sort of throughput are you trying to achieve? You could try and give LV a bit more time to service the TCPIP stack by increasing the buffer size.
Thanks for the vi. Any chance you've got a "Get Buffer" vi? I'd like to know what the buffers are set at before I start playing around with them. I tried to reverse-engineer the wsock32.dll call in "Set Buffer" but can't really test it here on my home box.
#11
Posted 24 July 2011 - 01:08 PM
I'll talk to my network guru about that and see what hardware we've got available. But I'm thinking its very possible this data never really makes it out of the computer onto the network.Wireshark is a good product; unfortunately I've only worked with it with the detailed instructions of tech support. It occurred to me you might be able to get load information from a managed switch.
#12
Posted 24 July 2011 - 03:52 PM
Well, that's not a huge amount. Even the default LV examples should be able to cope with that.Each of the data streams is ~ 8 M/B, so worst case thruput is 24MB/s (counting P2 --> P3 twice). I've been staring at Resource Monitor a lot. The network (1Gb) is loping, the CPUs are loping.
Thanks for the vi. Any chance you've got a "Get Buffer" vi? I'd like to know what the buffers are set at before I start playing around with them. I tried to reverse-engineer the wsock32.dll call in "Set Buffer" but can't really test it here on my home box.
The default windows buffer is 8192 (if I remember correctly-don't have the getsockettoption....maybe later in the week). There are a few ways of calculating the optimum size dependent on the network characteristics, but I usually just set it to 65356 (64K) unless it's a particularly slow network (like dial up). It really makes a difference with UDP rather than TCP (datagram size errors). Note, however, that it only makes a difference if you are setting the "Listener" connection. It has no effect on "Open".
It''s strange that the C program doesn't exhibit the same problem. If you are doing the usual write size, write data then you can try and combine it into one write operation (just concatenate the strings) but I haven't run into a problem with the former. If you have the C code then you can take a peek to see if they are doing anything fancy with the sockets....but I doubt it.
Are you trying to send and receive on the same port? (Cmd-Resp) or do you have two separate channels; one for sending and one for receiving. If the latter. What part disconnects? The receipt of the data or the send of the data (or arbitrarily both) and what is the error thrown (66?). If you get timeout errors on a read, then you should see that in the network monitor (task manger) as drops but you say that it "lopes" (had to look that one up..lol). That's normally indicative of a terminated messaging scheme where the terminator gets dropped for some reason..
Founder and general mischief maker on www.labview-tools.com.
SQlite aficionado and websocket zealot.
If it 'aint in LabVIEW, then you 'aint got a clue!
#13
Posted 27 July 2011 - 12:12 AM
That's what the vi you sent defaults to. I can just go with that.The default windows buffer is 8192 (if I remember correctly-don't have the getsockettoption....maybe later in the week).
Ohh... so the fact that I'm the client (open) not the server (listen) means this isn't going to help?Note, however, that it only makes a difference if you are setting the "Listener" connection. It has no effect on "Open".
Everyone involved agrees it's strange. And no, there's nothing fancy going on with the sockets on the C side. Not that we know of. This *is* a port of Unix C code, done by someone with lots of Unix C experience but zero Windows experience.It''s strange that the C program doesn't exhibit the same problem. If you are doing the usual write size, write data then you can try and combine it into one write operation (just concatenate the strings) but I haven't run into a problem with the former. If you have the C code then you can take a peek to see if they are doing anything fancy with the sockets....but I doubt it.
In order to remove any other possible code issues, I've written a couple vis that are basic read/write to the same port. The C programmer has done the same. My two programs run fine together. Her two programs run fine together. Run her write and my read and it runs for awhile but errors out eventually.
Tho this is making me wonder what might happen if we run my write and her read together...
One port. The C side most of the time throws a 10060 error. The LV side thows 56 for awhile and then 66. Basically, "what we've got here is a failure to communicate".Are you trying to send and receive on the same port? (Cmd-Resp) or do you have two separate channels; one for sending and one for receiving.
Oops. Sorry.you should see that in the network monitor (task manger) as drops but you say that it "lopes" (had to look that one up..lol)
#14
Posted 27 July 2011 - 05:19 AM
One port. The C side most of the time throws a 10060 error. The LV side thows 56 for awhile and then 66. Basically, "what we've got here is a failure to communicate".
Have you tried playing with the mode input from TCP Read? Do you use message termination scheme with CR LF or something else? Basically LabVIEW defaults to a semi buffered mode but most simple socket programming without any poll() operation resembles more the immediate mode of LabVIEW. A mismatch in this mode with what a C program expects is usually the most likely reason for the behavior you see.
The C Read with LabVIEW Write most likely will simply work if above is the culprit.
Rolf Kalbermatter
CIT Engineering Netherlands
A division of Test & Measurement Solutions
#15
Posted 30 July 2011 - 09:01 PM
Rolfs got a point, but immediate really puts a burden on the CPU since you've got to (wo)man handle characters as they arrive. Then concatenate and terminate the loop on whatever it's supposed to terminate on (number of bytes or term char).
This is the sort of thing:
As you can probably see from the snippet. there is a (small) possibility that the first 4 bytes are garbage or maybe you read 1/2 way through a string and therefore expect a huge number.
So are you using character terminated or pretending a payload size? You haven't said much of the inner workings. Example perhaps?
Founder and general mischief maker on www.labview-tools.com.
SQlite aficionado and websocket zealot.
If it 'aint in LabVIEW, then you 'aint got a clue!
#16
Posted 03 August 2011 - 01:42 PM
The basic problem is to write 5MBytes of data from one program to another, on the same computer, every second, via TCP. In its original configuration, this data is literally 5MB all in one TCP write every second, not paced out. It uses payload size to determine the end of the data.
If the two programs are written in C, it works. I was incorrect in my orignal statement that if both programs are in LabVIEW it also works. It doesn't, or rather I haven't been able to figure out how to make it work. It does work if both LV programs are on different computers, but not if they are on the same computer. And if the LV is doing the write, the C read works fine. So the issue seems to be the LV read, on the same computer with any type write. The two programs connect and are sending/receiving the data for several minutes (2-50). Then both sides stop with various errors. With both sides in LV, most often, the read errors out with a timeout (56), and the write errors out saying the system caused a network disconnect (62).
Here are some things I've tried that made little or no difference:
running the LV programs together on a different computer
Intermediate mode read (thanks for the suggestion, Rolf)
breaking the 5MB write up into 10 500KB writes and 100 50kB writes
breaking the 5MB write up into 10 500KB writes and pacing them out over 750ms
reading the 5MB all at once
breaking the read up into 500kB and 50kB passes
Shaun, your suggestion to play with the TCP buffer sizes helped in that instead of failing in a few minutes, it would go for several minutes. Oh, and controlling buffer size on a Windows 7 machine is a PITA. Check out this article if you're doing it on a Win7 platform. I tried every buffer size possible, but it never really helped much more. I even posted to serverfault.com with the problem, and also many other questions about Win7 TCP buffers that no one seems to understand, and have gotten a deafening silence in response.
At this point, I'm going with Plan B. I've configured a ramdisk on the computer and we're just going to write/read files. In retrospect, this may actually be a better solution, but dang it, I want to know why the TCP way isn't working.
I'm attaching a couple of very simple vis to demonstrate the problem. Just run them on the same machine, with that machine's IP address (it's an input instead of default for testing on 2 different machines). Written with LV 2010 SP1 64-bit on Windows 7. About all I haven't been able to try is a different combo of LV version/OS. The longest these test vis have ever run has been 50 minutes, and that was much longer than the norm. If anyone has a chance to run these vis, please let me know how it goes.
Thanks,
Cat
Write TCP Data - Simple.vi 15.11K
50 downloads
Read TCP Data - Simple.vi 14.06K
48 downloads
#17
Posted 03 August 2011 - 02:53 PM
Try it this way....Every once in a long while I'm presented with a problem I just can't figure out. It's been quite some time; I guess I'm overdue. I've run so many different tests I'm seeing them in my sleep, but here's the summary of tearing my hair out for the past two weeks:
The basic problem is to write 5MBytes of data from one program to another, on the same computer, every second, via TCP. In its original configuration, this data is literally 5MB all in one TCP write every second, not paced out. It uses payload size to determine the end of the data.
If the two programs are written in C, it works. I was incorrect in my orignal statement that if both programs are in LabVIEW it also works. It doesn't, or rather I haven't been able to figure out how to make it work. It does work if both LV programs are on different computers, but not if they are on the same computer. And if the LV is doing the write, the C read works fine. So the issue seems to be the LV read, on the same computer with any type write. The two programs connect and are sending/receiving the data for several minutes (2-50). Then both sides stop with various errors. With both sides in LV, most often, the read errors out with a timeout (56), and the write errors out saying the system caused a network disconnect (62).
Here are some things I've tried that made little or no difference:
running the LV programs together on a different computer
Intermediate mode read (thanks for the suggestion, Rolf)
breaking the 5MB write up into 10 500KB writes and 100 50kB writes
breaking the 5MB write up into 10 500KB writes and pacing them out over 750ms
reading the 5MB all at once
breaking the read up into 500kB and 50kB passes
Shaun, your suggestion to play with the TCP buffer sizes helped in that instead of failing in a few minutes, it would go for several minutes. Oh, and controlling buffer size on a Windows 7 machine is a PITA. Check out this article if you're doing it on a Win7 platform. I tried every buffer size possible, but it never really helped much more. I even posted to serverfault.com with the problem, and also many other questions about Win7 TCP buffers that no one seems to understand, and have gotten a deafening silence in response.
At this point, I'm going with Plan B. I've configured a ramdisk on the computer and we're just going to write/read files. In retrospect, this may actually be a better solution, but dang it, I want to know why the TCP way isn't working.
I'm attaching a couple of very simple vis to demonstrate the problem. Just run them on the same machine, with that machine's IP address (it's an input instead of default for testing on 2 different machines). Written with LV 2010 SP1 64-bit on Windows 7. About all I haven't been able to try is a different combo of LV version/OS. The longest these test vis have ever run has been 50 minutes, and that was much longer than the norm. If anyone has a chance to run these vis, please let me know how it goes.
Thanks,
Cat
Edited by ShaunR, 03 August 2011 - 02:53 PM.
Founder and general mischief maker on www.labview-tools.com.
SQlite aficionado and websocket zealot.
If it 'aint in LabVIEW, then you 'aint got a clue!
#18
Posted 03 August 2011 - 05:00 PM
I've had the VIs running on a WinXP SP3 Core2 Duo 2 Ghz machine with 2 GB of RAM under LV 2011 for about 1-1/2 hours with no sign of issues. Can try with LV 2010, but don't have 9.0 installed.I'm attaching a couple of very simple vis to demonstrate the problem. Just run them on the same machine, with that machine's IP address (it's an input instead of default for testing on 2 different machines). Written with LV 2010 SP1 64-bit on Windows 7. About all I haven't been able to try is a different combo of LV version/OS. The longest these test vis have ever run has been 50 minutes, and that was much longer than the norm. If anyone has a chance to run these vis, please let me know how it goes.
Tim
"If this was easy our kids would be doing it." - Coworker
#19
Posted 03 August 2011 - 05:14 PM
I knew LV2011 would be the answer to all my problems!!I've had the VIs running on a WinXP SP3 Core2 Duo 2 Ghz machine with 2 GB of RAM under LV 2011 for about 1-1/2 hours with no sign of issues. Can try with LV 2010, but don't have 9.0 installed.
I do not have an XP machine with anything other than LV8.6 on it, and it's out on loan at the moment. I should go see if I can borrow it back...
Thanks for the feedback!
I ran your two vis together on 2 different machines and they died after 7 minutes on 1 machine and after 42 minutes on the other...Try it this way
#20
Posted 03 August 2011 - 05:14 PM
Wait for SP1I knew LV2011 would be the answer to all my problems!!
![]()
Edited by ShaunR, 03 August 2011 - 05:17 PM.
Founder and general mischief maker on www.labview-tools.com.
SQlite aficionado and websocket zealot.
If it 'aint in LabVIEW, then you 'aint got a clue!












