Jump to content

Lost UDP packets due to ARP


Recommended Posts

Posted

I have LabVIEW code on 2 separate PCs using UDP to send data. It's a one-way path from a LV RT Pharlap Target to a Windows10 receiver. I'm streaming data at a rate of 50Hz and I don't lose any packets. I have confirmed this with Wireshark. However, every 15min. the sender sends an ARP packet instead of a UDP packet. This lost packet is critical to my process and not acceptable. Is there a way to turn off ARP or change the timeout of it to something longer than 15min? I realize I need ARP, or do I? Can I just update this magical table manually?

Info on ARP from Wireshark.

Posted

Why are you using UDP if lost packets aren't acceptable? Even if you are on a closed network, there is no guarantee that all the packets get through. Sounds like you should be using TCP.

Posted

ARP is part of address discovery. It links MAC addresses to IP addresses. Yes you do need it.

Ditto gleichman or implementing something like RUDP at the application level.

Posted

I'm using LabVIEW to do all this. So not sure how to do RUDP. I'm using UDP because of the low overhead. But maybe I can still get the throughput with other methods. I will have to experiment and see. I've just never considered that this issue would come up at the slow rate I'm using.

Edit: It seems like ARP has a configurable timeout. Since my test only lasts for about 2hrs. Perhaps i can do an ARP at the start of the test and set the timeout longer than 2hrs. Now to figure out how to configure this timeout in Phar Lap.

Posted (edited)

Here are some links to MS Docs that might be helpful:

The second one is for Vista. I couldn't come up with any information on W10, so details can/will differ.

Not sure if I misinterpret the information, but shouldn't the ARP table keep updating (i.e. not sending additional ARP requests) while packages are being transfered, or is this limited to TCP?

If an entry is not used for a time between 15 to 45 seconds, it changes to the "Stale" state.
Then, the host must send an ARP Request for IPV4 to the network when any IP datagram is sent
to that destination.

RFC 826 only mentions timeouts briefly:

It may be desirable to have table aging and/or timeouts. The
implementation of these is outside the scope of this protocol.
Edited by LogMAN
Posted

Considering your slow rate - why you want to depend on "in general" unreliable UDP instead of TCP? At such rate TCP's overhead should be of no matter, so - if I may ask - what are the benefits you are looking for? Or it's more matter of "curiosity" that UDP in principle should work byt ARP protocol interferes it?

Posted (edited)
 

I'm using LabVIEW to do all this. So not sure how to do RUDP. I'm using UDP because of the low overhead. But maybe I can still get the throughput with other methods. I will have to experiment and see. I've just never considered that this issue would come up at the slow rate I'm using.

On a closed network where lost packets are unlikely, TCP with nagle turned off should have minimal overhead/latency vs udp.

 

Looking at this post by an NI employee. It seems that the ARP table cannot be statically defined on Phar Lap.

Is there any way to send the ARP manually -- IE call into the winsock api (PharLap :( ) and force an ARP every 5 minutes to refresh the table?

Edited by smithd
Posted
 

So not sure how to do RUDP

You would have to create/send the packet header(s) as defined  by RUDP in each data packet in LabVIEW on pharlap side by placing it before the data you send.  Then you would have to send a response packet with the RUDP header(s) on the LabVIEW host side based on whether you received a packet out of sequence (or invalid checksum, etc).  You would effectively be creating your own slimmed down version of TCP at the LabVIEW application layer.  Quite a pain unless absolutely necessary.

Posted
 

I'm using LabVIEW to do all this. So not sure how to do RUDP. I'm using UDP because of the low overhead. But maybe I can still get the throughput with other methods. I will have to experiment and see. I've just never considered that this issue would come up at the slow rate I'm using.

Edit: It seems like ARP has a configurable timeout. Since my test only lasts for about 2hrs. Perhaps i can do an ARP at the start of the test and set the timeout longer than 2hrs. Now to figure out how to configure this timeout in Phar Lap.

Itseems to me that that this is an extremely poor cure of the symptom rather than addressing problem. If it ever gets deployed somewhere other than your particular network, with your particular data needs; it will undoubtably run into problems.

RUDP is an application overlay. So you would need to code it yourself (from the spec I linked to). I don't know of any LabVIEW implementations but there are quite a few similar ones (RTP?) since UDP is used all the time with VOIP and video streaming.

Posted

TCP is not free of pain either though.  I've been on networks where the IT network traffic monitors will automatically close TCP connections if no data flows across them EVEN if TCP keep alive packets flow across the connections.  For whatever reason the packet inspection policies effectively ignore keep alive packets as legitimate.  We ended up having to send NO-OP packets with some dummy data in them every 5 minutes or so if no "real data" was flowing.

Posted (edited)

For fun I thought I'd make a list of the reasons I can remember why people choose sometimes choose UDP over TCP.

  • Connection overhead of TCP (initiating a connection)
    • Mainly a big deal with web browsers (each page has to connect to several domains, and each takes a few (usually 2 I believe) TCP connections, which introduces latency)
      • This is part of why HTTP/3 exists
    • Not a big deal for a 2 hour test where you open one connection
  • Don't need packet de-duplication or re-transmits
    • video streaming
    • or there is an application-specific usage pattern that makes application-layer handling of faults the better route (HTTP/3)
    • This application needs reliable transmission as it does not implement reliability at a higher level
  • Want to avoid ordered transmission/head-of-line blocking
    • This really means you are implementing multiplexing at the application level rather than at the TCP level -- its a hell of a lot easier to open 10 TCP connections, especially in applications on closed networks which are not "web scale"
      • This is the reason HTTP/2 exists. HTTP/2 has connection multiplexing on TCP, HTTP/3 has connection multiplexing over UDP.
    • Given the reliable transmission and rate requirement, I'm assuming ordered transmission is desired
  • Want to avoid congestion control
    • Bad actor attempting to cause network failures
    • or: self-limited bandwidth use
      • This application falls under this category
    • or: Implement congestion control at the application layer (HTTP/3)
  • Memory/CPU usage of tcp implementation
    • Erm...labview
  • Network engineers want to heavily fiddle with parameters and algorithms without waiting for the OS kernel to update
    • HTTP/3 is supposed to be faster because of this -- TCP is tuned for 20 years ago or so its been said, and HTTP/3 can be tuned for modern networks
    • I'm assuming this is not Michael

 

On a closed network, for this application, its hard to see a benefit to UDP. (It occurs to me Michael never said it was a closed network, but if he put a pharlap system on the internet...😵)

Edited by smithd
  • Like 1
Posted
 

I know nothing about ARP, but why would this come instead of a UDP packet?

I have seen something similar. I supported a system that had 10 cRIOs all transmitting UDP data to be logged at a reasonable rate, and packets would indeed just get "lost" (as noticed by the receiver PC). I never did any analysis to see if they were not being transmitted or not being received, I just solved the problem by using a different port per cIRO.

Posted
 

I have seen something similar. I supported a system that had 10 cRIOs all transmitting UDP data to be logged at a reasonable rate, and packets would indeed just get "lost" (as noticed by the receiver PC). I never did any analysis to see if they were not being transmitted or not being received, I just solved the problem by using a different port per cIRO.

If you've ever started up Wireshark, you'll see occasional TCP retransmits due to differnet factors - especially under load and especially with Wifi.. These don't get retransmitted with UDP. It sounds like most of your losses was were due to packet collisions.

Posted

Thanks for the interesting conversation. I've resolved the issue however, in a roundabout way and by cleaning up my code.

The UDP communication code in question is a model written in LabVIEW running on VeriStand, which is essentially a bunch of LabVIEW RT timed loops running on a Phar Lap target. So the model essentially opens a UDP connection, sends the message, then closes the connection. This is done at a rate of 50Hz. I changed the code so that it opens the connection at the start of the test and the model just continuously sends the message at the 50Hz rate, then at the end of the test I close the UDP connection. This resolved the issue because now the ARP is not sent anymore, like at all. Well, maybe at the start? But I haven't checked that, so I should do that later.

The explanation I can come up with is, I was closing the connection every 20ms, so the OS considered the port closed, so it took that time to do the normal housekeeping of sending the ARP? But since the port was always open to a known IP address (and thus mac ID), it didn't need to do the ARP. I don't know, just my guess.

Ok, so bring on the comments about, why I would keep opening and closing the port. Bring it on, I can handle it... 😀

  • Haha 2
Posted (edited)
 

Ok, so bring on the comments about, why I would keep opening and closing the port. Bring it on, I can handle it... 

Okay... I'll bite just because I'm curious. 

While I was reading your post on how you fixed it, I was thinking of all of the times that I've reviewed code and shook my head when I see that type of programming.

 

Edited by Bryan
Posted (edited)
 

so bring on the comments about, why I would keep opening and closing the port. Bring it on, I can handle it... 

Actors? (AKA Throwing lots of cats in the air then trying to herd them.:D)

Edited by ShaunR
  • Like 1

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.