Jump to content

Failure to acknowledge TCP message


Recommended Posts

I've been scratching my head with this one and am hoping someone has dealt with it before.

My physical setup is a PC, CAT5 cables, a couple of switches, an Eaton HMI, and an Eaton PLC. I'm talking Modbus TCP to the PLC using the library on NI's website.

I'm using Wireshark to troubleshoot other problems and found the PLC retransmitting the response to the PC very frequently (not every time). The PC is reads and writes 32 words to the PLC every 100 msec, but only performs a write when the data changes. The PLC transmits a response within 6 msec of a read message (function 3 in Modbus). The PC never responds with an acknowledgement of the message from the PLC. The PLC retransmits 50 msec later and the PC acknowledges that message. The write messages occur about every 1 second (my heartbeat to the PLC).

I've tried switching to the second port on the PC, which goes to a second card in the PLC. That network has four drives, two air manifolds, a remote I/O block and an RFID system on it that are all talking to the PLC. The PLC sequences through all the remote devices, including the PC, so the response to the PC can take 75 - 150+ msec. The PC always acknowledges the first message with that network.

Anyone have some thoughts as to what's going on?

Tim

Forgot to mention that I'm using LabVIEW 8.6.1 and have swapped all of the network components.

Link to comment

I'm talking Modbus TCP to the PLC using the library on NI's website.

Is it this library you are using? If so, did you correct the bug pointed out in the comments:

a flaw in MB CRC-16.vi in NI Modbus.llb

A failure mode turn up in testing at 1200 baud. It turned out to be a flaw in MB CRC-16.vi in NI Modbus.llb. The failure mode is that the MB Serial Recieve.vi that waits for a Modbus reply to a command may falsely detemines that the message is complete before the last byte is recieved. This is becuase because MB CRC-16.vi coerces the U16 CRC to a U8, and the U8 becomes zero for a message with all but the last byte transmitted. The correction is to change the data type of the Exception Code indicator from U8 to U16.

- Jim Figucia, Code G Automation. labview_work@msn.com - Sep 15, 2009

I haven't used it but when I downloaded the library a few months ago the bug was still there. Don't know if it could cause your problem.

Link to comment

Before I forget, I appreciate any and all thoughts.

Sounds like a network stack issue. What OS are you running? Is this your problem?

Sorry, forgot to put that in. Windows7 is the OS. It is possible that I'm seeing an OS level issue, which appears to be how the person solved the issue (replace with WinXP system). I don't have a reset of the ethernet card and am starting the PC side after the PLC has booted up. I'll check for the start of connection, though.

Is it this library you are using? If so, did you correct the bug pointed out in the comments:

Ah, yes, that's it.

a flaw in MB CRC-16.vi in NI Modbus.llb

A failure mode turn up in testing at 1200 baud. It turned out to be a flaw in MB CRC-16.vi in NI Modbus.llb. The failure mode is that the MB Serial Recieve.vi that waits for a Modbus reply to a command may falsely detemines that the message is complete before the last byte is recieved. This is becuase because MB CRC-16.vi coerces the U16 CRC to a U8, and the U8 becomes zero for a message with all but the last byte transmitted. The correction is to change the data type of the Exception Code indicator from U8 to U16.

- Jim Figucia, Code G Automation. labview_work@msn.com - Sep 15, 2009

I haven't used it but when I downloaded the library a few months ago the bug was still there. Don't know if it could cause your problem.

Unfortunately it's not the cause. TCP is a connection based protocol, meaning all messages have to be acknowledged by the receiver; this occurs down in the TCP primitives. The issue I'm having doesn't show up except that the TCP Read takes ~56 msec to complete. The TCP Read should complete in ~6 msec if the TCP acknowledge by the PC occurred as it should instead of occurring when the PLC retransmits the message after not receiving an acknowledgement.

Link to comment

I recently chased a TCP issue under LV 8.6 and came away beliveing that 10Hz is about the best I can expect out of Windows.

Since its OK on another NIC I would expect the other network traffic is getting in the way and tying up the TCP stack and not acking the packet.

I found a lot of info posted by gamers trying to reduce Lag so searching on Lag and speed will get you some good hits.

This link

http://technet.microsoft.com/en-us/library/bb726981.aspx

gives you the internals of the TCP stack under windows and if there anything that can be tweaked its in there. If you find something out please share so we can benefit.

Ben

Link to comment

Most of the gaming advice has been about removing unneeded services and setting the performance options (system properties->advanced tab->performance options button) to "adjust for best performance". These did not help the acknowledge issue.

I found one YouTube video at:

It recommends going into the registry and adding "TcpAckFrequency" and "TCPNoDelay" to the parameters for the NIC. I'm seeing a positive change, but I was fooled a couple of times today, so I'm not ready to say it's fixed yet. The registry path is:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters\Interfaces\{ .... some GUID identifying your NIC card port .... }

Check the IP address to figure out which interface you should be looking at. Under the path add the values:

TcpAckFrequency DWORD data=0x1

TCPNoDelay DWORD data=0x1

TCPNoDelay = 1 turns off nagling

TcpAckFrequency is far more interesting per: http://support.microsoft.com/kb/328890

It seems Windows sends out a TCP acknowledge for every other segment received (per RFC 1122). Setting the ack frequency from 2 (default of every other segment) to 1 will acknowledge every segment. :frusty:

Tim

  • Like 1
Link to comment

If you use the link that I provided earlier, you will find that there is a way to turn off Nagling for a specific connection ref inside LabVIEW rather than disabling Nagle's Algorithm for all connections on an interface (registry setting).

The change could be done completely from within LabVIEW. Note the name of the LLB in the link: TCP_NODELAY.LLB

Link to comment

Sorry, though I'd posted.

Adding the TcpAckFrequency with data = 0x01 to the registry for the physical interface resolved it. The default value is 2, meaning it will acknowledge every other TCP message or after a 200 msec timeout (the PLC retransmitted after 50 msec). Setting the data to 1 forces it to acknowledge every message that arrives.

I've passed the info along to NI. Interestingly, I had to use email support as I used the word "Modbus"; it seems there is a recon (I think that was the name) group responsible for such and they only support through email at the moment. I could have continued with phone support if I had stuck to the TCP aspect.

Tim

  • Like 1
Link to comment

Sorry, though I'd posted.

Adding the TcpAckFrequency with data = 0x01 to the registry for the physical interface resolved it. The default value is 2, meaning it will acknowledge every other TCP message or after a 200 msec timeout (the PLC retransmitted after 50 msec). Setting the data to 1 forces it to acknowledge every message that arrives.

I've passed the info along to NI. Interestingly, I had to use email support as I used the word "Modbus"; it seems there is a recon (I think that was the name) group responsible for such and they only support through email at the moment. I could have continued with phone support if I had stuck to the TCP aspect.

Tim

Not really much NI can do about it. This is not a standard socket property at all and Windows socket implementations don't even support to set it through the API. So even if NI ever decides to add a property interface to network refnums, it's not possible to change this setting from the program in any way. And the 200ms acknowledgment is a standard TCP/IP socket feature, that Microsoft implemented in Windows 2000 to conform to the standard. So the really faulty party here is the PLC that resends already after 50 ms if you want to call it a fault. Realistically it's just a workaround to guarantee that any packet is acknowledged after not more than 50ms :-)

Enabling this registry setting in Windows is the only way to change this setting, and you don't want LabVIEW to change this behind your back in the registry ever! Especially since it enables this feature for any connection on that interface, which can be a real burden on network traffic if normal internet traffic also happens to go through this interface.

  • Like 1
Link to comment

Not really much NI can do about it. This is not a standard socket property at all and Windows socket implementations don't even support to set it through the API. So even if NI ever decides to add a property interface to network refnums, it's not possible to change this setting from the program in any way. And the 200ms acknowledgment is a standard TCP/IP socket feature, that Microsoft implemented in Windows 2000 to conform to the standard. So the really faulty party here is the PLC that resends already after 50 ms if you want to call it a fault. Realistically it's just a workaround to guarantee that any packet is acknowledged after not more than 50ms :-)

Enabling this registry setting in Windows is the only way to change this setting, and you don't want LabVIEW to change this behind your back in the registry ever! Especially since it enables this feature for any connection on that interface, which can be a real burden on network traffic if normal internet traffic also happens to go through this interface.

I completely agree. NI tech support is adding it to their database for when someone has the same issue. I expect Eaton and Delta (not sure who wrote the firmware for the ethernet module; Eaton repackages Delta PLCs) are not the only ones who are not aware of or compliant with the RFC standard. The other pieces of my system (Eaton/Vaccon drives, Festo manifolds, and Balluff RFID reader) have no issues.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.