What is the max. TCP throughtput under LV?

giopper · July 14, 2009

Hi guys,

I have a data acquisition system based on a NI-6122-PCI,

4 AI channels simultaneously sampled at 500 kHz over 16-bit.

It's a lot of data, 4 x 500 kHz x I16 = 4 Mbyte/s (pure data).

I need to transfer those samples

(after timestamping and some other fast manipulation)

to another machine for further processing,

while the data taking is running.

The transfer doesn't have to be real-time, but

as simultaneous as possible to the data taking.

I have a DAQ loop where data are collected from the hardware,

one time per second (4x 500000 samples/s) and stored in

a FIFO buffer (data-FIFO, an ad-hoc functional global).

No problem here.

Another loop takes the data from the data-FIFO, does some

pre-processing and stores the result in another FIFO, the TCP-FIFO.

No problem here.

Another loop runs a TCP server: when a client is connected,

it takes the data out from the TCP-FIFO and sends them using

the LabVIEW native TCP functions.

The client is connected through a private Gbit network,

a direct cable between two Gbit network adapters,

proven to be working properly with other software

(although I never really measured the true max throughput).

Unfortunately, somewhere there must be a bottleneck

because I see that data pile up in the TCP-FIFO,

i.e. the transfer from hardware to the TCP server is

faster than the transfer to the client via direct cable

connection.

The TCP data flow is very stable, a continuous flow

almost fixed at 15% of Gbit network capability,

as measured by WinXP TaskMan,

which (in my opinion) is ~15 Mbyte/s,

assuming that the max Gbit LAN throughput

is ~100 Mbyte/s.

The other machine runs a dedicated TCP client written

in C++ and running under Scientific Linux 5.4.

The client is definitely not the bottleneck.

I am afraid that the bottleneck could be in LabVIEW:

does anybody know what is the max. throughput of

LabVIEW's TCP native functions?

Does LabVIEW access the NIC drivers directly or

does it use an additional interface layer?

If I write/compile my own TCP functions in C and I call them

from LabVIEW as external code, will that improve the efficiency

of the TCP connection between the two machines?

Suggestions to improve the Gbit tranfer rate or ideas about

how to improve the data transfer are more than welcome.

Thanks

G.

LabVIEW 8.2, NI-DAQmx 8.6, WinXP32-SP2,

Gbit NIC integrated in the mobo (nForce)

Gary Rubin · July 14, 2009

Have you configured your connection to set TCPNoDelay?

giopper · July 14, 2009

Hi Gary, thanks for your answer.

No, I didn't know about that,

I'm going to try it right now.

I'll post here the result.

And thanks for the link to those nice TechNet pages.

G.

Have you configured your connection to set TCPNoDelay?

ned · July 14, 2009

Does LabVIEW access the NIC drivers directly or

does it use an additional interface layer?

If I write/compile my own TCP functions in C and I call them

from LabVIEW as external code, will that improve the efficiency

of the TCP connection between the two machines?

It's unlikely that you could improve throughput by writing your own external code, unless you think you can do better than your operating system. From an NI employee in this thread: "The networking primitives are a thin wrapper around the OS's networking stack."

Try transferring your data in larger chunks, so that each packet of data has a greater ratio of data to overhead.

EDIT: just to follow up on Gary's suggestion, take a look at this VI from NI for disabling the Nagle algorithm on a single connection.

Edited July 14, 2009 by ned

mzu · July 14, 2009

giopper,

I ran into this problem, and managed to squeeze 50-60% of gigabit speed on a pci system.

first of all your card is PCI, and your gbit card is PCI as well, right? So they both share same bandwidth (out of total theoretical 133MBytes/s, practical ~100MBytes/s). And you can not expect much more then 1/2 of the throughput going throught your gbit network. If there is a direct link, why don't you want to use UDP, to get get rid of the TCP overhead.

I do not think that Nagel algorithm delay is causing a problem here. There are 2 different characteristics of the network connection: throughput and latency. We want to improve the former, but disabling Nagel delay addresses the latter.

Second thing is enabling "JUMBO frames" if both of your gbit cards supports it, and try to send data in chunks just below the JUMBO frame size (to account for UDP header). I did all the above, but I was not able to get more then ~25% of the gbit throughput. (Data was streamed from shared memory of a PCI card through Gbit network by using LabVIEW API to stream the data). I believe it has to do with LabVIEW memory management, and most important with the fact that UDP write returns only when data is actually written, i.e. there is no non blocking mode. This may justify creating separate thread to send the data over gbit network. The speed was slightly higher, but I do not remember the exact data.

What I did in order to raise it to ~50-60%? I implemented my own FIFO in C++, so that it does not use LabVIEW memory management heavily. Then I implemented a reader from this FIFO which stuffed things in the UDP socket. This was done in C++. I used separate elevated priority thread and non-blocking windows sockets for that. Some simple API was provided for LabVIEW.

Edited July 14, 2009 by mzu

Sign In

What is the max. TCP throughtput under LV?

Recommended Posts

giopper

Gary Rubin

giopper

ned

mzu

Join the conversation

Browse

Activity

Important Information