Jump to content

DMA FIFO


Tim_S

Recommended Posts

Posted

I'm hoping someone has run into this and can point me in a good direction.

The test machine I'm using is a 2.79 GHz Pentium 4, WinXP, 1 GB RAM system with a PCI-7831R card in it. I'm running LabVIEW 8.6.1.

I'm trying to stream data from the FPGA to the PC over a DMA FIFO. I've picked 250 kHz as my current benchmark as that is significantly above where the actual system is going to need to run. The FPGA side uses a fixed time interval (4 usec) FIFO write. The PC side uses two loops, one to read a latched timeout flag in the FPGA and one to read out the FIFO in a 20 msec loop.

This works most of the time until I do something like minimize Windows Task Manager. The CPU usage doesn't blip and stays at ~25%; the number of elements remaining in the FIFO jumps to 30,000+ and I start to lose data points as the FIFO write in the FPGA times out.

I've attached my test code.

I appreciate any light people can shed on this.

Tim

FPGA DMA Test.zip

Posted (edited)

I'm hoping someone has run into this and can point me in a good direction.

The test machine I'm using is a 2.79 GHz Pentium 4, WinXP, 1 GB RAM system with a PCI-7831R card in it. I'm running LabVIEW 8.6.1.

I'm trying to stream data from the FPGA to the PC over a DMA FIFO. I've picked 250 kHz as my current benchmark as that is significantly above where the actual system is going to need to run. The FPGA side uses a fixed time interval (4 usec) FIFO write. The PC side uses two loops, one to read a latched timeout flag in the FPGA and one to read out the FIFO in a 20 msec loop.

This works most of the time until I do something like minimize Windows Task Manager. The CPU usage doesn't blip and stays at ~25%; the number of elements remaining in the FIFO jumps to 30,000+ and I start to lose data points as the FIFO write in the FPGA times out.

I've attached my test code.

I appreciate any light people can shed on this.

Tim

What's the FIFO depth? You need to set it both for the FPGA (when you create it) AND the host (default for the host is 10,000 if memory serves me correctly.)

You'll use up the 10,000 n 40ms which can easily happen if you task switch in windows.

Edited by ShaunR
Posted

What's the FIFO depth? You need to set it both for the FPGA (when you create it) AND the host (default for the host is 10,000 if memory serves me correctly.)

You'll use up the 10,000 n 40ms which can easily happen if you task switch in windows.

The depth is 16,383 on the FPGA side. I wasn't setting it on the PC side either, but did find it after posting and tried up to 500,000 with no improvement.

I was able to avoid the timeout by changing the value of the wait to 0, but that hard-pegged the CPU. I can create the timeout with a wait of 1 msec.

Tim

Posted

Try this for the read loop.

ShaunR - I appreciate the replies. At this point I have NI Tech Support performing a lot of the same head-scratching that I've been doing.

I like the use of the shift register to read additional elements.

The CPU pegged at 100% usage, which isn't surprising since there is not a wait in the loop. I opened and closed various windows and was still able to create the timeout condition in the FPGA FIFO. The timeout didn't consistently happen when I minimized or restored the Windows Task Manager as before, but it does seem to be repeatable if I switch to the block diagram of the PC Main VI and then minimize the block diagram window. I charted the elements remaining instead of the data; the backlog had jumped to ~27,800 elements.

I dropped the "sample rate" of the FPGA side to 54 kHz from 250 kHz. I was able to run an antivirus scan as well as minimize and restore various windows without the timeout occuring. That is good, though the CPU usage is bad as there are other loops that have to occur in my final system.

Tim

Posted

The FIFO depth that you configure on the FPGA side is more of a "transfer buffer" if I understand it correctly. The depth you can dynamically set on the host side is the DMA's footprint in system RAM. You can make the host depth very large as long as you have the free RAM available. So increasing that depth on the host side should help a bit.

The CPU hogging is happening, as has been pointed out before, because you have no wait in the while loop. Here is where an interrupt would help you. It is far less costly in CPU cycles to wait for an interrupt than to wait for samples to arrive in the FIFO. Of course, you have a different problem. The samples are already there! That's because you are using a loop rate in Windows that is much too fast to be reliable. The latency in Windows is just not low enough for 20 msec. You must interpret your requirements and see if you really need to loop that fast. If so, I would recommend RT.

I think if you boost your FIFO depth on the host side (does not require FPGA recompile!) and grab more samples less often, you should be good to go.

- Dan

Posted

ShaunR - I appreciate the replies. At this point I have NI Tech Support performing a lot of the same head-scratching that I've been doing.

I like the use of the shift register to read additional elements.

The CPU pegged at 100% usage, which isn't surprising since there is not a wait in the loop. I opened and closed various windows and was still able to create the timeout condition in the FPGA FIFO. The timeout didn't consistently happen when I minimized or restored the Windows Task Manager as before, but it does seem to be repeatable if I switch to the block diagram of the PC Main VI and then minimize the block diagram window. I charted the elements remaining instead of the data; the backlog had jumped to ~27,800 elements.

I dropped the "sample rate" of the FPGA side to 54 kHz from 250 kHz. I was able to run an antivirus scan as well as minimize and restore various windows without the timeout occuring. That is good, though the CPU usage is bad as there are other loops that have to occur in my final system.

Tim

There is not supposed to be a wait. The read function will wait until it either has the number of elements requested, or there is a time-out. In this respect you change the loop execution time by reading more samples. If you are pegging the CPU then increase the number of samples (say to 15000) and increase your PCs buffer appropriately (I usually use 2x te FPGA size). 5000 data points was an arbitrary choice to give a couple of ms between iterations but if the PC is is still struggling (i.e there is some left over in the buffer at the end of every read) then it may not be able to keep up still when other stuff is happening.

Posted

There is not supposed to be a wait. The read function will wait until it either has the number of elements requested, or there is a time-out. In this respect you change the loop execution time by reading more samples. If you are pegging the CPU then increase the number of samples (say to 15000) and increase your PCs buffer appropriately (I usually use 2x te FPGA size). 5000 data points was an arbitrary choice to give a couple of ms between iterations but if the PC is is still struggling (i.e there is some left over in the buffer at the end of every read) then it may not be able to keep up still when other stuff is happening.

The FIFO read appears to poll the memory heavily thus causing the high CPU usage. I did try bumping the number of samples read at a time; the CPU usage stayed at 100%, hence why I'm thinking the FIFO read is polling and thus pegging the CPU.

I think if you boost your FIFO depth on the host side (does not require FPGA recompile!) and grab more samples less often, you should be good to go.

I have boosted the FIFO depth to 5,000,000 (i.e., a insane amount) on the host side and put a wait in of 100 msec. I get ~25,000 samples each read until I perform my 'minimize-Windows-Task-Manager' check at which point the samples goes to ~32,000 and Timeout check in the FPGA flags. Bumping the FIFO depth from 16k to 32k exceeds the resources on the FPGA card.

Tim

Posted

NI Tech support tested this code with their own PC and the same series card I have without any problems transferring the data across the FIFO. The difference seems to be that they used a multi-core PC and I have a single core PC. DMA transfer appears to have an exceptionally low priority, so any operation is enough to potentially disrupt the transfer. Their recommendation was to increase the size of the PC-side buffer.

Tim

  • 4 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.