Jump to content

GregSands

Members
  • Posts

    264
  • Joined

  • Last visited

  • Days Won

    25

Posts posted by GregSands

  1. Reds, did you miss my earlier comment - while lvanlys.dll does not use multicore functions, the LabVIEW Multicore Analysis and Sparse Matrix Toolkit does. Despite X's comments on the lack of documentation, which are quite fair and accurate, I'm seeing huge speedups on my i7-12700F with 8P+4E cores (20 logical processors according to CPU Information).

    For a 1499x1499 CDB 2D array, the inbuilt NI FFT takes ~270ms, the MASM FFT takes ~20ms, and the MASM FFT on the same array as CSG takes ~11ms (the built-in FFTs don't support single-precision, but MASM does, and it's often all that is needed).

     

    FFTComparison.png.c287754cc5319008426fbc10a4c45dc0.png

    • Thanks 1
  2. I would recommend trying the NI Multicore Analysis and Sparse Matrix Toolkit. The FFT included there seems robust and much faster than the native LabVIEW FFT. It also supports single precision (which is faster again, and usually sufficient) and 3D FFTs as well. All of the MASM VIs are useful replacements for the single-threaded defaults, and I think it is based on the Intel Math Kernel Library. I think I tried FFTW at one stage - in fact I just found a post I made in 2013 which has links to a FFTW wrapper - I'd forgotten about that!  Also at the end of the page is a link to a 32/64 bit version. But I'd still try the MASM toolkit.

    • Like 1
  3. It appears that the shared libraries are fully threadsafe, given the calls are all set to run in any thread, and I don't think the zlib library is multithreaded.  Would there be any issues with setting the VIs to "Shared clone reentrant" to allow multiple simultaneous calls?

     

  4. The Parallel For Loop is perfect for parallel processing of an input array, and reassembling the results in the correct order, however this only works if the array is available before the loop starts.  There is no equivalent "Parallel While Loop" which might process a data stream - so what is the best architecture for doing this?

    In my case, I'm streaming image data from a camera via FPGA, acquiring 1MB every ~5ms - call this a "chunk" of data - and I know I will acquire N chunks (N could be 1000 or more).  I then want to process (compress) this data before writing to disk.  The compression time varies, but is longer than the acquisition time.  So I'd like to have a group of tasks which will each take chunks and return the results - however it's no longer guaranteed that the results are in the same order, so there's a bit of housekeeping to handle that.

    I have a workable architecture using channels, but I'd be interested in any better options.  Easiest to explain with a simplified code which mimics the real program:

    ParallelProcessTest.png.c635c8381ad196db7f490d32ee3dede6.png

    It requires the processing to use a Messenger channel (i.e. Queue) because a Stream channel cannot work in a Parallel For Loop, but this doesn't maintain order.  And the reordering is a little messy - perhaps could be tidied using Maps but I don't have 2019 at the moment.  The full image is too large to keep in memory (I'm restricted to 32-bit because the acquisition is from an FPGA card), so I need to process and write the data as it becomes available.  I've considered writing a separate file for each chunk, but writing millions of small files a day is not particularly efficient or sustainable.

    Is there a better approach?  Have I missed something?  I feel like this must be a solved problem, but I haven't come across an equivalent example. Could there be a Parallel Stream Channel which maintains ordering, or a Parallel While Loop which handles a defined number of tasks?

    Thanks.
    Greg

  5. I've not used the PCIe-1477, but have been using the earlier PCIe-1473 - different FPGA chip but I presume the coding is similar.  If you want to code the FPGA directly, rather than using the IMAQ routines, have a look at examples such as this one, which also show how to write to/from the CameraLink serial lines.  However, as @Antoine Chalons says, you do need to know the specific commands for your camera.

  6. So this gets a little more interesting with the output type of the DDS:

    DiagramDisable.png

    1.Following directly with a VIM causes the output to back-propagate from the VIM's default input type. 

    2.This does not happen if the Types Must Match is used directly, even though this is essentially the contents of the VIM.

    3. Wrapping a sequence around either the DDS or the VIM causes the type to be defined correctly.

    4. Putting the DDS inside its own VIM also solves the problem, but only if there is also a sequence wrapping the DDS inside - if not, then the output type from the DDS VIM is always its own default output type.

    In any case, here's the Default Element VIM (saved for 2012) for any who might use it.

    Default Element.vim

     

  7. Does anyone know of a way to create a single (default) element of an arbitrary-dimension array?  I'm trying to create some Malleable VIs which have the same code for 1D-3D arrays, but have different code for floating-point vs integer arrays.  A second possible use in Malleable VIs would be to Initialize a new array based on an input array.

    Any thoughts from anyone?

  8. This probably won't help you, but you should be using HDF5 files - it can do exactly this. H stands for Hierarchical, and it is quite straightforward to write data to multiple files, and create a "master" file which transparently links them.  That works for writing as well as reading, so you can create the master file at the start, and write data to that which will be stored in separate files, or create it after writing individual files.  The HDF5 library handles all of the connection, and can be as simple as I said or far more complex if needed.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.