In LabVIEW 7.1, the way you are suggesting is in most cases the fastest way to send data to a host machine from the FPGA. You may be able to speed it up slightly by creating an intermediate array of 16 to 32 elements so the synchronization overhead is limited to a chunk of memory instead of a per element basis.
In LabVIEW 8.0, FPGA FIFOs are capable of using DMA to send data directly from the FPGA to the host and is orders of magnitude faster than the above method.