Hm....well I think a continuous read might be the fastest. Set it up to perform something like N channels N samples, reading continuously at a fast rate, then perform a read, taking in all samples ready to be read. You will likely get lots of samples because of your fast rate but who cares, just grab the newest for each channel, do calculations, and write single point, or maybe have the write task already open set to a continuous write, and do not regen.
The point I'm making, is I suspect that if you have your read task set to a continuous read, then just read how ever many samples are on the buffer, then this might be faster than reading a single point on each channel one at a time, because the task is already running and all that needs to happen is transfer the data.
Maybe the write can work the same way but it might be more complicated. If you already have the task running, maybe performing a continuous write would be faster than multiple single point writes, but regen needs to be off. Otherwise your output will repeat because this is usually used to do something like continually write a sine wave, you use the write once and it loops back around but, in this case you wouldn't want that. Sorry I don't have any hardware to test on.