So I did a quick run, and I probably need to state that the above methods cannot really be compared apples to apples. Partially for the following reasons:
Some methods only support one element at a time input. If you need to enter 1000 pts at once these methods will probably be slower and involve more operations.
Some methods like the circular buffer are much more useful in certain situations like where the buffer is needed in different loops or are acquired at different rates.
here are numbers for single point(one element at a time) inputs:
How long does it take to process 10000 input samples with a buffer size of 1000 on my computer?:
Taylorh140 => 8.28579 ms
infinitenothing => 2.99063 ms (looks like shifting wins)
Data Queue Pt-by-pt => 9.03695 ms (I expected that this would beat mine)
hooovahh Circular Buffer => 8.7936 ms (Nearly as good as mine and uses DVR)
I would consider all these to be winners, except maybe the Data Queue pt-by-pt (but it is available by default which gives it a slight edge), Perhaps ill have to do another test where inputs are more than one sample.
Note: if you want to try the source you'll need the circular buffer xnodes installed.
buffer.zip