Is the frame size 16 bytes? In the first you use 14, in the second 16.
This is slightly faster:
The biggest hurdle is converting the endianness, I do not think you can get much faster with conversion.
I tried to be clever using BLAS dcopy for copying out the relevant part, but the conversion kills the performance gain.
decode-frame-cm.vi