Basically the same as Jacobson said. The DMA FIFOs internally are 64-bit aligned. If you try to push data through it that doesn't fit into the 64 bits (8 * 8 bit, 4 * 16 bit, 2 * 32 bit or 1 * 64 bit) then the FPGA will actually force alignment by stuffing extra filler bytes into the DMA channel. In that case you would loose some of the throughput as there is extra data transferred that simply is discarded on the other side. That loss is however typically very low. The worst case would be if you try to push 5 byte data elements (clusters of 5 bytes for instance) through the channel. Then you would waste 3/8 of the DMA bandwidth.
The performance on the FPGA side should not change at all purely from different data sizes. What could somewhat change is the usage of FPGA resources as binary bit data is stuffed, shifted, packed/unpacked and otherwise manipulated to push into or pull from the DMA interface logic.
The performance on the realtime side could change however as more complex packing/unpacking will incur some extra CPU consumption.