When a link becomes saturated and the queueing depth increases significantly a sudden delay is introduced to the stream. This often causes unnecessary retransmissions because the ACKs for the queued data segments were't received in the expected time window (or the ACKs were delayed in the opposite direction).
With a constant load level the retransmissions should even out as TCP's RTT tracking adapts but with the loads changing permanently, retransmission counts will likely stay on that level.
QoS can help reduce the queueing delay but there's only so much you can do from one side of a link.