TurboQuant breakthrough: Google's TurboQuant compresses LLM KV-cache up to 6x without quality loss, freeing GPU memory and boosting inference speed. Hybrid attention savings: DeltaNet-style ...