← Home

KV Cache Preemption

Step 1 / 10
Requests
GPU KV Cache Pressure
CPU Memory (swap space)
GPU Physical Memory — 12 blocks