vllm bench sweep plot_pareto¶
JSON CLI Arguments¶
When passing JSON CLI arguments, the following sets of arguments are equivalent:
--json-arg '{"key1": "value1", "key2": {"key3": "value2"}}'--json-arg.key1 value1 --json-arg.key2.key3 value2
Additionally, list elements can be passed individually using +:
--json-arg '{"key4": ["value3", "value4", "value5"]}'--json-arg.key4+ value3 --json-arg.key4+='value4,value5'
Arguments¶
--user-count-var¶
Result key that stores concurrent user count. Falls back to max_concurrent_requests if missing.
Default: max_concurrency
--gpu-count-var¶
Result key that stores GPU count. If not provided, falls back to num_gpus/gpu_count or tensor_parallel_size * pipeline_parallel_size.
Default: None
--label-by¶
Comma-separated list of fields to annotate on Pareto frontier points.
Default: max_concurrency,gpu_count
--dry-run¶
If set, prints the figures to plot without drawing them.
Default: False