vLLM V1 increases the default value for `max_num_seqs` allowing for higher parallelism. Therefore, `max_requests` is doubled to 1000 for V1.
10 KiB
10 KiB
vLLM V1 increases the default value for `max_num_seqs` allowing for higher parallelism. Therefore, `max_requests` is doubled to 1000 for V1.