Files
ubicloud/config/ai_models.yml
Benjamin Satzger 8984f5ebe9 Double max_requests for vLLM V1
vLLM V1 increases the default value for `max_num_seqs` allowing for
higher parallelism. Therefore, `max_requests` is doubled to 1000 for V1.
2025-02-14 16:35:59 +01:00

10 KiB