Files
ubicloud/spec/prog/ai
Benjamin Satzger 1ae2804639 External Inference Engine via RunPod
Previously, each inference replica ran as a self-contained VM, hosting
both the inference gateway and inference engine.

This change introduces a new option:
An inference replica that runs a VM with the inference gateway but
replaces the local inference engine with a secure tunnel to an external
inference engine. In this setup, the external inference engine is vLLM
running on a RunPod pod.
2025-03-17 12:56:32 +01:00
..