Commit Graph

  • 0ce554985d Extend the runner's destroy deadline if it's still running a job Enes Cakir 2025-03-17 16:17:47 +03:00
  • 1271487236 Validate stored enum value, not method return value Jeremy Evans 2025-03-17 07:44:03 -07:00
  • 1ae2804639 External Inference Engine via RunPod Benjamin Satzger 2025-03-12 11:37:14 +01:00
  • c0c9e982fa Introduce empty AI model image Benjamin Satzger 2025-03-12 11:37:04 +01:00
  • e4ff060e52 New config for HuggingFace token Benjamin Satzger 2025-03-12 14:38:15 +01:00
  • d7f4cc2169 New config for RunPod API key Benjamin Satzger 2025-03-10 15:07:32 +01:00
  • 54ef0917af Migration: Store external config and state for inference Benjamin Satzger 2025-03-10 15:07:17 +01:00
  • 480b8e33a9 Configure restart behavior for inference services Benjamin Satzger 2025-03-10 15:06:43 +01:00
  • df49c3a306 Provide engine start command to inference replica Benjamin Satzger 2025-02-25 15:06:06 +01:00
  • cf0d7c1c46 Use the Sequel pg_auto_validate_enums plugin to automatically validate enum values Jeremy Evans 2025-03-12 11:46:42 -07:00
  • 32a5ea76a1 Disconnect Sequel::Database instance before fork Jeremy Evans 2025-03-11 17:10:24 -07:00
  • affcc39a51 Use spaces around | in destroy usage banners in the CLI Jeremy Evans 2025-03-13 09:31:06 -07:00
  • 971c4730d2 Update rhizome script location for Kubernetes ben/ie-rhizome-user Benjamin Satzger 2025-03-13 12:51:45 +01:00
  • 7f0a5758e1 Update rhizome script location for inference endpoints Benjamin Satzger 2025-03-13 12:50:17 +01:00
  • 23f049be63 Use rhizome user for service VM sshables Benjamin Satzger 2025-03-13 12:49:12 +01:00
  • ac85e59d9b Support showing examples for commands Jeremy Evans 2025-03-12 17:02:02 -07:00
  • 9fc326c965 Minor CLI help improvements Jeremy Evans 2025-03-12 14:16:38 -07:00
  • 355f7d3f97 Restructure CLI help to support command descriptions Jeremy Evans 2025-03-10 15:11:36 -07:00
  • db81fbb35c Avoid unnecessary spec subclasses Jeremy Evans 2025-03-12 12:06:42 -07:00
  • 3c186b1af5 Increase E2E timeout until the internal MinIO cluster stabilizes Enes Cakir 2025-03-13 14:01:03 +03:00
  • d03a9e516c Pin rubocop-capybara to 2.21 Daniel Farina 2025-03-12 18:55:18 -07:00
  • d6ac202364 Bump json from 2.10.1 to 2.10.2 dependabot[bot] 2025-03-12 16:02:06 +00:00
  • 1b54d1b87a Install rhizome programs in all Vm based services to /home/rhizome Daniel Farina 2025-03-12 11:33:11 -07:00
  • 9969d524ad Minor association improvements to MinioPool Jeremy Evans 2025-03-11 09:14:19 -07:00
  • 2c4131765e Avoid unnecessary join table for MinioServer cluster association Jeremy Evans 2025-03-11 09:11:39 -07:00
  • f136ee61a5 Avoid unnecessary join table for Project github_runners association Jeremy Evans 2025-03-11 09:05:01 -07:00
  • ef5285b3dc Remove unnecessary join table for MinioCluster servers associations Jeremy Evans 2025-03-10 09:07:11 -07:00
  • 66b4fafc35 Raise error in Rakefile instead of going forward with wrong environment Jeremy Evans 2025-03-05 17:34:26 -08:00
  • d722196bf2 Use the table component in the current usage view Enes Cakir 2025-02-19 18:03:11 +03:00
  • bd0de4ddaa Show finalized invoices as PDF Enes Cakir 2025-02-19 18:03:46 +03:00
  • 58efb43beb Add footer to the table card component Enes Cakir 2025-02-19 18:00:27 +03:00
  • 0c1eb41d3e Bump the development-dependencies group across 1 directory with 3 updates dependabot[bot] 2025-03-11 00:14:23 +00:00
  • 58b5ebeb17 Bump the js-dependencies group across 1 directory with 15 updates dependabot[bot] 2025-03-11 10:19:57 +00:00
  • 76aa15e1da Fix learn with greptile link Enes Cakir 2025-03-11 02:56:15 +03:00
  • ee0c344ea9 Fix cloudify_server script, adapting to Location being moved to DB Eren Başak 2025-03-11 16:16:09 +03:00
  • 504effca55 Associate Billing Records for Kubernetes Clusters Eren Başak 2025-03-05 02:06:57 +03:00
  • 44e1b31c69 Add Billing Rate Entries for k8s resources Eren Başak 2025-03-04 00:21:17 +03:00
  • a6d6a0705a Make ProvisionKubernetesNode cancel itself if cluster is dropped Eren Başak 2025-03-08 02:41:33 +03:00
  • 726768fc6b Convert push to bud in ProvisionKubernetesNode calls Eren Başak 2025-03-08 01:08:43 +03:00
  • fd000bc776 Increase nap time of k8s nodepool nexus Eren Başak 2025-03-08 00:46:20 +03:00
  • d6b881c4e6 Fix idempotency issue on k8s software install step Eren Başak 2025-03-08 00:45:39 +03:00
  • dabcb9d0ec Add workflow_dispatch trigger to the workflow jobs Enes Cakir 2025-02-28 17:35:44 +03:00
  • de5411cbd4 Set workflow job permissions explicitly Enes Cakir 2025-02-28 17:34:08 +03:00
  • fdf22ccdbb Bump prismjs from 1.29.0 to 1.30.0 dependabot[bot] 2025-03-10 22:49:27 +00:00
  • 6036001c25 Bump rack from 3.1.11 to 3.1.12 dependabot[bot] 2025-03-10 22:35:16 +00:00
  • 387c0de0a0 Fix the runner recycling issue for last-second assignments Enes Cakir 2025-03-11 11:25:41 +03:00
  • 9435c0b384 Update to Sequel checkout that supports deterministic pg_auto_constraint_validations cache file Jeremy Evans 2025-03-05 20:26:39 -08:00
  • 6cfb2ffd66 Fix regeneration of pg_auto_constraint_validations.cache Jeremy Evans 2025-03-05 17:31:27 -08:00
  • 1149ad0c3f Expand destroy guard of VM Nexus to all 3 states Eren Başak 2025-03-08 04:52:36 +03:00
  • a84fda5cad Create billing records for GPUs Benjamin Satzger 2025-03-07 14:42:31 +01:00
  • 5dbddc275c Add GPUs as new billing resource type Benjamin Satzger 2025-03-07 14:36:18 +01:00
  • 63be63c68c Get name of PCI device Benjamin Satzger 2025-03-07 14:27:45 +01:00
  • 4ba9b08390 Bump the js-dependencies group with 10 updates dependabot[bot] 2025-03-03 23:09:37 +00:00
  • 65c4ce0fd9 Update aws-sdk-s3 from 1.177 to 1.182 Enes Cakir 2025-02-28 17:09:39 +03:00
  • c029660d8f fixup! Use a timeout by default in Sshable#cmd jeremy-reduce-apoptosis Daniel Farina 2025-02-28 19:03:59 -08:00
  • 7b0bad25ed Use a timeout by default in Sshable#cmd Jeremy Evans 2025-01-28 13:58:37 -08:00
  • 5e0a94e2b5 Add new 20250302.1.1 runner image Burak Velioglu 2025-03-06 18:07:07 +03:00
  • fad8fa7cbf Enable speculative decoding for DeepSeek R1 32B Junhao Li 2025-02-25 11:19:36 -05:00
  • 0e1b8759a9 Make thread printer test robust to multithreading Daniel Farina 2025-03-05 09:30:19 -08:00
  • abe5fe1a25 E2E: Test host reboot for slices. Hadi Moshayedi 2025-03-03 13:34:59 -08:00
  • 479fad557b Use frozen constant for TARGET_STANDBY_COUNT_MAP Burak Yucesoy 2025-03-05 11:51:37 +01:00
  • d18f5720f9 Remove /failover endpoint for Postgres Burak Yucesoy 2025-03-04 15:51:33 +01:00
  • b3e48865b2 Do not pick servers that needs recycling as failover targets Burak Yucesoy 2025-01-02 04:04:17 +01:00
  • cdb79d20eb Do not destroy standby if it is picked for failover Burak Yucesoy 2025-01-02 04:02:26 +01:00
  • 080608948f Destroy primary only after failover candidate started to work Burak Yucesoy 2025-01-02 04:01:48 +01:00
  • f0622b062f Rename failover labels Burak Yucesoy 2025-01-02 03:58:28 +01:00
  • 70578c9871 Add target_server_count helper Burak Yucesoy 2025-01-02 03:48:44 +01:00
  • 8f3fcb1d42 Change the "Tax ID" label to "VAT ID" for EU customers Enes Cakir 2025-03-06 09:54:17 +03:00
  • 77e64f94ce Removing use_slices_for_allocation flag Maciek Sarnowicz 2025-02-24 10:34:05 -06:00
  • b4653fa011 Removing use_slices_for_allocation flag maciek/remove-project-flag Maciek Sarnowicz 2025-02-24 10:34:05 -06:00
  • 48247acb15 Bump uri from 1.0.2 to 1.0.3 dependabot[bot] 2025-03-05 00:56:49 +00:00
  • 8aba445291 Bump rack from 3.1.10 to 3.1.11 dependabot[bot] 2025-03-04 21:23:05 +00:00
  • a94f0e5081 Have cli program send version in header Jeremy Evans 2025-02-28 13:06:48 -08:00
  • e56831124b Reduce file size of cross-compiled cli binaries Jeremy Evans 2025-02-28 13:08:10 -08:00
  • 572261a85a Do not build Windows 386 cli version Jeremy Evans 2025-02-28 13:07:31 -08:00
  • 2c84e7cbc8 Support showing separating reasoning content from vllm Junhao Li 2025-03-03 17:25:25 -05:00
  • 798072c045 Add new 20250301.1.0 runner image Burak Velioglu 2025-03-02 23:55:53 +03:00
  • 1102863d6d Remove old 20250105.1.1 runner image Burak Velioglu 2025-03-02 23:47:13 +03:00
  • 4ed2ad9c88 Drop support for optional leading underscore in routes Jeremy Evans 2025-02-28 13:34:47 -08:00
  • e0822ab4b3 Start slices after host reboot. Hadi Moshayedi 2025-03-03 11:25:06 -08:00
  • d549cb59a0 Execute the block passed to Sshable.start_fresh_session Hadi Moshayedi 2025-03-03 11:17:10 -08:00
  • d2141acb6b Add golang to .tool-versions to compile ubi cli Daniel Farina 2025-02-28 19:04:44 -08:00
  • fe78bbf215 Disable burstable in some regions Maciek Sarnowicz 2025-02-26 13:45:45 -06:00
  • a40eabb3de Prevent showing Burstable family for locations where it is not allowed Maciek Sarnowicz 2025-02-27 13:46:32 -06:00
  • 85c2679a45 Switch from embedded Rodish to using the rodish gem Jeremy Evans 2025-02-27 13:20:51 -08:00
  • ae48b114db Clover: Increase SPDK hugepages. Hadi Moshayedi 2025-02-27 10:21:58 -08:00
  • b68e3a541d Rhizome: Increase SPDK iobuf sizes. Hadi Moshayedi 2025-02-27 10:12:34 -08:00
  • de34f40cb6 Update AI base image to 20250301.1.0 Benjamin Satzger 2025-02-28 18:59:43 +01:00
  • 4b724d1be7 Update AI base image to 20250228.1.0 Benjamin Satzger 2025-02-28 14:08:31 +01:00
  • 3c4ae99614 Store device id instead of device name for StorageDevices mohi-kalantari 2025-02-27 15:51:31 +01:00
  • 7add7a39d2 Update AI base image to 20250227.1.0 Benjamin Satzger 2025-02-27 22:36:28 +01:00
  • ace63b7249 Fix reasoning output of ds-r1-qwen-32b Benjamin Satzger 2025-02-27 11:50:55 +01:00
  • ab2105f8b7 Use deadline pages for VM host unavailability Enes Cakir 2025-02-21 11:54:08 +03:00
  • cac357e988 Fix initialization issue when populating StorageDevices and increase error verbosity mohi-kalantari 2025-02-26 10:59:24 +01:00
  • 23ab2a59b2 Qualify docker base image names Daniel Farina 2025-02-26 14:54:20 -08:00
  • 154b5799e9 Add libffi to docker image to fix build Daniel Farina 2025-02-26 14:50:50 -08:00
  • b5578b0036 Clover: Use bdev_ubi-0.3. Hadi Moshayedi 2025-02-21 15:09:50 -08:00
  • 30a0b468d3 Rhizome: Use bdev_ubi-0.3. Hadi Moshayedi 2025-02-21 14:27:55 -08:00
  • 052118d781 Update to Ruby 3.2.7 Enes Cakir 2025-02-11 10:24:10 +03:00
  • f081a866cf Remove "Install ruby for ARM runners if not cached" step Jeremy Evans 2025-02-25 14:31:10 -08:00