Files
ubicloud/Procfile
Jeremy Evans 09cc4626cb Make partitioned respirate automatically rebalance based on number of workers
Instead of having a fixed partition of the uuid space, workers now
only have a fixed partition number (based on the ARGV[0], DYNO on
Heroku, or PS on Foreman).  If a partition number is given, the
dispatcher creates a repartition thread.  This thread uses LISTEN
to receive messages from other respirate processes. When it starts,
and every 2 seconds after, it uses NOTIFY to notify other listening
processes of its existance. If it has been more than 4 seconds
since it has been notified that a respirate process exists, it
assumes the respirate process has exited.

When it receives a notification that a respirate process has been
added with a higher number than the number of partitions it is
currently configured for, it repartitions the uuid space it monitors
(decreasing the size), and updates the number of current partitions.

If it removes stale respirate processes, and the highest remaining
process is lower than the number of partitions it is currently
configured for, it repartitions the uuid space it monitors
(increasing the size), and updates the number of current partitions.

Let's say you are running 4 respirate workers: respirate.{1,2,3,4}
These will quickly divide the uuid space into 4 partitions.  If you
add respirate.5 and respirate.6, the existing processes will quickly
repartition to use 6 partitions.  If you remove respirate.6, the
existing processes will repartition to use 5 partitions within 4-6
seconds.  If you then remove respirate.4, it will not repartition,
since respirate.5 is still running.  This is not designed to
repartition if respirate processes crash, only if the number of
respirate processes increases or decreases, with the assumption
that if any increases will start at the number after the highest
current process, and decreases will remove starting with the highest
number.

While this does not repartition for crashed respirate processeses,
strands for those processes are still handled by the old strand scan.

This moves the uuid partitioning logic from respirate to dispatcher,
since it no longer static. Similarly, creating the prepared statement
for the scan query is moved to a separate method, since it is no
longer only called by initialize.  This may appear to be thread unsafe,
but the way Sequel works, is before every prepared statement execution,
it checks whether the SQL of the already prepared statement on the
connection matches what the current SQL for the prepared statement for
the Database object, and if it doesn't match, it DEALLOCATEs the
existing prepared statement on the connection, and the PREPAREs a new
statement, using the current SQL (in this case, the new partition.)

Change the smoke test when running fewer processes than partitions
(such as the 3/4 test) to not create processes for the lower
partition numbers instead of not creating processes for the higher
partition numbers.  Otherwise, due to the rebalancing, instead of
testing the behavior with crashed respirate processes, you end up
only testing with a lower number of total processes.

To monitor the partitioning, this emits logs in 2 additional cases:

* When initially partitioning or repartitioning, so you can see what
  partition each respirate process is monitoring.

* If an unexpected notification is received while listening. This
  should never happen, but we do not want to crash the repartition
  thread if it does.
2025-06-14 01:46:43 +09:00

100 B