Phase 2: droplet create/destroy saga

The most load-bearing write workflow — droplet provisioning is the spine
of phase 4a deployment onboarding.

DigitalOcean.Client: create_droplet, get_droplet, list_droplets_by_tag,
destroy_droplet. list_paginated/3 now threads caller-supplied params
(opts[:params]) through pagination so tag-filtered listing works.

Four droplet saga steps:
- CreateDroplet — POST a droplet, tagged arcadia-saga-<saga8> +
  managed-by-arcadia-cloud. Idempotency: re-run checks context for
  droplet_id, then queries DO by the saga tag, so a crash between POST
  and context-save adopts the existing droplet. compensate destroys it.
- WaitDropletActive — polls get_droplet until status "active" (96x5s);
  records the public IP. No compensation (waiting has no side effect).
- RegisterDroplet — fetches the droplet, upserts it into cloud_resources
  (inventory consistent immediately, not at next 15-min sync) and writes
  cloud_provisioned desired-state {size_slug, region, image}. compensate
  removes the DB rows (the droplet itself is destroyed by CreateDroplet's
  compensate).
- DestroyDroplet — DELETE the droplet + mark its cloud_resources row
  deleted. Terminal/irreversible: compensate is a logged noop, per the
  saga design destroy-class steps don't roll back.

Provisioning helpers:
- provision_droplet/1 — [CreateDroplet, WaitDropletActive, RegisterDroplet]
- destroy_droplet/2   — [DestroyDroplet]

Live smoke verified end-to-end (full create + destroy on a real
s-1vcpu-512mb-10gb droplet in syd1):
- provision saga completed: droplet 572017320 created, reached active
  with public IP, registered into cloud_resources (status=active) +
  cloud_provisioned (spec recorded).
- destroy saga completed: cloud_resources row marked deleted; droplet
  confirmed 404 on DO afterward. Account back to its original 5
  droplets, zero leftover, ~1 cent total cost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-20 13:31:47 +10:00
parent e3bcd3fc77
commit b9fc4f9cf3
6 changed files with 427 additions and 4 deletions

View File

@@ -0,0 +1,59 @@
defmodule ArcadiaCloud.Provisioning.Steps.WaitDropletActive do
@moduledoc """
Polls a droplet (created by a prior CreateDroplet step) until its
status is "active". Reads `droplet_id` from saga context.
No compensation — waiting has no side effect to undo. If the saga
rolls back, the prior CreateDroplet step's compensate destroys the
droplet regardless of whether it ever reached active.
"""
@behaviour ArcadiaCloud.Provisioning.Step
alias ArcadiaCloud.DigitalOcean.Client
alias ArcadiaCloud.Provisioning.SagaState
@poll_interval_ms 5_000
@poll_max_attempts 96
@impl true
def name, do: "wait_droplet_active"
@impl true
def execute(state) do
case SagaState.get_output(state, :droplet_id) do
nil -> {:error, :no_droplet_id_in_context}
droplet_id -> poll(state, droplet_id, 1)
end
end
defp poll(_state, _droplet_id, attempt) when attempt > @poll_max_attempts do
{:error, :droplet_active_timeout}
end
defp poll(state, droplet_id, attempt) do
case Client.get_droplet(droplet_id) do
{:ok, %{"status" => "active"} = droplet} ->
public_ip = extract_public_ip(droplet)
{:ok, SagaState.put_output(state, :droplet_public_ip, public_ip)}
{:ok, %{"status" => status}} when status in ["new", "off"] ->
Process.sleep(@poll_interval_ms)
poll(state, droplet_id, attempt + 1)
{:ok, %{"status" => other}} ->
{:error, {:unexpected_droplet_status, other}}
{:error, reason} ->
{:error, reason}
end
end
defp extract_public_ip(droplet) do
droplet
|> get_in(["networks", "v4"])
|> List.wrap()
|> Enum.find(%{}, &(&1["type"] == "public"))
|> Map.get("ip_address")
end
end