Close inventory gaps: Spaces buckets (URN-discover), droplet backups, snapshot URN aliases

Two patterns added:

1. ProjectsWorker now does URN-discover for kinds without a dedicated
   sync worker (spaces_bucket, managed_db, k8s_cluster, etc.). For these,
   it inserts a minimal placeholder row when the URN points to something
   not yet in inventory. Kinds with dedicated workers (droplet, snapshot,
   volume, etc.) still get attribution-only — the worker is source of
   truth for richer attrs. Implemented by splitting attribute_or_discover/4
   on a @dedicated_kinds whitelist.

2. New BackupsWorker pulls /v2/droplets/:id/backups for each active
   droplet. DO automated backups aren't in /v2/snapshots; they live per
   droplet. Cron: hourly at :41. Kind="droplet_backup".

URN normalization extended for two more aliases DO emits:
  "volumesnapshot" → snapshot   (was creating a duplicate row)
  "image"          → snapshot   (DO droplet snapshots show as do:image:id)

Billing.find_resource/1 gets a kind-specific clause for droplet_backup
that matches to the parent droplet by name, since invoice lines for
backups read "<droplet-name> (Weekly Backup Services)" — the line is a
per-droplet subscription, not a per-backup-snapshot fee.

Live verified on the same April 2026 invoice:
- 6 Spaces buckets discovered via URN (account has 6, only 1 visible in
  the invoice as the $5 subscription line — that's account-level so it
  can't tie to a specific bucket, expected).
- 4 droplet backups discovered via BackupsWorker; the git.sky-ai.com
  backup line now matches (repo.sky-ai.com backup line can't match — that
  droplet was destroyed).
- Of 16 unmatched lines: 11 are destroyed historic resources, 1 is GST,
  1 is the account-level Spaces subscription, 3 are likely tiny snapshot
  name variances. Effectively ~100% of currently-existing billable
  resources match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-19 22:46:29 +10:00
parent 8bdf500214
commit ea3101ca2f
5 changed files with 150 additions and 6 deletions

View File

@@ -0,0 +1,84 @@
defmodule ArcadiaCloud.Sync.BackupsWorker do
@moduledoc """
Sync of DO automated droplet backups.
Backups are not exposed via /v2/snapshots — they live under each
droplet at /v2/droplets/:id/backups. We iterate every active droplet
in inventory and pull its backups, normalizing them as
kind="droplet_backup" with the parent droplet_id in attrs.
"""
use Oban.Worker, queue: :cloud_sync_full, max_attempts: 3
import Ecto.Query
alias ArcadiaCloud.Cloud
alias ArcadiaCloud.Cloud.CloudResource
alias ArcadiaCloud.DigitalOcean.Client
alias ArcadiaCloud.Repo
@kind "droplet_backup"
@provider "digitalocean"
@impl Oban.Worker
def perform(_job) do
now = DateTime.utc_now() |> DateTime.truncate(:second)
droplets = list_active_droplets()
Enum.each(droplets, fn d ->
case Client.list_droplet_backups(d.provider_id) do
{:ok, backups} ->
Enum.each(backups, fn b ->
Cloud.upsert_resource(normalize(b, d, now))
end)
{:error, _} ->
# Soft-fail per droplet; mark_stale below handles disappearances.
:skip
end
end)
Cloud.mark_stale(@kind, now)
:ok
end
defp list_active_droplets do
from(r in CloudResource,
where:
r.provider == ^@provider and r.kind == "droplet" and is_nil(r.deleted_at) and
r.status != "archived",
select: %{id: r.id, provider_id: r.provider_id, cloud_project_id: r.cloud_project_id,
tenant_id: r.tenant_id}
)
|> Repo.all()
end
defp normalize(b, droplet, now) do
region =
case b["regions"] do
[first | _] when is_binary(first) -> first
_ -> nil
end
%{
provider: @provider,
provider_id: to_string(b["id"]),
kind: @kind,
name: b["name"] || "backup-#{b["id"]}",
region: region,
status: "active",
tags: [],
cloud_project_id: droplet.cloud_project_id,
tenant_id: droplet.tenant_id,
attrs: %{
droplet_id: droplet.provider_id,
size_gigabytes: b["size_gigabytes"],
min_disk_size: b["min_disk_size"],
regions: b["regions"],
do_created_at: b["created_at"]
},
first_seen_at: now,
last_seen_at: now
}
end
end