Skip to main content

Storage Architecture (Walrus)

This section describes how content is addressed, stored, certified, and delivered using Walrus and edge infrastructure, and how clients retrieve data with integrity and resilience guarantees—without relying on a traditional backend in the hot path.

Why Walrus for media and sites

Walrus is a decentralized blob store coordinated by Sui. It erasure‑codes large objects using fast linear fountain codes and stores slivers across a dynamic committee of storage nodes, offering high availability at a storage overhead of roughly 5× rather than full replication (design overview). Sui smart contracts manage storage resources, committees, and availability events, so availability can be proven on‑chain while reads use efficient HTTP aggregator endpoints and CDNs.

Authoritative data and addressing

  • Media and static assets are stored on Walrus. The authoritative reference is the Walrus blob ID and expected content hash, both linked from on‑chain objects (e.g., Content, Submission) for client verification.
  • Static catalogs/manifests (front page, author pages, topic lists) are immutable Walrus blobs referenced from on‑chain indices, enabling deterministic discovery without a mutable backend.
  • Single‑page apps (the Frontends) are distributed as Walrus Sites: Sui objects with dynamic fields that map resource paths (e.g., /index.html, /assets/app.js) to Walrus blob IDs (Walrus Sites).

Write path and Point of Availability (PoA)

Walrus writes are a two‑phase protocol coordinated by Sui (write paths, on‑chain life cycle). The platform uses the Quilt SDK for efficient blob management and streaming uploads.

Operational notes

  • A user acquires storage capacity from the Walrus system object; capacity is split/merged/transferred as needed. Fees flow via the storage fund to nodes per epoch performance (contracts).
  • The availability certificate attests that ≥2/3 of shards hold the slivers; the emitted availability event defines the PoA for the blob ID. From this point, other nodes fetch missing slivers automatically.

Read path and multi‑path delivery

Clients prefer CDN for performance and fall back to Walrus aggregator URLs when necessary. Integrity is verified against the expected content hash from on‑chain references.

Design choices

  • CDN is the default read path for low latency. Aggregator URLs are embedded as first‑class fallbacks; a Walrus read does not depend on a central origin.
  • Clients verify the hash of fetched content before rendering. Mismatches trigger fallback or user‑visible errors.
  • Quilt SDK handles chunking and reassembly transparently, optimizing for the 5× storage overhead of erasure coding.

Storage resources and lifecycle on Sui

Walrus maintains Sui objects to coordinate storage epochs and capacity (on‑chain operations).

  • System object: holds node committee, total capacity, and pricing per KiB; values are agreed by ≥2/3 of storage nodes each epoch.
  • Storage fund: escrow of payments across storage epochs; pays nodes per performance.
  • Storage resources: user‑owned objects representing capacity and lifetime; can be split/merged/transferred; assigned to a blob ID to authorize storage.
  • Certificates and events: availability certificates are checked on‑chain; availability events mark PoA. Inconsistency proofs can mark a blob as invalid, signaling None reads and shard reclamation.

Hosting frontends as Walrus Sites

Frontends are treated as Walrus Sites, where a Site Sui object holds dynamic fields keyed by resource path and pointing to Walrus blob IDs (sites overview). This yields a fully decentralized distribution of the browser application, with DNS for resolution and CDN for acceleration, and Walrus as a built‑in fallback without a mutable web origin.

Encrypted submissions and privacy

Source submissions and sensitive materials are never stored on‑chain. Clients upload encrypted packages to Walrus and commit only hashes (and optional Walrus IDs) on‑chain. End‑to‑end encryption and minimal metadata ensure confidentiality while retaining verifiability and dispute resolution via on‑chain anchors.

Caching and versioning strategy

  • Immutability: published assets are content‑addressed and immutable; cache headers maximize CDN hit rates.
  • Busting and rollback: catalogs/manifests are versioned blobs; rolling back is a matter of switching references. Clients do not depend on mutable indices.
  • Large media: chunk sizes and fountain code parameters follow Walrus best practices for efficient upload and high recovery probability under partial failures.

Availability and failure modes

  • Node failures: with < 1/3 faulty nodes, reads remain possible by requesting slivers across the committee; writes achieve PoA with 2/3 shard signatures.
  • CDN degradation: clients automatically fall back to aggregator URLs; integrity checks ensure correctness.
  • Inconsistency: if an incorrectly encoded blob is later proven inconsistent, an on‑chain inconsistency event signals that reads should yield None, and shards can be reclaimed (inconsistency proofs).

Security considerations

  • Integrity: hashes and blob IDs are recorded in on‑chain objects and verified on read.
  • Access control: access is enforced by on‑chain entitlements; storage is not the gatekeeper.
  • Confidentiality: only encrypted packages are stored for sensitive submissions; no doxable PII is written on‑chain.

Operations and observability

  • Health: clients expose fetch timing and path selection (CDN vs Walrus) to telemetry; anomalies suggest CDN issues or aggregator congestion.
  • Cost: storage capacity planning is tied to expected asset sizes and retention windows; the 5× overhead guidance informs budgets (objectives & use cases).
  • Upgrades: site assets and catalogs are treated as immutable releases; flips are managed by updating on‑chain references rather than mutating remote state.