Skip to main content

Blob Storage

  • Blob Storage (Binary Large Object storage) is an object storage system designed for storing large-scale unstructured data such as images, videos, logs, backups, and documents.

  • It provides simple operations:

    • PUT (upload)
    • GET (download)
    • DELETE
    • LIST
  • No querying, no transactions, no joins.

Why Blob Storage Exists ?

  • DBs are optimized for structured data and transactions, not large files.

Problems with using DBs for files

  • Requires expensive SSDs
  • Heavy DB engine overhead (indexes, locks, MVCC, ACID)
  • High CPU and RAM usage
  • Replication = 3× storage cost

Blob storage solves this by

  • Removing the DB engine
  • Using cheap HDDs (for some prime Blob storage we use SSDs as well)
  • Using simple APIs
  • Optimizing for scale and durability
  • Uses erasure coding instead of full replication

Core Idea

  • Write once, read many (append-only)
  • Large sequential I/O
  • Flat namespace (no real folders)
  • Metadata separated from actual data
  • High durability, relaxed latency requirements

How Blob Storage Stores Data

  • Uploaded file is split into chunks (4–16 MB each).
  • Chunks are distributed across many servers.
  • Stored on cheap HDDs, not SSDs.
  • Metadata (object name, size, chunk locations) is stored separately on fast storage.
  • Reads and writes happen in parallel.

Blob Storage vs Database (Quick Comparison)

AspectDatabaseBlob Storage
EngineHeavyMinimal
DiskSSDHDD
UpdatesFrequentRare
RedundancyReplicationErasure Coding
CostHighLow
Use caseTransactionsFiles & media