๐ธ Database Snapshots
๐ง What is a Snapshot?โ
A snapshot is a read-only, point-in-time view of the database.
- It is NOT a full copy (in modern systems)
- It is just a logical view based on data versions
๐ค Why Snapshots Existโ
1. Backup & Recoveryโ
- Restore database after failure
2. Analyticsโ
- Run heavy queries without affecting production
3. Replicationโ
- Create replicas quickly
4. Testing / Debuggingโ
- Reproduce past state
5. Point-in-Time Recoveryโ
- Restore DB to exact timestamp
โ๏ธ How Snapshots Work Internally (MVCC)โ
Instead of copying data, databases store multiple versions of rows.
Each row has metadata:
xminโ when version was createdxmaxโ when version became invalid
Exampleโ
| Value | xmin | xmax |
|---|---|---|
| A=10 | T1 | T2 |
| A=100 | T2 | โ |
Snapshot Ruleโ
A snapshot reads rows where:
xmin โค snapshot_time < xmax
๐ Snapshot = just a timestamp + visibility rules
๐ฆ How Data is Storedโ
Database stores versions of data, NOT snapshots
Storage looks like:
[Row1: A=10, xmin=T1, xmax=T2]
[Row2: A=100, xmin=T2, xmax=โ]
๐ Multiple Snapshotsโ
- Many snapshots can exist at the same time
- Each query/transaction gets its own snapshot
Example:
| Snapshot | Value Seen |
|---|---|
| S1 | A = 10 |
| S2 | A = 100 |
| Current | A = 200 |
๐งน Garbage Collection (GC)โ
Old versions are deleted when:
- No active snapshot needs them
Flowโ
- Data updated โ new versions created
- Old snapshots finish
- GC removes unused versions
๐ Storage Growthโ
Case 1: Few updatesโ
- when some rows or DB is updated
100GB DB + 10GB changes = 110GB
Case 2: Full DB updatedโ
- when almost all rows present in data is updated
100GB old + 100GB new = 200GB
๐ Temporary 2x storage possible
โ ๏ธ Storage Does NOT Grow Foreverโ
Because:
- Old versions are cleaned by GC
Final storage:
โ Current DB size
๐ด Real-World Issue: Bloatโ
If:
- Long-running queries exist
- Snapshots live too long
๐ Old data cannot be deleted
Result:
- Storage increases
- Performance degrades
๐พ Backup Using Snapshotsโ
Step 1: Take Snapshot (Base Backup)โ
A = 10, B = 20
Step 2: Record Logs (WAL)โ
A โ 100
B โ 200
Step 3: Store Bothโ
- Snapshot
- Logs
๐ Restore Processโ
- Load snapshot
- Replay logs
Final state:
A = 100
B = 200
๐ง Key Insights (Must Remember)โ
- Snapshot is not stored as full data
- It is a logical view over versions
- Database stores multiple versions of rows
- Old versions are cleaned via garbage collection
- Storage may temporarily grow but stabilizes