# DockerVault
> Intelligent Docker backup discovery for real systems
DockerVault scans your Docker environments and figures out **what actually matters to back up** โ automatically.
No guesswork. No forgotten volumes. No broken restores.
---
## ๐ Contents
* [๐ What is DockerVault?](#what-is-dockervault)
* [โก Quick Start](#quick-start)
* [๐ง How it Works](#how-it-works)
* [๐ Classification Model](#classification-model)
* [๐พ Borg Integration](#borg-integration)
* [๐ค Automation Mode](#automation-mode)
* [๐ข Exit Codes](#exit-codes)
* [๐ Tech Stack](#tech-stack)
* [๐ Example](#example)
* [๐งฑ Current Features](#current-features)
* [๐ฅ Roadmap](#roadmap)
* [๐ฎ Future Ideas](#future-ideas)
* [๐ง Philosophy](#philosophy)
* [๐ License](#license)
* [โค๏ธ Credits](#credits)
---
## ๐ What is DockerVault?
DockerVault analyzes your `docker-compose.yml` and identifies:
* What **must** be backed up
* What can be **ignored**
* What needs **human review**
It bridges the gap between:
๐ โeverything looks fineโ
and
๐ โrestore just failedโ
---
## โก Quick Start
```bash
git clone https://github.com/YOUR-USER/dockervault.git
cd dockervault
pip install -e .
```
Run analysis:
```bash
python -m dockervault.cli docker-compose.yml --borg --repo /backup-repo
```
Run backup:
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--repo /backup-repo
```
---
## ๐ง How it Works
DockerVault parses your compose file and inspects:
* bind mounts
* volume targets
* known data paths
It then classifies them using heuristics:
* database paths โ critical
* logs/cache โ optional
* unknown โ review
---
## ๐ Classification Model
DockerVault divides everything into three categories:
### โ
INCLUDE
Must be backed up.
Example:
```
/var/lib/mysql
/data
/config
```
### โ ๏ธ REVIEW
Needs human decision.
Triggered when:
* path does not exist
* path exists but is empty
* named volumes (Docker-managed)
Example:
```
./mc-missing โ /data
```
### โ SKIP
Safe to ignore.
Example:
```
/var/log
/tmp
/cache
```
---
## ๐พ Borg Integration
DockerVault can generate and run Borg backups directly.
Example:
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--repo /mnt/backups/borg/dockervault
```
Generated command:
```bash
borg create --stats --progress \
/repo::hostname-2026-03-23_12-44-19 \
/path/to/data
```
### Features
* automatic archive naming (with seconds precision)
* deduplicated paths
* safe command generation
* subprocess execution
* optional passphrase support
---
## ๐ค Automation Mode
Designed for cron / scripts / servers.
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--quiet \
--automation \
--repo /backup-repo
```
### Behavior
* no plan output
* no interactive prompts
* minimal output
* suitable for logs / CI
---
## ๐ข Exit Codes
| Code | Meaning |
| ---- | ------------------------------------ |
| 0 | Success |
| 1 | General error |
| 2 | Missing required args |
| 3 | No include paths |
| 4 | Review required (`--fail-on-review`) |
### Fail on review
```bash
--fail-on-review
```
Stops automation if something needs human attention.
---
## ๐ Tech Stack
* Python 3.10+
* PyYAML
* BorgBackup
* CLI-first design
---
## ๐ Example
Input:
```yaml
services:
db:
volumes:
- ./db:/var/lib/mysql
mc:
volumes:
- ./mc-missing:/data
nginx:
volumes:
- ./logs:/var/log/nginx
```
Output:
```
INCLUDE:
db
REVIEW:
mc-missing
SKIP:
logs
```
---
## ๐งฑ Current Features
* Docker Compose parsing
* Bind mount detection
* Intelligent classification
* Borg backup integration
* Automation mode
* Exit codes for scripting
* Safe path handling
* Deduplication
---
## ๐บ Roadmap
DockerVault is built with a clear philosophy:
**simple core, intelligent behavior, and extensible design โ without unnecessary complexity or vendor lock-in.**
---
### ๐ v1 โ Core Engine (Current Focus)
> Build a reliable, deterministic backup discovery engine
- [x] Docker Compose scanning
- [x] Volume and bind mount detection
- [x] Intelligent classification (critical / review / skip)
- [x] Backup plan generation
- [x] Borg backup integration
- [x] Dry-run mode
- [x] Automation mode (`--automation`, `--quiet`)
---
### ๐ง v2 โ Observability & Automation
> Make DockerVault production-ready
- [ ] Advanced logging (human + JSON output)
- [ ] Webhook support (primary notification system)
- [ ] ntfy integration (lightweight alerts)
- [ ] Email notifications (optional reports)
- [ ] Change detection (new/missing volumes)
- [ ] Backup summaries (stats, duration, warnings)
- [ ] Basic run history (file-based, no database)
---
### ๐ง v3 โ Intelligence Layer
> Move from tool โ system awareness
- [ ] "Explain why" classification decisions
- [ ] Anomaly detection (size, duration, structure)
- [ ] System understanding confidence
- [ ] Backup diff between runs
- [ ] Smarter classification patterns
---
### ๐งช v4 โ Reliability & Safety
> Ensure backups are actually usable
- [ ] Restore testing (ephemeral container validation)
- [ ] Integrity checks (borg/restic verify)
- [ ] Pre/post execution hooks
- [ ] Backup profiles (critical / full / custom)
---
### ๐ v5 โ Security & Encryption
> Strong, transparent data protection
- [ ] Engine-native encryption (Borg / Restic)
- [ ] Encryption validation checks
- [ ] Optional post-process encryption (age / gpg)
- [ ] Clear key handling guidelines
---
### ๐ v6 โ Plugin Ecosystem
> Extend without bloating core
- [ ] Storage backends (S3, WebDAV, SSH, etc.)
- [ ] Optional cloud integrations (Dropbox, Google Drive, Proton Drive)
- [ ] Notification plugins (webhook-first approach)
- [ ] Pluggable architecture for extensions
---
### ๐ v7 โ Platform & Deployment
> Make DockerVault easy to run anywhere
- [ ] Official Docker image
- [ ] Non-interactive container mode
- [ ] Unraid Community Apps template
- [ ] Configurable via environment + config file
---
### ๐งญ Design Principles
- **No vendor lock-in** โ webhook over platform integrations
- **Self-hosting friendly** โ works fully offline/local
- **Transparency over magic** โ explain decisions
- **Stateless-first** โ no database required by default
- **Extensible architecture** โ plugins over core bloat
- **Backup โ done until restore works**
---
### ๐ฎ Future Ideas
> Ideas that push DockerVault beyond backup โ towards system awareness and control.
#### ๐ง System Intelligence
- Change detection (new/missing volumes, structure changes)
- "Explain why" classification decisions
- System understanding confidence score
- Backup diff between runs
- Detection of unknown/unclassified data
#### ๐ Observability & Insight
- Historical trends (size, duration, change rate)
- Growth analysis (detect abnormal data expansion)
- Backup performance tracking
- Structured JSON logs for external systems
#### ๐จ Alerting & Automation
- Webhook-first automation triggers
- ntfy notifications
- Email reporting
- Conditional alerts (failures, anomalies, missing data)
- Integration with external systems (Node-RED, Home Assistant, OpenObserve)
#### ๐งช Reliability & Verification
- Automated restore testing (ephemeral containers)
- Service-level validation (DB start, app health)
- Integrity checks (borg/restic verification)
- Backup validation reports
#### โ๏ธ Control & Extensibility
- Pre/post execution hooks
- Backup profiles (critical / full / custom)
- Simulation mode (predict behavior before execution)
- Advanced dry-run with diff preview
#### ๐ Security & Encryption
- Engine-native encryption support
- Optional post-process encryption (age, gpg)
- Encryption validation and key awareness
- Secure offsite export workflows
#### ๐ Plugin Ecosystem
- Storage backends (S3, WebDAV, SSH, etc.)
- Optional cloud targets (Dropbox, Google Drive, Proton Drive)
- Notification plugins (webhook-first design)
- Pluggable architecture for extensions
#### ๐ Multi-System Awareness
- Multi-host environments (Lanx-style setups)
- Centralized reporting and monitoring
- Cross-node backup visibility
#### ๐ฅ Platform & UX
- Optional Web UI (status, history, alerts)
- Docker-native deployment mode
- Unraid Community Apps integration
- Config-driven operation (env + config files)
---
> Built with โค๏ธ for real systems โ not toy setups.
---
## ๐ง Philosophy
DockerVault is built on a simple idea:
> Backups should reflect reality โ not assumptions.
* No blind backups
* No hidden data
* No silent failures
Just clarity.
---
## ๐ License
GNU GPLv3
This project is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License v3.
---
## โค๏ธ Credits
Created by
**Ed https://lanx.dk
NodeFox ๐ฆ https://nodefox.lanx.dk**
Built with โค๏ธ for Lanx
Maintained by Eddie Nielsen
Feel free to contribute, suggest improvements or fork the project.