dockervault/README.md
2026-03-23 18:28:13 +00:00

461 lines
9.1 KiB
Markdown

<p align="center">
<img src="images/dockervault_logo.png" width="600">
</p>
# DockerVault
> Intelligent Docker backup discovery for real systems
DockerVault scans your Docker environments and figures out **what actually matters to back up** — automatically.
No guesswork. No forgotten volumes. No broken restores.
---
## 📚 Contents
* [🚀 What is DockerVault?](#what-is-dockervault)
* [⚡ Quick Start](#quick-start)
* [🧠 How it Works](#how-it-works)
* [🗂 Classification Model](#classification-model)
* [💾 Borg Integration](#borg-integration)
* [🤖 Automation Mode](#automation-mode)
* [🔢 Exit Codes](#exit-codes)
* [🛠 Tech Stack](#tech-stack)
* [🔍 Example](#example)
* [🧱 Current Features](#current-features)
* [🔥 Roadmap](#roadmap)
* [🔮 Future Ideas](#future-ideas)
* [🧠 Philosophy](#philosophy)
* [📜 License](#license)
* [❤️ Credits](#credits)
---
## 🚀 What is DockerVault?
DockerVault analyzes your `docker-compose.yml` and identifies:
* What **must** be backed up
* What can be **ignored**
* What needs **human review**
It bridges the gap between:
👉 “everything looks fine”
and
👉 “restore just failed”
---
## ⚡ Quick Start
```bash
git clone https://github.com/YOUR-USER/dockervault.git
cd dockervault
pip install -e .
```
Run analysis:
```bash
python -m dockervault.cli docker-compose.yml --borg --repo /backup-repo
```
Run backup:
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--repo /backup-repo
```
---
## 🧠 How it Works
DockerVault parses your compose file and inspects:
* bind mounts
* volume targets
* known data paths
It then classifies them using heuristics:
* database paths → critical
* logs/cache → optional
* unknown → review
---
## 🗂 Classification Model
DockerVault divides everything into three categories:
### ✅ INCLUDE
Must be backed up.
Example:
```
/var/lib/mysql
/data
/config
```
### ⚠️ REVIEW
Needs human decision.
Triggered when:
* path does not exist
* path exists but is empty
* named volumes (Docker-managed)
Example:
```
./mc-missing → /data
```
### ❌ SKIP
Safe to ignore.
Example:
```
/var/log
/tmp
/cache
```
---
## 💾 Borg Integration
DockerVault can generate and run Borg backups directly.
Example:
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--repo /mnt/backups/borg/dockervault
```
Generated command:
```bash
borg create --stats --progress \
/repo::hostname-2026-03-23_12-44-19 \
/path/to/data
```
### Features
* automatic archive naming (with seconds precision)
* deduplicated paths
* safe command generation
* subprocess execution
* optional passphrase support
---
## 🤖 Automation Mode
Designed for cron / scripts / servers.
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--quiet \
--automation \
--repo /backup-repo
```
### Behavior
* no plan output
* no interactive prompts
* minimal output
* suitable for logs / CI
---
## 🔢 Exit Codes
| Code | Meaning |
| ---- | ------------------------------------ |
| 0 | Success |
| 1 | General error |
| 2 | Missing required args |
| 3 | No include paths |
| 4 | Review required (`--fail-on-review`) |
### Fail on review
```bash
--fail-on-review
```
Stops automation if something needs human attention.
---
## 🛠 Tech Stack
* Python 3.10+
* PyYAML
* BorgBackup
* CLI-first design
---
## 🔍 Example
Input:
```yaml
services:
db:
volumes:
- ./db:/var/lib/mysql
mc:
volumes:
- ./mc-missing:/data
nginx:
volumes:
- ./logs:/var/log/nginx
```
Output:
```
INCLUDE:
db
REVIEW:
mc-missing
SKIP:
logs
```
---
## 🧱 Current Features
* Docker Compose parsing
* Bind mount detection
* Intelligent classification
* Borg backup integration
* Automation mode
* Exit codes for scripting
* Safe path handling
* Deduplication
---
## 🗺 Roadmap
DockerVault is built with a clear philosophy:
**simple core, intelligent behavior, and extensible design — without unnecessary complexity or vendor lock-in.**
---
### 🚀 v1 — Core Engine (Current Focus)
> Build a reliable, deterministic backup discovery engine
- [x] Docker Compose scanning
- [x] Volume and bind mount detection
- [x] Intelligent classification (critical / review / skip)
- [x] Backup plan generation
- [x] Borg backup integration
- [x] Dry-run mode
- [x] Automation mode (`--automation`, `--quiet`)
---
### 🔧 v2 — Observability & Automation
> Make DockerVault production-ready
- [ ] Advanced logging (human + JSON output)
- [ ] Webhook support (primary notification system)
- [ ] ntfy integration (lightweight alerts)
- [ ] Email notifications (optional reports)
- [ ] Change detection (new/missing volumes)
- [ ] Backup summaries (stats, duration, warnings)
- [ ] Basic run history (file-based, no database)
---
### 🧠 v3 — Intelligence Layer
> Move from tool → system awareness
- [ ] "Explain why" classification decisions
- [ ] Anomaly detection (size, duration, structure)
- [ ] System understanding confidence
- [ ] Backup diff between runs
- [ ] Smarter classification patterns
---
### 🧪 v4 — Reliability & Safety
> Ensure backups are actually usable
- [ ] Restore testing (ephemeral container validation)
- [ ] Integrity checks (borg/restic verify)
- [ ] Pre/post execution hooks
- [ ] Backup profiles (critical / full / custom)
---
### 🔐 v5 — Security & Encryption
> Strong, transparent data protection
- [ ] Engine-native encryption (Borg / Restic)
- [ ] Encryption validation checks
- [ ] Optional post-process encryption (age / gpg)
- [ ] Clear key handling guidelines
---
### 🔌 v6 — Plugin Ecosystem
> Extend without bloating core
- [ ] Storage backends (S3, WebDAV, SSH, etc.)
- [ ] Optional cloud integrations (Dropbox, Google Drive, Proton Drive)
- [ ] Notification plugins (webhook-first approach)
- [ ] Pluggable architecture for extensions
---
### 🌐 v7 — Platform & Deployment
> Make DockerVault easy to run anywhere
- [ ] Official Docker image
- [ ] Non-interactive container mode
- [ ] Unraid Community Apps template
- [ ] Configurable via environment + config file
---
### 🧭 Design Principles
- **No vendor lock-in** — webhook over platform integrations
- **Self-hosting friendly** — works fully offline/local
- **Transparency over magic** — explain decisions
- **Stateless-first** — no database required by default
- **Extensible architecture** — plugins over core bloat
- **Backup ≠ done until restore works**
---
### 🔮 Future Ideas
> Ideas that push DockerVault beyond backup — towards system awareness and control.
#### 🧠 System Intelligence
- Change detection (new/missing volumes, structure changes)
- "Explain why" classification decisions
- System understanding confidence score
- Backup diff between runs
- Detection of unknown/unclassified data
#### 📊 Observability & Insight
- Historical trends (size, duration, change rate)
- Growth analysis (detect abnormal data expansion)
- Backup performance tracking
- Structured JSON logs for external systems
#### 🚨 Alerting & Automation
- Webhook-first automation triggers
- ntfy notifications
- Email reporting
- Conditional alerts (failures, anomalies, missing data)
- Integration with external systems (Node-RED, Home Assistant, OpenObserve)
#### 🧪 Reliability & Verification
- Automated restore testing (ephemeral containers)
- Service-level validation (DB start, app health)
- Integrity checks (borg/restic verification)
- Backup validation reports
#### ⚙️ Control & Extensibility
- Pre/post execution hooks
- Backup profiles (critical / full / custom)
- Simulation mode (predict behavior before execution)
- Advanced dry-run with diff preview
#### 🔐 Security & Encryption
- Engine-native encryption support
- Optional post-process encryption (age, gpg)
- Encryption validation and key awareness
- Secure offsite export workflows
#### 🔌 Plugin Ecosystem
- Storage backends (S3, WebDAV, SSH, etc.)
- Optional cloud targets (Dropbox, Google Drive, Proton Drive)
- Notification plugins (webhook-first design)
- Pluggable architecture for extensions
#### 🌐 Multi-System Awareness
- Multi-host environments (Lanx-style setups)
- Centralized reporting and monitoring
- Cross-node backup visibility
#### 🖥 Platform & UX
- Optional Web UI (status, history, alerts)
- Docker-native deployment mode
- Unraid Community Apps integration
- Config-driven operation (env + config files)
---
> Built with ❤️ for real systems — not toy setups.
---
## 🧠 Philosophy
DockerVault is built on a simple idea:
> Backups should reflect reality — not assumptions.
* No blind backups
* No hidden data
* No silent failures
Just clarity.
---
## 📜 License
GNU GPLv3
This project is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License v3.
---
## ❤️ Credits
Created by **Ed https://lanx.dk <br> NodeFox 🦊 https://nodefox.lanx.dk**
Built with ❤️ for Lanx
Maintained by Eddie Nielsen
Feel free to contribute, suggest improvements or fork the project.