feat: add borg backup support and classification improvements

This commit is contained in:
Eddie Nielsen 2026-03-23 14:46:33 +00:00
parent 483e2720f1
commit e5ef50a74a
15 changed files with 1293 additions and 649 deletions

367
README.md
View file

@ -1,5 +1,5 @@
<p align="center">
<img src="images/dockervault-logo.png" width="600">
<img src="images/dockervault_logo.png" width="600">
</p>
# DockerVault
@ -15,172 +15,262 @@ No guesswork. No forgotten volumes. No broken restores.
## 📚 Contents
* [🚀 What is DockerVault?](#-what-is-dockervault)
* [🧠 Why DockerVault?](#-why-dockervault)
* [⚡ Quick Start](#-quick-start)
* [🧠 How it Works](#-how-it-works)
* [🗂 Classification Model](#-classification-model)
* [💾 Borg Integration](#-borg-integration)
* [🤖 Automation Mode](#-automation-mode)
* [🔢 Exit Codes](#-exit-codes)
* [🛠 Tech Stack](#-tech-stack)
* [🔍 Example](#-example)
* [🧱 Current Features](#-current-features)
* [🔥 Roadmap](#-roadmap)
* [🧠 Philosophy](#-philosophy)
* [📜 License](#-license)
* [🤝 Contributing](#-contributing)
* [❤️ Credits](#-credits)
---
## 🚀 What is DockerVault?
DockerVault is a CLI tool that:
DockerVault analyzes your `docker-compose.yml` and identifies:
* Scans Docker Compose projects
* Parses services, volumes, env files
* Identifies **real data vs noise**
* Builds a structured backup understanding
* What **must** be backed up
* What can be **ignored**
* What needs **human review**
Built for people running real systems — not toy setups.
It bridges the gap between:
---
## 🧠 Why DockerVault?
Most backup setups fail because:
* You forget a volume
* You miss an `.env` file
* You back up cache instead of data
* You dont know what actually matters
DockerVault solves this by **thinking like an operator**.
👉 “everything looks fine”
and
👉 “restore just failed”
---
## ⚡ Quick Start
```bash
git clone https://git.lanx.dk/ed/dockervault.git
git clone https://github.com/YOUR-USER/dockervault.git
cd dockervault
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
dockervault scan /path/to/docker
```
Run analysis:
```bash
python -m dockervault.cli docker-compose.yml --borg --repo /backup-repo
```
Run backup:
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--repo /backup-repo
```
---
## 🧠 How it Works
DockerVault parses your compose file and inspects:
* bind mounts
* volume targets
* known data paths
It then classifies them using heuristics:
* database paths → critical
* logs/cache → optional
* unknown → review
---
## 🗂 Classification Model
DockerVault divides everything into three categories:
### ✅ INCLUDE
Must be backed up.
Example:
```
/var/lib/mysql
/data
/config
```
### ⚠️ REVIEW
Needs human decision.
Triggered when:
* path does not exist
* path exists but is empty
* named volumes (Docker-managed)
Example:
```
./mc-missing → /data
```
### ❌ SKIP
Safe to ignore.
Example:
```
/var/log
/tmp
/cache
```
---
## 💾 Borg Integration
DockerVault can generate and run Borg backups directly.
Example:
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--repo /mnt/backups/borg/dockervault
```
Generated command:
```bash
borg create --stats --progress \
/repo::hostname-2026-03-23_12-44-19 \
/path/to/data
```
### Features
* automatic archive naming (with seconds precision)
* deduplicated paths
* safe command generation
* subprocess execution
* optional passphrase support
---
## 🤖 Automation Mode
Designed for cron / scripts / servers.
```bash
python -m dockervault.cli docker-compose.yml \
--run-borg \
--quiet \
--automation \
--repo /backup-repo
```
### Behavior
* no plan output
* no interactive prompts
* minimal output
* suitable for logs / CI
---
## 🔢 Exit Codes
| Code | Meaning |
| ---- | ------------------------------------ |
| 0 | Success |
| 1 | General error |
| 2 | Missing required args |
| 3 | No include paths |
| 4 | Review required (`--fail-on-review`) |
### Fail on review
```bash
--fail-on-review
```
Stops automation if something needs human attention.
---
## 🛠 Tech Stack
DockerVault is built using simple, reliable components:
* **Python 3.10+** core language
* **PyYAML** parsing Docker Compose files
* **argparse** CLI interface
* **pip / venv** environment management
---
### 🔧 Designed for
* Linux systems (Ubuntu, Debian, Unraid environments)
* Docker Compose based setups
* CLI-first workflows
* Python 3.10+
* PyYAML
* BorgBackup
* CLI-first design
---
## 🔍 Example
```bash
dockervault scan ~/test-docker --json
Input:
```yaml
services:
db:
volumes:
- ./db:/var/lib/mysql
mc:
volumes:
- ./mc-missing:/data
nginx:
volumes:
- ./logs:/var/log/nginx
```
```json
[
{
"name": "app2",
"services": [
{
"name": "app",
"image": "ghcr.io/example/app:latest",
"env_files": [".env"],
"mounts": [
"./data:/app/data",
"./config:/app/config"
]
}
],
"named_volumes": ["app_cache"]
}
]
Output:
```
INCLUDE:
db
REVIEW:
mc-missing
SKIP:
logs
```
---
## 🧱 Current Features
* CLI interface
* Recursive project scanning
* Docker Compose parsing (YAML)
* Service detection
* Volume + bind mount detection
* Environment file detection
* Docker Compose parsing
* Bind mount detection
* Intelligent classification
* Borg backup integration
* Automation mode
* Exit codes for scripting
* Safe path handling
* Deduplication
---
## 🔥 Roadmap
### ✅ Phase 1 Discovery
* [x] CLI
* [x] Scan command
* [x] YAML parsing
---
### 🚧 Phase 2 Intelligence
* [ ] Classify mounts (data / config / cache)
* [ ] Detect backup candidates
* [ ] Generate backup plan
---
### 🔜 Phase 3 Storage
* [ ] SQLite inventory
* [ ] Historical tracking
* [ ] Change detection
---
### 🔜 Phase 4 Execution
* [ ] Borg integration
* [ ] Backup automation
* [ ] Named volume inspection (`docker volume inspect`)
* [ ] Docker API integration
* [ ] Multiple compose files support
* [ ] Email / ntfy notifications
* [ ] Web interface
* [ ] Backup history tracking
* [ ] Restore validation
---
### 🔔 Phase 5 Notifications & Monitoring
* [ ] Email notifications
* [ ] ntfy.sh integration
* [ ] Webhook support
* [ ] Alerts on:
* missing backups
* new volumes
* changed data paths
* [ ] Daily/weekly reports
---
### 🧠 Future Ideas
* [ ] Auto-detect Docker hosts on network
* [ ] Multi-node backup coordination (Lanx-style)
* [ ] Backup simulation ("what would be backed up?")
* [ ] Restore dry-run validation
* [ ] Tagging system (critical / optional / ignore)
* [ ] Scheduling integration
---
@ -188,43 +278,30 @@ dockervault scan ~/test-docker --json
DockerVault is built on a simple idea:
> Backups should be **correct by default**
> Backups should reflect reality — not assumptions.
Not configurable chaos.
* No blind backups
* No hidden data
* No silent failures
Not guesswork.
But **system understanding**.
Just clarity.
---
## 📜 License
This project is licensed under the **GNU General Public License v3.0 (GPL-3.0)**.
GNU GPLv3
You are free to:
* Use the software
* Study and modify it
* Share and redistribute it
Under the condition that:
* Any derivative work must also be licensed under GPL-3.0
* Source code must be made available when distributed
See the full license in the [LICENSE](LICENSE) file.
This project is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License v3.
---
## 🤝 Contributing
## ❤️ Credits
Created by **Ed & NodeFox 🦊**
Built with ❤️ for Lanx
Maintained by Eddie Nielsen
Feel free to contribute, suggest improvements or fork the project.
---
<p align="center">
Built with ❤️ for Lanx by NodeFox 🦊
Maintained by Eddie Nielsen & NodeFox 🦊
Feel free to contribute, suggest improvements or fork the project.
</p>