foxcrawler

No description

Python 53.4%
JavaScript 23%
CSS 15.5%
HTML 7.1%
Dockerfile 1%

Find a file

Eddie Nielsen b30ad8de0d Clarify responsible use and scope		2026-04-27 20:53:53 +02:00
backend/app	Freeze FoxCrawler v1	2026-04-27 20:21:04 +02:00
frontend	Use neutral defaults and clean public docs	2026-04-27 20:37:16 +02:00
.dockerignore	Freeze FoxCrawler v1	2026-04-27 20:21:04 +02:00
.gitignore	Ignore local backup script	2026-04-27 20:31:14 +02:00
docker-compose.yml	Freeze FoxCrawler v1	2026-04-27 20:21:04 +02:00
Dockerfile	Add responsible use policy	2026-04-27 20:29:31 +02:00
README.md	Use absolute responsible use policy link	2026-04-27 20:47:19 +02:00
requirements.txt	Freeze FoxCrawler v1	2026-04-27 20:21:04 +02:00
RESPONSIBLE_USE.md	Clarify responsible use and scope	2026-04-27 20:53:53 +02:00

README.md

Local-first site crawler, link checker, API discovery tool, and HTML report generator.

Overview

FoxCrawler is a small, self-hosted crawler for inspecting websites and services you own, operate, or have permission to test.

It is built for controlled local use. The goal is to help maintain and understand your own web services, not to perform large-scale public scraping.

FoxCrawler can check site health, discover internal links and API-like endpoints, inspect browser-rendered pages, and generate clean reports.

Features

Crawling

Configurable crawl depth
Maximum page limit
Configurable request delay
Page, asset, API, and error detection
Slow response detection
Error grouping

Browser mode

Optional rendered link discovery
Optional safe button testing
Button result summaries
Skipped/error grouping

Reports

Saved crawl history
Polished HTML report download
JSON export
CSV export
Result filters
Search across URL, title, source, type, and error text

Saved sites

Site registry for repeat targets
Per-site crawl presets
Latest report metadata
Quick actions for repeat crawling

Quick start

Docker

Build and start:

docker compose up -d --build

Open locally on the server:

http://127.0.0.1:8098

Open from another machine on the same network:

http://<server-ip>:8098

Example:

http://192.168.1.50:8098

Stop:

docker compose down

View logs:

docker logs --tail=100 foxcrawler

Local development

cd ~/projects/foxcrawler
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python -m playwright install chromium
uvicorn backend.app.main:app --host 0.0.0.0 --port 8098 --log-level info

Runtime data

The runtime database is stored here:

./data/foxcrawler.db

Runtime data is intentionally ignored by git:

data/*.db
data/*.db-shm
data/*.db-wal
reports/
backups/

Back up data/foxcrawler.db if you want to preserve crawl history, saved sites, presets, and reports.

API

Crawl

POST /api/crawl
GET  /api/jobs/{job_id}

Reports

GET /api/reports
GET /api/reports/{report_id}
GET /api/reports/{report_id}/export/json
GET /api/reports/{report_id}/export/csv
GET /api/reports/{report_id}/export/html
GET /api/reports/{report_id}/download/html

Sites

GET    /api/sites
POST   /api/sites
PUT    /api/sites/{site_id}/presets
DELETE /api/sites/{site_id}

Responsible use

FoxCrawler is intended for websites and services you own, operate, or have explicit permission to test.

See RESPONSIBLE_USE.md for the full responsible use policy.

Credits

Built with ❤️ for Lanx by NodeFox 🦊
Maintained by Eddie Nielsen

Learn. Adopt. Survive. Share.