# ParentZone Downloader Docker Setup This Docker setup runs the ParentZone snapshot downloaders automatically every day at 2:00 AM. ## Quick Start 1. **Copy the example config file and customize it:** ```bash cp config.json.example config.json # Edit config.json with your credentials and preferences ``` 2. **Build and run with Docker Compose:** ```bash docker-compose up -d ``` ## Configuration Methods ### Method 1: Using config.json (Recommended) Edit `config.json` with your ParentZone credentials: ```json { "api_url": "https://api.parentzone.me", "output_dir": "snapshots", "api_key": "your-api-key-here", "email": "your-email@example.com", "password": "your-password", "date_from": "2021-01-01", "date_to": null, "type_ids": [15], "max_pages": null, "debug_mode": false } ``` ### Method 2: Using Environment Variables Create a `.env` file: ```bash API_KEY=your-api-key-here EMAIL=your-email@example.com PASSWORD=your-password TZ=America/New_York ``` ## Schedule Configuration The downloaders run daily at 2:00 AM by default. To change this: 1. Edit the `crontab` file 2. Rebuild the Docker image: `docker-compose build` 3. Restart: `docker-compose up -d` ## File Organization ``` ./ ├── snapshots/ # Generated HTML reports ├── logs/ # Scheduler and downloader logs ├── config.json # Main configuration ├── Dockerfile ├── docker-compose.yml └── scheduler.sh # Daily execution script ``` ## Monitoring ### View logs in real-time: ```bash docker-compose logs -f ``` ### Check scheduler logs: ```bash docker exec parentzone-downloader tail -f /app/logs/scheduler_$(date +%Y%m%d).log ``` ### View generated reports: HTML files are saved in the `./snapshots/` directory and can be opened in any web browser. ## Maintenance ### Update the container: ```bash docker-compose down docker-compose build docker-compose up -d ``` ### Manual run (for testing): ```bash docker exec parentzone-downloader /app/scheduler.sh ``` ### Cleanup old files: The system automatically: - Keeps logs for 30 days - Keeps HTML reports for 90 days - Limits cron.log to 50MB ## Troubleshooting ### Check if cron is running: ```bash docker exec parentzone-downloader pgrep cron ``` ### View cron logs: ```bash docker exec parentzone-downloader tail -f /var/log/cron.log ``` ### Test configuration: ```bash docker exec parentzone-downloader python3 config_snapshot_downloader.py --config /app/config.json --max-pages 1 ``` ## Security Notes - Keep your `config.json` file secure and don't commit it to version control - Consider using environment variables for sensitive credentials - The Docker container runs with minimal privileges - Network access is only required for ParentZone API calls ## Volume Persistence Data is persisted in: - `./snapshots/` - Generated HTML reports - `./logs/` - Application logs These directories are automatically created and mounted as Docker volumes.