Files
parentzone_downloader/docs/Docker-README.md
Tudor Sitaru d8637ac2ea
All checks were successful
Build Docker Image / build (push) Successful in 1m3s
repo restructure
2025-10-14 21:58:54 +01:00

131 lines
2.9 KiB
Markdown

# ParentZone Downloader Docker Setup
This Docker setup runs the ParentZone snapshot downloaders automatically every day at 2:00 AM.
## Quick Start
1. **Copy the example config file and customize it:**
```bash
cp config.json.example config.json
# Edit config.json with your credentials and preferences
```
2. **Build and run with Docker Compose:**
```bash
docker-compose up -d
```
## Configuration Methods
### Method 1: Using config.json (Recommended)
Edit `config.json` with your ParentZone credentials:
```json
{
"api_url": "https://api.parentzone.me",
"output_dir": "snapshots",
"api_key": "your-api-key-here",
"email": "your-email@example.com",
"password": "your-password",
"date_from": "2021-01-01",
"date_to": null,
"type_ids": [15],
"max_pages": null,
"debug_mode": false
}
```
### Method 2: Using Environment Variables
Create a `.env` file:
```bash
API_KEY=your-api-key-here
EMAIL=your-email@example.com
PASSWORD=your-password
TZ=America/New_York
```
## Schedule Configuration
The downloaders run daily at 2:00 AM by default. To change this:
1. Edit the `crontab` file
2. Rebuild the Docker image: `docker-compose build`
3. Restart: `docker-compose up -d`
## File Organization
```
./
├── snapshots/ # Generated HTML reports
├── logs/ # Scheduler and downloader logs
├── config.json # Main configuration
├── Dockerfile
├── docker-compose.yml
└── scheduler.sh # Daily execution script
```
## Monitoring
### View logs in real-time:
```bash
docker-compose logs -f
```
### Check scheduler logs:
```bash
docker exec parentzone-downloader tail -f /app/logs/scheduler_$(date +%Y%m%d).log
```
### View generated reports:
HTML files are saved in the `./snapshots/` directory and can be opened in any web browser.
## Maintenance
### Update the container:
```bash
docker-compose down
docker-compose build
docker-compose up -d
```
### Manual run (for testing):
```bash
docker exec parentzone-downloader /app/scheduler.sh
```
### Cleanup old files:
The system automatically:
- Keeps logs for 30 days
- Keeps HTML reports for 90 days
- Limits cron.log to 50MB
## Troubleshooting
### Check if cron is running:
```bash
docker exec parentzone-downloader pgrep cron
```
### View cron logs:
```bash
docker exec parentzone-downloader tail -f /var/log/cron.log
```
### Test configuration:
```bash
docker exec parentzone-downloader python3 config_snapshot_downloader.py --config /app/config.json --max-pages 1
```
## Security Notes
- Keep your `config.json` file secure and don't commit it to version control
- Consider using environment variables for sensitive credentials
- The Docker container runs with minimal privileges
- Network access is only required for ParentZone API calls
## Volume Persistence
Data is persisted in:
- `./snapshots/` - Generated HTML reports
- `./logs/` - Application logs
These directories are automatically created and mounted as Docker volumes.