131 lines
2.9 KiB
Markdown
131 lines
2.9 KiB
Markdown
|
|
# ParentZone Downloader Docker Setup
|
||
|
|
|
||
|
|
This Docker setup runs the ParentZone snapshot downloaders automatically every day at 2:00 AM.
|
||
|
|
|
||
|
|
## Quick Start
|
||
|
|
|
||
|
|
1. **Copy the example config file and customize it:**
|
||
|
|
```bash
|
||
|
|
cp config.json.example config.json
|
||
|
|
# Edit config.json with your credentials and preferences
|
||
|
|
```
|
||
|
|
|
||
|
|
2. **Build and run with Docker Compose:**
|
||
|
|
```bash
|
||
|
|
docker-compose up -d
|
||
|
|
```
|
||
|
|
|
||
|
|
## Configuration Methods
|
||
|
|
|
||
|
|
### Method 1: Using config.json (Recommended)
|
||
|
|
Edit `config.json` with your ParentZone credentials:
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"api_url": "https://api.parentzone.me",
|
||
|
|
"output_dir": "snapshots",
|
||
|
|
"api_key": "your-api-key-here",
|
||
|
|
"email": "your-email@example.com",
|
||
|
|
"password": "your-password",
|
||
|
|
"date_from": "2021-01-01",
|
||
|
|
"date_to": null,
|
||
|
|
"type_ids": [15],
|
||
|
|
"max_pages": null,
|
||
|
|
"debug_mode": false
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Method 2: Using Environment Variables
|
||
|
|
Create a `.env` file:
|
||
|
|
```bash
|
||
|
|
API_KEY=your-api-key-here
|
||
|
|
EMAIL=your-email@example.com
|
||
|
|
PASSWORD=your-password
|
||
|
|
TZ=America/New_York
|
||
|
|
```
|
||
|
|
|
||
|
|
## Schedule Configuration
|
||
|
|
|
||
|
|
The downloaders run daily at 2:00 AM by default. To change this:
|
||
|
|
|
||
|
|
1. Edit the `crontab` file
|
||
|
|
2. Rebuild the Docker image: `docker-compose build`
|
||
|
|
3. Restart: `docker-compose up -d`
|
||
|
|
|
||
|
|
## File Organization
|
||
|
|
|
||
|
|
```
|
||
|
|
./
|
||
|
|
├── snapshots/ # Generated HTML reports
|
||
|
|
├── logs/ # Scheduler and downloader logs
|
||
|
|
├── config.json # Main configuration
|
||
|
|
├── Dockerfile
|
||
|
|
├── docker-compose.yml
|
||
|
|
└── scheduler.sh # Daily execution script
|
||
|
|
```
|
||
|
|
|
||
|
|
## Monitoring
|
||
|
|
|
||
|
|
### View logs in real-time:
|
||
|
|
```bash
|
||
|
|
docker-compose logs -f
|
||
|
|
```
|
||
|
|
|
||
|
|
### Check scheduler logs:
|
||
|
|
```bash
|
||
|
|
docker exec parentzone-downloader tail -f /app/logs/scheduler_$(date +%Y%m%d).log
|
||
|
|
```
|
||
|
|
|
||
|
|
### View generated reports:
|
||
|
|
HTML files are saved in the `./snapshots/` directory and can be opened in any web browser.
|
||
|
|
|
||
|
|
## Maintenance
|
||
|
|
|
||
|
|
### Update the container:
|
||
|
|
```bash
|
||
|
|
docker-compose down
|
||
|
|
docker-compose build
|
||
|
|
docker-compose up -d
|
||
|
|
```
|
||
|
|
|
||
|
|
### Manual run (for testing):
|
||
|
|
```bash
|
||
|
|
docker exec parentzone-downloader /app/scheduler.sh
|
||
|
|
```
|
||
|
|
|
||
|
|
### Cleanup old files:
|
||
|
|
The system automatically:
|
||
|
|
- Keeps logs for 30 days
|
||
|
|
- Keeps HTML reports for 90 days
|
||
|
|
- Limits cron.log to 50MB
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
### Check if cron is running:
|
||
|
|
```bash
|
||
|
|
docker exec parentzone-downloader pgrep cron
|
||
|
|
```
|
||
|
|
|
||
|
|
### View cron logs:
|
||
|
|
```bash
|
||
|
|
docker exec parentzone-downloader tail -f /var/log/cron.log
|
||
|
|
```
|
||
|
|
|
||
|
|
### Test configuration:
|
||
|
|
```bash
|
||
|
|
docker exec parentzone-downloader python3 config_snapshot_downloader.py --config /app/config.json --max-pages 1
|
||
|
|
```
|
||
|
|
|
||
|
|
## Security Notes
|
||
|
|
|
||
|
|
- Keep your `config.json` file secure and don't commit it to version control
|
||
|
|
- Consider using environment variables for sensitive credentials
|
||
|
|
- The Docker container runs with minimal privileges
|
||
|
|
- Network access is only required for ParentZone API calls
|
||
|
|
|
||
|
|
## Volume Persistence
|
||
|
|
|
||
|
|
Data is persisted in:
|
||
|
|
- `./snapshots/` - Generated HTML reports
|
||
|
|
- `./logs/` - Application logs
|
||
|
|
|
||
|
|
These directories are automatically created and mounted as Docker volumes.
|