# Image Downloader Script A Python script to download images from a REST API that provides endpoints for listing assets and downloading them in full resolution. ## Features - **Concurrent Downloads**: Download multiple images simultaneously for better performance - **Error Handling**: Robust error handling with detailed logging - **Progress Tracking**: Real-time progress bar with download statistics - **Resume Support**: Skip already downloaded files - **Flexible API Integration**: Supports various API response formats - **Filename Sanitization**: Automatically handles invalid characters in filenames - **File Timestamps**: Preserves original file modification dates from API ## Installation 1. Clone or download this repository 2. Install the required dependencies: ```bash pip install -r requirements.txt ``` ## Usage ### Basic Usage ```bash python image_downloader.py \ --api-url "https://api.example.com" \ --list-endpoint "/assets" \ --download-endpoint "/download" \ --output-dir "./images" \ --api-key "your_api_key_here" ``` ### Advanced Usage ```bash python image_downloader.py \ --api-url "https://api.example.com" \ --list-endpoint "/assets" \ --download-endpoint "/download" \ --output-dir "./images" \ --max-concurrent 10 \ --timeout 60 \ --api-key "your_api_key_here" ``` ### Parameters - `--api-url`: Base URL of the API (required) - `--list-endpoint`: Endpoint to get the list of assets (required) - `--download-endpoint`: Endpoint to download individual assets (required) - `--output-dir`: Directory to save downloaded images (required) - `--max-concurrent`: Maximum number of concurrent downloads (default: 5) - `--timeout`: Request timeout in seconds (default: 30) - `--api-key`: API key for authentication (x-api-key header) - `--email`: Email for login authentication - `--password`: Password for login authentication ## Authentication The script supports two authentication methods: ### API Key Authentication - Uses `x-api-key` header for list endpoint - Uses `key` parameter for download endpoint - Configure with `--api-key` parameter or `api_key` in config file ### Login Authentication - Performs login to `/v1/auth/login` endpoint - Uses session token for list endpoint - Uses `key` parameter for download endpoint - Configure with `--email` and `--password` parameters or in config file **Note**: Only one authentication method should be used at a time. API key takes precedence over login credentials. ## API Integration The script is designed to work with REST APIs that follow these patterns: ### List Endpoint The list endpoint should return a JSON response with asset information. The script supports these common formats: ```json // Array of assets [ {"id": "1", "filename": "image1.jpg", "url": "..."}, {"id": "2", "filename": "image2.png", "url": "..."} ] // Object with data array { "data": [ {"id": "1", "filename": "image1.jpg"}, {"id": "2", "filename": "image2.png"} ] } // Object with results array { "results": [ {"id": "1", "filename": "image1.jpg"}, {"id": "2", "filename": "image2.png"} ] } ``` ### Download Endpoint The download endpoint should accept an asset ID and return the image file. Common patterns: - `GET /download/{asset_id}` - `GET /assets/{asset_id}/download` - `GET /images/{asset_id}` **ParentZone API Format:** - `GET /v1/media/{asset_id}/full?key={api_key}&u={updated_timestamp}` ### Asset Object Fields The script looks for these fields in asset objects: **Required for identification:** - `id`, `asset_id`, `image_id`, `file_id`, `uuid`, or `key` **Optional for better filenames:** - `fileName`: Preferred filename (ParentZone API) - `filename`: Alternative filename field - `name`: Alternative name - `title`: Display title - `mimeType`: MIME type for proper file extension (ParentZone API) - `content_type`: Alternative MIME type field **Required for ParentZone API downloads:** - `updated`: Timestamp used in download URL parameter and file modification time ## Examples ### Example 1: ParentZone API with API Key ```bash python image_downloader.py \ --api-url "https://api.parentzone.me" \ --list-endpoint "/v1/gallery" \ --download-endpoint "/v1/media" \ --output-dir "./parentzone_images" \ --api-key "your_api_key_here" ``` ### Example 2: ParentZone API with Login ```bash python image_downloader.py \ --api-url "https://api.parentzone.me" \ --list-endpoint "/v1/gallery" \ --download-endpoint "/v1/media" \ --output-dir "./parentzone_images" \ --email "your_email@example.com" \ --password "your_password_here" ``` ### Example 2: API with Authentication The script now supports API key authentication via the `--api-key` parameter. For other authentication methods, you can modify the script to include custom headers: ```python # In the get_asset_list method, add headers: headers = { 'Authorization': 'Bearer your_token_here', 'Content-Type': 'application/json' } async with session.get(url, headers=headers, timeout=self.timeout) as response: ``` ### Example 3: Custom Response Format If your API returns a different format, you can modify the `get_asset_list` method: ```python # For API that returns: {"images": [...]} if 'images' in data: assets = data['images'] ``` ## Output The script creates: 1. **Downloaded Images**: All images are saved to the specified output directory with original modification timestamps 2. **Log File**: `download.log` in the output directory with detailed information 3. **Progress Display**: Real-time progress bar showing: - Total assets - Successfully downloaded - Failed downloads - Skipped files (already exist) ### File Timestamps The downloader automatically sets the file modification time to match the `updated` timestamp from the API response. This preserves the original file dates and helps with: - **File Organization**: Files are sorted by their original creation/update dates - **Backup Systems**: Backup tools can properly identify changed files - **Media Libraries**: Media management software can display correct dates - **Data Integrity**: Maintains the temporal relationship between files ## Error Handling The script handles various error scenarios: - **Network Errors**: Retries and continues with other downloads - **Invalid Responses**: Logs errors and continues - **File System Errors**: Creates directories and handles permission issues - **API Errors**: Logs HTTP errors and continues ## Performance - **Concurrent Downloads**: Configurable concurrency (default: 5) - **Connection Pooling**: Efficient HTTP connection reuse - **Chunked Downloads**: Memory-efficient large file handling - **Progress Tracking**: Real-time feedback on download progress ## Troubleshooting ### Common Issues 1. **"No assets found"**: Check your list endpoint URL and response format 2. **"Failed to fetch asset list"**: Verify API URL and network connectivity 3. **"Content type is not an image"**: API might be returning JSON instead of image data 4. **Permission errors**: Check write permissions for the output directory ### Debug Mode For detailed debugging, you can modify the logging level: ```python logging.basicConfig(level=logging.DEBUG) ``` ## License This script is provided as-is for educational and personal use. ## Contributing Feel free to submit issues and enhancement requests!