7.2 KiB
Image Downloader Script
A Python script to download images from a REST API that provides endpoints for listing assets and downloading them in full resolution.
Features
- Concurrent Downloads: Download multiple images simultaneously for better performance
- Error Handling: Robust error handling with detailed logging
- Progress Tracking: Real-time progress bar with download statistics
- Resume Support: Skip already downloaded files
- Flexible API Integration: Supports various API response formats
- Filename Sanitization: Automatically handles invalid characters in filenames
- File Timestamps: Preserves original file modification dates from API
Installation
- Clone or download this repository
- Install the required dependencies:
pip install -r requirements.txt
Usage
Basic Usage
python image_downloader.py \
--api-url "https://api.example.com" \
--list-endpoint "/assets" \
--download-endpoint "/download" \
--output-dir "./images" \
--api-key "your_api_key_here"
Advanced Usage
python image_downloader.py \
--api-url "https://api.example.com" \
--list-endpoint "/assets" \
--download-endpoint "/download" \
--output-dir "./images" \
--max-concurrent 10 \
--timeout 60 \
--api-key "your_api_key_here"
Parameters
--api-url: Base URL of the API (required)--list-endpoint: Endpoint to get the list of assets (required)--download-endpoint: Endpoint to download individual assets (required)--output-dir: Directory to save downloaded images (required)--max-concurrent: Maximum number of concurrent downloads (default: 5)--timeout: Request timeout in seconds (default: 30)--api-key: API key for authentication (x-api-key header)--email: Email for login authentication--password: Password for login authentication
Authentication
The script supports two authentication methods:
API Key Authentication
- Uses
x-api-keyheader for list endpoint - Uses
keyparameter for download endpoint - Configure with
--api-keyparameter orapi_keyin config file
Login Authentication
- Performs login to
/v1/auth/loginendpoint - Uses session token for list endpoint
- Uses
keyparameter for download endpoint - Configure with
--emailand--passwordparameters or in config file
Note: Only one authentication method should be used at a time. API key takes precedence over login credentials.
API Integration
The script is designed to work with REST APIs that follow these patterns:
List Endpoint
The list endpoint should return a JSON response with asset information. The script supports these common formats:
// Array of assets
[
{"id": "1", "filename": "image1.jpg", "url": "..."},
{"id": "2", "filename": "image2.png", "url": "..."}
]
// Object with data array
{
"data": [
{"id": "1", "filename": "image1.jpg"},
{"id": "2", "filename": "image2.png"}
]
}
// Object with results array
{
"results": [
{"id": "1", "filename": "image1.jpg"},
{"id": "2", "filename": "image2.png"}
]
}
Download Endpoint
The download endpoint should accept an asset ID and return the image file. Common patterns:
GET /download/{asset_id}GET /assets/{asset_id}/downloadGET /images/{asset_id}
ParentZone API Format:
GET /v1/media/{asset_id}/full?key={api_key}&u={updated_timestamp}
Asset Object Fields
The script looks for these fields in asset objects:
Required for identification:
id,asset_id,image_id,file_id,uuid, orkey
Optional for better filenames:
fileName: Preferred filename (ParentZone API)filename: Alternative filename fieldname: Alternative nametitle: Display titlemimeType: MIME type for proper file extension (ParentZone API)content_type: Alternative MIME type field
Required for ParentZone API downloads:
updated: Timestamp used in download URL parameter and file modification time
Examples
Example 1: ParentZone API with API Key
python image_downloader.py \
--api-url "https://api.parentzone.me" \
--list-endpoint "/v1/gallery" \
--download-endpoint "/v1/media" \
--output-dir "./parentzone_images" \
--api-key "your_api_key_here"
Example 2: ParentZone API with Login
python image_downloader.py \
--api-url "https://api.parentzone.me" \
--list-endpoint "/v1/gallery" \
--download-endpoint "/v1/media" \
--output-dir "./parentzone_images" \
--email "your_email@example.com" \
--password "your_password_here"
Example 2: API with Authentication
The script now supports API key authentication via the --api-key parameter. For other authentication methods, you can modify the script to include custom headers:
# In the get_asset_list method, add headers:
headers = {
'Authorization': 'Bearer your_token_here',
'Content-Type': 'application/json'
}
async with session.get(url, headers=headers, timeout=self.timeout) as response:
Example 3: Custom Response Format
If your API returns a different format, you can modify the get_asset_list method:
# For API that returns: {"images": [...]}
if 'images' in data:
assets = data['images']
Output
The script creates:
- Downloaded Images: All images are saved to the specified output directory with original modification timestamps
- Log File:
download.login the output directory with detailed information - Progress Display: Real-time progress bar showing:
- Total assets
- Successfully downloaded
- Failed downloads
- Skipped files (already exist)
File Timestamps
The downloader automatically sets the file modification time to match the updated timestamp from the API response. This preserves the original file dates and helps with:
- File Organization: Files are sorted by their original creation/update dates
- Backup Systems: Backup tools can properly identify changed files
- Media Libraries: Media management software can display correct dates
- Data Integrity: Maintains the temporal relationship between files
Error Handling
The script handles various error scenarios:
- Network Errors: Retries and continues with other downloads
- Invalid Responses: Logs errors and continues
- File System Errors: Creates directories and handles permission issues
- API Errors: Logs HTTP errors and continues
Performance
- Concurrent Downloads: Configurable concurrency (default: 5)
- Connection Pooling: Efficient HTTP connection reuse
- Chunked Downloads: Memory-efficient large file handling
- Progress Tracking: Real-time feedback on download progress
Troubleshooting
Common Issues
- "No assets found": Check your list endpoint URL and response format
- "Failed to fetch asset list": Verify API URL and network connectivity
- "Content type is not an image": API might be returning JSON instead of image data
- Permission errors: Check write permissions for the output directory
Debug Mode
For detailed debugging, you can modify the logging level:
logging.basicConfig(level=logging.DEBUG)
License
This script is provided as-is for educational and personal use.
Contributing
Feel free to submit issues and enhancement requests!