# ParentZone Snapshots Web Server A built-in web server that serves your downloaded snapshot HTML files and their assets through a clean, responsive web interface. ## Features - **📂 Directory Listing**: Browse all your snapshot files with file sizes and modification dates - **🖼️ Asset Serving**: Properly serves images, CSS, and other assets referenced in HTML files - **📱 Responsive Design**: Works great on desktop, tablet, and mobile devices - **🔒 Security**: Path traversal protection and secure file serving - **📊 Request Logging**: Detailed logging of all web requests - **⚡ Caching**: Optimized caching headers for better performance ## Quick Start ### Using Docker (Recommended) The web server starts automatically when you run the Docker container: ```bash # Build and start with docker-compose docker-compose up -d # Or build and run manually docker build -t parentzone-downloader . docker run -d -p 8080:8080 -v ./snapshots:/app/snapshots parentzone-downloader ``` The web interface will be available at: **http://localhost:8080** ### Running Standalone You can also run the web server independently: ```bash # Start web server with default settings python webserver.py # Custom port and directory python webserver.py --port 3000 --snapshots-dir ./my-snapshots # Bind to all interfaces python webserver.py --host 0.0.0.0 --port 8080 ``` ## Configuration Options ### Command Line Arguments | Argument | Default | Description | |----------|---------|-------------| | `--snapshots-dir` | `./snapshots` | Directory containing snapshot files | | `--port` | `8080` | Port to run the server on | | `--host` | `0.0.0.0` | Host interface to bind to | ### Examples ```bash # Serve from custom directory on port 3000 python webserver.py --snapshots-dir /path/to/snapshots --port 3000 # Local access only python webserver.py --host 127.0.0.1 # Production setup python webserver.py --host 0.0.0.0 --port 80 --snapshots-dir /var/snapshots ``` ## Web Interface ### Main Directory Page - **Clean Layout**: Modern, responsive design with file cards - **File Information**: Shows file names, sizes, and last modified dates - **Sorting**: Files are sorted by modification date (newest first) - **Direct Links**: Click any file name to view the snapshot ### File Serving - **HTML Files**: Served with proper content types and encoding - **Assets**: Images, CSS, JS, and other assets are served correctly - **Caching**: Efficient browser caching for better performance - **Security**: Path traversal protection prevents unauthorized access ## URL Structure | URL Pattern | Description | Example | |-------------|-------------|---------| | `/` | Main directory listing | `http://localhost:8080/` | | `/{filename}.html` | Serve HTML snapshot file | `http://localhost:8080/snapshots_2024-01-01.html` | | `/assets/{path}` | Serve asset files | `http://localhost:8080/assets/images/photo.jpg` | | `/{filename}.{ext}` | Serve other files | `http://localhost:8080/snapshots.log` | ## Docker Integration ### Environment Variables The web server respects these environment variables when running in Docker: - `SNAPSHOTS_DIR`: Directory to serve files from (default: `/app/snapshots`) - `WEB_PORT`: Port for the web server (default: `8080`) - `WEB_HOST`: Host interface to bind to (default: `0.0.0.0`) ### Volume Mounts Make sure your snapshots directory is properly mounted: ```yaml # docker-compose.yml volumes: - ./snapshots:/app/snapshots # Your local snapshots folder - ./logs:/app/logs # Log files ``` ### Port Mapping The default port `8080` is exposed and mapped in the Docker setup: ```yaml # docker-compose.yml ports: - "8080:8080" # Host:Container ``` To use a different port: ```yaml ports: - "3000:8080" # Access via http://localhost:3000 ``` ## File Types Supported ### HTML Files - **Snapshot files**: Main HTML files with embedded images and styles - **Content-Type**: `text/html; charset=utf-8` - **Features**: Full HTML rendering with linked assets ### Asset Files - **Images**: JPG, PNG, GIF, WebP, SVG, ICO - **Stylesheets**: CSS files - **Scripts**: JavaScript files - **Data**: JSON files - **Documents**: PDF files - **Logs**: TXT and LOG files ### Content Type Detection The server automatically detects content types based on file extensions: ```python content_types = { ".html": "text/html; charset=utf-8", ".css": "text/css; charset=utf-8", ".js": "application/javascript; charset=utf-8", ".jpg": "image/jpeg", ".png": "image/png", ".pdf": "application/pdf", # ... and more } ``` ## Security Features ### Path Traversal Protection The server prevents access to files outside the snapshots directory: - ✅ `/snapshots_2024-01-01.html` - Allowed - ✅ `/assets/images/photo.jpg` - Allowed - ❌ `/../../../etc/passwd` - Blocked - ❌ `/../../config.json` - Blocked ### Safe File Serving - Only serves files from designated directories - Validates all file paths before serving - Returns proper HTTP error codes for invalid requests - Logs suspicious access attempts ## Performance Optimization ### Caching Headers The server sets appropriate caching headers: - **HTML files**: `Cache-Control: public, max-age=3600` (1 hour) - **Asset files**: `Cache-Control: public, max-age=86400` (24 hours) - **Last-Modified**: Proper modification time headers ### Connection Handling - Built on `aiohttp` for high-performance async handling - Efficient file serving with proper buffer sizes - Graceful error handling and recovery ## Logging ### Request Logging All requests are logged with details: ``` 2024-01-15 10:30:45 - webserver - INFO - 192.168.1.100 - GET /snapshots_2024-01-01.html - 200 - 0.045s 2024-01-15 10:30:46 - webserver - INFO - 192.168.1.100 - GET /assets/images/photo.jpg - 200 - 0.012s ``` ### Error Logging Errors and security events are logged: ``` 2024-01-15 10:31:00 - webserver - WARNING - Attempted path traversal: ../../../etc/passwd 2024-01-15 10:31:05 - webserver - ERROR - Error serving file unknown.html: File not found ``` ### Log Location - **Docker**: Logs to `/app/logs/startup.log` and container stdout - **Standalone**: Logs to console and any configured log files ## Troubleshooting ### Common Issues #### Port Already in Use ```bash # Error: Address already in use # Solution: Use a different port python webserver.py --port 8081 ``` #### Permission Denied ```bash # Error: Permission denied (port 80) # Solution: Use sudo or higher port number sudo python webserver.py --port 80 # Or python webserver.py --port 8080 ``` #### No Files Visible - Check that snapshots directory exists and contains HTML files - Verify directory permissions are readable - Check docker volume mounts are correct #### Assets Not Loading - Ensure assets directory exists within snapshots folder - Check that asset files are properly referenced in HTML - Verify file permissions on asset files #### AttributeError: 'Application' object has no attribute 'remote' This error occurs with older versions of aiohttp. The web server has been updated to use the correct request attributes: - Uses `request.transport.get_extra_info("peername")` for client IP - Handles cases where transport is not available - Falls back to "unknown" for client identification ### Debug Mode For more verbose logging, modify the logging level: ```python # In webserver.py logging.basicConfig(level=logging.DEBUG) ``` ### Health Check Test if the server is running: ```bash # Check if server responds curl http://localhost:8080/ # Check specific file curl -I http://localhost:8080/snapshots_2024-01-01.html ``` ## Development ### Adding New Features The web server is designed to be easily extensible: ```python # Add new route async def custom_handler(request): return web.Response(text="Custom response") # Register route app.router.add_get("/custom", custom_handler) ``` ### Custom Styling You can customize the directory listing appearance by modifying the CSS in `_generate_index_html()`. ### API Endpoints Consider adding REST API endpoints for programmatic access: ```python # Example: JSON API for file listing async def api_files(request): files = get_file_list() # Your logic here return web.json_response(files) app.router.add_get("/api/files", api_files) ``` ## Production Deployment ### Reverse Proxy Setup For production, consider using nginx as a reverse proxy: ```nginx server { listen 80; server_name your-domain.com; location / { proxy_pass http://localhost:8080; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } } ``` ### SSL/HTTPS Add SSL termination at the reverse proxy level: ```nginx server { listen 443 ssl; ssl_certificate /path/to/cert.pem; ssl_certificate_key /path/to/key.pem; location / { proxy_pass http://localhost:8080; } } ``` ### Process Management Use systemd or supervisor to manage the web server process: ```ini # /etc/systemd/system/parentzone-webserver.service [Unit] Description=ParentZone Web Server After=network.target [Service] Type=simple User=parentzone WorkingDirectory=/opt/parentzone ExecStart=/usr/bin/python3 webserver.py Restart=always [Install] WantedBy=multi-user.target ``` ## Contributing The web server is part of the ParentZone Downloader project. To contribute: 1. Fork the repository 2. Make your changes to `webserver.py` 3. Test thoroughly 4. Submit a pull request ## License This web server is part of the ParentZone Downloader project and follows the same license terms.