Spaces:
Paused
Paused
| # CLAUDE.md | |
| This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. | |
| ## Project Overview | |
| This is the AI Toolkit by Ostris, packaged as a Hugging Face Space for Docker deployment. It's a comprehensive training suite for diffusion models supporting the latest models on consumer-grade hardware. The toolkit includes both CLI and web UI interfaces for training LoRA models, particularly focused on FLUX.1 models. | |
| ## Architecture | |
| ### Core Structure | |
| - **Main Entry Points**: | |
| - `run.py` - CLI interface for running training jobs with config files | |
| - `flux_train_ui.py` - Gradio-based simple training interface | |
| - `start.sh` - Docker entry point that launches the web UI | |
| - **Web UI** (`ui/`): Next.js application with TypeScript | |
| - Frontend in `src/app/` with API routes | |
| - Background worker process for job management | |
| - SQLite database via Prisma for job persistence | |
| - **Core Toolkit** (`toolkit/`): Python modules for ML operations | |
| - Model implementations in `toolkit/models/` | |
| - Training processes in `jobs/process/` | |
| - Configuration management and data loading utilities | |
| - **Extensions** (`extensions_built_in/`): Modular training components | |
| - Support for various model types (FLUX, SDXL, SD 1.5, etc.) | |
| - Different training strategies (LoRA, fine-tuning, etc.) | |
| ### Key Configuration | |
| - Training configs in `config/examples/` with YAML format | |
| - Docker setup supports GPU passthrough with nvidia runtime | |
| - Environment variables for HuggingFace tokens and authentication | |
| ## Common Development Commands | |
| ### Setup and Installation | |
| ```bash | |
| # Python environment setup | |
| python3 -m venv venv | |
| source venv/bin/activate # or .\venv\Scripts\activate on Windows | |
| pip3 install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126 | |
| pip3 install -r requirements.txt | |
| ``` | |
| ### Running Training Jobs | |
| ```bash | |
| # CLI training with config file | |
| python run.py config/your_config.yml | |
| # Simple Gradio UI for FLUX training | |
| python flux_train_ui.py | |
| ``` | |
| ### Web UI Development | |
| ```bash | |
| # Development mode (from ui/ directory) | |
| cd ui | |
| npm install | |
| npm run dev | |
| # Production build and start | |
| npm run build_and_start | |
| # Database updates | |
| npm run update_db | |
| ``` | |
| ### Docker Operations | |
| ```bash | |
| # Run with docker-compose | |
| docker-compose up | |
| # Build custom image | |
| docker build -f docker/Dockerfile -t ai-toolkit . | |
| ``` | |
| ## Authentication Requirements | |
| ### HuggingFace Access | |
| - FLUX.1-dev requires accepting license at https://huggingface.co/black-forest-labs/FLUX.1-dev | |
| - Set `HF_TOKEN` environment variable with READ access token | |
| - Create `.env` file in root: `HF_TOKEN=your_key_here` | |
| ### UI Security | |
| - Set `AI_TOOLKIT_AUTH` environment variable for UI authentication | |
| - Default password is "password" if not set | |
| ## Training Configuration | |
| ### Model Support | |
| - **FLUX.1-dev**: Requires HF token, non-commercial license | |
| - **FLUX.1-schnell**: Apache 2.0, needs training adapter | |
| - **SDXL, SD 1.5**: Standard Stable Diffusion models | |
| - **Video models**: Various I2V and text-to-video architectures | |
| ### Memory Requirements | |
| - FLUX.1 training requires minimum 24GB VRAM | |
| - Use `low_vram: true` in config if running with displays attached | |
| - Supports various quantization options to reduce memory usage | |
| ### Dataset Format | |
| - Images: JPG, JPEG, PNG (no WebP) | |
| - Captions: `.txt` files with same name as images | |
| - Use `[trigger]` placeholder in captions, replaced by `trigger_word` config | |
| - Images auto-resized and bucketed, no manual preprocessing needed | |
| ## Key Files to Understand | |
| - `run.py:46-85` - Main training job runner and argument parsing | |
| - `toolkit/job.py` - Job management and configuration loading | |
| - `ui/src/app/api/jobs/route.ts` - API endpoints for job management | |
| - `config/examples/train_lora_flux_24gb.yaml` - Standard FLUX training template | |
| - `extensions_built_in/sd_trainer/SDTrainer.py` - Core training logic | |
| ## Development Notes | |
| - Jobs run independently of UI - UI is only for management | |
| - Training can be stopped/resumed via checkpoints | |
| - Output stored in `output/` directory with samples and models | |
| - Extensions system allows custom training implementations | |
| - Multi-GPU support via accelerate library |