Refactored code, added Dockerfile, replaced bash scripts with python alternatives, added README with instructions on running a pipeline

2026-04-01 16:56:06 +02:00
parent ca116562fe
commit 686a458905
19 changed files with 1103 additions and 65 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,202 @@
+# ContentGeneration Pipeline
+
+This project runs a 3-step video pipeline:
+
+1. Generate shot videos from images + prompts.
+2. Merge each generated video with its audio.
+3. Concatenate merged clips into one final output.
+
+The pipeline entrypoint is `run_video_pipeline.py`.
+
+## Quick Start
+
+Local Python:
+
+```bash
+cp .env.example .env
+python3 -m venv .venv && source .venv/bin/activate
+pip install -r requirements.txt
+python run_video_pipeline.py
+```
+
+Docker (GPU):
+
+```bash
+cp .env.example .env
+docker build -t content-generation:latest .
+docker run --rm --gpus all --env-file .env -v "$(pwd)":/app -w /app content-generation:latest
+```
+
+First run (skip S3 upload):
+
+```bash
+python run_video_pipeline.py --skip-s3-upload
+```
+
+Docker first run (skip S3 upload):
+
+```bash
+docker run --rm --gpus all --env-file .env -v "$(pwd)":/app -w /app content-generation:latest \
+  python run_video_pipeline.py --skip-s3-upload
+```
+
+## Project Layout
+
+- `run_video_pipeline.py`: main entrypoint.
+- `src/scripts/`: helper scripts used by the pipeline.
+- `HunyuanVideo-1.5/`: Hunyuan inference code and model dependencies.
+- `reel_script.json`: required script input with `shots`.
+- `images/`, `audios/`, `videos/`, `merged/`, `results/`: working/output folders.
+- `.env.example`: environment variable template.
+
+## Prerequisites
+
+1. Linux with NVIDIA GPU and CUDA runtime.
+2. `ffmpeg` and `ffprobe` available on PATH.
+3. Python 3.10+.
+4. Hunyuan model checkpoints under `HunyuanVideo-1.5/ckpts`.
+5. If using FLUX local download, access approved for `black-forest-labs/FLUX.1-schnell`.
+
+## Environment Variables
+
+1. Create local env file:
+
+```bash
+cp .env.example .env
+```
+
+2. Fill required variables in `.env`:
+- `ELEVENLABS_API_KEY` for audio generation.
+- `HUGGINGFACE_HUB_TOKEN` if gated Hugging Face model access is needed.
+- `AWS_S3_BUCKET` (+ optional AWS vars) if you want final output uploaded to S3.
+
+## Run Locally (Python)
+
+1. Create and activate a virtual environment:
+
+```bash
+python3 -m venv .venv
+source .venv/bin/activate
+```
+
+2. Install Python dependencies:
+
+```bash
+python -m pip install --upgrade pip
+pip install -r requirements.txt
+```
+
+3. Install Hunyuan dependencies:
+
+```bash
+pip install -r HunyuanVideo-1.5/requirements.txt
+pip install --upgrade tencentcloud-sdk-python
+pip install sgl-kernel==0.3.18
+```
+
+4. Run full pipeline:
+
+```bash
+python run_video_pipeline.py
+```
+
+5. Common options:
+
+```bash
+# Skip generation and only merge + concat
+python run_video_pipeline.py --skip-generate
+
+# Skip S3 upload
+python run_video_pipeline.py --skip-s3-upload
+
+# Override base directory
+python run_video_pipeline.py --base-dir /absolute/path/to/workdir
+
+# Change logging verbosity
+python run_video_pipeline.py --log-level DEBUG
+```
+
+## Run with Docker
+
+1. Build image:
+
+```bash
+docker build -t content-generation:latest .
+```
+
+2. Optional build with extra attention backends:
+
+```bash
+docker build -t content-generation:latest --build-arg INSTALL_OPTIONAL_ATTENTION=1 .
+```
+
+3. Run pipeline in container (GPU required):
+
+```bash
+docker run --rm --gpus all \
+  --env-file .env \
+  -v "$(pwd)":/app \
+  -w /app \
+  content-generation:latest
+```
+
+4. Pass extra pipeline args:
+
+```bash
+docker run --rm --gpus all \
+  --env-file .env \
+  -v "$(pwd)":/app \
+  -w /app \
+  content-generation:latest \
+  python run_video_pipeline.py --skip-s3-upload --log-level DEBUG
+```
+
+## Input Expectations
+
+1. `reel_script.json` must exist and contain a `shots` array.
+2. `images/shot_<n>.png` and `audios/output_<n>.mp3` should align by shot number.
+3. Final output is written by default to `results/final_output.mp4`.
+
+## S3 Upload Behavior
+
+1. If `AWS_S3_BUCKET` is set, the pipeline uploads final output to S3 using `S3VideoStorage`.
+2. If `AWS_S3_BUCKET` is missing, upload is skipped with a warning.
+3. Disable upload explicitly with `--skip-s3-upload`.
+
+## Troubleshooting
+
+1. `torch.cuda.is_available()` is false in Docker.
+- Run with GPU flags: `docker run --gpus all ...`
+- Verify NVIDIA Container Toolkit is installed on host.
+- Check host GPU visibility: `nvidia-smi`.
+
+2. `ffmpeg` or `ffprobe` not found.
+- Local: install ffmpeg with your package manager.
+- Docker: ffmpeg is installed in the provided Dockerfile.
+
+3. Hunyuan generate step fails due to missing checkpoints.
+- Ensure checkpoints are available under `HunyuanVideo-1.5/ckpts`.
+- Confirm mounted project path in Docker includes checkpoints.
+
+4. Hugging Face model download fails (401/403).
+- Accept model access terms for gated models (for example FLUX.1-schnell).
+- Set `HUGGINGFACE_HUB_TOKEN` in `.env`.
+
+5. S3 upload fails.
+- Confirm `AWS_S3_BUCKET` is set.
+- If needed, set `AWS_REGION` and credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, optional `AWS_SESSION_TOKEN`).
+- For S3-compatible providers, set `AWS_S3_ENDPOINT_URL`.
+
+6. Permission issues when running Docker with mounted volumes.
+- Use your host user mapping if needed:
+  `docker run --rm --gpus all -u "$(id -u):$(id -g)" ...`
+
+7. Out-of-memory during video generation.
+- Keep `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True,max_split_size_mb:128`.
+- Reduce workload by skipping optional enhancements or lowering resolution/steps in generation scripts.
+
+8. Verify syntax quickly before running.
+
+```bash
+python3 -m py_compile run_video_pipeline.py src/scripts/*.py
+```