ContentGeneration/README.md

# ContentGeneration Pipeline

This project runs a 3-step video pipeline:

1. Generate shot videos from images + prompts.
2. Merge each generated video with its audio.
3. Concatenate merged clips into one final output.

The pipeline entrypoint is `run_video_pipeline.py`.

## Quick Start

Local Python:

```bash
cp .env.example .env
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python run_video_pipeline.py
```

Docker (GPU):

```bash
cp .env.example .env
docker build -t content-generation:latest .
docker run --rm --gpus all --env-file .env \
  -v "$(pwd)":/app \
  -v "$HOME/.cache/huggingface":/root/.cache/huggingface \
  -w /app content-generation:latest
```

First run (skip S3 upload):

```bash
python run_video_pipeline.py --skip-s3-upload
```

Docker first run (skip S3 upload):

```bash
docker run --rm --gpus all --env-file .env \
  -v "$(pwd)":/app \
  -v "$HOME/.cache/huggingface":/root/.cache/huggingface \
  -w /app \
  content-generation:latest \
  python run_video_pipeline.py --skip-s3-upload
```

## Project Layout

- `run_video_pipeline.py`: main entrypoint.
- `src/`: helper scripts used by the pipeline.
- `HunyuanVideo-1.5/`: Hunyuan inference code and model dependencies.
- `reel_script.json`: required script input with `shots`.
- `images/`, `audios/`, `videos/`, `merged/`, `results/`: working/output folders.
- `.env.example`: environment variable template.

## Prerequisites

1. Linux with NVIDIA GPU and CUDA runtime.
2. `ffmpeg` and `ffprobe` available on PATH.
3. Python 3.10+.
4. Hunyuan model checkpoints under `HunyuanVideo-1.5/ckpts`.
5. If using FLUX local download, access approved for `black-forest-labs/FLUX.1-schnell`.

## Environment Variables

1. Create local env file:

```bash
cp .env.example .env
```

2. Fill required variables in `.env`:
- `ELEVENLABS_API_KEY` for audio generation.
- `HUGGINGFACE_HUB_TOKEN` if gated Hugging Face model access is needed.
- `AWS_S3_BUCKET` (+ optional AWS vars) if you want final output uploaded to S3.

## Run Locally (Python)

1. Create and activate a virtual environment:

```bash
python3 -m venv .venv
source .venv/bin/activate
```

2. Install Python dependencies:

```bash
python -m pip install --upgrade pip
pip install -r requirements.txt
```

3. Install Hunyuan dependencies:

```bash
pip install -r HunyuanVideo-1.5/requirements.txt
pip install --upgrade tencentcloud-sdk-python
pip install sgl-kernel==0.3.18
```

4. Run full pipeline:

```bash
python run_video_pipeline.py
```

5. Common options:

```bash
# Skip generation and only merge + concat
python run_video_pipeline.py --skip-generate

# Skip S3 upload
python run_video_pipeline.py --skip-s3-upload

# Override base directory
python run_video_pipeline.py --base-dir /absolute/path/to/workdir

# Change logging verbosity
python run_video_pipeline.py --log-level DEBUG
```

## Run with Docker

1. Build image:

```bash
docker build -t content-generation:latest .
```

2. Optional build with extra attention backends:

```bash
docker build -t content-generation:latest --build-arg INSTALL_OPTIONAL_ATTENTION=1 .
```

3. Run pipeline in container (GPU required):

```bash
docker run --rm --gpus all \
  --env-file .env \
  -v "$(pwd)":/app \
  -v "$HOME/.cache/huggingface":/root/.cache/huggingface \
  -w /app \
  content-generation:latest
```

4. Pass extra pipeline args:

```bash
docker run --rm --gpus all \
  --env-file .env \
  -v "$(pwd)":/app \
  -v "$HOME/.cache/huggingface":/root/.cache/huggingface \
  -w /app \
  content-generation:latest \
  python run_video_pipeline.py --skip-s3-upload --log-level DEBUG
```

## Input Expectations

1. `reel_script.json` must exist and contain a `shots` array.
2. `images/shot_<n>.png` and `audios/output_<n>.mp3` should align by shot number.
3. Final output is written by default to `results/final_output.mp4`.

## S3 Upload Behavior

1. If `AWS_S3_BUCKET` is set, the pipeline uploads final output to S3 using `S3VideoStorage`.
2. If `AWS_S3_BUCKET` is missing, upload is skipped with a warning.
3. Disable upload explicitly with `--skip-s3-upload`.

## Troubleshooting

1. `torch.cuda.is_available()` is false in Docker.
- Run with GPU flags: `docker run --gpus all ...`
- Verify NVIDIA Container Toolkit is installed on host.
- Check host GPU visibility: `nvidia-smi`.

2. `ffmpeg` or `ffprobe` not found.
- Local: install ffmpeg with your package manager.
- Docker: ffmpeg is installed in the provided Dockerfile.

3. Hunyuan generate step fails due to missing checkpoints.
- Ensure checkpoints are available under `HunyuanVideo-1.5/ckpts`.
- Confirm mounted project path in Docker includes checkpoints.

4. Hugging Face model download fails (401/403).
- Accept model access terms for gated models (for example FLUX.1-schnell).
- Set `HUGGINGFACE_HUB_TOKEN` in `.env`.

5. S3 upload fails.
- Confirm `AWS_S3_BUCKET` is set.
- If needed, set `AWS_REGION` and credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, optional `AWS_SESSION_TOKEN`).
- For S3-compatible providers, set `AWS_S3_ENDPOINT_URL`.

6. Permission issues when running Docker with mounted volumes.
- Use your host user mapping if needed:
  `docker run --rm --gpus all -u "$(id -u):$(id -g)" ...`

7. Out-of-memory during video generation.
- Keep `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True,max_split_size_mb:128`.
- Reduce workload by skipping optional enhancements or lowering resolution/steps in generation scripts.

8. Verify syntax quickly before running.

```bash
python3 -m py_compile run_video_pipeline.py src/*.py
```