forked from LiveCarta/ContentGeneration
212 lines
5.4 KiB
Markdown
212 lines
5.4 KiB
Markdown
# ContentGeneration Pipeline
|
|
|
|
This project runs a 3-step video pipeline:
|
|
|
|
1. Generate shot videos from images + prompts.
|
|
2. Merge each generated video with its audio.
|
|
3. Concatenate merged clips into one final output.
|
|
|
|
The pipeline entrypoint is `run_video_pipeline.py`.
|
|
|
|
## Quick Start
|
|
|
|
Local Python:
|
|
|
|
```bash
|
|
cp .env.example .env
|
|
python3 -m venv .venv && source .venv/bin/activate
|
|
pip install -r requirements.txt
|
|
python run_video_pipeline.py
|
|
```
|
|
|
|
Docker (GPU):
|
|
|
|
```bash
|
|
cp .env.example .env
|
|
docker build -t content-generation:latest .
|
|
docker run --rm --gpus all --env-file .env \
|
|
-v "$(pwd)":/app \
|
|
-v "$HOME/.cache/huggingface":/root/.cache/huggingface \
|
|
-w /app content-generation:latest
|
|
```
|
|
|
|
First run (skip S3 upload):
|
|
|
|
```bash
|
|
python run_video_pipeline.py --skip-s3-upload
|
|
```
|
|
|
|
Docker first run (skip S3 upload):
|
|
|
|
```bash
|
|
docker run --rm --gpus all --env-file .env \
|
|
-v "$(pwd)":/app \
|
|
-v "$HOME/.cache/huggingface":/root/.cache/huggingface \
|
|
-w /app \
|
|
content-generation:latest \
|
|
python run_video_pipeline.py --skip-s3-upload
|
|
```
|
|
|
|
## Project Layout
|
|
|
|
- `run_video_pipeline.py`: main entrypoint.
|
|
- `src/`: helper scripts used by the pipeline.
|
|
- `HunyuanVideo-1.5/`: Hunyuan inference code and model dependencies.
|
|
- `reel_script.json`: required script input with `shots`.
|
|
- `images/`, `audios/`, `videos/`, `merged/`, `results/`: working/output folders.
|
|
- `.env.example`: environment variable template.
|
|
|
|
## Prerequisites
|
|
|
|
1. Linux with NVIDIA GPU and CUDA runtime.
|
|
2. `ffmpeg` and `ffprobe` available on PATH.
|
|
3. Python 3.10+.
|
|
4. Hunyuan model checkpoints under `HunyuanVideo-1.5/ckpts`.
|
|
5. If using FLUX local download, access approved for `black-forest-labs/FLUX.1-schnell`.
|
|
|
|
## Environment Variables
|
|
|
|
1. Create local env file:
|
|
|
|
```bash
|
|
cp .env.example .env
|
|
```
|
|
|
|
2. Fill required variables in `.env`:
|
|
- `ELEVENLABS_API_KEY` for audio generation.
|
|
- `HUGGINGFACE_HUB_TOKEN` if gated Hugging Face model access is needed.
|
|
- `AWS_S3_BUCKET` (+ optional AWS vars) if you want final output uploaded to S3.
|
|
|
|
## Run Locally (Python)
|
|
|
|
1. Create and activate a virtual environment:
|
|
|
|
```bash
|
|
python3 -m venv .venv
|
|
source .venv/bin/activate
|
|
```
|
|
|
|
2. Install Python dependencies:
|
|
|
|
```bash
|
|
python -m pip install --upgrade pip
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. Install Hunyuan dependencies:
|
|
|
|
```bash
|
|
pip install -r HunyuanVideo-1.5/requirements.txt
|
|
pip install --upgrade tencentcloud-sdk-python
|
|
pip install sgl-kernel==0.3.18
|
|
```
|
|
|
|
4. Run full pipeline:
|
|
|
|
```bash
|
|
python run_video_pipeline.py
|
|
```
|
|
|
|
5. Common options:
|
|
|
|
```bash
|
|
# Skip generation and only merge + concat
|
|
python run_video_pipeline.py --skip-generate
|
|
|
|
# Skip S3 upload
|
|
python run_video_pipeline.py --skip-s3-upload
|
|
|
|
# Override base directory
|
|
python run_video_pipeline.py --base-dir /absolute/path/to/workdir
|
|
|
|
# Change logging verbosity
|
|
python run_video_pipeline.py --log-level DEBUG
|
|
```
|
|
|
|
## Run with Docker
|
|
|
|
1. Build image:
|
|
|
|
```bash
|
|
docker build -t content-generation:latest .
|
|
```
|
|
|
|
2. Optional build with extra attention backends:
|
|
|
|
```bash
|
|
docker build -t content-generation:latest --build-arg INSTALL_OPTIONAL_ATTENTION=1 .
|
|
```
|
|
|
|
3. Run pipeline in container (GPU required):
|
|
|
|
```bash
|
|
docker run --rm --gpus all \
|
|
--env-file .env \
|
|
-v "$(pwd)":/app \
|
|
-v "$HOME/.cache/huggingface":/root/.cache/huggingface \
|
|
-w /app \
|
|
content-generation:latest
|
|
```
|
|
|
|
4. Pass extra pipeline args:
|
|
|
|
```bash
|
|
docker run --rm --gpus all \
|
|
--env-file .env \
|
|
-v "$(pwd)":/app \
|
|
-v "$HOME/.cache/huggingface":/root/.cache/huggingface \
|
|
-w /app \
|
|
content-generation:latest \
|
|
python run_video_pipeline.py --skip-s3-upload --log-level DEBUG
|
|
```
|
|
|
|
## Input Expectations
|
|
|
|
1. `reel_script.json` must exist and contain a `shots` array.
|
|
2. `images/shot_<n>.png` and `audios/output_<n>.mp3` should align by shot number.
|
|
3. Final output is written by default to `results/final_output.mp4`.
|
|
|
|
## S3 Upload Behavior
|
|
|
|
1. If `AWS_S3_BUCKET` is set, the pipeline uploads final output to S3 using `S3VideoStorage`.
|
|
2. If `AWS_S3_BUCKET` is missing, upload is skipped with a warning.
|
|
3. Disable upload explicitly with `--skip-s3-upload`.
|
|
|
|
## Troubleshooting
|
|
|
|
1. `torch.cuda.is_available()` is false in Docker.
|
|
- Run with GPU flags: `docker run --gpus all ...`
|
|
- Verify NVIDIA Container Toolkit is installed on host.
|
|
- Check host GPU visibility: `nvidia-smi`.
|
|
|
|
2. `ffmpeg` or `ffprobe` not found.
|
|
- Local: install ffmpeg with your package manager.
|
|
- Docker: ffmpeg is installed in the provided Dockerfile.
|
|
|
|
3. Hunyuan generate step fails due to missing checkpoints.
|
|
- Ensure checkpoints are available under `HunyuanVideo-1.5/ckpts`.
|
|
- Confirm mounted project path in Docker includes checkpoints.
|
|
|
|
4. Hugging Face model download fails (401/403).
|
|
- Accept model access terms for gated models (for example FLUX.1-schnell).
|
|
- Set `HUGGINGFACE_HUB_TOKEN` in `.env`.
|
|
|
|
5. S3 upload fails.
|
|
- Confirm `AWS_S3_BUCKET` is set.
|
|
- If needed, set `AWS_REGION` and credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, optional `AWS_SESSION_TOKEN`).
|
|
- For S3-compatible providers, set `AWS_S3_ENDPOINT_URL`.
|
|
|
|
6. Permission issues when running Docker with mounted volumes.
|
|
- Use your host user mapping if needed:
|
|
`docker run --rm --gpus all -u "$(id -u):$(id -g)" ...`
|
|
|
|
7. Out-of-memory during video generation.
|
|
- Keep `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True,max_split_size_mb:128`.
|
|
- Reduce workload by skipping optional enhancements or lowering resolution/steps in generation scripts.
|
|
|
|
8. Verify syntax quickly before running.
|
|
|
|
```bash
|
|
python3 -m py_compile run_video_pipeline.py src/*.py
|
|
```
|