Instructions to use stepfun-ai/NextStep-1-Large-Edit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stepfun-ai/NextStep-1-Large-Edit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-to-image", model="stepfun-ai/NextStep-1-Large-Edit", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("stepfun-ai/NextStep-1-Large-Edit", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Multi-GPU / Parallel Processing Support
#3
by Iotcv - opened
We are trying to use this model on multiple GPUs, but noticed that it currently only utilizes a single GPU. This leads to out-of-memory (OOM) errors.
Any guidance on best practices for running this model across multiple GPUs would be very helpful.
Looking forward to exploring more with this model
Thanks
Thank you for your attention. You can try the scripts below to enable Multi-GPU / Parallel Processing:
...
import torch.distributed as dist
dist.init_process_group(backend="nccl")
rank = dist.get_rank()
...
pipeline = NextStepPipeline(tokenizer=tokenizer, model=model).to(device=f"cuda:{rank}")
...
image = pipeline.generate_image(
....
seed=42 + rank,
)[0]
image.save(f"./assets/output_{rank}.png")
then use torchrun to start the inference:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc-per-node=8 your_scripts.py
jingwwu changed discussion status to closed