Instructions to use ICTNLP/bayling-13b-v1.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ICTNLP/bayling-13b-v1.1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ICTNLP/bayling-13b-v1.1")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ICTNLP/bayling-13b-v1.1") model = AutoModelForCausalLM.from_pretrained("ICTNLP/bayling-13b-v1.1") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ICTNLP/bayling-13b-v1.1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ICTNLP/bayling-13b-v1.1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ICTNLP/bayling-13b-v1.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ICTNLP/bayling-13b-v1.1
- SGLang
How to use ICTNLP/bayling-13b-v1.1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ICTNLP/bayling-13b-v1.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ICTNLP/bayling-13b-v1.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ICTNLP/bayling-13b-v1.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ICTNLP/bayling-13b-v1.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ICTNLP/bayling-13b-v1.1 with Docker Model Runner:
docker model run hf.co/ICTNLP/bayling-13b-v1.1
BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models
BayLing (百聆, bǎi líng) is an instruction-following LLM equipped with advanced language alignment, showing superior capability in English/Chinese generation, instruction following and multi-turn interaction. BayLing can be effortlessly deployed on a consumer-grade GPU with 16GB of memory, and assists users with tasks such as translation, writing, creation, suggestion...
This model is the version 1.1 of BayLing-13B.
Compared with BayLing-13B-v1.0, BayLing-13B-v1.1 is additionally injected with extensive Chinese knowledge.
More info:
- Paper: https://arxiv.org/abs/2306.10968
- Github Repo: https://github.com/ictnlp/BayLing
- Online Demo: http://nlp.ict.ac.cn/bayling/demo
Authors
| Shaolei Zhang | Qingkai Fang | Zhuocheng Zhang | Zhengrui Ma |
| Yan Zhou | Langlin Huang | Mengyu Bu | Shangtong Gui |
| Yunji Chen | Xilin Chen | Yang Feng * |
Citation
If our work is helpful for you, please cite as:
@article{bayling,
title={BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models},
author={Shaolei Zhang and Qingkai Fang and Zhuocheng Zhang and Zhengrui Ma and Yan Zhou and Langlin Huang and Mengyu Bu and Shangtong Gui and Yunji Chen and Xilin Chen and Yang Feng},
journal={arXiv preprint arXiv:2306.10968},
year={2023},
url={https://arxiv.org/abs/2306.10968}
}
- Downloads last month
- 13