Text Generation
Transformers
PyTorch
Safetensors
English
Chinese
llama
conversational
text-generation-inference
Instructions to use infly/OpenCoder-1.5B-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use infly/OpenCoder-1.5B-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="infly/OpenCoder-1.5B-Base") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("infly/OpenCoder-1.5B-Base") model = AutoModelForCausalLM.from_pretrained("infly/OpenCoder-1.5B-Base") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use infly/OpenCoder-1.5B-Base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "infly/OpenCoder-1.5B-Base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "infly/OpenCoder-1.5B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/infly/OpenCoder-1.5B-Base
- SGLang
How to use infly/OpenCoder-1.5B-Base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "infly/OpenCoder-1.5B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "infly/OpenCoder-1.5B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "infly/OpenCoder-1.5B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "infly/OpenCoder-1.5B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use infly/OpenCoder-1.5B-Base with Docker Model Runner:
docker model run hf.co/infly/OpenCoder-1.5B-Base
File size: 1,127 Bytes
3c0f75f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | {
"<code_to_intermediate>": 96521,
"<empty_output>": 96520,
"<file_sep>": 96511,
"<fim_middle>": 96508,
"<fim_prefix>": 96507,
"<fim_suffix>": 96509,
"<intermediate_to_code>": 96522,
"<issue_closed>": 96514,
"<issue_comment>": 96513,
"<issue_start>": 96512,
"<jupyter_code>": 96517,
"<jupyter_output>": 96518,
"<jupyter_script>": 96519,
"<jupyter_start>": 96515,
"<jupyter_text>": 96516,
"<pr>": 96523,
"<pr_base>": 96526,
"<pr_base_code>": 96528,
"<pr_comment>": 96531,
"<pr_diff>": 96529,
"<pr_diff_hunk>": 96530,
"<pr_diff_hunk_comment_line>": 96538,
"<pr_event_id>": 96532,
"<pr_file>": 96527,
"<pr_in_reply_to_comment_id>": 96537,
"<pr_in_reply_to_review_id>": 96536,
"<pr_is_merged>": 96525,
"<pr_review>": 96533,
"<pr_review_comment>": 96535,
"<pr_review_state>": 96534,
"<pr_status>": 96524,
"<repo_name>": 96510,
"<|endoftext|>": 96506,
"<|end|>": 96500,
"<|im_end|>": 96539,
"<|im_start|>": 96540,
"<|message|>": 96501,
"<|pad|>": 96505,
"<|start|>": 96499,
"<|tool_end|>": 96504,
"<|tool_excute|>": 96503,
"<|tool_start|>": 96502
}
|