Instructions to use Vaultkeeper/Corpus-Callosum-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Vaultkeeper/Corpus-Callosum-1.5B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Vaultkeeper/Corpus-Callosum-1.5B")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Vaultkeeper/Corpus-Callosum-1.5B", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Vaultkeeper/Corpus-Callosum-1.5B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Vaultkeeper/Corpus-Callosum-1.5B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Vaultkeeper/Corpus-Callosum-1.5B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Vaultkeeper/Corpus-Callosum-1.5B

SGLang

How to use Vaultkeeper/Corpus-Callosum-1.5B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Vaultkeeper/Corpus-Callosum-1.5B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Vaultkeeper/Corpus-Callosum-1.5B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Vaultkeeper/Corpus-Callosum-1.5B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Vaultkeeper/Corpus-Callosum-1.5B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Vaultkeeper/Corpus-Callosum-1.5B with Docker Model Runner:
```
docker model run hf.co/Vaultkeeper/Corpus-Callosum-1.5B
```

Corpus-Callosum

by VaultAI

Deployment Status: UNRELEASED

[ PRE-ALPHA ] SOVEREIGN-CODE & CORPUS-CALLOSUM | ARCHITECTING...

✅ Routing, Millisecond-Fast.

The "Central Nervous System" of the VaultAI ecosystem. Corpus-Callosum is not a chatbot; it is a high-speed intent classifier and traffic controller.

Engineered for millisecond latency, Corpus-Callosum is a 1.5B parameter Slerp Merge. It bridges the generalist comprehension of Qwen 2.5 Instruct with the syntax-sensitivity of Qwen 2.5 Coder. Its sole job is to analyze your prompt and decide which expert in your fleet is best equipped to handle it.

🧠 Architecture & Identity: The Traffic Cop

Corpus-Callosum is designed to run silently in the background of system RAM. By utilizing a 50/50 Spherical Linear Interpolation (Slerp), VaultAI has created a tiny but hyper-intelligent router that understands the difference between a request for lore and a request for code.

Key Routing Commands:

[1] Creative/Abstract: Routes to Ouroboros-level creative reasoning.
[2] Logic/Code: Routes to Sovereign-level execution engines.
[3] Hybrid Relay: Triggers a multi-stage collaborative workflow between experts.

⚡ Performance & Efficiency

Corpus-Callosum is optimized to live entirely in CPU RAM, leaving 100% of your GPU VRAM available for the primary experts.

Metric	Speed (Prompt Processing)	Hardware Requirement	System Footprint
Latency	< 50ms	0% GPU VRAM	~1.1 GB (Q4_K_M)
Model Size	1.5B Parameters	CPU/RAM Only	Lightweight Background Process

Standardized Accuracy Benchmarks

Benchmarks are currently queued to test classification accuracy.

Benchmark	Focus Area	Accuracy	Status
Intent Classification	Logic vs Creative	TBD	⏳ Pending Eval
MMLU (Micro)	Knowledge Retention	TBD	⏳ Pending Eval

Model Details

Type: Classification Language Model (Slerp Merge)
Base Architecture: Qwen 2.5 (1.5B)
Merge Method: SLERP (Spherical Linear Interpolation)
Blend Ratio:
- 50% — Qwen2.5-1.5B-Instruct
- 50% — Qwen2.5-Coder-1.5B-Instruct
Tokenizer: Qwen 2.5 (1.5B Base)
License: Apache 2.0

Why Corpus-Callosum?

Zero Latency Orchestration: Doesn't waste time being polite. It reads, classifies, and hands off.
VRAM Preservation: Designed specifically for users with limited VRAM who need to manage multi-model expert fleets.
Embedded Directive: Metadata-overridden to ensure it never breaks its "one-token" output rule.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Vaultkeeper/Corpus-Callosum-1.5B

Qwen/Qwen2.5-1.5B-Instruct

Qwen/Qwen2.5-Coder-1.5B-Instruct

Merge model

this model