Text Generation
PEFT
Safetensors
English
qwen3
lora
unsloth
agent
tool-use
agentbench
alfworld
dbbench
conversational
Instructions to use AF0815/agentbench with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AF0815/agentbench with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use AF0815/agentbench with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AF0815/agentbench to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AF0815/agentbench to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AF0815/agentbench to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="AF0815/agentbench", max_seq_length=2048, )
File size: 2,811 Bytes
fd778b3 d928909 3327c76 fd778b3 d928909 fd778b3 d928909 fd778b3 d928909 fd778b3 d928909 fd778b3 d928909 fd778b3 d928909 8ada862 d928909 79fe4e9 d928909 348e1a8 bdf7c69 03ff6d6 d928909 5fd2d89 d928909 348e1a8 fd778b3 46a04e4 5fd2d89 8ada862 5fd2d89 d928909 fd778b3 4f4b30a fd778b3 d928909 fd778b3 d928909 fd778b3 d928909 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | ---
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/dbbench_sft_dataset_react_v4
- u-10bei/sft_alfworld_trajectory_dataset_v5
language:
- en
license: apache-2.0
library_name: peft
pipeline_tag: text-generation
tags:
- lora
- peft
- unsloth
- agent
- tool-use
- agentbench
- alfworld
- dbbench
---
# qwen3-4b-agentbench-dbalf-lora
This repository provides a **LoRA adapter** fine-tuned from **Qwen/Qwen3-4B-Instruct-2507**
using **LoRA + Unsloth** for **AgentBench-style multi-turn agent trajectories**.
This repository contains **LoRA adapter weights only**.
The base model must be loaded separately.
## Training Objective
This adapter is trained to improve **multi-turn agent task performance** on:
- **DBBench** (database operation / SQL generation trajectories)
- **ALFWorld** (household task trajectories)
Loss is applied to **all assistant turns** in the trajectory, enabling the model to learn:
- environment observation
- action selection
- tool use / operation formatting
- recovery from intermediate errors
## Training Data
- DBBench dataset: `u-10bei/dbbench_sft_dataset_react_v4`
- ALFWorld dataset: `u-10bei/sft_alfworld_trajectory_dataset_v5`
- Mixing ratio (pre-merge target): **DB:ALF = 1:0**
### DB Oversampling (category-aware)
Enabled: **False**
DB category weights used during training-data preparation:
- counting: 1
- comparison: 1
- ranking: 1
- select: 1
- insert: 1
- update: 1
- other: 1
## Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (full precision base)
- Max sequence length: 2048
- Epochs: 1.2
- Learning rate: 3e-06
- LoRA: r=32, alpha=64, dropout=0.0
- Per-device train batch size: 2
- Gradient accumulation: 4
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "AF0815/agentbench"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
```
## Notes
- This repository is intended for **adapter-only** distribution.
- Please ensure compliance with the **base model license/terms** in addition to this repository's license.
- If you publish evaluation results, it is recommended to report:
- AgentBench task split / seeds
- DBBench / ALFWorld mix ratio
- DB oversampling settings
- decoding settings
## Sources & Terms (IMPORTANT)
Training data:
- u-10bei/dbbench_sft_dataset_react_v4
- u-10bei/sft_alfworld_trajectory_dataset_v5
Dataset license / terms:
- Please follow the original license and terms of each dataset repository.
- This adapter repository license (**apache-2.0**) applies to the adapter files in this repository.
|