Papers
arxiv:2606.04694

DuDi: Dual-Signal Distillation with Cross-Lingual Verbalizer

Published on Jun 3
Authors:
,
,
,
,
,

Abstract

DuDi is a dual-signal multilingual distillation framework that enhances small language models' performance across diverse languages through sequence-level and token-level signals combined with cross-lingual verbalization.

Small language models (SLMs) are efficient and scalable, but their multilingual capabilities degrade severely at sub-billion scales, especially for Southeast Asian (SEA) languages. We introduce DuDi, a dual-signal multilingual distillation framework that combines an online sequence-level signal with off-policy and on-policy token-level signals. DuDi further uses a cross-lingual verbalizer to refine teacher feedback and improve teacher-student transferability in multilingual settings. Experiments on SEA-HELM across multiple model families, scales, and teacher-student settings show that DuDi consistently outperforms competitive distillation baselines. Ablations and analyses confirm that sequence-level optimization, token-level supervision, and cross-lingual verbalization provide complementary and transferable learning signals for multilingual SLMs.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.04694
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 4

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.04694 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.04694 in a Space README.md to link it from this page.

Collections including this paper 1