facebook/multilingual_librispeech
Viewer • Updated • 1.49M • 43.1k • 179
Fine-tuned openai/whisper-small (244M params) for Italian ASR on multiple datasets.
Author: Ettore Di Giacinto
Brought to you by the LocalAI team. This model can be used directly with LocalAI.
This model is ready to use with LocalAI via the whisperx backend.
Save the following as whisperx-small-it-multi.yaml in your LocalAI models directory:
name: whisperx-small-it-multi
backend: whisperx
known_usecases:
- transcript
parameters:
model: LocalAI-io/whisper-small-it-multi-ct2-int8
language: it
Then transcribe audio via the OpenAI-compatible endpoint:
curl http://localhost:8080/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F file="@audio.mp3" \
-F model="whisperx-small-it-multi"
Evaluated on combined test set (Common Voice + MLS + VoxPopuli):
| Step | WER |
|---|---|
| 1000 | 21.51% |
| 3000 | 18.30% |
| 5000 | 17.32% |
| 7000 | 16.21% |
| 10000 | 15.63% |
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-small-it-multi")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])
For optimized CPU inference: LocalAI-io/whisper-small-it-multi-ct2-int8
Base model
openai/whisper-small