🌟 T1 Series Collection Instruction-tuned Gemma-3 models optimized for agentic workflows in Traditional Chinese. • 5 items • Updated 9 days ago • 3
🧠Traditional Chinese Reasoning Datasets Collection A curated collection of datasets designed to evaluate and train reasoning capabilities in Traditional Chinese across various domains. • 3 items • Updated 9 days ago • 9
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks nvidia • Aug 11, 2025 • 76
view article Article What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware RakshitAralimatti • Aug 8, 2025 • 35
view article Article How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio fdaudens • Aug 14, 2025 • 26
view article Article Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training +3 smohammadi, siro1, winglian, marcsun13, djsaunde • Aug 8, 2025 • 98
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 775
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 611
Common Pile v0.1 Collection All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 40
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 ariG23498, merve, pcuenq, reach-vb • Mar 12, 2025 • 496
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated Mar 2 • 97
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 720
view article Article MTEB Leaderboard : User guide and best practices lyon-nlp-group • Mar 13, 2024 • 9
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets dvilasuero • Jun 4, 2024 • 79
view article Article 🔥 Argilla 2.0: the data-centric tool for AI makers 🤗 dvilasuero • Jul 30, 2024 • 39