🔄 In a Training Loop

minyi

minyichen

51 20 439

min-yi-chen-68b6ab130

AI & ML interests

None yet

Recent Activity

liked a dataset 4 days ago

liodon-ai/gemma4-code-review-instruct

updated a dataset 10 days ago

minyichen/hermes_fc_call_mt_R1

liked a dataset 13 days ago

miromind-ai/MiroMind-M1-SFT-719K

View all activity

Organizations

upvoted 2 collections 7 months ago

🌟 T1 Series

Collection

Instruction-tuned Gemma-3 models optimized for agentic workflows in Traditional Chinese. • 5 items • Updated May 6 • 3

🧠 Traditional Chinese Reasoning Datasets

Collection

A curated collection of datasets designed to evaluate and train reasoning capabilities in Traditional Chinese across various domains. • 3 items • Updated May 6 • 9

upvoted 4 articles 11 months ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

nvidia

•

Aug 11, 2025

• 76

Article

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

RakshitAralimatti

•

Aug 8, 2025

• 36

Article

How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

fdaudens

•

Aug 14, 2025

• 27

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

smohammadi, siro1, winglian, marcsun13, djsaunde

•

Aug 8, 2025

• 99

upvoted 4 articles about 1 year ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 782

Article

The Large Language Model Course

mlabonne

•

Jan 16, 2025

• 230

Article

Vision Language Models Explained

merve, edbeeching

•

Apr 11, 2024

• 541

Article

Vision Language Models (Better, faster, stronger)

merve, sergiopaniego, ariG23498, pcuenq, andito

•

May 12, 2025

• 614

upvoted a collection about 1 year ago

Common Pile v0.1

Collection

All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 41

upvoted an article over 1 year ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

ariG23498, merve, pcuenq, reach-vb

•

Mar 12, 2025

• 497

upvoted a collection over 1 year ago

Tulu 3 Datasets

Collection

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated Mar 2 • 102

upvoted a collection almost 2 years ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 731

upvoted 5 articles almost 2 years ago

Article

Merge Large Language Models with mergekit

mlabonne

•

Jan 9, 2024

• 156

Article

Uncensor any LLM with abliteration

mlabonne

•

Jun 13, 2024

• 875

Article

MTEB Leaderboard : User guide and best practices

lyon-nlp-group

•

Mar 13, 2024

• 9

Article

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

dvilasuero

•

Jun 4, 2024

• 79

Article

🔥 Argilla 2.0: the data-centric tool for AI makers 🤗

dvilasuero

•

Jul 30, 2024

• 40

minyi

AI & ML interests

Recent Activity

Organizations

minyichen's activity

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

SmolLM3: smol, multilingual, long-context reasoner

The Large Language Model Course

Vision Language Models Explained

Vision Language Models (Better, faster, stronger)

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Merge Large Language Models with mergekit

Uncensor any LLM with abliteration

MTEB Leaderboard : User guide and best practices

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

🔥 Argilla 2.0: the data-centric tool for AI makers 🤗