DedeProGames (DedeProGames)

reacted to Enderchef's post with ❤️👍🔥 about 12 hours ago

Post

7994

Hi, everyone!
Please follow, like, and support the work of

CompactAI-O !
Spread the word!

7 replies

·

reacted to danielhanchen's post with 🔥 5 days ago

Post

5188

Qwen3.6-27B is out now! Run it locally on 18GB RAM. 💜

Qwen3.6-27B surpasses Qwen3.5-397B-A17B on all major coding benchmarks.

GGUFs to run: unsloth/Qwen3.6-27B-GGUF
Guide + MLX: https://unsloth.ai/docs/models/qwen3.6

reacted to imnotkitty's post with 🔥👍❤️ 5 days ago

Post

3921

tencent/Hy3-preview is out: an open-weights MoE reasoning model.

✅ 295B total / 21B active / 256K context
✅ Fused fast-and-slow thinking in a single model
✅ First model trained on Hunyuan's rebuilt pretraining + RL infra (Feb → Apr)

Benchmarks:
👉 SWE-Bench Verified, Terminal-Bench 2.0, BrowseComp, WideSearch — competitive results, particularly strong on agentic tool use
👉 Top score on Tsinghua's 2026 Spring math PhD qualifying exam
👉 Strong context-learning and instruction-following on Tencent's CL-bench / CL-bench-Life

More details can be found in my article: https://huggingface.co/blog/imnotkitty/hy3-preview

2 replies

·

reacted to their post with 👀🤗🚀🔥 13 days ago

Post

3394

🔥 GRM-2.5 - The most POWERFUL model for local inference

The GRM-2.5 is the newest model from Orion LLM Labs. It has consistent RAW reasoning and is capable of generating very precise responses, similar to large models, while maintaining a parameter size of 4b.

The GRM-2.5 family consists of these models:
OrionLLM/GRM-2.5 (4b)
OrionLLM/GRM-2.5-Air (0.8b)

Furthermore, the GRM-2.5 is the best option for local agentic environments, being very good in code, terminal agent, etc. It is capable of generating 1000 lines of consistent code and programming like large models.
The GRM-2.5 is the best base for FineTune to date and has vision, which means it can interpret images and videos.

1 reply

·

posted an update 13 days ago

Post

3394

🔥 GRM-2.5 - The most POWERFUL model for local inference

The GRM-2.5 is the newest model from Orion LLM Labs. It has consistent RAW reasoning and is capable of generating very precise responses, similar to large models, while maintaining a parameter size of 4b.

The GRM-2.5 family consists of these models:
OrionLLM/GRM-2.5 (4b)
OrionLLM/GRM-2.5-Air (0.8b)

Furthermore, the GRM-2.5 is the best option for local agentic environments, being very good in code, terminal agent, etc. It is capable of generating 1000 lines of consistent code and programming like large models.
The GRM-2.5 is the best base for FineTune to date and has vision, which means it can interpret images and videos.

1 reply

·

reacted to their post with 😎🧠👍 18 days ago

Post

3023

🔥 GRM2 - The small one that surpasses the big ones.
What if a 3-parameter model can beat a 32-parameter model in every benchmark? We prove that it can.
GRM2 is a 3b params model based on the llama architecture, trained for long reasoning and high performance in complex tasks - the first 3b params model to outperform qwen3-32b in ALL benchmarks, and outperform o3-mini in almost all benchmarks.
🤗 Model: OrionLLM/GRM2-3b
The first 3b params model to generate over 1000 lines of code and achieve a score of 39.0 in xBench-DeepSearch-2510.

🚀 Chat with GRM:
DedeProGames/GRM2-Chat

🏆 Download official GGUFs: OrionLLM/GRM2-3b-GGUF

reacted to anakin87's post with 🔥❤️ 19 days ago

Post

3296

📣 I just published a free course on Reinforcement Learning Environments for Language Models!

📌 COURSE: https://github.com/anakin87/llm-rl-environments-lil-course

Over the past year, we've seen a shift in LLM Post-Training.
Previously, Supervised Fine-Tuning was the most important part: making models imitate curated Question-Answer pairs.

Now we also have Reinforcement Learning with Verifiable Rewards. With techniques like GRPO, models can learn through trial and error in dynamic environments. They can climb to new heights without relying on expensively prepared data.

But what actually are these environments in practice❓ And how do you build them effectively❓

Fascinated by these concepts, I spent time exploring this space through experiments, post-training Small Language Models.
I've packaged everything I learned into this short course.

What you'll learn

🔹 Agents, Environments, and LLMs: how to map Reinforcement Learning concepts to the LLM domain
🔹 How to use Verifiers (open-source library by Prime Intellect) to build RL environments as software artifacts
🔹 Common patterns: How to build single-turn, multi-turn, and tool-use environments

🔹 Hands-on: turn a small language model (LFM2-2.6B by LiquidAI) into a Tic Tac Toe master
🔸 Build the game Environment
🔸 Use it to generate synthetic data for SFT warm-up
🔸 Group-based Reinforcement Learning

If you're interested in building "little worlds" where LLMs can learn, this course is for you.

---

🤗🕹️ Play against the trained model: anakin87/LFM2-2.6B-mr-tictactoe

📚 HF collection (datasets + models): https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe

1 reply

·

reacted to shriarul5273's post with 🚀 26 days ago

Post

2316

🔁 One API. 12 model families. 28 variants. Why depth_estimation makes depth research easier

Switching between depth models usually means rewriting preprocessing, adapting outputs, and dealing with different codebases.

depth_estimation removes that friction.

With the same interface, you can work with:
🌊 Depth Anything
🍎 DepthPro
🧭 MiDaS
📏 ZoeDepth
🧩 MoGe
🛰️ VGGT / OmniVGGT
and more

Change one model string, keep the rest of your workflow the same.

That makes it much easier to:
⚖️ compare models fairly
🧪 prototype quickly
📈 benchmark consistently
🛠️ build reusable depth pipelines

GitHub: https://github.com/shriarul5273/depth_estimation

#depthestimation #research #computervision #python #machinelearning #opensource #pytorch

reacted to their post with 🤯🤗 27 days ago

Post

3023

🔥 GRM2 - The small one that surpasses the big ones.
What if a 3-parameter model can beat a 32-parameter model in every benchmark? We prove that it can.
GRM2 is a 3b params model based on the llama architecture, trained for long reasoning and high performance in complex tasks - the first 3b params model to outperform qwen3-32b in ALL benchmarks, and outperform o3-mini in almost all benchmarks.
🤗 Model: OrionLLM/GRM2-3b
The first 3b params model to generate over 1000 lines of code and achieve a score of 39.0 in xBench-DeepSearch-2510.

🚀 Chat with GRM:
DedeProGames/GRM2-Chat

🏆 Download official GGUFs: OrionLLM/GRM2-3b-GGUF

DedeProGames PRO

AI & ML interests

Recent Activity

Organizations

DedeProGames PRO

AI & ML interests

Recent Activity

Organizations

DedeProGames's activity